Abstract
Physics-based modeling aids in designing efficient data center power and cooling systems. These systems have traditionally been modeled independently under the assumption that the inherent coupling of effects between the systems has negligible impact. This study tests the assumption through uncertainty quantification of models for a typical 300 kW data center supplied through either an alternating current (AC)-based or direct current (DC)-based power distribution system. A novel calculation scheme is introduced that couples the calculations of these two systems to estimate the resultant impact on predicted power usage effectiveness (PUE), computer room air conditioning (CRAC) return temperature, total system power requirement, and system power loss values. A two-sample z-test for comparing means is used to test for statistical significance with 95% confidence. The power distribution component efficiencies are calibrated to available published and experimental data. The predictions for a typical data center with an AC-based system suggest that the coupling of system calculations results in statistically significant differences for the cooling system PUE, the overall PUE, the CRAC return air temperature, and total electrical losses. However, none of the tested metrics are statistically significant for a DC-based system. The predictions also suggest that a DC-based system provides statistically significant lower overall PUE and electrical losses compared to the AC-based system, but only when coupled calculations are used. These results indicate that the coupled calculations impact predicted general energy efficiency metrics and enable statistically significant conclusions when comparing different data center cooling and power distribution strategies.
1 Introduction
The growth of the data center industry calls for energy efficient practices. In 2018, global data center electricity consumption exceeded 200 TWh each year, which is roughly 1% of the worldwide electricity use and more than the national energy consumption of some countries, including Iran [1]. Shehabi et al. [2] projected that U.S. data centers would consume approximately 73 × 109 kWh of electricity in 2020, while estimated consumption in 2014 was 70 × 109 kWh, representing around 1.8% of total U.S. electricity consumption. More recently, Masanet et al. [3] estimated that the data center computing workload increased by about 550% from 2010 to 2018, while global data center electricity consumption increased by 6% due to significant improvements in energy efficiency between this time interval, yet Shehabi et al. [4] doubted that the improvements in energy efficiency would be sufficient to offset the energy demand for the rapidly expanding industry. At the same time, some recent reports also predicted that data centers in 2025 will consume around 20% of global electricity production [5]. The recent expansion of data-intensive technologies such as cryptocurrency, artificial intelligence, autonomous vehicles, digitized manufacturing, and energy systems further increase the demand for data processing and storing in data centers. Data centers are responsible for approximately 0.5% of U.S. greenhouse emissions [6] due to their large energy consumption. Therefore, improving data center energy efficiency is critical to enhancing energy security and reducing environmental burden.
Instantaneous PUE values, which are used in this study, apply the above equations except use instantaneous power draws instead of annual values.
Cooling systems have been the primary focus of data center energy research since cooling systems generally dominate the non-IT power consumption in PUE calculations [8]. Tools such as the in-house flow network software Villanova Thermodynamic Analysis of Systems (vtas) [9] or the data center computational fluid dynamics (CFD) software 6sigmadc [10] have been developed to aid in improving data center cooling system efficiency by predicting metrics such as the cooling system PUE (i.e., ). These tools provide an ability for data center designers and managers to compare different cooling strategies and to optimize the chosen strategy.
This study uses VTAS because of its flexibility in incorporating a wide variety of cooling equipment (e.g., chillers, evaporative coolers, and cooling towers) and cooling strategies. For example, VTAS has been used to show that the second-law efficiency of cooling systems increases when a traditional air cooling strategy containing computer room air conditioning (CRAC) or computer room air handler (CRAH) units is replaced with hybrid liquid-air cooling (in-row, overhead, or rear door coolers) [11] or direct (cold plate) liquid cooling equipment [12]. VTAS has also been used to demonstrate reasonable agreement with CFD modeling for exergy destruction due to data center airflow mixing [13]. VTAS has also been combined with CFD to show how variations in whitespace (i.e., airspace) flow patterns influence the system-wide exergy destruction [14] and to develop strategies for implementing hybrid liquid-air cooling equipment [15]. The software tool has been validated to an experimental testbed as part of a study that suggests the combination of supervisory control and data acquisition and ON/OFF system-level controls provides low cooling system PUE values while maintaining reliability [16].
Contributions of electrical power system inefficiencies to the overall PUE should also be considered. Fan et al. [17] used modeling to discover opportunities for energy savings in clusters (i.e., thousands of servers) when studying the power provisioning for a warehouse-scale computer installation. In addition, Meisner et al. [18] proposed mechanisms to eliminate idle power waste in servers and showed a 74% average server power reduction by combining an energy conservation approach (PowerNap) and a power provisioning approach (redundant array for inexpensive load sharing, or RAILS).
Some studies have combined the effects of both cooling and power distribution inefficiencies in data centers. Pelley et al. [19] show a theoretical framework for total data center power calculations that include the effects of power distribution from the power distribution unit (PDU) to the servers (including load distribution), and the influence of cooling equipment and airflow recirculation. They provide a parametric power distribution model in their study instead of a physics-based detailed analysis of components. Also, Tran et al. [20] developed the datacenter workload energy simulation tool to estimate the energy consumption of all cooling and power equipment in a data center.
These approaches perform basic data center energy consumption calculations that include contributions of both cooling and power delivery equipment, but they do not incorporate the fact that an inherent coupling exists between cooling and power systems in data centers: each piece of cooling equipment requires a power feed, and electrical inefficiencies translate into cooling loads provided that the source of the inefficiencies reside within the data hall (Fig. 1). Moreover, when the cooling system power draw is increased to handle cooling loads by electrical equipment inefficiencies, then the corresponding power losses are increased. As a result, the traditionally uncoupled calculated cooling system, power distribution, and overall PUE values will be artificially low. This inherent coupling has not yet been explored analytically, yet it could be important for accurate predictions of PUE values. The key purpose of this study is therefore to determine if this error is statistically significant when compared to the inherent uncertainties within the model. This examination is performed on models of typical alternating current (AC)-based and direct current (DC)-based power delivery systems, enabling the determination of statistical significance in several predicted data center energy efficiency metrics (cooling system PUE, electrical system PUE, overall PUE, CRAC return temperature, total grid power requirement, and total electrical equipment losses) for uncoupled versus coupled calculations as well as for comparing the two systems.
This work builds upon preliminary work by the authors [21] to estimate the influence of coupling electrical and mechanical system calculations on energy efficiency metrics. The preliminary work first introduced a standalone power system calculation scheme and then described the relationship between cooling and power systems. The coupled calculations suggest a significant increase in the cooling system PUE. However, the electrical system calculation framework presented in that work was not stable for data centers beyond ten racks and did not allow for converging power flows (i.e., multiple power sources). Additionally, no system-wide validation was performed, and the power system component models showed only modest agreement in validation exercises to data from The Green Grid [22]. Also, no formal analysis was presented for statistical significance when comparing uncoupled and coupled results. Finally, the reported quantitative impact of coupling on energy efficiency metrics was overestimated due to an error later found in the software [23]. This study therefore advances the previous study in four ways:
A new power system calculation framework has been developed to allow for converging power flows and with improved stability for data centers beyond ten racks. The old framework, which began by assuming values of component efficiencies and calculating the current at the loads and working back to the grid, and then adjusting the currents based on updated component efficiencies and line losses using derived correlations, is unstable since the component efficiency calculations would diverge unless the initially guessed efficiencies are close to the final values. The new framework addresses this condition-related problem by calibrating component efficiencies to experimental data.
The power distribution component efficiency ranges have been calibrated to available published data and additional new data from experimental data center measurements. The new models therefore agree with experimental measurements.
The energy efficiency metric calculations have been corrected.
A formal statistical analysis has been performed to test for significant differences in key metrics.
In addition, these calculations have been extended to compare the efficiencies of AC versus DC-based power distribution.
This study provides significant advancements forward in describing how (1) to model data center power delivery systems, (2) to quantify the influence of coupling cooling and power delivery system calculations on cooling system PUE, electrical system PUE, and overall PUE predictions, and (3) to use statistical analysis on the results of the coupled calculations to determine if one cooling or power delivery strategy is significantly more efficient than another. The study also demonstrates that some key data center metrics are significantly reduced when DC-based power systems is used in place of AC-based systems when the coupled equations are applied for each system model.
2 Methodology
The coupled calculation scheme requires independent algorithms for cooling and power system calculations, followed by translating information between the systems in an iterative fashion until convergence. VTAS, the tool used for the analysis in this study, was originally developed for data center cooling systems analysis. The concept behind the VTAS modeling scheme is that various components in a data center (e.g., cooling and IT equipment) are linked through fluid loops, such as CRAC units interacting with servers through an air loop. The loops contain a closed network of fluid branches. In a typical cooling system calculation, fluid flow rates are calculated based on user-specified pump/fan and fluid branch network information. An energy balance is then performed to determine the equipment capacity based on the known instantaneous IT load. The component heat exchange, fluid stream inlet temperature, and fluid stream outlet temperature are used to size the components and to calculate the component exergy destruction. Transient simulations may subsequently be run using the configuration as a starting state. The cooling system algorithm has been substantially discussed and validated elsewhere [9,11,12,16]. The cooling system for this study features two CRAC units, each providing 10 m3/s of supply airflow at 20 °C and 50% relative humidity.
Power system modeling follows a similar network modeling scheme as the cooling system analysis. Power components are connected by electrical lines with calculated inlet and outlet voltages and currents, enabling calculation of component efficiencies. The AC-based and DC-based power distribution systems are shown in Figs. 2 and 3, respectively. In both systems, 13.8 kVAC, three-phase power is received from the electric grid and passed through an AC transformer, which reduces the voltage to 480 VAC. In the AC-based power distribution system, the power is then split to mechanical equipment and to a second transformer, which in turn steps down the power to 208 VAC. This circuit then passes through an uninterruptible power supply (UPS), which is modeled as a rectifier and inverter in series. The power leaving the inverter is then distributed to the server power supply units (PSUs) through a series of row-based PDUs and racks.
The DC-based power distribution system differs from the AC-based system in that the second AC transformer and AC UPS are replaced with a single transformation of 480 Vac to 380 Vdc using a DC UPS containing a controlled three-phase bridge rectifier, denoted here as a component named “RectifierV.” The output DC power is then fed into the PDUs for distribution to the racks and servers. It should be noted that the AC PSUs contain a rectifier and DC/DC buck converter in series, whereas the DC PSUs only contain the buck converter. Both AC PSUs and DC PSUs terminate with a load at 12 volts of direct current. The test data center used in this study contains four rows of 10 racks, each containing 15 servers. Each server load is 500 W, leading to a total IT load of 300 kW.
2.1 Power System Components.
Fixed component efficiencies, defined as real power out divided by real power in, are directly used for all component models: AC transformers (step down in AC voltage), rectifiers (AC-to-DC conversion), inverters (DC-to-AC conversion), DC/DC buck converters (step down in DC voltage), and the RectifierV (AC-to-DC conversion plus step down in voltage) component. The range of component efficiencies in AC and DC power system models were calibrated to match available data by The Green Grid [22], Southern California Edison (Rosemead, CA) [24], and measurements in an experimental data center at Binghamton University (Binghamton, NY). Data from The Green Grid include transformer efficiencies over various load levels, indicating a nearly flat efficiency of ∼0.97 when the loads exceed 20%. The Green Grid data for an AC UPS also indicates ∼0.90 efficiency for loads exceeding 20%. Finally, Southern California Edison's data of an AC PSU indicates an efficiency range of 0.87–0.90 for loads exceeding 40%.
Calibration of component efficiencies to Binghamton University experimental data required models similar to those shown in Figs. 2 and 3 except with a single rack of 48 servers and without mechanical power feeds. These single-rack models are sufficient for calibrating specific component efficiencies since experimental data pertain only to the PDU level. Power system measurements were used at a variety of server load levels for both systems, although the variation in results is not captured in the models due to the use of fixed component efficiencies.
The data from modeling the DC-based single-rack system were used to calibrate the efficiency of RectifierV. The rectifier within the UPS draws more power than at the PDU since the latter only reports rack power draw. Therefore, the efficiency at the PDU measurement position is calculated as the DC PDU reading divided by the UPS rectifier DC power, or the efficiency of RectifierV per Fig. 3. The experimental data for loads ranging from 25% to 100% indicate an efficiency range of –0.950, so an efficiency range of is used for RectifierV.
The experimental dataset also compares AC-based and DC-based PSUs by determining the ratio of power for both systems at the PDU. Since fixed efficiencies are used in this study, and the buck converter efficiencies are assumed to be equivalent for both systems, then a choice of a PSU rectifier efficiency of 0.90 achieves a ratio of 1.11, which falls in the experimental data range of 1.09–1.15 while adhering to Southern California Edison's [24] AC UPS efficiency range.
The calibrated component efficiency ranges and the results of the calibration exercises are summarized in Tables 1 and 2, respectively. Table 1 also includes conservative engineering judgment ranges for the coefficient of performance (COP) for the CRAC units and the efficiencies of fans in the system. These component metric ranges form the basis for the subsequent statistical analysis.
2.2 Power System Framework.
A modified power system calculation framework is used in this study to address the limitations (i.e., the inability to handle converging power flows, and lack of robustness in regards to scalability) of the preliminary method described in Ref. [21]. The approach here that addresses these two issues is to setup and solve a system of nonlinear equations. This nonlinear equation set includes component losses, line losses, and a direct solution for real and reactive system input power values. The algorithm is also designed to allow for flexibility in defining loads as AC or DC, input power sources as AC or DC, and AC or DC electrical junctions. The system assumes balance in three-phase AC transmission. The unknowns in the system are:
The input real power (AC and DC power sources) and input reactive power (AC power sources only) to the system.
The voltage magnitudes (AC and DC electrical lines) and phase angles (AC electrical lines only) at the beginning and end points of all electrical lines.
The system of nonlinear equations solves for these unknowns at each iteration. Figure 4 shows that component m is defined as upstream of component k, which is upstream of component n. A power flow defined as going from k to m uses the subscript km, a flow from k to n uses the subscript kn, and so forth.
where and are conductance and susceptance, respectively.
These equations are modified for specific components:
The above system of power balance equations provides a set of nonlinear equations for each component. However, establishing the values of and (if applicable) for each component k is also part of this system of equations, which achieves the same number of equations and unknowns. The application of these equations is achieved through component-specific means as described in Table 3.
The electrical power system solution algorithm is as follows per Fig. 7:
- (1)
Calculate G and B for all electrical lines based on user-specified wire length, gage, and nearest-neighbor spacing.
- (2)
Determine the size of the system of equations using the criteria specified above.
- (3)
Provide an initial guess for the solution vector by:
Assuming that all phase angles are zero
Assuming zero voltage drop in each electrical line
Assuming zero input power to the system
- (4)
Populate the stiffness matrix and force vector by applying the power balance equations and outlet voltage/phase angle relations for all components.
- (5)
Solve the linearized system of equations.
- (6)
Update the inlet power to the power sources, the voltages and phase angles in the electrical lines, and calculate the complex current for each electrical line.
- (7)
Update the component real and reactive (if applicable) power losses.
- (8)
Check for convergence and update the old solution vector using successive under relaxation. A relaxation parameter of 0.5 is used here.
- (9)
Go to step 5 until converged using the two-norm of the absolute change in the solution vector between successive iterations as the convergence criterion (10−2 in this study). Convergence was achieved in all system models in this study.
2.3 Coupled Power and Cooling System Calculations.
The implementation of the coupling of the two systems follows a standard iteration cycle as shown in Fig. 8:
The mechanical system calculates the cooling equipment power requirements assuming no electrical inefficiencies in the system.
The electrical system model is updated with power feeds to the cooling equipment.
The electrical system is solved to determine the various electrical power losses.
- The relative error norm is calculated based on the change in total system real power in each iterationwhere is the total system electrical power input for iteration i. Convergence is achieved when the error norm falls below 10−3.(18)
The additional heat sources due to electrical power losses are incorporated into the cooling system calculations. The cooling calculations are then repeated.
Go to step 2 and iterate until convergence. Convergence was achieved for all of the simulations used in this study.
2.4 Statistical Significance Testing.
where the systematic standard uncertainty, , is based on conservative engineering judgment and is assumed to be identical for each case (AC-based versus DC-based, uncoupled versus coupled).
Statistical significance is then achieved when .
3 Results
3.1 Base Case Systems.
The predictions for an AC-based system with the mean values of parameters in Table 1 with uncoupled and coupled calculations are shown in Table 4. The coupled simulations converged in six iterations. The results show that only those components upstream of the cooling system components (i.e., the CRAC units and fans) have additional losses due to the coupled calculations as expected since the server power draw is fixed. Incorporating the additional cooling loads increases the predicted compressor power for each CRAC unit by over 60%, thereby increasing the predicted cooling system PUE by 20%. The CRAC blower's predicted power consumption is not affected by the coupling since the air volumetric flow rate is not affected in the model. The cooling system equipment represents a minor part of the overall grid power draw in the data center as seen by a real power requirement increase of 17%, so as a result a negligible growth in electrical system PUE is seen because of the coupled calculations.
Of note is the large growth in cooling load (267 kW), which is as nearly large as the IT load itself (300 kW). A fully uncoupled model does not take into account any cooling loads generated by electrical losses, whereas the coupled scheme here incorporates these loads after the first iteration. The resultant change in PUE values and cooling load, shown in the table as a “partially coupled” case, suggest a growth in real power requirement by 56 kW due to the additional CRAC power required to handle the additional 186 kW from electrical losses outside the CRAC compressor. The remaining 36 kW growth in real power for iterations 2–6 stem from cooling loads by the CRAC units themselves, assuming 30% of total CRAC power draw in this model. It should be noted that this approach represents a “worst case” scenario where all electrical losses and a significant portion of cooling equipment cooling load are located within the data hall.
Table 5 provides the results for uncoupled and coupled calculations associated with a DC-based power distribution system. The coupled system calculations also converge in six iterations and have no impact on PSU component losses like in the AC-based system model. The DC-based system has less electrical equipment power loss than the AC-based system, so as a result the coupling has less of an impact on data center energy efficiency metrics. This is seen in a smaller increase in cooling system and overall PUE compared to the impact of coupling in the AC-based system model. The reduced electrical equipment power loss results in less cooling loads in a coupled system, so as a result the CRAC compressor requirement increase is lower for the DC-based system model compared to the AC-based system model.
3.2 Statistical Significance Tests.
Tables 6 and 7 indicates uncertainty ranges for data center energy efficiency metrics when sampled input values from Table 1 are provided for AC-based and DC-based power distribution systems, respectively. Both tables include the chosen values for the systematic standard uncertainty associated with each metric. Table 6 shows that the cooling system PUE, the overall PUE, the CRAH return air temperature, and the total electrical equipment losses are all deemed statistically significant. These results are expected when examining Table 4 since the mean values for the metrics in Table 6 are similar to those in Table 4, and the effects of coupled calculations in Table 4 are most apparent for those metrics listed as statistically significant in Table 6. It should be noted that the increased CRAC load in Table 4 corresponds to an increased CRAC return temperature, hence the statistical significance for this metric in Table 6.
The results of the statistical analysis for the DC-based system model, shown in Table 7, surprisingly shows no statistical significance in any category. The reduced impact of coupling seen in this system model in Table 5 compared to Table 4 results in the lack of statistical significance due to the uncertainty range provided in Table 7. These results suggest that coupled calculations are not necessary for efficient power delivery systems but becomes increasingly important as the electrical system PUE increases.
The quantified uncertainty ranges and systematic standard uncertainties in Tables 6 and 7 lend itself to determining the statistical significance when comparing AC-based versus DC-based power distribution systems when coupling is included. Table 8 provides the results of testing the statistical significance of the calculated data center system metrics in AC-based versus DC-based power distribution systems, indicating that differences in statistical significance are seen when comparing uncoupled versus coupled results. No statistical significance is seen in any metrics when comparing the uncoupled cases, whereas examination of the coupled cases indicates a statistically significant lower overall PUE and electrical equipment losses for the DC-based system. The reason for the lack of statistical significance for uncoupled systems are the similar values of their metrics, whereas the coupled calculations exacerbates the differences in electrical losses to the point where some metrics become statistically significant.
3.3 Discussion.
One advantage of the network modeling approach used by VTAS is the inherent flexibility to model a wide variety of cooling systems (e.g., CRAC or CRAH-based cooling [9], rear door heat exchangers [11], water-based cold plates [12], and systems with airside economization [16]) and power distribution systems (e.g., AC-based and DC-based systems as in this study). For each component in the cooling network, the user can either specify performance metrics (e.g., the coefficient of performance for a CRAC unit) or derive the metrics using physics-based component models. The electrical power system components also have this capability, but the instability in their physics-based efficiency calculations call for calibrated, user-defined efficiencies as used in this study. VTAS can also perform parameter sweeping to ascertain the impact of input parameters on system-level metrics such as PUE. Therefore, future work can investigate the influence of coupling the cooling and power distribution system calculations for alternative cooling and power distribution system configurations.
4 Conclusions
Several insights can be gained from this study, notably that coupling system calculations becomes increasingly important as the inefficiencies in the electrical power distribution system increase. Coupling the system calculations tends to impact the cooling equipment load most significantly—and therefore the cooling system PUE—whereas the electrical system PUE is largely unaffected since the coupling only affects the power draw by cooling components and their upstream power distribution components. In addition, the overall PUE is affected for the AC-based system when coupled calculations are used, and the coupled calculations suggest that the DC-based system has a statistically significant lower overall PUE than an AC-based system for the data center model in this study. Finally, coupled calculations are necessary for comparing two different power distribution strategies since the coupling enhances differences in electrical system efficiencies, enabling a greater possibility for passing statistical significance tests. Future work should explore the coupling effect on PUE predictions for models of alternative cooling systems (CRAH-chiller-cooling tower, evaporative cooler, etc.) and the impact of spatial heterogeneity in IT equipment utilization.
Acknowledgment
This material is based upon work supported by the National Science Foundation under Grant No. IIP-1738782. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation. Advice from industrial mentors as part of the NSF Industry/University Cooperative Research Center (I/UCRC) in Energy-Smart Electronic Systems (ES2), and measurement data from the Binghamton University data center under the supervision of Kanad Ghosh and Bahgat Sammakia are greatly appreciated.
Funding Data
Directorate for Engineering (Grant No. IIP-1738782; Funder ID: 10.13039/100000084).