Machine Learning Enhanced Building Performance Gui
Machine Learning Enhanced Building Performance Gui
1 State Key Laboratory of Subtropical Building Science, School of Architecture, South China University of
Technology, Guangzhou 510641, China
2 Architectural Design & Research Institute of South China University of Technology (SCUT) Co., Ltd.,
Guangzhou 510641, China
3 Guangzhou International Engineering Consult Co., Ltd., Guangzhou 510600, China
* Correspondence: [email protected] (X.X.); [email protected] (Y.N.)
Abstract: Given their dominant role in energy expenditure within China’s Hot Summer
and Warm Winter (HSWW) zone, high-fidelity performance prediction and multi-objective
optimization framework during the early design phase are critical for achieving sustain-
able energy efficiency. This study presents an innovative approach integrating machine
learning (ML) algorithms and multi-objective genetic optimization to predict and opti-
mize the performance of high-rise office buildings in China’s HSWW zone. By integrating
Rhino/Grasshopper parametric modeling, Ladybug Tools performance simulation, and
Python programming, this study developed a parametric high-rise office building model
and validated five advanced and mature machine learning algorithms for predicting energy
use intensity (EUI) and useful daylight illuminance (UDI) based on architectural form
parameters under HSWW climatic conditions. The results demonstrate that the CatBoost
algorithm outperforms other models with an R2 of 0.94 and CVRMSE of 1.57%. The Pareto
optimal solutions identify substantial shading dimensions, southeast orientations, high as-
pect ratios, appropriate spatial depths, and reduced window areas as critical determinants
for optimizing EUI and UDI in high-rise office buildings of the HSWW zone. This research
Academic Editor: Peixian Li
fills a gap in the existing literature by systematically investigating the application of ML
Received: 16 March 2025
algorithms to predict the complex relationships between architectural form parameters and
Revised: 25 April 2025
performance metrics in high-rise building design. The proposed data-driven optimization
Accepted: 29 April 2025
Published: 1 May 2025 framework provides architects and engineers with a scientific decision-making tool for
early-stage design, offering methodological guidance for sustainable building design in
Citation: Xie, X.; Ni, Y.; Zhang, T.
Machine-Learning-Enhanced Building similar climatic regions.
Performance-Guided Form
Optimization of High-Rise Office Keywords: high-rise office buildings; building performance; machine learning; building
Buildings in China’s Hot Summer and performance optimization; Pareto-optimal solutions; hot-summer and warm winter zone
Warm Winter Zone—A Case Study of
Guangzhou. Sustainability 2025, 17,
4090. https://doi.org/10.3390/
su17094090
1. Introduction
Copyright: © 2025 by the authors.
1.1. Background of the Study
Licensee MDPI, Basel, Switzerland.
This article is an open access article 2023 was officially declared the warmest year in recorded human history by the World
distributed under the terms and Meteorological Organization (WMO), with projections indicating further intensification of
conditions of the Creative Commons heatwaves in the coming decades [1]. Meanwhile, the past two decades (2000–2019) also
Attribution (CC BY) license marked the warmest period in China since 1900 [2]. The continuous temperature increase
(https://creativecommons.org/
has already exerted significant impacts on human society. The Intergovernmental Panel
licenses/by/4.0/).
on Climate Change (IPCC) has identified anthropogenic greenhouse gas emissions as the
primary driver of accelerating global warming trends [3]. Hence, implementing energy
conservation measures represents a critical strategy for preserving ecological integrity and
advancing sustainable development objectives in the context of climate change mitigation.
As the world’s second-largest carbon emitter [4], China faces significant opportunities
for energy conservation and emission reduction, particularly within the construction sector.
Recent studies have indicated that energy consumption in China’s building industry
accounts for 36.3% of the national total [5], positioning it as the sector with the highest
decarbonization potential among the three primary energy-consuming sectors (construction,
industry, and transportation) [6]. Office buildings, representing the most rapidly expanding
building typology over the past decade, now comprise 20% of China’s public building
stock [7]. These structures contribute to 20% of the nation’s total energy use, with per-unit-
area electricity consumption 10 to 20 times greater than that of residential buildings [8,9]. In
China’s HSWW zone, the eight-month air conditioning season further exacerbates energy
intensity in office buildings, resulting in energy consumption significantly higher than the
national average. Projections predict that commercial buildings will continue to increase
their share of total energy consumption in the coming decades [10]. Given this concern,
optimizing the performance of high-rise office buildings in China’s HSWW zone has
emerged as a critical priority for advancing the nation’s energy conservation and emission
reduction agenda.
Performance optimization during the early-stage architectural design phase is crucial
for achieving energy-efficient buildings, with studies demonstrating up to 40% energy
savings potential [11]. Building performance simulation (BPS) plays a key role in design
validation through two model categories [12]: forward models (e.g., EnergyPlus [13],
TRNSYS [14]) rely on physical principles for accurate predictions [15], while inverse
models—also known as data-driven methods (comprising black-box (e.g., ML) and gray-
box approaches [16])—offer computational efficiency for parametric optimization. Specifi-
cally, the efficiency of data-driven methods enables architects to explore broader design
spaces within constrained timelines and identify sustainable design solutions rapidly in
early design stages [15,17].
However, their accuracy remains context-dependent, particularly when it comes to
predicting the relationship between a building’s architectural form and its performance.
This study addresses these limitations by developing an integrated optimization framework
and taking high-rise office buildings in China’s HSWW zone as an example. By combining
real-time feedback mechanisms with multi-objective optimization, the approach enhances
the practicality of traditional green design workflows while maintaining academic rigor
through validated simulation tools and parametric analysis.
solutions represented by the Pareto frontier [29,30]—a concept widely adopted to visualize
non-dominated design alternatives [31].
To reduce computational costs in performance simulation, traditional MOO ap-
proaches integrate building performance simulation (BPS) engines with optimization
algorithms [32], such as genetic algorithms (GA), particle swarm optimization (PSO),
ant colony optimization (ACO), and simulated annealing (SA). Among these, GA and its
variants have emerged as the most prevalent and effective methods [33,34]. Platforms
like ModeFrontier [35] and Octopus [36] have operationalized these algorithms, enabling
designers to conduct parametric optimizations efficiently. Recent applications include
Alelwani et al.’s [37] GA-based optimization of vernacular Rawshan elements in Saudi
Arabian buildings to improve energy efficiency and useful daylight illuminance (UDI).
Wang et al. [38] applied the SPEA-2 algorithm to optimize annual cooling and lighting
energy consumption in rural residences of China’s Hot Summer and Cold Winter Zone.
Chaturvedi et al. [39] used NSGA-II to balance annual energy use and cooling duration
in Indian residential buildings, while Zhao et al. [40] employed NSGA-II to optimize win-
dow and shading parameters for thermal comfort and energy performance in a high-rise
office building.
While GA reduce simulation frequency and computational time compared to brute-
force approaches, their efficiency remains insufficient for real-world design workflows [41].
Recent advancements in ML have revolutionized building performance prediction by de-
ploying surrogate models trained on limited simulation datasets, enabling rapid feedback
and optimization [42]. Over 100 ML algorithms have been applied to building perfor-
mance modeling [43], with dominant approaches including artificial neural networks
(ANN) [44–47], support vector regression (SVR) [48–50], gradient-boosted decision trees
(GBDT) [51], long short-term memory networks (LSTM) [52], decision trees (DT) [53],
random forests (RF) [54–56], and their variants. Chi et al. [57] demonstrated over 90%
accuracy in predicting heating/cooling energy consumption across six building typologies
using eight ML algorithms. Siamak et al. [58] integrated Gaussian process regression (GPR)
with MOO to optimize nine design parameters for heating/cooling energy savings. Chen
et al. [59] developed surrogate models using MLR, MARS, and SVM for 5000+ simula-
tion cases of Hong Kong high-rise residential buildings, identifying SVM as the optimal
performer before applying NSGA-II to derive Pareto-optimal solutions for three energy
objectives. Gou et al. [60] combined ANN with NSGA-II to analyze 20 design metrics
affecting cooling thermal response (CTR) and building energy density (BED) in Shanghai
high-rises. Wu et al. [51] employed BO-XGBoost to model envelope parameters’ impacts on
energy, daylighting, and thermal comfort, achieving Pareto optimization through NSGA-II.
Overall, the core of integrating ML and GA resides in the combination of “predictive
capability” and “optimization capability”: the ML enables quick and accurate prediction of
building performance, while the GA supports more effective optimization. This integra-
tion overcomes the limitations of traditional methods in terms of accuracy, efficiency, and
multi-objective balance, thereby establishing itself as a key technical approach for advanced
building performance prediction and optimization.
While ML-driven building optimization has made notable advancements, prior studies
have primarily focused on thermal design parameters [61–64] or geometric adjustments to
specific building components [65], leaving the relationship between architectural form and
performance underexplored. Additionally, despite the recognized importance of occupant
comfort in enhancing health and productivity [66], existing ML-based frameworks have
primarily focused on energy efficiency, overlooking metrics such as thermal comfort and
daylight illuminance [67–69]. Moreover, as demonstrated in the prior literature, no single
ML algorithm universally outperforms others across all building performance tasks, as
Sustainability 2025, 17, 4090 4 of 27
2. Methodology
As illustrated in Figure 1, the research framework comprises three interconnected
phases.
Phase 1: Develop the parametric building model, which involves two core tasks:
(1) defining parameters of architectural form (e.g., building orientation, building width,
room depth, etc.), configuring the thermal parameter of the building envelope, selecting
proper climate data files, and establishing occupancy schedules to create a parametric
prototype of high-rise office buildings in China’s HSWW region; (2) generating datasets
of parameter combinations via Latin hypercube sampling within the defined parame-
ter ranges.
Phase 2: Run the performance simulation for each sampled building model using
Ladybug Tools, generating a ‘parameter-performance’ dataset where each parameter com-
bination is mapped to its corresponding EUI and UDI values.
Phase 3: Train multiple ML algorithms on the simulated datasets to develop surrogate
models for multi-objective prediction. The best-performing models are integrated with
the non-dominated sorting genetic algorithm (NSGA-II) to derive Pareto-optimal solu-
tions. Based on the derived Pareto front, the optimal ranges and interactions of building
design parameters are analyzed to identify performance trade-offs between EUI and UDI,
thereby guiding design decision-making. The subsequent sections detail the tools and
methodologies employed in each phase.
design parameters are analyzed to identify performance trade-offs between EUI and U
thereby guiding design decision-making. The subsequent sections detail the tools a
Sustainability 2025, 17, 4090 5 of 27
methodologies employed in each phase.
VSS
HSS
CH
FH
VSD WSH
(a) (b)
Thereferences
Figure 2. The referencesfor
foreach
eachparameter:
parameter:(a)(a) reference
reference of of
thethe planar
planar parameters.
parameters. For For planar
planar pa-
Sustainability 2025, 17, x FOR PEERparameters,
REVIEW × R,
Wand other7non-
of 29
rameters, thethe plan
plan length
length can can be calculated
be calculated as Was
× R, theand the
core core (circulation
(circulation and
and other non-support
support officesize
office spaces) spaces) size automatically
automatically adapts toadapts
W, R, to
andW,D.R,(b)
and D. (b) Reference
Reference of the and
of the vertical vertical and
shading
shading parameters, the height of the window can be given by FH −
parameters, the height of the window can be given by FH − (CH + WSH). (CH + WSH).
Figure 3. 3D models generated via different parametric configurations, showcasing the capability of
Figure
the 3. 3D models
parametric generated
model with viaparameters.
10 form different parametric configurations, showcasing the capability of
the parametric model with 10 form parameters.
the Honeybee plugin. This plugin offers a “Program” port, which has eight sub-ports,
including “People”, “Lighting”, “Electric Equipment”, “Gas Equipment”, “Hot Water”,
“Infiltration”, “Ventilation”, and “Setpoint”. These sub-ports are used to define the device
power, usage time, occupant density, and space occupancy rate.
For the sake of generality, the building operation schedule in this study is set according
to the recommended values for office buildings in references [74,75]. The only difference
is that the winter heating temperature specified in the standards is removed, and no
heating equipment is used in winter. This adjustment aligns with the actual usage and
design guidelines of most local office buildings in the HSWW zone. The detailed building
operation schedule is presented in Table 3.
Figure 4. The geographical coverage of China’s HSWW climate zone and the major cities within
The red star4.represents
Figure the location
The geographical coverageofofBeijing, the capital
China’s HSWW of zone
climate China.
and the major cities within it.
The red star represents the location of Beijing, the capital of China.
Figure 4. The geographical coverage of China’s HSWW climate zone and the major cities within
The red star represents the location of Beijing, the capital of China.
(a) (b)
FigureFigure 5. The
5. The climate
climate differences between
differences between six major
six cities
major in theinHSWW
cities zone: (a)zone:
the HSWW monthly
(a) average
monthly avera
temperature;
(a) (b) monthly average sunshine duration. (b)
temperature; (b) monthly average sunshine duration.
While
Figure 5. The CSWDdifferences
climate has been widely used
between sixinmajor
Chinese studies,
cities in theitsHSWW
static 2005 datasets
zone: may avera
(a) monthly
Generally, open-source
be less relevant .epw datadue
to current conditions of Guangzhou wereFigure
to climate change. sourced from two
6 compares repositori
hourly
temperature; (b) monthly average sunshine duration.
dry bulb temperatures from the CSWD and TMYx datasets for Guangzhou, revealing
1. CSWD (Chinese Standard Weather Data): A 2005 historical dataset provided by t
significantly more high-temperature hours in TMYx. The average annual temperatures
China
differ byMeteorological
Generally,0.88open-source Administration.
.epw
◦ C (22.23 ◦ C for data
CSWD vs. of Guangzhou
23.11 were
◦ C for TMYx), sourceda from
representing 3.96% two repositori
increase.
2.
1. Given the
TMYx
CSWD strong influence
(Typical
(Chinese Standard of ambient
Meteorological temperature
Year):
Weather Aon building
A dynamically
Data): 2005 energydataset
updated
historical use, TMYx
dataset data the
from
provided byUt
were
NOAA, selected to ensure up-to-date and
incorporating Administration. realistic simulation outcomes.
2007–2021 monthly averages to reflect contemporary clima
China Meteorological
2. trends.
2.5. Creation
TMYx of Building
(Typical Performance Simulation
Meteorological Year): A Datasets
dynamically updated dataset from the U
While
NOAA, CSWD
Following has been2007–2021
parametric
incorporating widely
model used in Chinese
development
monthly studies,
andaverages
simulation its static
toparameter
reflect 2005 datasets
configuration,
contemporary m
clima
energy
be less and daylight simulations were executed using Honeybee 1.4.0 [78]—an
relevant to current conditions due to climate change. Figure 6 compares hourly d
trends. open-source
plugin for Grasshopper. This plugin’s computational accuracy has been validated in prior
bulb While
temperatures from the CSWD and
usedTMYx datasets for Guangzhou, revealing sign
studies CSWD has been
[79,80], ensuring widely
reliable performanceinpredictions.
Chinese studies, its static 2005 datasets m
cantly more
be less relevanthigh-temperature
to current
To evaluate building hours
conditions
performance,in TMYx.
due The average
to climatemetrics
appropriate change. annual
Figure
were temperatures
6 compares
selected. differ
hourly d
For energy
0.88 °C (22.23
efficiency, °C for
energy CSWD
use vs.
intensity 23.11
(EUI, °C for
kWh/m 2TMYx),
) was representing
adopted, combining
bulb temperatures from the CSWD and TMYx datasets for Guangzhou, revealing sign a 3.96%
cooling increase.
energy Giv
cantly(EUI_cooling) and lighting energy (EUI_lighting). Daylight performance was assessed
more high-temperature hours in TMYx. The average annual temperatures differ
using the Useful Daylight Index (UDI, %) proposed by Nabil and Mardaljevic [81], which
0.88 °C (22.23 °C for CSWD vs. 23.11 °C for TMYx), representing a 3.96% increase. Giv
quantifies the annual percentage of occupied hours with horizontal illuminance. Incorpo-
rating both glare and illuminance criteria, the UDI is widely recognized as a comprehensive
Sustainability 2025, 17, 4090 10 of 27
tainability 2025, 17, x FOR PEER REVIEW 10 of
metric for daylight quality. In this study, the UDI was calculated by placing sensors at
0.8 m above floor level (desk height) across a 1-m grid in office zones. The effective illumi-
the strong influence of ambient temperature on building energy use, TMYx data
nance range of 300–2000 lux for typical office tasks was applied, and the final UDI value
were
lectedrepresented
to ensuretheup-to-date andallrealistic
average across simulation outcomes.
sensor points.
(a)
(b)
FigureFigure
6. Comparison of hourly
6. Comparison dry
of hourly drybulb
bulb temperatures
temperatures ofof Guangzhou
Guangzhou fromfrom
CSWDCSWD and TMYx
and TMYx
tasets:datasets: (a) hourly temperatures of Guangzhou from CSWD with less red color (representing high
(a) hourly temperatures of Guangzhou from CSWD with less red color (representing h
temperature); (b) hourly temperatures of Guangzhou for TMYx with more red color.
temperature); (b) hourly temperatures of Guangzhou for TMYx with more red color.
To enhance the representativeness, accuracy, and generalizability of the datasets
for ML training,
2.5. Creation LatinPerformance
of Building hypercube sampling (LHS)Datasets
Simulation was employed to generate parameter
combinations. Latin hypercube sampling (LHS) efficiently captures design space variability
Following parametric model development and simulation parameter configuratio
without requiring excessive samples [82], meaning it has been widely adopted in building
energy and daylight
performance analysissimulations were executed
[82–84]. This method usingparameter
ensures uniform Honeybee 1.4.0 [78]—an
distribution while ope
source plugin for
minimizing Grasshopper.
redundant sampling This plugin’s
[17], making computational
it particularly suitableaccuracy has been validat
for high-dimensional
design spaces.
in prior studies [79,80], ensuring reliable performance predictions.
To achieve an optimal balance between computational cost and machine learning data
To evaluate building performance, appropriate metrics were selected. For energy
requirements, a total of 50 parametric samples were generated. Simulations were executed
ficiency, energyCPU
on a 16-core use(AMD
intensity
RyzenTM(EUI, kWh/m
9 7945HX,
2) was adopted, combining cooling ener
Advanced Micro Devices, Santa Clara, CA),
(EUI_cooling)
taking approximately 10 min per sample simulation and totalingperformance
and lighting energy (EUI_lighting). Daylight was assessed u
about 80 h for finishing
ing the
the Useful Daylight Index (UDI, %) proposed by Nabil and Mardaljevic [81], wh
entire dataset.
quantifies the annual percentage of occupied hours with horizontal illuminance. Incorp
2.6. Machine Learning Algorithm
rating both glare and illuminance criteria, the UDI is widely recognized as a comprehe
As previously discussed, no single ML algorithm universally outperforms others in
sive metric
buildingfor daylight analysis
performance quality.[85].
In this study,
Predicting the UDI
algorithm was calculated
suitability by dataset
for a specific placing senso
at 0.8remains
m above floor level
challenging (desk
prior to height)
training. across
To address this,aa 1-m grid inevaluation
comparative office zones. The effect
of multiple
algorithms
illuminance was conducted
range of 300–2000to identify thetypical
lux for most accurate
officepredictor.
tasks wasGiven the impracticality
applied, and the final U
of testing all available algorithms, selection criteria focused on algorithmic strengths and
value represented the average across all sensor points.
validated application scenarios. After comprehensive analysis, three algorithms were
To enhance the representativeness, accuracy, and generalizability of the datasets
chosen for this study:
ML training, Latin hypercube sampling (LHS) was employed to generate parameter co
binations. Latin hypercube sampling (LHS) efficiently captures design space variabil
without requiring excessive samples [82], meaning it has been widely adopted in buildi
performance analysis [82–84]. This method ensures uniform parameter distribution wh
Sustainability 2025, 17, 4090 11 of 27
• Multi-Layer Perceptron (MLP): a prevalent and simple artificial neural network (ANN)
architecture comprising input, hidden, and output layers was selected for this study
due to its proven capacity to capture complex nonlinear relationships in building
performance datasets. The input layer receives the parameters of architectural form,
while the output layer generates predicted performance metrics. By incorporating
nonlinear activation functions (e.g., ReLU, Sigmoid), MLPs can capture complex input-
output relationships, making them well-suited for mapping static or low-dimensional
time-series data like building design parameters to performance outcomes.
• Support Vector Regression (SVR): originating from support vector machine (SVM)
theory [86], SVR is a regression model that maps low-dimensional data to a high-
dimensional feature space using kernel functions (e.g., RBF). By constructing an
optimal regression hyperplane, SVR effectively captures latent relationships between
input and output variables, making it well-suited for predicting building performance
metrics from design parameters. Unlike other continuous variable prediction methods,
SVR exhibits robust generalization when applied to unseen data [87], maintaining
superior predictive performance even with limited training data—a critical advantage
for building optimization workflows constrained by computational resources.
• Random Forest (RF): an ensemble learning method that constructs multiple decision
trees for classification and regression tasks, enhancing prediction accuracy and robust-
ness through aggregating tree outputs [54]. This algorithm reduces model variance
and mitigates overfitting risks via bootstrap sampling and random feature selection.
Due to its insensitivity to noise and missing values, it maintains stable performance
even with limited training data, which is a critical advantage over most other ML
models. Additionally, tree-based models are favored for their interpretability, enabling
transparent analysis of feature contributions to predictions [17].
• XGBoost: a powerful ensemble learning algorithm based on the Gradient Boosting
Decision Tree (GBDT) [88]. It enhances prediction accuracy by combining multiple de-
cision trees. Distinguishing itself from GBDT, XGBoost attains superior computational
accuracy. It leverages the second-order Taylor expansion formula and incorporates
a regularization term into the objective function, effectively mitigating overfitting
risks. Currently, it has demonstrated advantages such as fast computation speed, high
prediction accuracy, and strong robustness in regression problems and has become a
very popular algorithm.
• CatBoost: an open-source GBDT framework developed by Yandex in 2017 [89] specifi-
cally designed for handling categorical features in classification, regression, and rank-
ing tasks. Unlike traditional ML algorithms, CatBoost automates categorical feature
processing through advanced techniques such as target encoding and combinatorial
optimization, eliminating the need for manual pre-processing. This native capability
makes CatBoost particularly suitable for unstructured datasets and high-cardinality
categorical scenarios. Furthermore, it has demonstrated effectiveness in predicting
energy consumption across diverse domains [90], where it often outperforms XGBoost
in both prediction accuracy and computational efficiency.
To evaluate the predictive performance of different algorithms, three metrics were
adopted: the coefficient of determination (R2 ), root mean squared error (RMSE), and coeffi-
cient of variation of RMSE (CVRMSE). R2 (0 ≤ R2 ≤ 1) quantifies the proportion of variance
in the dependent variable explained by the model, with values closer to 1 indicating better
fit. RMSE measures the average magnitude of prediction errors, calculated as the square
root of the mean squared deviation between predicted and observed values. CVRMSE
normalizes RMSE by the mean of observed values, yielding a dimensionless metric for
fair cross-dataset comparisons. Recommended by ASHRAE [91], CVRMSE eliminates
Sustainability 2025, 17, 4090 12 of 27
scale dependency and is particularly useful for benchmarking models across different
building types or climates. The mathematical expressions for these metrics are provided in
Equations (1)–(3).
2
∑n (y − y̌i )
R2 = 1 − ni=1 i 2
(1)
∑ i=1 ( y i − y i )
q
n
RMSE = ∑i=1 (y̌i − yi )2 /n (2)
q
2
∑ni=1 (y̌i − yi ) /n
CVRMSE = (3)
∑ni=1 yi /n
where y̌i , yi , and yi represent the predicted value of sample i, the actual value of sample i,
and the mean value of all sample datasets, respectively; n denotes the number of samples.
(a) (b)
Figure 7. Frequency distribution plot of EUI and UDI values: (a) frequency distribution plot of EUI;
(b) frequency distribution plot of UDI.
Results
Model Name
Sustainability 2025, 17, 4090 R2 RMSE CVRMSE (%) 14 of 27
MLP 0.8728 0.2486 5.96%
SVR 0.4476 0.5182 37.89%
Regression plots
RF reveal that predicted outputs for the0.2938
0.8224 training, validation, and test sets
15.1%
align closely with
XGBoost target values along the
0.8672 diagonal, reflecting
0.2541 strong fitting relationships.
8.89%
This confirms the model’s capability
CatBoost to capture complex
0.9406 non-linear relationships
0.1930 1.57% between
architectural form parameters and EUI/UDI.
Figure 9 illustrates the training progression of the CatBoost model and its regression
Table 4. The
prediction training results
performance. of the machine
During learning
training, both models.
training and validation losses demon-
strated sustained downward trends, indicating overall stability in the optimization pro-
Results
cess. The
Modelvalidation
Name loss stabilized2 below 0.1 after approximately 18,000 iterations, at
R RMSE CVRMSE (%)
which point the model snapshot was selected as the final performance prediction surro-
gate model. MLP
Regression plots reveal0.8728 0.2486
that predicted outputs 5.96% and
for the training, validation,
SVR 0.4476 0.5182 37.89%
test sets align closely with target values along the diagonal, reflecting strong fitting rela-
RF 0.8224 0.2938 15.1%
tionships. This confirms the model’s
XGBoost 0.8672capability to capture
0.2541complex non-linear relation-
8.89%
ships between architectural form0.9406
CatBoost parameters and EUI/UDI. 0.1930 1.57%
(a) (b)
(c) (d)
Figure9. 9.
Figure Training
Training progression
progression of the
of the CatBoost
CatBoost model
model andand its regression
its regression prediction
prediction performance:
performance: (a)
(a) MSE
MSE changeschanges during
during iterations;
iterations; (b) regression
(b) regression of training
of training data; data; (c) regression
(c) regression of validation
of validation data; data;
(d)
(d) regression
regression ofdata.
of test test data.
Figure
Figure 10.10. Distribution
Distribution ofof
MLML predictions,
predictions, Shenzhen’s
Shenzhen’s performance
performance simulation
simulation results,
results, and
and adjusted
adjusted
valuesincorporating
values incorporatingclimatic
climaticdifferences
differencesfor
for5050random
randomparameter
parametersamples.
samples.
(a) (b)
Figure
Figure 11. 11.
SHAPSHAP importance ranking
importance ranking ofof
each parameter
each for EUI
parameter forand
EUIUDI:
and(a) UDI:
SHAP(a)
importance
SHAP importa
ranking for EUI; (b) SHAP importance ranking for UDI.
ranking for EUI; (b) SHAP importance ranking for UDI.
In Figure 11a, HSS emerges as the most impactful parameter for EUI among all form
In Figure
factors, with11a, HSS
a SHAP emergesexceeding
importance as the most
twiceimpactful
that of any parameter for EUI
other parameter. among all fo
Parameters
such as D, WSH, FH, VSD, and VSS show moderate effects on EUI,
factors, with a SHAP importance exceeding twice that of any other parameter. while O, R, W, and CHParamet
suchhave
as D,negligible
WSH, FH,influences.
VSD, and For VSS
UDI, show
Figure moderate
11b shows that HSSon
effects again haswhile
EUI, the strongest
O, R, W, and
effect on UDI—parallel to its role in EUI—while D and FH exhibit notably significant
have negligible influences. For UDI, Figure 11b shows that HSS again has the strong
impacts, with WSH, VSD, and CH contributing moderately. The remaining parameters, by
contrast, have minimal effects on UDI.
Collectively, these results highlight that HSS is a critical factor for both EUI and UDI,
underscoring the importance of horizontal shading design in the HSWW zone—consistent
with prior Pearson analysis; parameters like D, FH, and WSH rank second, third, and fourth
in importance, indicating substantial effects of spatial depth, floor height, and windowsill
height on high-rise building performance. The lower importance of vertical sunshade
parameters (VSS, VSD) indicates that horizontal sunshades are more critical than vertical
counterparts; the higher ranking of WSH compared to CH implies greater optimization
potential in adjusting windowsill height. Notably, D stands out as the only planar pa-
rameter with high importance, emphasizing that spatial depth impacts performance more
significantly than building dimensions or shape.
Unlike importance ranking, SHAP beeswarm plots evaluate the positive or negative
impacts of parameters on performance metrics: positive SHAP values indicate positive
correlations, while negative values denote negative correlations, with larger absolute values
reflecting stronger influences. Additionally, the color gradient of samples in these plots
shows the relationship between parameter values and SHAP values, where redder colors
indicate larger parameter values within their respective ranges, and bluer tones signify
smaller values.
Analysis of the SHAP importance ranking plot also reveals that the SHAP importance
of parameter CH for EUI is negligible, while its importance for UDI remains moderate.
Although removing this parameter from ML model training might theoretically improve
prediction performance, testing results indicate that excluding CH causes the model’s
overall R2 to drop from 0.94 to 0.70. This suggests that CH still plays a significant role in
maintaining prediction accuracy.
Sustainability 2025, 17, 4090 17 of 27
Figure 12a presents the SHAP beeswarm plot for EUI, showing that W, R, D, WSH,
HSS, and VSS correlate negatively with EUI. Among these parameters, HSS, WSH, and D
exhibit particularly pronounced negative correlations, indicating that the larger sizes of
horizontal sunshades, higher windowsills, and greater spatial depth reduce energy use
in high-rise offices of the HSWW climate zone. Conversely, FH and VSD show positive
correlations, indicating lower floor heights and smaller vertical shading distances are
linked to lower energy consumption. O displays a non-linear relationship with energy
use: intermediate O values (corresponding to south-facing orientations) are associated
with reduced energy consumption, whereas decreases in O (west-facing) or increases in O
(east-facing) correlate with higher energy use—likely due to balanced sunlight on south
Sustainability 2025, 17, x FOR PEER REVIEW 18 of 29
facades reducing extreme thermal loads compared to east/west orientations; CH has no
significant directional trend, suggesting minimal direct impact on EUI.
(a)
(b)
Figure
Figure 12.
12. SHAP
SHAPbeeswarm
beeswarmplots
plotsfor
forEUI
EUIand
andUDI:
UDI:(a)(a)
beeswarm plot
beeswarm of of
plot EUI; (b)(b)
EUI; beeswarm plot
beeswarm of
plot
UDI.
of UDI.
Figure 12b
Figure 12b presents
presents the
the SHAP
SHAP beeswarm
beeswarm plot
plot for
for UDI,
UDI, revealing
revealing that W, D,
that W, D, and VSD
and VSD
exhibit strong
exhibit strong negative
negative correlations
correlations with
with UDI—indicating
UDI—indicating that
that smaller
smaller plan
plan widths,
widths,shal-
shal-
lower spatial depths, and narrower vertical shading intervals enhance UDI. R
lower spatial depths, and narrower vertical shading intervals enhance UDI. R demon-demonstrates
a moderate
strates negative
a moderate relationship
negative with UDI,
relationship with suggesting that building
UDI, suggesting plansplans
that building closer to a
closer
to a square configuration (lower R values) correlate with higher UDI; notably, a subset of
samples with low R values display negative SHAP values, potentially attributable to
model uncertainty or edge-case scenarios. Conversely, O, CH, and HSS show positive cor-
relations with UDI: east-facing orientations, increased ceiling heights, and larger horizon-
tal shading components consistently improve UDI performance. For HSS in particular, the
Sustainability 2025, 17, 4090 18 of 27
square configuration (lower R values) correlate with higher UDI; notably, a subset of sam-
ples with low R values display negative SHAP values, potentially attributable to model
uncertainty or edge-case scenarios. Conversely, O, CH, and HSS show positive correlations
with UDI: east-facing orientations, increased ceiling heights, and larger horizontal shading
components consistently improve UDI performance. For HSS in particular, the beneficial
effect is hypothesized to arise from its ability to filter direct solar radiation while preserving
diffused light. Besides, FH, WSH, and VSS exhibit no clear directional trends, a result likely
arising from strong interactive effects with other parameters.
Sustainability 2025, 17, x FOR PEER REVIEW Overall, the interpretability analysis of the prediction model using SHAP reveals the
19 of 29
following results: Parameters HSS, D, FH, WSH, VSS, and VSD have substantial impacts on
prediction outcomes. Larger horizontal sunshades and vertical sunshade sizes and minor
southeast-facing
greater building plan orientations
width andcanplanreduce
aspectEUI
ratiowhile improving
and spatial depthUDI. Conversely,
lower greater
EUI but dimin-
ishbuilding plan width
UDI. Shorter and plan
floor heights aspect
(FH), ratiowindowsill
higher and spatialheights
depth lower
(WSH), EUI
andbut diminishver-
narrower UDI.
Shorter floor heights (FH), higher windowsill heights (WSH), and narrower
tical sunshade intervals reduce EUI, though their effects on UDI require evaluation in con- vertical sun-
shade intervals
junction with other reduce EUI, though
parameters. their
Ceiling effects
height onaUDI
has require
minimal evaluation
impact in conjunction
on EUI, yet larger
with other parameters. Ceiling
CH values contribute to higher UDI. height has a minimal impact on EUI, yet larger CH values
contribute to higher UDI.
These conclusions align well with practical design experience, underscoring the pre-
These
diction model’s conclusions align well with practical all
strong interpretability—whereby design experience,
parameters exertunderscoring the pre-
discernible effects
ondiction model’swithout
performance strong interpretability—whereby
irrelevant factors—therebyall parameters
validating its exert discernible
credibility. Thus,effects
this
on performance without irrelevant factors—thereby validating its credibility.
prediction model can be employed as a reliable surrogate for form parameter optimization Thus, this
viaprediction
NSGA-II.model can be employed as a reliable surrogate for form parameter optimization
via NSGA-II.
3.4. Performance and Analysis of Optimization
3.4. Performance and Analysis of Optimization
Leveraging the predictive capabilities of the surrogate model, the NSGA-II algorithm
Leveraging the predictive capabilities of the surrogate model, the NSGA-II algorithm
was implemented in Python to derive Pareto optimal solutions for EUI and UDI. Approx-
was implemented in Python to derive Pareto optimal solutions for EUI and UDI. Approxi-
imately 2 min of optimization yielded 65 Pareto-optimal solutions.
mately 2 min of optimization yielded 65 Pareto-optimal solutions.
Figure 13 reveals that the solutions closely align with a quadratic curve (y = −29.65x22
Figure 13 reveals that the solutions closely align with a quadratic curve (y = −29.65x +
+ 1946x − 31,849), reflecting a non-linear and competitive trade-off between EUI and UDI.
1946x − 31,849), reflecting a non-linear and competitive trade-off between EUI and UDI. In
In the initial segment, UDI increases at a higher rate with rising EUI, whereas in the latter
the initial segment, UDI increases at a higher rate with rising EUI, whereas in the latter part,
part, UDI growth decelerates as EUI increases. Consequently, the convex point of the pa-
UDI growth decelerates as EUI increases. Consequently, the convex point of the parabola is
rabola is identified as the optimal design solution balancing both performance metrics.
identified as the optimal design solution balancing both performance metrics.
Figure 13. The distribution of all Pareto-optimal solutions of EUI and UDI generated by NSGA-II.
Figure 13. The distribution of all Pareto-optimal solutions of EUI and 2UDI generated by NSGA-II.
The distribution approximately fits a quadratic curve (y = −29.65x + 1946x − 31,849, R2 = 0.99),
The distribution approximately fits a quadratic curve (y = −29.65x2 + 1946x − 31,849, R2 = 0.99), where
where y represents UDI and x represents EUI, illustrating a non-linear trade-off between them.
y represents UDI and x represents EUI, illustrating a non-linear trade-off between them.
Figure 14. The distribution of Pareto solutions of EUI and UDI by re-simulation based on Pareto-
Figure
optimal14. The distribution
solutions generatedofbyPareto solutions of EUI and UDI by re-simulation based on Pareto-
NSGA-II.
Figure 14.
optimal The distribution
solutions generated of
by Pareto solutions of EUI and UDI by re-simulation based on Pareto-
NSGA-II.
optimal solutions generated by NSGA-II.
Figure 15. Distribution of normalized parameter values for eight solutions with difference rates
exceeding
Figure 15. 5% (orange) and
Distribution other Pareto-optimal
of normalized parameter solutions
values for(blue).
eight solutions with difference rates ex-
ceeding 5% (orange) and other Pareto-optimal solutions (blue).
Figure 15. Distribution
After excluding of normalized
the parameter
8 solutions values for eight
with difference ratessolutions
> 5%, 57 with difference rates so-
Pareto-optimal ex-
ceeding
lutions 5%close
in
After (orange)
excludingandthe
other
agreement Pareto-optimal
with with solutions
predictions
8 solutions were (blue).
differenceretained. Table
rates > 5%, 575Pareto-optimal
lists the maximum,
solu-
median, mean, and minimum values for the two performances, along with the
tions in close agreement with predictions were retained. Table 5 lists the maximum, me- baseline
After
model’s excluding the 8 solutions with difference rates > 5%, 57 Pareto-optimal solu-
values.
dian, mean, and minimum values for the two performances, along with the baseline
tions in close agreement with predictions were retained. Table 5 lists the maximum, me-
model’s values.
dian, mean, and minimum values for the two performances, along with the baseline
model’s values.
Table 5. Maximum, median, mean, and minimum values of the Pareto-optimal solutions for
EUI/UDI, along with baseline model values.
Table 5. Maximum, median, mean, and minimum values of the Pareto-optimal solutions for
Sustainability 2025, 17, 4090 20 of 27
Table 5. Maximum, median, mean, and minimum values of the Pareto-optimal solutions for EUI/UDI,
Sustainability 2025, 17, x FOR PEER REVIEW 21 of 29
along with baseline model values.
Figure 16. Statistical distribution and median of each planar parameter for the Pareto-optimal solutions.
Figure 16. Statistical distribution and median of each planar parameter for the Pareto-optimal solu-
tions. Figure 16 illustrates the distribution of planar parameters for Pareto-optimal solutions.
Orientations within the Pareto set cluster are at 105◦ (15◦ east of south), with secondary
frequency peaks at 90◦ (south) and 120◦ (30◦ east of south). This clustering indicates
that southeast/south-facing orientations are optimal for high-rise office buildings in this
region, as they align with prevailing monsoon patterns—enhancing natural ventilation
and reducing west-facing solar heat gain, which minimizes cooling loads. The median
plan width and length values are 40.2 m and 58.3 m, and a dominant 1.45 aspect ratio
tainability 2025, 17, x FOR PEER REVIEW 22 of
Sustainability 2025, 17, 4090 21 of 27
(length/width) implies that increasing the south-facing facade area improves performance.
Spatial depth and plan area cluster at 9 m/1500 m2 and 12 m/2500 m2 , indicating an
Sustainability 2025, 17, x FOR PEER REVIEW 22 of 29
adaptive strategy balancing daylight access (shallow and small plans) and heat reduction
(deep and large plans).
Figure 17. Statistical distribution and median of each vertical parameter for the Pareto-optimal
lutions.
Figure 17. Statistical distribution and median of each vertical parameter for the Pareto-optimal solutions.
Figure 17. Statistical distribution and median of each vertical parameter for the Pareto-optimal
solutions.
Figure 18. Statistical distribution and median of each shading parameter for the Pareto-optimal solutions.
Figure 18. Statistical distribution and median of each shading parameter for the Pareto-optimal
Figure 17 shows the distribution of vertical parameters for Pareto-optimal solutions.
lutions.
Floor
Figure 18.heights ranging
Statistical from 4.1and
distribution to 4.3 m reflect
median a balance
of each shading between daylight
parameter volume
for the and
Pareto-optimal
spatial
solutions. comfort. The maximum allowable values for windowsill (1.2 m) and ceiling heights
Figure 16 illustrates
(1.4 m) indicate thewindow
that reduced distribution of planar
area contributes parameters
to enhanced for
overall Pareto-optimal
building perfor- so
tions.mance,
Orientations
Figurewhich within the
16 isillustrates
supported by Pareto
thewindow set cluster
height
distribution are at (1.4–1.9
distributions
of planar 105° (15°
parameters east
m) and ofPareto-optimal
WWR
for south),
(~0.3). with s
ondary Figure 18 presents
frequency the
at distribution
peakswithin 90° the
(south) of vertical parameters forof
Pareto-optimal solutions.
solutions. Orientations Paretoand set 120°
cluster(30°areeast
at 105° south).
(15° east This clustering
of south), with in
The uniform adoption of 1.5 m horizontal sunshade size and 3 m vertical sunshade intervals,
cates that southeast/south-facing
secondary frequency peaks at 90°orientations
(south) andare 120° optimal
(30° east forofhigh-rise
south). Thisoffice buildings
clustering
both representing minimum design thresholds, highlights their critical role in performance
this region,
indicates as
thatthey align with prevailing
southeast/south-facing monsoon patterns—enhancing
orientations are optimal for high-rise
optimization. While vertical sunshade sizes contribute less significantly than horizontal
natural venti
office
buildings
tion and
shading, in
reducing thisvertical
larger region,
west-facing as they
solar
sunshades align
are heat with
gain, prevailing
whichselected
still preferentially monsoon
minimizes patterns—enhancing
cooling
in optimal loads.
solutions. The med
This
natural
plan width ventilation
parameter and length andvalues
distribution reducing
evinces arewest-facing
their 40.2 m andsolar
supplementary 58.3
roleheat gain,
and awhich
m,mitigating
in solarminimizes
dominant 1.45and
heat gain cooling ra
aspect
loads. The median
reducing
(length/width) excessive
impliesplan width
direct solar
that and length
radiation
increasing values
that
the causes are 40.2 overheating
indoor
south-facing m and 58.3 area
facade m, and
and a dominant
discomfort
improves perf
1.45 aspect
glare, therebyratio (length/width)
minimizing implies
cooling loads whilethat increasing
addressing the south-facing
glare-related
mance. Spatial depth and plan area cluster at 9 m/1500 m and 12 m/2500 m , indicati 2 visual facade
disturbances.
2 area
improvesCollectively,
performance.these findings identifyand
Spatial depth enlarger
plan shading
area cluster systems,
at 9 southeast/south
m/1500 m2 and orien-12 m/2500
an adaptive
tations,
strategy balancing
substantially large or
daylight
small spatial
accesshigh
depths,
(shallow
aspect
and small
ratios, and
plans)window
reduced
and heat red
m , indicating an adaptive strategy balancing daylight access (shallow and small plans)
2
tion (deep and large plans).
and heat reduction (deep and large plans).
Figure 17 shows the distribution of vertical parameters for Pareto-optimal solutio
Figure 17 shows the distribution of vertical parameters for Pareto-optimal solutions.
Floor heights
Floor heightsranging
rangingfrom
from 4.1 to 4.3
4.1 to 4.3 m
mreflect
reflecta abalance
balance between
between daylight
daylight volume
volume and a
spatial comfort.
spatial comfort.The
Themaximum allowablevalues
maximum allowable valuesforfor windowsill
windowsill (1.2(1.2 m) and
m) and ceiling
ceiling heig
heights
Sustainability 2025, 17, 4090 22 of 27
areas as key determinants in balancing EUI and UDI for high-rise office building perfor-
mance in China’s HSWW zone. Additionally, the observed parameter distribution patterns
and median values of Pareto-optimal solutions provide meaningful design guidelines for
such buildings in this climate zone, offering valuable references for future practices that
balance energy efficiency and visual comfort criteria.
4. Conclusions
This study proposes a performance-oriented optimization method for building mor-
phology, integrating machine learning (ML) algorithms with genetic algorithms (GAs), and
establishes a complete workflow. Guided by this framework, this paper uses Guangzhou
as a case study in China’s HSWW climate zone to optimize the form parameters of local
high-rise office buildings. The primary findings are as follows:
• Through comparative analysis of multiple ML algorithms, ensemble ML algorithms
are found to effectively capture the complex nonlinear relationships between build-
ing form parameters and performance metrics. Among them, the CatBoost algo-
rithm demonstrates the best predictive performance for this study’s target (R2 = 0.94,
CVRMSE = 1.59%).
• SHAP analysis shows that horizontal sunshade size (HSS), spatial depth (D), floor
height (FH), windowsill height (WSH), vertical sunshade size (VSS), and vertical
shading distance (VSD) strongly influence the predictions of the machine learning
model. Additionally, by increasing horizontal sunshade sizes, decreasing vertical shad-
ing distance, and adjusting building orientation to a slight southeast direction, these
form parameters become the most effective for performance optimization, achieving
reduced EUI while improving UDI. In general, SHAP analysis indicates that shad-
ing parameters have the greatest effect on performance results, followed by vertical
parameters, with planar parameters exerting the smallest influence.
• The Pareto-optimal morphological parameters generated by the surrogate model show
good agreement with their corresponding actual simulation results, with 87.7% (57 out
of 65) of the results having an error rate below 5% and an average error rate of 0.34%
for EUI and −1.4% for UDI. This demonstrates the effectiveness of the integrated
optimization approach using machine learning and genetic algorithms.
• Compared to the baseline model, a Pareto-optimal solution achieves a 3.31% reduction
in EUI and a 5.12% increase in UDI.
• Based on the Pareto-optimal solutions, the following design strategies for form pa-
rameters are proposed to fully enhance the energy-saving potential of high-rise office
buildings in China’s HSWW zone: (1) adopting a building orientation ranging from
due south to 30 degrees east of south; (2) using a rectangular floor plan measuring
approximately 40 m in width and 58 m in length (an aspect ratio of 1.45, total area of
about 2300 m2 , and office area depth of 12 m); (3) implementing a facade design with a
floor height of 4.0–4.2 m, larger possible windowsill and ceiling height, and a window-
to-wall ratio of 0.37–0.45; and (4) employing horizontal and vertical sunshades longer
than 1.3 m as well as high-density vertical sunshades.
However, this research has several limitations. First, the parametric model only uses
10 constrained form parameters, which are insufficient for designs requiring greater preci-
sion, although they are suitable for most early-stage design processes. Second, only two
performance objectives were selected, lacking consideration of other metrics (e.g., thermal
comfort, carbon emissions). Third, the algorithm comparison was limited to five well-
established classical algorithms, without exploring advanced algorithms or deep neural
network approaches. Additionally, this study only chose Guangzhou as a representative of
this climate zone. Although cities within the same climate zone share similar climatic condi-
Sustainability 2025, 17, 4090 23 of 27
tions, there are still subtle differences between them, which may lead to certain deviations
in the optimal results.
Future research should focus on the following directions: (1) applying the proposed
method to building designs with more parameters, such as the rotation angles of shading
devices and other geometric details; (2) incorporating additional performance objectives;
(3) investigating advanced machine learning (ML) algorithms to improve prediction accu-
racy; (4) extending this research method to more cities within this climate zone to obtain
more precise form optimization results and attempting to develop a universal prediction
model applicable to all major cities in this climate zone; (5) applying this method to the form
optimization of high-rise office buildings in other climatic zones of China; and (6) extending
the simulation duration to account for long-term climate change impacts.
Author Contributions: Conceptualization, X.X. and Y.N.; methodology, X.X. and Y.N.; software, X.X.;
validation, X.X., Y.N. and T.Z.; formal analysis, X.X. and T.Z.; investigation, X.X.; resources, X.X. and
Y.N.; data curation, X.X.; writing—original draft preparation, X.X.; writing—review and editing, X.X.;
visualization, X.X.; supervision, Y.N.; project administration, X.X. and Y.N.; funding acquisition, Y.N.
All authors have read and agreed to the published version of the manuscript.
Funding: This research was funded and supported by State Key Laboratory of Subtropical Building
Science, South China University of Technology for the project titled “Comprehensive Demonstration
of Green and Low-Carbon Construction Technologies for Buildings and Cities in China’s Hot-Summer
and Warm-Winter (HSWW) zone” (Grant No. 2022KC16).
Data Availability Statement: The data presented in this study are available on request from the
corresponding author.
Conflicts of Interest: Authors Xie Xie was a PhD student at South China University of Technology
and was currently interning at the company Architectural Design & Research Institute of South China
University of Technology (SCUT) Co., Ltd. Yang Ni wasere a professor of South China University of
Technology and employed by the company Architectural Design & Research Institute of South China
University of Technology (SCUT) Co., Ltd. Author Tianzi Zhang was employed by the company
Guangzhou International Engineering Consult Co., Ltd. The authors declare that the research was
conducted in the absence of any commercial or financial relationships that could be construed as a
potential conflict of interest.
Abbreviations
The following abbreviations are used in this manuscript:
References
1. WMO Confirms That 2023 Smashes Global Temperature Record. Available online: https://wmo.int/news/media-centre/wmo-
confirms-2023-smashes-global-temperature-record (accessed on 15 March 2025).
2. The 2019 Blue Book on Climate Change in China. Available online: https://www.cma.gov.cn/zfxxgk/gknr/qxbg/201905/t20190
524_1709279.html (accessed on 15 March 2025).
Sustainability 2025, 17, 4090 24 of 27
3. United Nations; Intergovernmental Panel on Climate Change (IPCC). Climate Change 2013: The Physical Science Basis; Plattner, M.,
Ed.; Cambridge University Press: Cambridge, UK, 2013.
4. Geng, Y.; Sarkis, J. China-US trade spat could hit the environment. Nature 2018, 557, 309. [CrossRef] [PubMed]
5. China Association of Building Energy Efficiency; Chongqing University. Research Report on Energy Consumption and Carbon
Emission of Buildings in China (2023). Construct Archit. 2024, 2024, 46–59.
6. Reshaping Energy: A Study on the Roadmap of China’s Energy Consumption and Production Revolution Towards 2050. 2016.
Available online: https://china.lbl.gov (accessed on 15 March 2025).
7. China Association of Building Energy Efficiency. 2022 Research Report of China Building Energy Consumption and Carbon Emissions;
China Association of Building Energy Efficiency: Chongqing, China, 2023; Volume 27, p. 12.
8. Ma, M.; Cai, W.; Wu, Y. China Act on the Energy Efficiency of Civil Buildings (2008): A decade review. Sci. Total Environ. 2019,
651, 42–60. [CrossRef] [PubMed]
9. Song, L.; Zhang, C.; Li, H.J. 2015 National Green Building Evaluation Label Statistical Report. Constr. Sci. Technol. 2016, 10, 12–15.
[CrossRef]
10. Ruparathna, R.; Hewage, K.; Sadiq, R. Improving the energy efficiency of the existing building stock: A critical review of
commercial and institutional buildings. Renew. Sustain. Energy Rev. 2016, 53, 1032–1045. [CrossRef]
11. The Ministry of Housing and Urban Rural Development of China. Several Opinions of the Ministry of Housing and Urban Rural
Development on Promoting the Development and Reform of the Construction Industry. Intell. Build. City Inf. 2014, 7, 24–28.
12. American Society of Heating, Refrigeration and Air-Conditioning Engineers. ASHRAE Handbook: Fundamentals; ASHRAE:
Atlanta, GA, USA, 2009.
13. EnergyPlus. Available online: https://energyplus.net/ (accessed on 15 March 2025).
14. TRNSYS: Transient System Simulation Tool. Available online: http://www.trnsys.com/ (accessed on 15 March 2025).
15. Lin, B.; Chen, H.; Liu, Y.; He, Q.; Li, Z. A Preference-Based Multi-Objective Building Performance Optimization Method for Early
Design Stage. Build. Simul. 2021, 14, 477–494. [CrossRef]
16. Li, Y.; O’Neill, Z.; Zhang, L.; Chen, J.; Im, P.; DeGraw, J. Grey-box modeling and application for building energy simulations—A
critical review. Renew. Sustain. Energy Rev. 2021, 146, 111–174. [CrossRef]
17. Manmatharasan, P.; Bitsuamlak, G.; Grolinger, K. AI-driven design optimization for sustainable buildings: A systematic review.
Energy Build. 2025, 332, 115440. [CrossRef]
18. Clarke, J.A.; Clarke, J.A. Energy Simulation in Building Design; Routledge: London, UK, 2001.
19. Javanroodi, K.; Nik, V.M.; Mahdavinejad, M. A novel design—Based optimization framework for enhancing the energy efficiency
of high-rise office buildings in urban areas. Sustain. Cities Soc. 2019, 49, 101577. [CrossRef]
20. Božiček, D.; Kunič, R.; Krainer, A.; Stritih, U.; Dovjak, M. Mutual Influence of External Wall Thermal Transmittance, Thermal
Inertia, and Room Orientation on Office Thermal Comfort and Energy Demand. Energies 2023, 16, 3524. [CrossRef]
21. Soflaei, F.; Shokouhian, M.; Tabadkani, A.; Moslehi, H.; Berardi, U. A simulation-based model for courtyard housing design based
on adaptive thermal comfort. J. Build. Eng. 2020, 31, 101335. [CrossRef]
22. Du, Y.; Mak, C.M.; Li, Y. A multi-stage optimization of pedestrian level wind environment and thermal comfort with lift-up
design in ideal urban canyons. Sustain. Cities Soc. 2019, 46, 101424. [CrossRef]
23. Moazzeni, M.H.; Ghiabaklou, Z. Investigating the Influence of Light Shelf Geometry Parameters on Daylight Performance and
Visual Comfort, a Case Study of Educational Space in Tehran, Iran. Buildings 2016, 6, 26. [CrossRef]
24. Alhagla, K.; Mansour, A.; Elbassuoni, R. Optimizing windows for enhancing daylighting performance and energy saving. Alex.
Eng. J. 2019, 58, 283–290. [CrossRef]
25. Susa-Páez, A.; Piderit-Moreno, M.B. Geometric Optimization of Atriums with Natural Lighting Potential for Detached High-Rise
Buildings. Sustainability 2020, 12, 6651. [CrossRef]
26. Gan, V.J.L.; Wang, B.; Chan, C.M.; Weerasuriya, A.U.; Cheng, J.C.P. Physics-based, data-driven approach for predicting natural
ventilation of residential high-rise buildings. Build. Simul. 2022, 15, 129–148. [CrossRef]
27. Østergård, T.; Jensen, R.L.; Maagaard, S.E. Building simulations supporting decision making in early design—A review. Renew.
Sustain. Energy Rev. 2016, 61, 187–201. [CrossRef]
28. Wortmann, T.; Cichocka, J.; Waibel, C. Simulation-based optimization in architecture and building engineering—Results from an
international user survey in practice and research. Energy Build. 2022, 259, 111863. [CrossRef]
29. Radford, A.D.; Gero, J.S. On optimization in computer aided architectural design. Build. Environ. 1980, 15, 73–80. [CrossRef]
30. Deb, K. Multi-Objective Optimization Using Evolutionary Algorithm; John Wiley & Sons: Hoboken, NJ, USA, 2001; p. 497.
31. Longo, S.; Montana, F.; Riva Sanseverino, E. A review on optimization and cost-optimal methodologies in low-energy buildings
design and environmental considerations. Sustain. Cities Soc. 2019, 45, 87–104. [CrossRef]
32. Attia, S. Computational Optimisation for Zero Energy Building Design, Interviews with Twenty Eight International Experts.
In Proceedings of the Building Simulation 2013—13th International IBPSA Conference, Chambery, France, 25–28 August 2012;
Architecture et Climat: Paris, France, 2012.
Sustainability 2025, 17, 4090 25 of 27
33. Wetter, M.; Wright, J.A. A comparison of deterministic and probabilistic optimization algorithms for nonsmooth
simulation—Based optimization. Build. Environ. 2004, 39, 989–999. [CrossRef]
34. Hamdy, M.; Nguyen, A.-T.; Hensen, J.L.M. A performance comparison of multi-objective optimization algorithms for solving
nearly-zero-energy-building design problems. Energy Build. 2016, 121, 57–71. [CrossRef]
35. modeFRONTIER. Available online: http://www.esteco.com/modefrontier (accessed on 15 March 2025).
36. Octopus. Available online: https://www.grasshopper3d.com/group/octopus?overrideMobileRedirect=1 (accessed on
15 March 2025).
37. Alelwani, R.; Ahmad, M.W.; Rezgui, Y.; Alshammari, K. Optimising Energy Efficiency and Daylighting Performance for Designing
Vernacular Architecture—A Case Study of Rawshan. Sustainability 2025, 17, 315. [CrossRef]
38. Wang, M.; Xu, Y.; Shen, R.; Wu, Y. Performance—Oriented Parametric Optimization Design for Energy Efficiency of Rural
Residential Buildings: A Case Study from China’s Hot Summer and Cold Winter Zone. Sustainability 2024, 16, 8330. [CrossRef]
39. Chaturvedi, S.; Rajasekar, E.; Natarajan, S. Multi-objective Building Design Optimization under Operational Uncertainties Using
the NSGA II Algorithm. Buildings 2020, 10, 88. [CrossRef]
40. Zhao, J.; Du, Y. Multi-objective optimization design for windows and shading configuration considering energy consumption and
thermal comfort: A case study for office building in different climatic regions of China. Sol. Energy 2020, 206, 997–1017. [CrossRef]
41. Amasyali, K.; El-Gohary, N.M. A review of data-driven building energy consumption prediction studies. Renew. Sustain. Energy
Rev. 2018, 81, 1192–1205. [CrossRef]
42. Qiao, Q.; Yunusa-Kaltungo, A.; Edwards, R.E. Towards developing a systematic knowledge trend for building energy consumption
prediction. J. Build. Eng. 2021, 35, 101967. [CrossRef]
43. Zhang, L.; Wen, J.; Li, Y.; Chen, J.; Ye, Y.; Fu, Y.; Livingood, W. A review of machine learning in building load prediction. Appl.
Energy 2021, 285, 116452. [CrossRef]
44. Kalogirou, S.A. Applications of artificial neural-networks for energy systems. Appl. Energy 2000, 67, 17–35. [CrossRef]
45. Wong, S.L.; Wan, K.K.W.; Lam, T.N.T. Artificial neural networks for energy analysis of office buildings with daylighting. Appl.
Energy 2010, 87, 551–557. [CrossRef]
46. Moon, J.W.; Kim, J.-J. ANN-based thermal control models for residential buildings. Build. Environ. 2010, 45, 1612–1625. [CrossRef]
47. Geyer, P.; Singaravel, S. Component-based machine learning for performance prediction in building design. Appl. Energy 2018,
228, 1439–1453. [CrossRef]
48. Shao, M.; Wang, X.; Bu, Z.; Chen, X.; Wang, Y. Prediction of energy consumption in hotel buildings via support vector machines.
Sustain. Cities Soc. 2020, 57, 102128. [CrossRef]
49. Liu, Y.; Chen, H.; Zhang, L.; Wu, X.; Wang, X. -J. Energy consumption prediction and diagnosis of public buildings based on
support vector machine learning: A case study in China. J. Clean. Prod. 2020, 272, 122542. [CrossRef]
50. Cai, W.; Wen, X.; Li, C.; Shao, J.; Xu, J. Predicting the energy consumption in buildings using the optimized support vector
regression model. Energy 2023, 273, 127188. [CrossRef]
51. Wu, C.; Pan, H.; Luo, Z.; Liu, C.; Huang, H. Multi-objective optimization of residential building energy consumption, daylighting,
and thermal comfort based on BO-XGBoost-NSGA-II. Build. Environ. 2024, 254, 111386. [CrossRef]
52. Yan, K.; Li, W.; Ji, Z.; Qi, M.; Du, Y. A Hybrid LSTM Neural Network for Energy Consumption Forecasting of Individual
Households. IEEE Access 2019, 7, 157633–157642. [CrossRef]
53. Yu, Z.; Haghighat, F.; Fung, B.C.M.; Yoshino, H. A decision tree method for building energy demand modeling. Energy Build.
2010, 42, 1637–1646. [CrossRef]
54. Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [CrossRef]
55. Wang, Z.; Wang, Y.; Zeng, R.; Srinivasan, R.S.; Ahrentzen, S. Random Forest based hourly building energy prediction. Energy
Build. 2018, 171, 11–25. [CrossRef]
56. Pham, A.-D.; Ngo, N.-T.; Truong, T.T.H.; Huynh, N.-T.; Truong, N.-S. Predicting energy consumption in multiple buildings using
machine learning for improving energy efficiency and sustainability. J. Clean. Prod. 2020, 260, 121082. [CrossRef]
57. Chi, B.; Li, Y.; Zhou, D. A Hybrid Method of Cooling and Heating Consumption Prediction for Six Types of Buildings Based on
Machine Learning. Sustainability 2024, 16, 11200. [CrossRef]
58. Safarzadegan Gilan, S.; Goyal, N.; Dilkina, B. Active learning in multi-objective evolutionary algorithms for sustainable building
design. In Proceedings of the Genetic and Evolutionary Computation Conference 2016, Denver, CO, USA, 20–24 July 2016.
59. Chen, X.; Yang, H. A multi-stage optimization of passively designed high-rise residential buildings in multiple building operation
scenarios. Appl. Energy 2017, 206, 541–557. [CrossRef]
60. Gou, S.; Nik, V.M.; Scartezzini, J.L.; Zhao, Q.; Li, Z. Passive design optimization of newly-built residential buildings in Shanghai
for improving indoor thermal comfort while reducing building energy demand. Energy Build. 2018, 169, 484–506. [CrossRef]
61. Ilbeigi, M.; Ghomeishi, M.; Dehghanbanadaki, A. Prediction and optimization of energy consumption in an office building using
artificial neural network and a genetic algorithm. Sustain. Cities Soc. 2020, 61, 102325. [CrossRef]
Sustainability 2025, 17, 4090 26 of 27
62. Chen, R.; Tsay, Y.-S.; Ni, S. An integrated framework for multi-objective optimization of building performance: Carbon emissions,
thermal comfort, and global cost. J. Clean. Prod. 2022, 359, 131978. [CrossRef]
63. Ding, Z.; Li, J.; Wang, Z.; Xiong, Z. Multi-Objective Optimization of Building Envelope Retrofits Considering Future Climate
Scenarios: An Integrated Approach Using Machine Learning and Climate Models. Sustainability 2024, 16, 8217. [CrossRef]
64. Si, B.; Ni, Z.; Xu, J.; Li, Y.; Liu, F. Interactive effects of hyperparameter optimization techniques and data characteristics on
the performance of machine learning algorithms for building energy metamodeling. Case Stud. Therm. Eng. 2024, 55, 104124.
[CrossRef]
65. Al-Masrani, S.M.; Al-Obaidi, K.M. Dynamic shading systems: A review of design parameters, platforms and evaluation strategies.
Autom. Constr. 2019, 102, 195–216. [CrossRef]
66. Zhou, F.; Wang, Z.; Su, X.; Yang, Y.; Duanmu, L.; Zhou, X.; Lian, Z.; Zhai, Y.; Cao, B.; Zhang, Y.; et al. Study on the Thermal
Adaptation Model During the Transition Season in Hot Summer and Cold Winter Regions. Heat. Vent. Air Cond. 2022, 52, 132–136.
[CrossRef]
67. Kheiri, F. A review on optimization methods applied in energy-efficient building geometry and envelope design. Renew. Sustain.
Energy Rev. 2018, 92, 897–920. [CrossRef]
68. Li, S.; Liu, L.; Peng, C. A Review of Performance-Oriented Architectural Design and Optimization in the Context of Sustainability:
Dividends and Challenges. Sustainability 2020, 12, 1427. [CrossRef]
69. Xuanyuan, P.; Zhang, Y.; Yao, J.; Zheng, R. Sensitivity Analysis and Optimization of Energy-Saving Measures for Office Building
in Hot Summer and Cold Winter Regions. Energies 2024, 17, 1675. [CrossRef]
70. Ma, Y.; Deng, W.; Xie, J.; Heath, T.; Xiang, Y.; Hong, Y. Generating prototypical residential building geometry models using a new
hybrid approach. Build. Simul. 2022, 15, 17–28. [CrossRef]
71. Touloupaki, E.; Theodosiou, T. Performance Simulation Integrated in Parametric 3D Modeling as a Method for Early Stage Design
Optimization—A Review. Energies 2017, 10, 637. [CrossRef]
72. Honeybee for Grasshopper. Available online: https://github.com/mostaphaRoudsari/Honeybee/ (accessed on 15 March 2025).
73. Ward, G.J. The Radiance lighting simulation and rendering system. In Proceedings of the 21st Annual Conference on Computer
Graphics and Interactive Techniques, SIGGRAPH, Orlando, FL, USA, 24–29 July 1994.
74. GB 55015-2021; General Specification for Building Energy Efficiency and Renewable Energy Utilization. China Architecture &
Building Press: Beijing, China, 2022.
75. GB 50189-2015; General Administration of Quality Supervision, Inspection and Quarantine of the People’s Republic of China.
Design Standard for Energy Efficiency of Public Buildings. China Architecture & Building Press: Beijing, China, 2015.
76. ASHRAE Standard 90.1-2019; Energy Standard for Buildings Except Low-Rise Residential Buildings. ASHRAE: Atlanta, GA,
USA, 2019.
77. GB 50352-2019; Unified Standard for Civil Building Design. China Architecture & Building Press: Beijing, China, 2014.
78. Honeybee. Available online: https://www.ladybug.tools/honeybee.html (accessed on 15 March 2025).
79. Roudsari, M.S.; Pak, M.; Smith, A. Ladybug: A parametric environmental plugin for grasshopper to help designers create an
environmentally-conscious design. In Proceedings of the 13th International IBPSA Conference, Lyon, France, 25–28 August 2013;
Volume 8.
80. Negendahl, K.; Nielsen, T.R. Building energy optimization in the early design stages: A simplified method. Energy Build. 2015,
105, 88–99. [CrossRef]
81. Nabil, A.; Mardaljevic, J. Useful daylight illuminance: A new paradigm for assessing daylight in buildings. Light Res. Technol.
2005, 37, 41–57. [CrossRef]
82. Tian, W. A review of sensitivity analysis methods in building energy analysis. Renew. Sustain. Energy Rev. 2013, 20, 411–419.
[CrossRef]
83. Mahmoud, A.H.A.; Elghazi, Y. Parametric-based designs for kinetic facades to optimize daylight performance: Comparing
rotation and translation kinetic motion for hexagonal facade patterns. Solar Energy 2016, 126, 111–127. [CrossRef]
84. Helton, J.C.; Johnson, J.D.; Sallaberry, C.J.; Storlie, C.B. Survey of sampling-based methods for uncertainty and sensitivity analysis.
Reliab. Eng. Syst. Saf. 2006, 91, 1175–1209. [CrossRef]
85. Ascione, F.; Bianco, N.; De Stasio, C.; Mauro, G.M.; Vanoli, G.P. Artificial neural networks to predict energy performance and
retrofit scenarios for any member of a building category: A novel approach. Energy 2017, 118, 999–1017. [CrossRef]
86. Drucker, H.; Burges, C.J.; Kaufman, L.; Smola, A.; Vapnik, V. Support vector regression machines. Adv. Neural Inf. Process. Syst.
1997, 9, 155–161.
87. Jain, R.K.; Smith, K.M.; Culligan, P.J.; Taylor, J.E. Forecasting energy consumption of multi-family residential buildings using
support vector regression: Investigating the impact of temporal and spatial monitoring granularity on performance accuracy.
Appl. Energy 2014, 123, 168–178. [CrossRef]
Sustainability 2025, 17, 4090 27 of 27
88. Chen, T.; Guestrin, C. XGBoost: A scalable tree boosting system. In Proceedings of the KDD’16: Proceedings of the 22nd ACM
SIGKDD International Conference on Knowledge Discovery and Data Mining, Hong Kong, China, 13–17 August 2016; Volume 9,
p. 3. [CrossRef]
89. Yandex. CatBoost: Gradient Boosting with Categorical Features. Available online: https://catboost.ai (accessed on
16 March 2025).
90. Bian, J.; Wang, J.; Yece, Q. A novel study on power consumption of an HVAC system using CatBoost and AdaBoost algorithms
combined with the metaheuristic algorithms. Energy 2024, 302, 131841. [CrossRef]
91. American Society of Heating, Refrigerating and Air-Conditioning Engineers. Measurement of Energy and Demand Savings (ASHRAE
Guideline 14-2014); ASHRAE: Atlanta, GA, USA, 2014.
92. Deb, K.; Agrawal, S.; Pratap, A.; Meyarivan, T. A fast elitist non-dominated sorting genetic algorithm for multi-objective
optimization: NSGA-II. In Proceedings of the International Conference on Parallel Problem Solving from Nature, Paris, France,
18–20 September 2000; pp. 849–858. [CrossRef]
93. Delgarm, N.; Sajadi, B.; Delgarm, S.; Kowsary, F. A novel approach for the simulation-based optimization of the buildings energy
consumption using NSGA-II: Case study in Iran. Energy Build. 2016, 127, 552–560. [CrossRef]
94. Cohen, J. Statistical Power Analysis for the Behavioral Sciences, 2nd ed.; Lawrence Erlbaum Associates: Mahwah, NJ, USA, 1988.
95. Lundberg, S.; Lee, S. A unified approach to interpreting model predictions. In Proceedings of the 31st International Conference
on Neural Information Processing Systems (NIPS’17), Long Beach, CA, USA, 4–9 December 2017; Volume 4, p. 12.
96. Salih, A.M.; Raisi-Estabragh, Z.; Galazzo, I.B.; Radeva, P.; Petersen, S.E.; Lekadir, K.; and Menegaz, G. A Perspective on
Explainable Artificial Intelligence Methods: SHAP and LIME. Adv. Intell. Syst. 2024, 7, 2400304. [CrossRef]
97. Suga, K.; Kato, S.; Hiyama, K. Structural analysis of Pareto-optimal solution sets for multi-objective optimization: An application
to outer window design problems using Multiple Objective Genetic Algorithms. Build. Environ. 2010, 45, 1144–1152. [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.