0% found this document useful (0 votes)

20 views27 pages

Machine Learning Enhanced Building Performance Gui

This study presents a machine-learning-enhanced optimization framework for high-rise office buildings in China's Hot Summer and Warm Winter zone, focusing on energy efficiency and daylight performance. By integrating machine learning algorithms with multi-objective genetic optimization, the research identifies key architectural parameters that influence energy use intensity and daylight illuminance. The findings provide architects and engineers with data-driven tools for sustainable building design during the early stages of development.

Uploaded by

bhalodiya.neel

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

20 views27 pages

Machine Learning Enhanced Building Performance Gui

Uploaded by

bhalodiya.neel

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 27

Article

Machine-Learning-Enhanced Building Performance-Guided

Form Optimization of High-Rise Office Buildings in China’s Hot
Summer and Warm Winter Zone—A Case Study of Guangzhou
Xie Xie 1,2, *, Yang Ni 1,2, * and Tianzi Zhang 3

1 State Key Laboratory of Subtropical Building Science, School of Architecture, South China University of
Technology, Guangzhou 510641, China
2 Architectural Design & Research Institute of South China University of Technology (SCUT) Co., Ltd.,
Guangzhou 510641, China
3 Guangzhou International Engineering Consult Co., Ltd., Guangzhou 510600, China
* Correspondence: [email protected] (X.X.); [email protected] (Y.N.)

Abstract: Given their dominant role in energy expenditure within China’s Hot Summer
and Warm Winter (HSWW) zone, high-fidelity performance prediction and multi-objective
optimization framework during the early design phase are critical for achieving sustain-
able energy efficiency. This study presents an innovative approach integrating machine
learning (ML) algorithms and multi-objective genetic optimization to predict and opti-
mize the performance of high-rise office buildings in China’s HSWW zone. By integrating
Rhino/Grasshopper parametric modeling, Ladybug Tools performance simulation, and
Python programming, this study developed a parametric high-rise office building model
and validated five advanced and mature machine learning algorithms for predicting energy
use intensity (EUI) and useful daylight illuminance (UDI) based on architectural form
parameters under HSWW climatic conditions. The results demonstrate that the CatBoost
algorithm outperforms other models with an R2 of 0.94 and CVRMSE of 1.57%. The Pareto
optimal solutions identify substantial shading dimensions, southeast orientations, high as-
pect ratios, appropriate spatial depths, and reduced window areas as critical determinants
for optimizing EUI and UDI in high-rise office buildings of the HSWW zone. This research
Academic Editor: Peixian Li
fills a gap in the existing literature by systematically investigating the application of ML
Received: 16 March 2025
algorithms to predict the complex relationships between architectural form parameters and
Revised: 25 April 2025
performance metrics in high-rise building design. The proposed data-driven optimization
Accepted: 29 April 2025
Published: 1 May 2025 framework provides architects and engineers with a scientific decision-making tool for
early-stage design, offering methodological guidance for sustainable building design in
Citation: Xie, X.; Ni, Y.; Zhang, T.
Machine-Learning-Enhanced Building similar climatic regions.
Performance-Guided Form
Optimization of High-Rise Office Keywords: high-rise office buildings; building performance; machine learning; building
Buildings in China’s Hot Summer and performance optimization; Pareto-optimal solutions; hot-summer and warm winter zone
Warm Winter Zone—A Case Study of
Guangzhou. Sustainability 2025, 17,
4090. https://doi.org/10.3390/
su17094090
1. Introduction
Copyright: © 2025 by the authors.
1.1. Background of the Study
Licensee MDPI, Basel, Switzerland.
This article is an open access article 2023 was officially declared the warmest year in recorded human history by the World
distributed under the terms and Meteorological Organization (WMO), with projections indicating further intensification of
conditions of the Creative Commons heatwaves in the coming decades [1]. Meanwhile, the past two decades (2000–2019) also
Attribution (CC BY) license marked the warmest period in China since 1900 [2]. The continuous temperature increase
(https://creativecommons.org/
has already exerted significant impacts on human society. The Intergovernmental Panel
licenses/by/4.0/).

Sustainability 2025, 17, 4090 https://doi.org/10.3390/su17094090

Sustainability 2025, 17, 4090 2 of 27

on Climate Change (IPCC) has identified anthropogenic greenhouse gas emissions as the
primary driver of accelerating global warming trends [3]. Hence, implementing energy
conservation measures represents a critical strategy for preserving ecological integrity and
advancing sustainable development objectives in the context of climate change mitigation.
As the world’s second-largest carbon emitter [4], China faces significant opportunities
for energy conservation and emission reduction, particularly within the construction sector.
Recent studies have indicated that energy consumption in China’s building industry
accounts for 36.3% of the national total [5], positioning it as the sector with the highest
decarbonization potential among the three primary energy-consuming sectors (construction,
industry, and transportation) [6]. Office buildings, representing the most rapidly expanding
building typology over the past decade, now comprise 20% of China’s public building
stock [7]. These structures contribute to 20% of the nation’s total energy use, with per-unit-
area electricity consumption 10 to 20 times greater than that of residential buildings [8,9]. In
China’s HSWW zone, the eight-month air conditioning season further exacerbates energy
intensity in office buildings, resulting in energy consumption significantly higher than the
national average. Projections predict that commercial buildings will continue to increase
their share of total energy consumption in the coming decades [10]. Given this concern,
optimizing the performance of high-rise office buildings in China’s HSWW zone has
emerged as a critical priority for advancing the nation’s energy conservation and emission
reduction agenda.
Performance optimization during the early-stage architectural design phase is crucial
for achieving energy-efficient buildings, with studies demonstrating up to 40% energy
savings potential [11]. Building performance simulation (BPS) plays a key role in design
validation through two model categories [12]: forward models (e.g., EnergyPlus [13],
TRNSYS [14]) rely on physical principles for accurate predictions [15], while inverse
models—also known as data-driven methods (comprising black-box (e.g., ML) and gray-
box approaches [16])—offer computational efficiency for parametric optimization. Specifi-
cally, the efficiency of data-driven methods enables architects to explore broader design
spaces within constrained timelines and identify sustainable design solutions rapidly in
early design stages [15,17].
However, their accuracy remains context-dependent, particularly when it comes to
predicting the relationship between a building’s architectural form and its performance.
This study addresses these limitations by developing an integrated optimization framework
and taking high-rise office buildings in China’s HSWW zone as an example. By combining
real-time feedback mechanisms with multi-objective optimization, the approach enhances
the practicality of traditional green design workflows while maintaining academic rigor
through validated simulation tools and parametric analysis.

1.2. Related Work

After over five decades of evolution [18], building performance simulation (BPS) tech-
nology has achieved significant advancements in accurately predicting building energy
consumption [19,20], thermal environments [20–22], daylight levels [23–25], and natural
ventilation [26]. Leveraging BPS, researchers can conduct performance optimization of
architectural designs [27]. Building performance optimization (BPO) is classified into
single-objective optimization (SOO) and multi-objective optimization (MOO) based on the
number of performance metrics targeted. Given the potential trade-offs among conflict-
ing performance objectives (e.g., energy efficiency vs. daylight level), MOO has gained
prominence in both academic research and industrial applications due to its ability to
balance competing requirements [28]. Unlike SOO, MOO yields a set of Pareto-optimal
Sustainability 2025, 17, 4090 3 of 27

solutions represented by the Pareto frontier [29,30]—a concept widely adopted to visualize
non-dominated design alternatives [31].
To reduce computational costs in performance simulation, traditional MOO ap-
proaches integrate building performance simulation (BPS) engines with optimization
algorithms [32], such as genetic algorithms (GA), particle swarm optimization (PSO),
ant colony optimization (ACO), and simulated annealing (SA). Among these, GA and its
variants have emerged as the most prevalent and effective methods [33,34]. Platforms
like ModeFrontier [35] and Octopus [36] have operationalized these algorithms, enabling
designers to conduct parametric optimizations efficiently. Recent applications include
Alelwani et al.’s [37] GA-based optimization of vernacular Rawshan elements in Saudi
Arabian buildings to improve energy efficiency and useful daylight illuminance (UDI).
Wang et al. [38] applied the SPEA-2 algorithm to optimize annual cooling and lighting
energy consumption in rural residences of China’s Hot Summer and Cold Winter Zone.
Chaturvedi et al. [39] used NSGA-II to balance annual energy use and cooling duration
in Indian residential buildings, while Zhao et al. [40] employed NSGA-II to optimize win-
dow and shading parameters for thermal comfort and energy performance in a high-rise
office building.
While GA reduce simulation frequency and computational time compared to brute-
force approaches, their efficiency remains insufficient for real-world design workflows [41].
Recent advancements in ML have revolutionized building performance prediction by de-
ploying surrogate models trained on limited simulation datasets, enabling rapid feedback
and optimization [42]. Over 100 ML algorithms have been applied to building perfor-
mance modeling [43], with dominant approaches including artificial neural networks
(ANN) [44–47], support vector regression (SVR) [48–50], gradient-boosted decision trees
(GBDT) [51], long short-term memory networks (LSTM) [52], decision trees (DT) [53],
random forests (RF) [54–56], and their variants. Chi et al. [57] demonstrated over 90%
accuracy in predicting heating/cooling energy consumption across six building typologies
using eight ML algorithms. Siamak et al. [58] integrated Gaussian process regression (GPR)
with MOO to optimize nine design parameters for heating/cooling energy savings. Chen
et al. [59] developed surrogate models using MLR, MARS, and SVM for 5000+ simula-
tion cases of Hong Kong high-rise residential buildings, identifying SVM as the optimal
performer before applying NSGA-II to derive Pareto-optimal solutions for three energy
objectives. Gou et al. [60] combined ANN with NSGA-II to analyze 20 design metrics
affecting cooling thermal response (CTR) and building energy density (BED) in Shanghai
high-rises. Wu et al. [51] employed BO-XGBoost to model envelope parameters’ impacts on
energy, daylighting, and thermal comfort, achieving Pareto optimization through NSGA-II.
Overall, the core of integrating ML and GA resides in the combination of “predictive
capability” and “optimization capability”: the ML enables quick and accurate prediction of
building performance, while the GA supports more effective optimization. This integra-
tion overcomes the limitations of traditional methods in terms of accuracy, efficiency, and
multi-objective balance, thereby establishing itself as a key technical approach for advanced
building performance prediction and optimization.
While ML-driven building optimization has made notable advancements, prior studies
have primarily focused on thermal design parameters [61–64] or geometric adjustments to
specific building components [65], leaving the relationship between architectural form and
performance underexplored. Additionally, despite the recognized importance of occupant
comfort in enhancing health and productivity [66], existing ML-based frameworks have
primarily focused on energy efficiency, overlooking metrics such as thermal comfort and
daylight illuminance [67–69]. Moreover, as demonstrated in the prior literature, no single
ML algorithm universally outperforms others across all building performance tasks, as
Sustainability 2025, 17, 4090 4 of 27

prediction accuracy is inherently dependent on data characteristics (e.g., data quality,

dimensionality) and task-specific requirements. Consequently, algorithm selection must be
context-aware, necessitating iterative validation through cross-scenario testing to determine
optimal modeling approaches.

1.3. Aims and Originality

To address these research gaps, this study develops a multi-objective optimization
framework that integrates ML and GA for performance prediction and optimization of
energy consumption and daylight level in high-rise office buildings during the early design
phase, explicitly incorporating parameters of architectural form as design variables. The
framework is validated using a case study of buildings in China’s HSWW zone. The
research objectives are threefold:
• Develop a form parametric model of typical high-rise office buildings in China’s
HSWW zone, simulate their performance across diverse form parameter combinations,
and train high-fidelity ML surrogate models for energy use intensity (EUI) and useful
daylight illuminance (UDI).
• Integrate the surrogate models with GA to establish a computationally efficient multi-
objective optimization workflow.
• Provide designers and policymakers with Pareto-optimal solutions and optimal archi-
tectural form parameter ranges for balancing energy efficiency and daylight levels in
HSWW high-rise office buildings.
The paper is structured as follows: Section 2 outlines the methodology, including
climate data selection for China’s HSWW zone; parametric model development using
Rhino 7.0/Grasshopper 1.0; performance simulation via Ladybug Tools 1.4.0; machine
learning (ML) algorithm selection criteria; and implementation of the GA-based optimiza-
tion framework using Python 3.1.2. Section 3 discusses the accuracy of five ML models
and presents Pareto-optimal solutions for EUI and UDI, along with corresponding optimal
parameter distributions. Section 4 concludes the study, highlighting limitations and future
research directions.

2. Methodology
As illustrated in Figure 1, the research framework comprises three interconnected
phases.
Phase 1: Develop the parametric building model, which involves two core tasks:
(1) defining parameters of architectural form (e.g., building orientation, building width,
room depth, etc.), configuring the thermal parameter of the building envelope, selecting
proper climate data files, and establishing occupancy schedules to create a parametric
prototype of high-rise office buildings in China’s HSWW region; (2) generating datasets
of parameter combinations via Latin hypercube sampling within the defined parame-
ter ranges.
Phase 2: Run the performance simulation for each sampled building model using
Ladybug Tools, generating a ‘parameter-performance’ dataset where each parameter com-
bination is mapped to its corresponding EUI and UDI values.
Phase 3: Train multiple ML algorithms on the simulated datasets to develop surrogate
models for multi-objective prediction. The best-performing models are integrated with
the non-dominated sorting genetic algorithm (NSGA-II) to derive Pareto-optimal solu-
tions. Based on the derived Pareto front, the optimal ranges and interactions of building
design parameters are analyzed to identify performance trade-offs between EUI and UDI,
thereby guiding design decision-making. The subsequent sections detail the tools and
methodologies employed in each phase.
design parameters are analyzed to identify performance trade-oﬀs between EUI and U
thereby guiding design decision-making. The subsequent sections detail the tools a
Sustainability 2025, 17, 4090 5 of 27
methodologies employed in each phase.

Figure 1. Research framework.

Figure 1. Research framework.
2.1. Development of Parametric Model
Given theofmorphological
2.1. Development diversity of high-rise office buildings, a parametric pro-
Parametric Model
totype representing typical configurations in China’s HSWW zone was developed [70].
Given
To enabletheefficient
morphological diversity
iterative design, of high-rise office
Rhino/Grasshopper buildings,
was selected as thea modeling
parametric pro
type platform
representing
due to typical
its visualconfigurations in China’s interoperability
programming environment, HSWW zone with was simulation
developed [70].
tools,
enable and widespread
efficient iterative adoption in performance-driven was
design, Rhino/Grasshopper design [71]. The
selected as integration
the modeling pl
of Ladybug and Honeybee [72]—plugins leveraging EnergyPlus 9.6.0 and Radiance
form due to its visual programming environment, interoperability with simulation too
5.4a [73]—allowed seamless connection between parametric modifications and multi-
and widespread adoption in performance-driven design [71]. The integration of Ladyb
domain performance analysis.
and Honeybee
Based on[72]—plugins leveraging
the survey of typical EnergyPlus
HSWW high-rise 9.6.0 and
office buildings, theRadiance 5.4a [73]—
scope of param-
lowed seamless
eters connection
of architectural between
form was parametric
confirmed to guide modifications
the developmentand multi-domain
of the parametric perf
mancemodels. Parametric models were constructed for planar, vertical, and shading parameters,
analysis.
with each parameter adjustment automatically generating a new 3D model. To reduce
Based on the survey of typical HSWW high-rise office buildings, the scope of para
model complexity and facilitate subsequent machine learning fitting, only 10 essential
etersindependent
of architectural form was confirmed to guide the development of the paramet
variables were selected to create the parametric model:
models.
•
Parametric models were constructed for planar, vertical, and shading paramete
In the planar aspect, the orientation parameter O, representing the rotation angle of
with each theparameter adjustment
parametric model automatically
and simulating generating
building orientation, a new
changes 3D model.
in 15-degree stepsTo redu
model complexity and range
and has its value facilitate subsequent
limited to half a full machine learning
circle (180 degrees) duefitting, only 10 essential
to the symmetrical
dependent plan. The plan were
variables width parameter
selected toW and
createaspect
theratio R define the planar size
parametermodel:
parametric
and shape, while the spatial depth parameter D determines the dimensions of office
• In the planar
areas aspect,
and core zones.the orientation parameter O, representing the rotation angle
the parametric model and simulating building orientation, changes in 15-deg
steps and has its value range limited to half a full circle (180 degrees) due to the sym-
metrical plan. The plan width parameter W and aspect ratio parameter R define the
Sustainability 2025, 17, 4090 planar size and shape, while the spatial depth parameter D determines the dimen- 6 of 27
sions of office areas and core zones.
• In the vertical aspect, three parameters—floor height parameter FH, windowsill
• In the vertical
height aspect,
parameter WSH,three
andparameters—floor
ceiling height parameter CH—are FH,
height parameter windowsill
sufficient height
to construct
parameter WSH, and ceiling height parameter CH—are sufficient
any common high-rise office facade. Notably, unlike most existing studies that rely to construct any
common high-rise office
on the commonly usedfacade. Notably, unlike
window-to-wall ratio most
(WWR)existing studiesto
parameter that rely onwin-
control the
commonly used window-to-wall ratio (WWR)
dow/curtain wall size, this study utilizes WSH and CH parameter to control window/curtain
precisely define the size
wall
and size, thisposition
vertical study utilizes WSH and CH towalls
of windows/curtain precisely
on thedefine the size
building and vertical
facade. posi-
This allows
tion of windows/curtain walls on the building facade. This allows
for a more accurate assessment of how window size and placement influence the for a more accurate
assessment of how window size and placement influence the building’s performance.
building’s performance.
•• In the shading aspect, three
In the shading aspect, three parameters—horizontal
parameters—horizontal sunshadesunshade size HSS, vertical
size HSS, vertical sun-
sun-
shade size VSS,
shade size VSS, and
and vertical
vertical sunshade distance VSD—can
sunshade distance VSD—can model model most
most common
common
building
building shading configurations. In the parametric model, horizontal sunshades are
shading configurations. In the parametric model, horizontal sunshades are
fixed
fixed at
at the
the upper
upper edge
edge ofof windows/curtain
windows/curtain walls,walls, while
while vertical
vertical shading
shading panels
panels are
are
evenly
evenly distributed
distributedalong
alongeach
eachfacade
facade at at
intervals defined
intervals by VSD.
defined by VSD.BothBoth
horizontal and
horizontal
vertical sunshades
and vertical are oriented
sunshades at a 90-degree
are oriented angleangle
at a 90-degree to thetofacade.
the facade.
By
By controlling
controlling the
the1010parameters
parametersmentioned
mentionedabove,
above,thetheparametric
parametric model
model can generate
can gener-
forms that match commonly found high-rise office building types
ate forms that match commonly found high-rise office building types in reality. in reality.
Figure
Figure 22 illustrates the parameters’
illustrates the parameters’ specific
specific location
location on
on the
the building.
building. Table
Table 11 summa-
summa-
rizes
rizes the value ranges and step sizes for all independent variables (covariates are derived
the value ranges and step sizes for all independent variables (covariates are derived
from
from the independent parameters), while Figure 3 further visualizes several 3D models
the independent parameters), while Figure 3 further visualizes several 3D models
under
under different
differentparametric
parametricconfigurations.
configurations.Additionally,
Additionally, a set of of
a set parameters
parameters forfor
thethe
baseline
base-
model was established based on common high-rise office building typologies
line model was established based on common high-rise office building typologies in the in the HSWW
zone.
HSWW The performance
zone. of the baseline
The performance model serves
of the baseline model to quantify
serves the improvement
to quantify of the
the improvement
optimization through
of the optimization comparative
through analysis.
comparative analysis.

VSS

HSS

VSD WSH

(a) (b)

Thereferences
Figure 2. The referencesfor
foreach
eachparameter:
parameter:(a)(a) reference
reference of of
thethe planar
planar parameters.
parameters. For For planar
planar pa-
Sustainability 2025, 17, x FOR PEERparameters,
REVIEW × R,
Wand other7non-
of 29
rameters, thethe plan
plan length
length can can be calculated
be calculated as Was
× R, theand the
core core (circulation
(circulation and
and other non-support
support officesize
oﬃce spaces) spaces) size automatically
automatically adapts toadapts
W, R, to
andW,D.R,(b)
and D. (b) Reference
Reference of the and
of the vertical vertical and
shading
shading parameters, the height of the window can be given by FH −
parameters, the height of the window can be given by FH − (CH + WSH). (CH + WSH).

Figure 3. 3D models generated via different parametric configurations, showcasing the capability of
Figure
the 3. 3D models
parametric generated
model with viaparameters.
10 form diﬀerent parametric configurations, showcasing the capability of
the parametric model with 10 form parameters.

Table 1. Summary of form parameters.

Classification Form Parameters Range Units Steps Properties Baseline

Orientation (O) 1 [0, 180] degree 15 Independent 90
Sustainability 2025, 17, 4090 7 of 27

Table 1. Summary of form parameters.

Classification Form Parameters Range Units Steps Properties Baseline

Orientation (O) 1 [0, 180] degree 15 Independent 90
Plan width (W) [30, 50] m 0.1 Independent 45
Planar Aspect ratio (R) [1, 1.5] - 0.05 Independent 1
parameters Spatial depth (D) [8, 14] m - Independent 12.5
Plan length (L) [30, 75] m - Covariates 2 45
Plan area (A) [900, 3750] m - Covariates 2 2025
Floor height (FH) [3.9, 4.5] m 0.1 Independent 4.2
Ceiling height (CH) [1, 1.5] m 0.1 Independent 1.2
Vertical Windowsill height (WSH) [0.1, 1.2] m 0.1 Independent 0.1
parameters Window height (WH) [1.2, 3.4] m - Covariates 2 2.9
Window-wall ratio (WWR) [30, 75] % - Covariates 2 ~69
Building storey (BS) 3 15 - - Fixed 15
Horizontal sunshade size
[0.3, 1.5] m 0.1 Independent 0.9
Shading (HSS)
parameters Vertical sunshade size (VSS) [0.3, 1.5] m 0.1 Independent 0.9
Vertical sunshade distance
[3, 9] m 0.1 Independent 3
(VSD)
1Orientation is represented numerically, with a 0–180 range due to the symmetric plan: 0◦ corresponds to west,
90◦ to south, and 180◦ to east. 2 Covariate parameters are solely used for analytical purposes and excluded from
machine learning input features. 3 The number of floors is fixed at 15—the average value for high-rise office
buildings in China.

2.2. Specification of Material and Thermophysical Parameters

In addition to architectural form parameters, thermophysical properties for building
envelope components (walls, windows, floors, and roofs) must also be determined for
simulation accuracy. During early-stage simulations, detailed material layering was omit-
ted in favor of simplified thermophysical values to balance accuracy and computational
efficiency. Given the prevalence of green building standards in China’s new office construc-
tion, thermophysical and operational parameters were derived from the General Code for
Energy Efficiency and Renewable Energy Utilization in Buildings (GB 55015-2021) [74] and
the Energy Efficiency Design Standard for Public Buildings (GB 50189-2015) [75]. Where
conflicts arose, the more recent GB 55015-2021 took precedence. For parameters not covered
by Chinese codes, values from the ASHRAE 90.1-2019 standard [76] for Climate Zone 2A
(corresponding to China’s HSWW zone) were adopted. Table 2 presents the thermophysical
properties assigned to each envelope component.

Table 2. Summary of thermophysical properties of the envelope.

Thermal Conductivity Solar Heat Gain Visible

Envelope
[W/(m2 ·K)] Coefficient (SHGC) Transmittance
Transmitting 1.5 - -
Curtain 1 Opaque 2.4 0.2 0.6
Internal wall 2.1 - -
Floor 1.1 - -
Ground 1.5 - -
Roof 0.4 - -
1 Curtain wall consists of two components: opaque cladding and transparent glazing.

2.3. Setup of Building Operation Schedule

Since the energy consumption and light environment of an office building are closely
related to its operation schedule, proper settings are necessary to achieve accurate sim-
ulation results. The building operation schedule can be meticulously configured using
Sustainability 2025, 17, 4090 8 of 27

the Honeybee plugin. This plugin offers a “Program” port, which has eight sub-ports,
including “People”, “Lighting”, “Electric Equipment”, “Gas Equipment”, “Hot Water”,
“Infiltration”, “Ventilation”, and “Setpoint”. These sub-ports are used to define the device
power, usage time, occupant density, and space occupancy rate.
For the sake of generality, the building operation schedule in this study is set according
to the recommended values for office buildings in references [74,75]. The only difference
is that the winter heating temperature specified in the standards is removed, and no
heating equipment is used in winter. This adjustment aligns with the actual usage and
design guidelines of most local office buildings in the HSWW zone. The detailed building
operation schedule is presented in Table 3.

Table 3. Summary of detailed building operation schedule.

Classification Components Values

Occupant heat power 120 W/people
People Occupant density 10 m2 /people
Occupant period From 7 AM to 9 PM on weekdays
Illuminance 300 lx
Lighting Lighting power 8 W/m2
Operating period From 7 AM to 9 PM on weekdays
Outdoor airflow rate 30 m3 /(h × people)
Cooling temperature setpoint 26 ◦ C
Heating temperature setpoint Off 1
HVAC
Coefficient of Performance
4.0
(COP)
Operating period From 7 AM to 9 PM on weekdays
1The heating function of the HVAC is deactivated by default, which better aligns with the typical operational
patterns of office buildings in the HSWW zone.

2.4. Selection of Climate Dataset

China’s HSWW zone, one of the country’s five major climate regions, spans approx-
imately 1.25 million km2 across multiple provinces and cities. Figure 4 illustrates the
geographical coverage of this climatic zone and the major cities within it. Characterized
by monthly mean temperatures > 10 ◦ C, July averages of 25–29 ◦ C, and ≥25 ◦ C daily
temperatures for 100–200 days annually (with extreme highs reaching 40 ◦ C) [77], the
region’s vast geographic diversity necessitates careful selection of representative climate
datasets for simulation accuracy. Figure 5 presents the monthly average temperatures
and sunshine durations for the six major cities with the highest density of high-rise office
buildings in China’s HSWW zone. All temperature and sunshine data are derived from
the latest Typical Meteorological Year data (TMYx) provided by the U.S. National Oceanic
and Atmospheric Administration (NOAA). The minimal differences between these cities,
especially in the main cooling period, justify the selection of a single representative climate
dataset to represent the zone. To ensure comparability with prior studies, this research
adopted the .epw weather data of Guangzhou (China) to represent the climate of the
HSWW zone.
Generally, open-source .epw data of Guangzhou were sourced from two repositories:
1. CSWD (Chinese Standard Weather Data): A 2005 historical dataset provided by the
China Meteorological Administration.
2. TMYx (Typical Meteorological Year): A dynamically updated dataset from the
U.S. NOAA, incorporating 2007–2021 monthly averages to reflect contemporary cli-
mate trends.
especially in the main cooling period, justify the selection of a single representative c
the
mateHSWW
datasetzone.
to represent the zone. To ensure comparability with prior studies, this
Sustainability 2025, 17, 4090
search adopted the .epw weather data of Guangzhou (China) to represent the climate
9 of 27
the HSWW zone.

Figure 4. The geographical coverage of China’s HSWW climate zone and the major cities within
The red star4.represents
Figure the location
The geographical coverageofofBeijing, the capital
China’s HSWW of zone
climate China.
and the major cities within it.
The red star represents the location of Beijing, the capital of China.
Figure 4. The geographical coverage of China’s HSWW climate zone and the major cities within
The red star represents the location of Beijing, the capital of China.

(a) (b)

FigureFigure 5. The
5. The climate
climate differences between
differences between six major
six cities
major in theinHSWW
cities zone: (a)zone:
the HSWW monthly
(a) average
monthly avera
temperature;
(a) (b) monthly average sunshine duration. (b)
temperature; (b) monthly average sunshine duration.
While
Figure 5. The CSWDdifferences
climate has been widely used
between sixinmajor
Chinese studies,
cities in theitsHSWW
static 2005 datasets
zone: may avera
(a) monthly
Generally, open-source
be less relevant .epw datadue
to current conditions of Guangzhou wereFigure
to climate change. sourced from two
6 compares repositori
hourly
temperature; (b) monthly average sunshine duration.
dry bulb temperatures from the CSWD and TMYx datasets for Guangzhou, revealing
1. CSWD (Chinese Standard Weather Data): A 2005 historical dataset provided by t
significantly more high-temperature hours in TMYx. The average annual temperatures
China
differ byMeteorological
Generally,0.88open-source Administration.
.epw
◦ C (22.23 ◦ C for data
CSWD vs. of Guangzhou
23.11 were
◦ C for TMYx), sourceda from
representing 3.96% two repositori
increase.
2.
1. Given the
TMYx
CSWD strong influence
(Typical
(Chinese Standard of ambient
Meteorological temperature
Year):
Weather Aon building
A dynamically
Data): 2005 energydataset
updated
historical use, TMYx
dataset data the
from
provided byUt
were
NOAA, selected to ensure up-to-date and
incorporating Administration. realistic simulation outcomes.
2007–2021 monthly averages to reflect contemporary clima
China Meteorological
2. trends.
2.5. Creation
TMYx of Building
(Typical Performance Simulation
Meteorological Year): A Datasets
dynamically updated dataset from the U
While
NOAA, CSWD
Following has been2007–2021
parametric
incorporating widely
model used in Chinese
development
monthly studies,
andaverages
simulation its static
toparameter
reflect 2005 datasets
configuration,
contemporary m
clima
energy
be less and daylight simulations were executed using Honeybee 1.4.0 [78]—an
relevant to current conditions due to climate change. Figure 6 compares hourly d
trends. open-source
plugin for Grasshopper. This plugin’s computational accuracy has been validated in prior
bulb While
temperatures from the CSWD and
usedTMYx datasets for Guangzhou, revealing sign
studies CSWD has been
[79,80], ensuring widely
reliable performanceinpredictions.
Chinese studies, its static 2005 datasets m
cantly more
be less relevanthigh-temperature
to current
To evaluate building hours
conditions
performance,in TMYx.
due The average
to climatemetrics
appropriate change. annual
Figure
were temperatures
6 compares
selected. differ
hourly d
For energy
0.88 °C (22.23
efficiency, °C for
energy CSWD
use vs.
intensity 23.11
(EUI, °C for
kWh/m 2TMYx),
) was representing
adopted, combining
bulb temperatures from the CSWD and TMYx datasets for Guangzhou, revealing sign a 3.96%
cooling increase.
energy Giv
cantly(EUI_cooling) and lighting energy (EUI_lighting). Daylight performance was assessed
more high-temperature hours in TMYx. The average annual temperatures differ
using the Useful Daylight Index (UDI, %) proposed by Nabil and Mardaljevic [81], which
0.88 °C (22.23 °C for CSWD vs. 23.11 °C for TMYx), representing a 3.96% increase. Giv
quantifies the annual percentage of occupied hours with horizontal illuminance. Incorpo-
rating both glare and illuminance criteria, the UDI is widely recognized as a comprehensive
Sustainability 2025, 17, 4090 10 of 27
tainability 2025, 17, x FOR PEER REVIEW 10 of

metric for daylight quality. In this study, the UDI was calculated by placing sensors at
0.8 m above floor level (desk height) across a 1-m grid in office zones. The effective illumi-
the strong influence of ambient temperature on building energy use, TMYx data
nance range of 300–2000 lux for typical office tasks was applied, and the final UDI value
were
lectedrepresented
to ensuretheup-to-date andallrealistic
average across simulation outcomes.
sensor points.

(a)

(b)

FigureFigure
6. Comparison of hourly
6. Comparison dry
of hourly drybulb
bulb temperatures
temperatures ofof Guangzhou
Guangzhou fromfrom
CSWDCSWD and TMYx
and TMYx
tasets:datasets: (a) hourly temperatures of Guangzhou from CSWD with less red color (representing high
(a) hourly temperatures of Guangzhou from CSWD with less red color (representing h
temperature); (b) hourly temperatures of Guangzhou for TMYx with more red color.
temperature); (b) hourly temperatures of Guangzhou for TMYx with more red color.
To enhance the representativeness, accuracy, and generalizability of the datasets
for ML training,
2.5. Creation LatinPerformance
of Building hypercube sampling (LHS)Datasets
Simulation was employed to generate parameter
combinations. Latin hypercube sampling (LHS) efficiently captures design space variability
Following parametric model development and simulation parameter configuratio
without requiring excessive samples [82], meaning it has been widely adopted in building
energy and daylight
performance analysissimulations were executed
[82–84]. This method usingparameter
ensures uniform Honeybee 1.4.0 [78]—an
distribution while ope
source plugin for
minimizing Grasshopper.
redundant sampling This plugin’s
[17], making computational
it particularly suitableaccuracy has been validat
for high-dimensional
design spaces.
in prior studies [79,80], ensuring reliable performance predictions.
To achieve an optimal balance between computational cost and machine learning data
To evaluate building performance, appropriate metrics were selected. For energy
requirements, a total of 50 parametric samples were generated. Simulations were executed
ficiency, energyCPU
on a 16-core use(AMD
intensity
RyzenTM(EUI, kWh/m
9 7945HX,
2) was adopted, combining cooling ener
Advanced Micro Devices, Santa Clara, CA),
(EUI_cooling)
taking approximately 10 min per sample simulation and totalingperformance
and lighting energy (EUI_lighting). Daylight was assessed u
about 80 h for finishing
ing the
the Useful Daylight Index (UDI, %) proposed by Nabil and Mardaljevic [81], wh
entire dataset.
quantifies the annual percentage of occupied hours with horizontal illuminance. Incorp
2.6. Machine Learning Algorithm
rating both glare and illuminance criteria, the UDI is widely recognized as a comprehe
As previously discussed, no single ML algorithm universally outperforms others in
sive metric
buildingfor daylight analysis
performance quality.[85].
In this study,
Predicting the UDI
algorithm was calculated
suitability by dataset
for a specific placing senso
at 0.8remains
m above floor level
challenging (desk
prior to height)
training. across
To address this,aa 1-m grid inevaluation
comparative office zones. The effect
of multiple
algorithms
illuminance was conducted
range of 300–2000to identify thetypical
lux for most accurate
officepredictor.
tasks wasGiven the impracticality
applied, and the final U
of testing all available algorithms, selection criteria focused on algorithmic strengths and
value represented the average across all sensor points.
validated application scenarios. After comprehensive analysis, three algorithms were
To enhance the representativeness, accuracy, and generalizability of the datasets
chosen for this study:
ML training, Latin hypercube sampling (LHS) was employed to generate parameter co
binations. Latin hypercube sampling (LHS) efficiently captures design space variabil
without requiring excessive samples [82], meaning it has been widely adopted in buildi
performance analysis [82–84]. This method ensures uniform parameter distribution wh
Sustainability 2025, 17, 4090 11 of 27

• Multi-Layer Perceptron (MLP): a prevalent and simple artificial neural network (ANN)
architecture comprising input, hidden, and output layers was selected for this study
due to its proven capacity to capture complex nonlinear relationships in building
performance datasets. The input layer receives the parameters of architectural form,
while the output layer generates predicted performance metrics. By incorporating
nonlinear activation functions (e.g., ReLU, Sigmoid), MLPs can capture complex input-
output relationships, making them well-suited for mapping static or low-dimensional
time-series data like building design parameters to performance outcomes.
• Support Vector Regression (SVR): originating from support vector machine (SVM)
theory [86], SVR is a regression model that maps low-dimensional data to a high-
dimensional feature space using kernel functions (e.g., RBF). By constructing an
optimal regression hyperplane, SVR effectively captures latent relationships between
input and output variables, making it well-suited for predicting building performance
metrics from design parameters. Unlike other continuous variable prediction methods,
SVR exhibits robust generalization when applied to unseen data [87], maintaining
superior predictive performance even with limited training data—a critical advantage
for building optimization workflows constrained by computational resources.
• Random Forest (RF): an ensemble learning method that constructs multiple decision
trees for classification and regression tasks, enhancing prediction accuracy and robust-
ness through aggregating tree outputs [54]. This algorithm reduces model variance
and mitigates overfitting risks via bootstrap sampling and random feature selection.
Due to its insensitivity to noise and missing values, it maintains stable performance
even with limited training data, which is a critical advantage over most other ML
models. Additionally, tree-based models are favored for their interpretability, enabling
transparent analysis of feature contributions to predictions [17].
• XGBoost: a powerful ensemble learning algorithm based on the Gradient Boosting
Decision Tree (GBDT) [88]. It enhances prediction accuracy by combining multiple de-
cision trees. Distinguishing itself from GBDT, XGBoost attains superior computational
accuracy. It leverages the second-order Taylor expansion formula and incorporates
a regularization term into the objective function, effectively mitigating overfitting
risks. Currently, it has demonstrated advantages such as fast computation speed, high
prediction accuracy, and strong robustness in regression problems and has become a
very popular algorithm.
• CatBoost: an open-source GBDT framework developed by Yandex in 2017 [89] specifi-
cally designed for handling categorical features in classification, regression, and rank-
ing tasks. Unlike traditional ML algorithms, CatBoost automates categorical feature
processing through advanced techniques such as target encoding and combinatorial
optimization, eliminating the need for manual pre-processing. This native capability
makes CatBoost particularly suitable for unstructured datasets and high-cardinality
categorical scenarios. Furthermore, it has demonstrated effectiveness in predicting
energy consumption across diverse domains [90], where it often outperforms XGBoost
in both prediction accuracy and computational efficiency.
To evaluate the predictive performance of different algorithms, three metrics were
adopted: the coefficient of determination (R2 ), root mean squared error (RMSE), and coeffi-
cient of variation of RMSE (CVRMSE). R2 (0 ≤ R2 ≤ 1) quantifies the proportion of variance
in the dependent variable explained by the model, with values closer to 1 indicating better
fit. RMSE measures the average magnitude of prediction errors, calculated as the square
root of the mean squared deviation between predicted and observed values. CVRMSE
normalizes RMSE by the mean of observed values, yielding a dimensionless metric for
fair cross-dataset comparisons. Recommended by ASHRAE [91], CVRMSE eliminates
Sustainability 2025, 17, 4090 12 of 27

scale dependency and is particularly useful for benchmarking models across different
building types or climates. The mathematical expressions for these metrics are provided in
Equations (1)–(3).
2
∑n (y − y̌i )
R2 = 1 − ni=1 i 2
(1)
∑ i=1 ( y i − y i )
q
n
RMSE = ∑i=1 (y̌i − yi )2 /n (2)
q
2
∑ni=1 (y̌i − yi ) /n
CVRMSE = (3)
∑ni=1 yi /n
where y̌i , yi , and yi represent the predicted value of sample i, the actual value of sample i,
and the mean value of all sample datasets, respectively; n denotes the number of samples.

2.7. Multi-Objective Optimization with Machine Learning

Upon establishing the surrogate model, multi-objective optimization was conducted
using the Non-dominated Sorting Genetic Algorithm II (NSGA-II) [30] on Python. NSGA-II
is a fast elitist algorithm renowned for its efficiency in low-dimensional optimization prob-
lems. Its elitism mechanism preserves the best solutions across generations by merging
parent and offspring populations, ensuring convergence stability. NSGA-II may occasion-
ally encounter duplicate solutions [30,92], but it remains the most widely adopted genetic
algorithm in building optimization [93].
The optimization objective was to identify Pareto-optimal solutions minimizing EUI
while maximizing useful daylight illuminance (UDI). Simulations were conducted in
Python using the following NSGA-II parameters: population size: 200, generations: 50,
crossover rate: 0.8, mutation rate: 0.9, and elitism ratio: 0.5. This parameter combination
effectively maintained population diversity while avoiding prolonged convergence time.

3. Results and Discussion

3.1. Analysis of the Building Performance Datasets
Prior to implementing machine learning (ML), conducting a preliminary analysis of
performance simulation results is crucial. This analysis aids researchers in refining ML
model parameters (e.g., judging relevant variables) and identifying potential prediction
issues (e.g., output distortion).
Figure 7 presents the statistical distributions of 500 simulation results for EUI and
UDI. The data closely approximate normal distributions (R2 _EUI = 0.94, R2 _UDI = 0.98),
validating the reliability of the simulation outcomes. The mean EUI of 34.02 kWh/m2
is 1.67% higher than the baseline model, while the mean UDI of 70.91 is 3.71% lower,
indicating the rationality of the baseline parameters. However, the baseline model’s EUI
and UDI lag behind the simulation’s best results (EUI = 32.53, UDI = 81.70) by 2.78% and
11.1%, respectively, demonstrating substantial optimization potential for high-rise office
buildings in the HSWW Zone.
Figure 8 presents Pearson correlation coefficients (r) between 10 architectural form
parameters and building performance metrics. Pearson’s r quantifies the linear relationship
strength between two variables, with values ranging from −1 to 1. Generally, |r| in the
range of 0.3–0.5 indicates moderate correlation, while values > 0.5 mean a significantly
strong correlation [94].
UDI lag behind the simulation’s best results (EUI = 32.53, UDI = 81.70) by 2.78% and 11.1%,
respectively, demonstrating substantial optimization potential for high-rise oﬃce build-
Sustainability 2025, 17, 4090 ings in the HSWW Zone. 13 of 27

(a) (b)

Figure 7. Frequency distribution plot of EUI and UDI values: (a) frequency distribution plot of EUI;
(b) frequency distribution plot of UDI.

Figure 8 presents Pearson correlation coeﬃcients (r) between 10 architectural form

parameters and building performance metrics. Pearson’s r quantifies the linear relation-
ship strength between two variables, with values ranging from −1 to 1. Generally, |r| in
the range of 0.3–0.5 indicates moderate correlation, while values > 0.5 mean a significantly
strong correlation [94].
Figure 8. Heatmap of Pearson correlation coefficients (r) between building form parameters and
Figure 8. Heatmap of Pearson correlation coeﬃcients (r) between building form parameters and
EUI/UDI.
EUI/UDI.
Heatmap analysis reveals that EUI exhibits strong negative correlations with HSS
Heatmap sunshade
(horizontal analysis reveals
size, rthat
= −EUI exhibits
0.637), strong negative
W (building width, correlations
r = −0.369),with HSS
D (spatial
(horizontal
depth, r =sunshade size, r(windowsill
−0.395), WSH = −0.637), W height,
(buildingr =width, r =and
−0.296), −0.369),
VSS D (spatialshading
(vertical depth, rsize,
=
−0.395), WSH (windowsill
r = −0.286), indicating that height, r = shading
larger −0.296), and VSS (vertical
devices, increasedshading size, rheights,
windowsill = −0.286),
and
indicating that larger
deeper spatial shading devices,
configurations reduce increased windowsill
solar penetration and heights, and deeper
cooling energy spatial
consumption.
configurations
For UDI, a strongreduce solar correlation
negative penetrationwithandDcooling energyr =consumption.
(space depth, −0.342) arisesFor
fromUDI, a
limited
strong negative
daylight accesscorrelation
in deep-plan with D (space
spaces, depth,
while r = −0.342)
a moderate arises
positive from limited
correlation withdaylight
HSS (hori-
Figure
access in8.shading
zontal Heatmap
deep-plan of rPearson
spaces,
size, correlation
while
= 0.496) a moderate
suggests coeﬃcients
that positive (r)
expanded between with
correlation
horizontal building form parameters
HSSmitigates
shading (horizontal
glare and
EUI/UDI.
shading size, maintaining
risks while r = 0.496) suggests
adequatethatdaylight
expanded horizontal shading mitigates glare risks
levels.
while maintaining adequate daylight levels.
3.2. Heatmap
Training andanalysis
Evaluation of Machine
reveals thatLearning Models strong negative correlations with HSS
EUI exhibits
(horizontal
The 500 sunshade
simulatedsize, r = −0.637),
samples W (building
were randomly width,
divided into 3rgroups:
= −0.369),
80%D(400
(spatial depth, r =
samples)
served as the training set, 10% (50 samples) were used for the validation set
−0.395), WSH (windowsill height, r = −0.296), and VSS (vertical shading size, r = −0.286),in an early
stopping mechanism
indicating that largertoshading
prevent devices,
overfitting, and the remaining
increased windowsill 10% (50 samples)
heights, formed spatial
and deeper
the test set for validating prediction accuracy.
configurations reduce solar penetration and cooling energy consumption. For UDI, a
Table 4 presents the training results of the five machine learning models. CatBoost
strong negative correlation with D (space depth, r = −0.342) arises from limited daylight
outperformed all other algorithms, while MLP, XGBoost, and RF also demonstrated com-
access in deep-plan spaces, while a moderate positive correlation with HSS (horizontal
petitive performances. Notably, SVR performed poorly in this context. A key observation
shading
is that thesize, r = 0.496) suggests
top-performing that expanded
models (CatBoost, horizontal
XGBoost, shading
and RF) are mitigates
all ensemble glare risks
learning
while maintaining
algorithms, adequate
suggesting daylight
their potential levels. in predicting building performance using
superiority
architectural form parameters. The CatBoost model, as the best performer, was selected for
subsequent NSGA-II multi-objective optimization.
Figure 9 illustrates the training progression of the CatBoost model and its regression
prediction performance. During training, both training and validation losses demonstrated
sustained downward trends, indicating overall stability in the optimization process. The
validation loss stabilized below 0.1 after approximately 18,000 iterations, at which point
the model snapshot was selected as the final performance prediction surrogate model.
Table 4. The training results of the machine learning models.

Results
Model Name
Sustainability 2025, 17, 4090 R2 RMSE CVRMSE (%) 14 of 27
MLP 0.8728 0.2486 5.96%
SVR 0.4476 0.5182 37.89%
Regression plots
RF reveal that predicted outputs for the0.2938
0.8224 training, validation, and test sets
15.1%
align closely with
XGBoost target values along the
0.8672 diagonal, reflecting
0.2541 strong fitting relationships.
8.89%
This confirms the model’s capability
CatBoost to capture complex
0.9406 non-linear relationships
0.1930 1.57% between
architectural form parameters and EUI/UDI.
Figure 9 illustrates the training progression of the CatBoost model and its regression
Table 4. The
prediction training results
performance. of the machine
During learning
training, both models.
training and validation losses demon-
strated sustained downward trends, indicating overall stability in the optimization pro-
Results
cess. The
Modelvalidation
Name loss stabilized2 below 0.1 after approximately 18,000 iterations, at
R RMSE CVRMSE (%)
which point the model snapshot was selected as the final performance prediction surro-
gate model. MLP
Regression plots reveal0.8728 0.2486
that predicted outputs 5.96% and
for the training, validation,
SVR 0.4476 0.5182 37.89%
test sets align closely with target values along the diagonal, reflecting strong fitting rela-
RF 0.8224 0.2938 15.1%
tionships. This confirms the model’s
XGBoost 0.8672capability to capture
0.2541complex non-linear relation-
8.89%
ships between architectural form0.9406
CatBoost parameters and EUI/UDI. 0.1930 1.57%

Sustainability 2025, 17, x FOR PEER REVIEW 15 of 29

(a) (b)

Figure9. 9.
Figure Training
Training progression
progression of the
of the CatBoost
CatBoost model
model andand its regression
its regression prediction
prediction performance:
performance: (a)
(a) MSE
MSE changeschanges during
during iterations;
iterations; (b) regression
(b) regression of training
of training data; data; (c) regression
(c) regression of validation
of validation data; data;
(d)
(d) regression
regression ofdata.
of test test data.

Additionally, to evaluate whether the prediction model can be applied to performance

Additionally, to evaluate whether the prediction model can be applied to perfor-
prediction for high-rise office buildings in other cities within the HSWW zone, this study
mance prediction for high-rise office buildings in other cities within the HSWW zone, this
conducted performance predictions on 50 random parameter samples and simulated their
study conducted performance predictions on 50 random parameter samples and simu-
actual EUI and UDI using the climate data of Shenzhen, which, similar to Guangzhou, also
lated their actual EUI and UDI using the climate data of Shenzhen, which, similar to
has a large number of high-rise office buildings in the region. The comparison results are
Guangzhou, also has a large number of high-rise office buildings in the region. The com-
shown in Figure 10. It can be observed that the actual EUI of high-rise office buildings in
parison results are shown in Figure 10. It can be observed that the actual EUI of high-rise
Shenzhen is generally slightly higher than that in Guangzhou, while the UDI is generally
office buildings in Shenzhen is generally slightly higher than that in Guangzhou, while
slightly lower. This result is likely due to two factors: (1) Shenzhen’s average temperature
the UDI◦is generally slightly lower. This result is likely due to two factors: (1) Shenzhen’s
(23.89 C) is approximately 3.50% higher than Guangzhou’s (23.08◦ C), which causes extra
average temperature (23.89 °C) is approximately 3.50% higher than Guangzhou’s
cooling consumption; and (2) the monthly hourly data of horizontal diffuse illuminance
(23.08°C), which causes extra cooling consumption; and (2) the monthly hourly data of
horizontal diffuse illuminance (HDI) above 10,000 lux in Shenzhen (119 h) are 4.2% higher
than in Guangzhou (114 h), a factor that increases the likelihood of glare and decreases
UDI.
office buildings in Shenzhen is generally slightly higher than that in Guangzhou, while
the UDI is generally slightly lower. This result is likely due to two factors: (1) Shenzhen’s
Sustainability 2025, 17, 4090 average temperature (23.89 °C) is approximately 3.50% higher than Guangzhou’s 15 of 27
(23.08°C), which causes extra cooling consumption; and (2) the monthly hourly data of
horizontal diffuse illuminance (HDI) above 10,000 lux in Shenzhen (119 h) are 4.2% higher
than in Guangzhou
(HDI) above 10,000(114 h),Shenzhen
lux in a factor that
(119increases thehigher
h) are 4.2% likelihood
than of glare and decreases
in Guangzhou (114 h), a
UDI.
factor that increases the likelihood of glare and decreases UDI.

Figure
Figure 10.10. Distribution
Distribution ofof
MLML predictions,
predictions, Shenzhen’s
Shenzhen’s performance
performance simulation
simulation results,
results, and
and adjusted
adjusted
valuesincorporating
values incorporatingclimatic
climaticdiﬀerences
differencesfor
for5050random
randomparameter
parametersamples.
samples.

When adjusting the ML model’s predictions by incorporating the 3.50% temperature

When adjusting the ML model’s predictions by incorporating the 3.50% temperature
difference and 4.2% monthly hourly HDI difference identified earlier, the average deviations
difference and 4.2% monthly hourly HDI difference identified earlier, the average devia-
between the adjusted predictions and Shenzhen’s actual data are approximately 1.33% and
tions between the adjusted predictions and Shenzhen’s actual data are approximately
1.80%, respectively. This indicates that after adjusting for climatic data differences, the
1.33% and 1.80%, respectively. This indicates that after adjusting for climatic data differ-
prediction model demonstrates certain general predictive capabilities within the HSWW
ences, the prediction model demonstrates certain general predictive capabilities within
zone, and can be used in the early stages of architectural design where high precision is
the HSWW zone, and can be used in the early stages of architectural design where high
not required. However, for more detailed form optimization of buildings in specific cities,
retraining the model with the target city’s simulation data is necessary.

3.3. Interpretability Analysis of Machine Learning Model Based on SHAP

Even when the CatBoost model exhibits a superior predictive performance, its predic-
tions remain difficult to fully trust if the decision-making process remains opaque to human
comprehension. Therefore, prior to engaging in performance optimization, conducting a
rigorous interpretability analysis of the predictive model is essential. Interpretability refers
to the systematic translation of complex model behaviors into causal frameworks under-
standable to humans, enabling researchers to discern the logical pathways underpinning
model decisions.
As a variant of gradient-boosted decision trees (GBDT), CatBoost operates within
a tree-based model [89]. Given the inherent explanatory advantages of SHAP (SHapley
Additive exPlanation) [95] for tree-based models, this study employs SHAP to conduct
interpretability analysis of the predictive model. Based on Shapley values from game
theory, SHAP provides a theoretically rigorous framework to quantify the contribution of
each feature to prediction outcomes [96]. SHAP not only ranks feature relevance but also
elucidates the directional impact of individual features—how increases or decreases in a
feature value influence the predicted result.
For the model developed in this study, SHAP quantifies the effect of each building’s
form parameter on EUI and UDI by computing the average marginal contribution of each
parameter across all possible feature combinations. This approach accounts for interactive
effects between parameters, offering a comprehensive understanding of how form parame-
ters collectively influence prediction results. The analytical process is conducted via the
SHAP library in Python.
Figure 11 illustrates the importance ranking of each parameter for EUI and UDI
based on SHAP values. The horizontal axis represents the average of the sum of absolute
eters collectively influence prediction results. The analytical process is conducted via
SHAP library in Python.
Figure 11 illustrates the importance ranking of each parameter for EUI and UDI ba
Sustainability 2025, 17, 4090 16 of 27
on SHAP values. The horizontal axis represents the average of the sum of absolute SH
values across all samples, reflecting the influence of the form parameters on the two p
SHAP values
formance across all samples, reflecting the influence of the form parameters on the two
metrics.
performance metrics.

(a) (b)

Figure
Figure 11. 11.
SHAPSHAP importance ranking
importance ranking ofof
each parameter
each for EUI
parameter forand
EUIUDI:
and(a) UDI:
SHAP(a)
importance
SHAP importa
ranking for EUI; (b) SHAP importance ranking for UDI.
ranking for EUI; (b) SHAP importance ranking for UDI.
In Figure 11a, HSS emerges as the most impactful parameter for EUI among all form
In Figure
factors, with11a, HSS
a SHAP emergesexceeding
importance as the most
twiceimpactful
that of any parameter for EUI
other parameter. among all fo
Parameters
such as D, WSH, FH, VSD, and VSS show moderate effects on EUI,
factors, with a SHAP importance exceeding twice that of any other parameter. while O, R, W, and CHParamet
suchhave
as D,negligible
WSH, FH,influences.
VSD, and For VSS
UDI, show
Figure moderate
11b shows that HSSon
eﬀects again haswhile
EUI, the strongest
O, R, W, and
effect on UDI—parallel to its role in EUI—while D and FH exhibit notably significant
have negligible influences. For UDI, Figure 11b shows that HSS again has the strong
impacts, with WSH, VSD, and CH contributing moderately. The remaining parameters, by
contrast, have minimal effects on UDI.
Collectively, these results highlight that HSS is a critical factor for both EUI and UDI,
underscoring the importance of horizontal shading design in the HSWW zone—consistent
with prior Pearson analysis; parameters like D, FH, and WSH rank second, third, and fourth
in importance, indicating substantial effects of spatial depth, floor height, and windowsill
height on high-rise building performance. The lower importance of vertical sunshade
parameters (VSS, VSD) indicates that horizontal sunshades are more critical than vertical
counterparts; the higher ranking of WSH compared to CH implies greater optimization
potential in adjusting windowsill height. Notably, D stands out as the only planar pa-
rameter with high importance, emphasizing that spatial depth impacts performance more
significantly than building dimensions or shape.
Unlike importance ranking, SHAP beeswarm plots evaluate the positive or negative
impacts of parameters on performance metrics: positive SHAP values indicate positive
correlations, while negative values denote negative correlations, with larger absolute values
reflecting stronger influences. Additionally, the color gradient of samples in these plots
shows the relationship between parameter values and SHAP values, where redder colors
indicate larger parameter values within their respective ranges, and bluer tones signify
smaller values.
Analysis of the SHAP importance ranking plot also reveals that the SHAP importance
of parameter CH for EUI is negligible, while its importance for UDI remains moderate.
Although removing this parameter from ML model training might theoretically improve
prediction performance, testing results indicate that excluding CH causes the model’s
overall R2 to drop from 0.94 to 0.70. This suggests that CH still plays a significant role in
maintaining prediction accuracy.
Sustainability 2025, 17, 4090 17 of 27

Figure 12a presents the SHAP beeswarm plot for EUI, showing that W, R, D, WSH,
HSS, and VSS correlate negatively with EUI. Among these parameters, HSS, WSH, and D
exhibit particularly pronounced negative correlations, indicating that the larger sizes of
horizontal sunshades, higher windowsills, and greater spatial depth reduce energy use
in high-rise offices of the HSWW climate zone. Conversely, FH and VSD show positive
correlations, indicating lower floor heights and smaller vertical shading distances are
linked to lower energy consumption. O displays a non-linear relationship with energy
use: intermediate O values (corresponding to south-facing orientations) are associated
with reduced energy consumption, whereas decreases in O (west-facing) or increases in O
(east-facing) correlate with higher energy use—likely due to balanced sunlight on south
Sustainability 2025, 17, x FOR PEER REVIEW 18 of 29
facades reducing extreme thermal loads compared to east/west orientations; CH has no
significant directional trend, suggesting minimal direct impact on EUI.

(a)

(b)

Figure
Figure 12.
12. SHAP
SHAPbeeswarm
beeswarmplots
plotsfor
forEUI
EUIand
andUDI:
UDI:(a)(a)
beeswarm plot
beeswarm of of
plot EUI; (b)(b)
EUI; beeswarm plot
beeswarm of
plot
UDI.
of UDI.

Figure 12b
Figure 12b presents
presents the
the SHAP
SHAP beeswarm
beeswarm plot
plot for
for UDI,
UDI, revealing
revealing that W, D,
that W, D, and VSD
and VSD
exhibit strong
exhibit strong negative
negative correlations
correlations with
with UDI—indicating
UDI—indicating that
that smaller
smaller plan
plan widths,
widths,shal-
shal-
lower spatial depths, and narrower vertical shading intervals enhance UDI. R
lower spatial depths, and narrower vertical shading intervals enhance UDI. R demon-demonstrates
a moderate
strates negative
a moderate relationship
negative with UDI,
relationship with suggesting that building
UDI, suggesting plansplans
that building closer to a
closer
to a square configuration (lower R values) correlate with higher UDI; notably, a subset of
samples with low R values display negative SHAP values, potentially attributable to
model uncertainty or edge-case scenarios. Conversely, O, CH, and HSS show positive cor-
relations with UDI: east-facing orientations, increased ceiling heights, and larger horizon-
tal shading components consistently improve UDI performance. For HSS in particular, the
Sustainability 2025, 17, 4090 18 of 27

square configuration (lower R values) correlate with higher UDI; notably, a subset of sam-
ples with low R values display negative SHAP values, potentially attributable to model
uncertainty or edge-case scenarios. Conversely, O, CH, and HSS show positive correlations
with UDI: east-facing orientations, increased ceiling heights, and larger horizontal shading
components consistently improve UDI performance. For HSS in particular, the beneficial
effect is hypothesized to arise from its ability to filter direct solar radiation while preserving
diffused light. Besides, FH, WSH, and VSS exhibit no clear directional trends, a result likely
arising from strong interactive effects with other parameters.
Sustainability 2025, 17, x FOR PEER REVIEW Overall, the interpretability analysis of the prediction model using SHAP reveals the
19 of 29
following results: Parameters HSS, D, FH, WSH, VSS, and VSD have substantial impacts on
prediction outcomes. Larger horizontal sunshades and vertical sunshade sizes and minor
southeast-facing
greater building plan orientations
width andcanplanreduce
aspectEUI
ratiowhile improving
and spatial depthUDI. Conversely,
lower greater
EUI but dimin-
ishbuilding plan width
UDI. Shorter and plan
floor heights aspect
(FH), ratiowindowsill
higher and spatialheights
depth lower
(WSH), EUI
andbut diminishver-
narrower UDI.
Shorter floor heights (FH), higher windowsill heights (WSH), and narrower
tical sunshade intervals reduce EUI, though their effects on UDI require evaluation in con- vertical sun-
shade intervals
junction with other reduce EUI, though
parameters. their
Ceiling effects
height onaUDI
has require
minimal evaluation
impact in conjunction
on EUI, yet larger
with other parameters. Ceiling
CH values contribute to higher UDI. height has a minimal impact on EUI, yet larger CH values
contribute to higher UDI.
These conclusions align well with practical design experience, underscoring the pre-
These
diction model’s conclusions align well with practical all
strong interpretability—whereby design experience,
parameters exertunderscoring the pre-
discernible effects
ondiction model’swithout
performance strong interpretability—whereby
irrelevant factors—therebyall parameters
validating its exert discernible
credibility. Thus,effects
this
on performance without irrelevant factors—thereby validating its credibility.
prediction model can be employed as a reliable surrogate for form parameter optimization Thus, this
viaprediction
NSGA-II.model can be employed as a reliable surrogate for form parameter optimization
via NSGA-II.
3.4. Performance and Analysis of Optimization
3.4. Performance and Analysis of Optimization
Leveraging the predictive capabilities of the surrogate model, the NSGA-II algorithm
Leveraging the predictive capabilities of the surrogate model, the NSGA-II algorithm
was implemented in Python to derive Pareto optimal solutions for EUI and UDI. Approx-
was implemented in Python to derive Pareto optimal solutions for EUI and UDI. Approxi-
imately 2 min of optimization yielded 65 Pareto-optimal solutions.
mately 2 min of optimization yielded 65 Pareto-optimal solutions.
Figure 13 reveals that the solutions closely align with a quadratic curve (y = −29.65x22
Figure 13 reveals that the solutions closely align with a quadratic curve (y = −29.65x +
+ 1946x − 31,849), reflecting a non-linear and competitive trade-off between EUI and UDI.
1946x − 31,849), reflecting a non-linear and competitive trade-off between EUI and UDI. In
In the initial segment, UDI increases at a higher rate with rising EUI, whereas in the latter
the initial segment, UDI increases at a higher rate with rising EUI, whereas in the latter part,
part, UDI growth decelerates as EUI increases. Consequently, the convex point of the pa-
UDI growth decelerates as EUI increases. Consequently, the convex point of the parabola is
rabola is identified as the optimal design solution balancing both performance metrics.
identified as the optimal design solution balancing both performance metrics.

Figure 13. The distribution of all Pareto-optimal solutions of EUI and UDI generated by NSGA-II.
Figure 13. The distribution of all Pareto-optimal solutions of EUI and 2UDI generated by NSGA-II.
The distribution approximately fits a quadratic curve (y = −29.65x + 1946x − 31,849, R2 = 0.99),
The distribution approximately fits a quadratic curve (y = −29.65x2 + 1946x − 31,849, R2 = 0.99), where
where y represents UDI and x represents EUI, illustrating a non-linear trade-off between them.
y represents UDI and x represents EUI, illustrating a non-linear trade-oﬀ between them.

However, these 65 Pareto-optimal solutions were predicted by the surrogate model.

Although the surrogate model performed well on the testing data, deviations may still
exist between its predictions and actual simulation results. To assess potential errors in
these solutions, their corresponding form parameters were re-simulated using the original
simulation tool. Figure 14 illustrates the discrepancies between re-simulated results and
the initial Pareto solutions, presenting an average EUI difference rate of +0.34%, a UDI
Sustainability 2025, 17, x FOR PEER REVIEW 20 of 29
difference rate of −1.40%, and 8 solutions with difference rates between simulated and
predicted values exceeding 5%. Figure 15 illustrates the distribution of normalized
Sustainability 2025, 17, x FOR PEER REVIEW 20 ofpa-
29
rameter values for these 8 samples within the complete set of optimal solutions, revealing
minority of parameter combinations at boundary conditions. While data processing and
that these outlier samples generally have larger values of W and D, as well as smaller FH
regularization techniques can mitigate these prediction discrepancies, their complete
and WSH.ofThis
minority suggestscombinations
parameter that the prediction modelconditions.
at boundary may produce certain
While datadeviations
processingfor a
and
elimination remains unfeasible, highlighting the necessity of validating ML and GA opti-
minority of parameter combinations at boundary conditions. While data processing
regularization techniques can mitigate these prediction discrepancies, their complete and
mization results through re-simulation.
regularization techniques
elimination remains can mitigate
unfeasible, these prediction
highlighting discrepancies,
the necessity their
of validating ML complete
and GAelimi-
opti-
nation remains unfeasible, highlighting
mization results through re-simulation. the necessity of validating ML and GA optimization
results through re-simulation.

Figure 14. The distribution of Pareto solutions of EUI and UDI by re-simulation based on Pareto-
Figure
optimal14. The distribution
solutions generatedofbyPareto solutions of EUI and UDI by re-simulation based on Pareto-
NSGA-II.
Figure 14.
optimal The distribution
solutions generated of
by Pareto solutions of EUI and UDI by re-simulation based on Pareto-
NSGA-II.
optimal solutions generated by NSGA-II.

Figure 15. Distribution of normalized parameter values for eight solutions with difference rates
exceeding
Figure 15. 5% (orange) and
Distribution other Pareto-optimal
of normalized parameter solutions
values for(blue).
eight solutions with difference rates ex-
ceeding 5% (orange) and other Pareto-optimal solutions (blue).
Figure 15. Distribution
After excluding of normalized
the parameter
8 solutions values for eight
with difference ratessolutions
> 5%, 57 with difference rates so-
Pareto-optimal ex-
ceeding
lutions 5%close
in
After (orange)
excludingandthe
other
agreement Pareto-optimal
with with solutions
predictions
8 solutions were (blue).
differenceretained. Table
rates > 5%, 575Pareto-optimal
lists the maximum,
solu-
median, mean, and minimum values for the two performances, along with the
tions in close agreement with predictions were retained. Table 5 lists the maximum, me- baseline
After
model’s excluding the 8 solutions with difference rates > 5%, 57 Pareto-optimal solu-
values.
dian, mean, and minimum values for the two performances, along with the baseline
tions in close agreement with predictions were retained. Table 5 lists the maximum, me-
model’s values.
dian, mean, and minimum values for the two performances, along with the baseline
model’s values.
Table 5. Maximum, median, mean, and minimum values of the Pareto-optimal solutions for
EUI/UDI, along with baseline model values.
Table 5. Maximum, median, mean, and minimum values of the Pareto-optimal solutions for
Sustainability 2025, 17, 4090 20 of 27

Table 5. Maximum, median, mean, and minimum values of the Pareto-optimal solutions for EUI/UDI,
Sustainability 2025, 17, x FOR PEER REVIEW 21 of 29
along with baseline model values.

Performance Minimum Maximum Median Mean Baseline

Since
EUI solutions
(kWh/m 2) within
31.95the Pareto-optimal
33.21 set32.36
are non-dominated
32.45 and33.46
cannot be di-
UDI (%)
rectly ranked 62.41 should select
[97], designers 83.18appropriate
71.25designs 72.36 73.64
based on their preferences.
For energy minimization, the solution with the lowest EUI (31.95 kWh/m2, 4.51% lower
than the baseline
Since model)
solutions withincan be chosen, but
the Pareto-optimal setitare
reduces UDI byand
non-dominated 9.70% compared
cannot be directlyto the
ranked [97], designers should select appropriate designs based
baseline model. Conversely, maximizing the UDI solution (83.18%, 13.0% higher on their preferences. For than
energy increases
minimization, 2
baseline) EUIthebysolution withbalanced
1.34%. For the lowestperformance,
EUI (31.95 kWh/m , 4.51%point
the convex lower of
than
the Pa-
the baseline model) can be chosen, but it reduces UDI by 9.70% compared
reto-optimal is recommended. Five solutions at the convex point achieve both metrics bet- to the baseline
model. Conversely, maximizing the UDI solution (83.18%, 13.0% higher than baseline)
ter than the baseline: a mean EUI of 32.35 kWh/m2 (3.31% lower than the baseline) and a
increases EUI by 1.34%. For balanced performance, the convex point of the Pareto-optimal
mean UDI of 77.41% (5.12% higher than the baseline). These results demonstrate the ef-
is recommended. Five solutions at the convex point achieve both metrics better than the
fectiveness
baseline: aofmean
the EUI
optimization framework
of 32.35 kWh/m 2 (3.31%in balancing
lower than theenergy
baseline)eﬃciency
and a mean and
UDIdaylight
of
performance.
77.41% (5.12% higher than the baseline). These results demonstrate the effectiveness of the
However, framework
optimization in practicalindesign scenarios,
balancing designers
energy efficiency and
and decision-makers
daylight performance.rarely adopt
optimalHowever,
parameters directlydesign
in practical due to various designers
scenarios, subjective and
and objective constraints.
decision-makers Instead,
rarely adopt
theoptimal parameters
Pareto-optimal setdirectly due to various
also provides subjective
numerical and objective
references through constraints.
the maximum,Instead,mini-
the Pareto-optimal set also provides numerical references through
mum, and median values of each form parameter across the Pareto-optimal solutions. the maximum, min-
imum,16–18
Figures and median
presentvalues of each form
the parameter parameter across
distributions the Pareto-optimal
of Pareto-optimal solutionssolutions.
using box
Figures 16–18 present the parameter distributions of Pareto-optimal solutions using box
plots, categorized into three aspects: planar parameters, vertical parameters, and shading
plots, categorized into three aspects: planar parameters, vertical parameters, and shad-
parameters.
ing parameters.

Figure 16. Statistical distribution and median of each planar parameter for the Pareto-optimal solutions.
Figure 16. Statistical distribution and median of each planar parameter for the Pareto-optimal solu-
tions. Figure 16 illustrates the distribution of planar parameters for Pareto-optimal solutions.
Orientations within the Pareto set cluster are at 105◦ (15◦ east of south), with secondary
frequency peaks at 90◦ (south) and 120◦ (30◦ east of south). This clustering indicates
that southeast/south-facing orientations are optimal for high-rise office buildings in this
region, as they align with prevailing monsoon patterns—enhancing natural ventilation
and reducing west-facing solar heat gain, which minimizes cooling loads. The median
plan width and length values are 40.2 m and 58.3 m, and a dominant 1.45 aspect ratio
tainability 2025, 17, x FOR PEER REVIEW 22 of
Sustainability 2025, 17, 4090 21 of 27

(length/width) implies that increasing the south-facing facade area improves performance.
Spatial depth and plan area cluster at 9 m/1500 m2 and 12 m/2500 m2 , indicating an
Sustainability 2025, 17, x FOR PEER REVIEW 22 of 29
adaptive strategy balancing daylight access (shallow and small plans) and heat reduction
(deep and large plans).

Figure 17. Statistical distribution and median of each vertical parameter for the Pareto-optimal
lutions.
Figure 17. Statistical distribution and median of each vertical parameter for the Pareto-optimal solutions.
Figure 17. Statistical distribution and median of each vertical parameter for the Pareto-optimal
solutions.

Figure 18. Statistical distribution and median of each shading parameter for the Pareto-optimal solutions.
Figure 18. Statistical distribution and median of each shading parameter for the Pareto-optimal
Figure 17 shows the distribution of vertical parameters for Pareto-optimal solutions.
lutions.
Floor
Figure 18.heights ranging
Statistical from 4.1and
distribution to 4.3 m reflect
median a balance
of each shading between daylight
parameter volume
for the and
Pareto-optimal
spatial
solutions. comfort. The maximum allowable values for windowsill (1.2 m) and ceiling heights
Figure 16 illustrates
(1.4 m) indicate thewindow
that reduced distribution of planar
area contributes parameters
to enhanced for
overall Pareto-optimal
building perfor- so
tions.mance,
Orientations
Figurewhich within the
16 isillustrates
supported by Pareto
thewindow set cluster
height
distribution are at (1.4–1.9
distributions
of planar 105° (15°
parameters east
m) and ofPareto-optimal
WWR
for south),
(~0.3). with s
ondary Figure 18 presents
frequency the
at distribution
peakswithin 90° the
(south) of vertical parameters forof
Pareto-optimal solutions.
solutions. Orientations Paretoand set 120°
cluster(30°areeast
at 105° south).
(15° east This clustering
of south), with in
The uniform adoption of 1.5 m horizontal sunshade size and 3 m vertical sunshade intervals,
cates that southeast/south-facing
secondary frequency peaks at 90°orientations
(south) andare 120° optimal
(30° east forofhigh-rise
south). Thisoﬃce buildings
clustering
both representing minimum design thresholds, highlights their critical role in performance
this region,
indicates as
thatthey align with prevailing
southeast/south-facing monsoon patterns—enhancing
orientations are optimal for high-rise
optimization. While vertical sunshade sizes contribute less significantly than horizontal
natural venti
oﬃce
buildings
tion and
shading, in
reducing thisvertical
larger region,
west-facing as they
solar
sunshades align
are heat with
gain, prevailing
whichselected
still preferentially monsoon
minimizes patterns—enhancing
cooling
in optimal loads.
solutions. The med
This
natural
plan width ventilation
parameter and length andvalues
distribution reducing
evinces arewest-facing
their 40.2 m andsolar
supplementary 58.3
roleheat gain,
and awhich
m,mitigating
in solarminimizes
dominant 1.45and
heat gain cooling ra
aspect
loads. The median
reducing
(length/width) excessive
impliesplan width
direct solar
that and length
radiation
increasing values
that
the causes are 40.2 overheating
indoor
south-facing m and 58.3 area
facade m, and
and a dominant
discomfort
improves perf
1.45 aspect
glare, therebyratio (length/width)
minimizing implies
cooling loads whilethat increasing
addressing the south-facing
glare-related
mance. Spatial depth and plan area cluster at 9 m/1500 m and 12 m/2500 m , indicati 2 visual facade
disturbances.
2 area
improvesCollectively,
performance.these findings identifyand
Spatial depth enlarger
plan shading
area cluster systems,
at 9 southeast/south
m/1500 m2 and orien-12 m/2500
an adaptive
tations,
strategy balancing
substantially large or
daylight
small spatial
accesshigh
depths,
(shallow
aspect
and small
ratios, and
plans)window
reduced
and heat red
m , indicating an adaptive strategy balancing daylight access (shallow and small plans)
2
tion (deep and large plans).
and heat reduction (deep and large plans).
Figure 17 shows the distribution of vertical parameters for Pareto-optimal solutio
Figure 17 shows the distribution of vertical parameters for Pareto-optimal solutions.
Floor heights
Floor heightsranging
rangingfrom
from 4.1 to 4.3
4.1 to 4.3 m
mreflect
reflecta abalance
balance between
between daylight
daylight volume
volume and a
spatial comfort.
spatial comfort.The
Themaximum allowablevalues
maximum allowable valuesforfor windowsill
windowsill (1.2(1.2 m) and
m) and ceiling
ceiling heig
heights
Sustainability 2025, 17, 4090 22 of 27

areas as key determinants in balancing EUI and UDI for high-rise office building perfor-
mance in China’s HSWW zone. Additionally, the observed parameter distribution patterns
and median values of Pareto-optimal solutions provide meaningful design guidelines for
such buildings in this climate zone, offering valuable references for future practices that
balance energy efficiency and visual comfort criteria.

4. Conclusions
This study proposes a performance-oriented optimization method for building mor-
phology, integrating machine learning (ML) algorithms with genetic algorithms (GAs), and
establishes a complete workflow. Guided by this framework, this paper uses Guangzhou
as a case study in China’s HSWW climate zone to optimize the form parameters of local
high-rise office buildings. The primary findings are as follows:
• Through comparative analysis of multiple ML algorithms, ensemble ML algorithms
are found to effectively capture the complex nonlinear relationships between build-
ing form parameters and performance metrics. Among them, the CatBoost algo-
rithm demonstrates the best predictive performance for this study’s target (R2 = 0.94,
CVRMSE = 1.59%).
• SHAP analysis shows that horizontal sunshade size (HSS), spatial depth (D), floor
height (FH), windowsill height (WSH), vertical sunshade size (VSS), and vertical
shading distance (VSD) strongly influence the predictions of the machine learning
model. Additionally, by increasing horizontal sunshade sizes, decreasing vertical shad-
ing distance, and adjusting building orientation to a slight southeast direction, these
form parameters become the most effective for performance optimization, achieving
reduced EUI while improving UDI. In general, SHAP analysis indicates that shad-
ing parameters have the greatest effect on performance results, followed by vertical
parameters, with planar parameters exerting the smallest influence.
• The Pareto-optimal morphological parameters generated by the surrogate model show
good agreement with their corresponding actual simulation results, with 87.7% (57 out
of 65) of the results having an error rate below 5% and an average error rate of 0.34%
for EUI and −1.4% for UDI. This demonstrates the effectiveness of the integrated
optimization approach using machine learning and genetic algorithms.
• Compared to the baseline model, a Pareto-optimal solution achieves a 3.31% reduction
in EUI and a 5.12% increase in UDI.
• Based on the Pareto-optimal solutions, the following design strategies for form pa-
rameters are proposed to fully enhance the energy-saving potential of high-rise office
buildings in China’s HSWW zone: (1) adopting a building orientation ranging from
due south to 30 degrees east of south; (2) using a rectangular floor plan measuring
approximately 40 m in width and 58 m in length (an aspect ratio of 1.45, total area of
about 2300 m2 , and office area depth of 12 m); (3) implementing a facade design with a
floor height of 4.0–4.2 m, larger possible windowsill and ceiling height, and a window-
to-wall ratio of 0.37–0.45; and (4) employing horizontal and vertical sunshades longer
than 1.3 m as well as high-density vertical sunshades.
However, this research has several limitations. First, the parametric model only uses
10 constrained form parameters, which are insufficient for designs requiring greater preci-
sion, although they are suitable for most early-stage design processes. Second, only two
performance objectives were selected, lacking consideration of other metrics (e.g., thermal
comfort, carbon emissions). Third, the algorithm comparison was limited to five well-
established classical algorithms, without exploring advanced algorithms or deep neural
network approaches. Additionally, this study only chose Guangzhou as a representative of
this climate zone. Although cities within the same climate zone share similar climatic condi-
Sustainability 2025, 17, 4090 23 of 27

tions, there are still subtle differences between them, which may lead to certain deviations
in the optimal results.
Future research should focus on the following directions: (1) applying the proposed
method to building designs with more parameters, such as the rotation angles of shading
devices and other geometric details; (2) incorporating additional performance objectives;
(3) investigating advanced machine learning (ML) algorithms to improve prediction accu-
racy; (4) extending this research method to more cities within this climate zone to obtain
more precise form optimization results and attempting to develop a universal prediction
model applicable to all major cities in this climate zone; (5) applying this method to the form
optimization of high-rise office buildings in other climatic zones of China; and (6) extending
the simulation duration to account for long-term climate change impacts.

Author Contributions: Conceptualization, X.X. and Y.N.; methodology, X.X. and Y.N.; software, X.X.;
validation, X.X., Y.N. and T.Z.; formal analysis, X.X. and T.Z.; investigation, X.X.; resources, X.X. and
Y.N.; data curation, X.X.; writing—original draft preparation, X.X.; writing—review and editing, X.X.;
visualization, X.X.; supervision, Y.N.; project administration, X.X. and Y.N.; funding acquisition, Y.N.
All authors have read and agreed to the published version of the manuscript.

Funding: This research was funded and supported by State Key Laboratory of Subtropical Building
Science, South China University of Technology for the project titled “Comprehensive Demonstration
of Green and Low-Carbon Construction Technologies for Buildings and Cities in China’s Hot-Summer
and Warm-Winter (HSWW) zone” (Grant No. 2022KC16).

Institutional Review Board Statement: Not applicable.

Informed Consent Statement: Not applicable.

Data Availability Statement: The data presented in this study are available on request from the
corresponding author.

Conflicts of Interest: Authors Xie Xie was a PhD student at South China University of Technology
and was currently interning at the company Architectural Design & Research Institute of South China
University of Technology (SCUT) Co., Ltd. Yang Ni wasere a professor of South China University of
Technology and employed by the company Architectural Design & Research Institute of South China
University of Technology (SCUT) Co., Ltd. Author Tianzi Zhang was employed by the company
Guangzhou International Engineering Consult Co., Ltd. The authors declare that the research was
conducted in the absence of any commercial or financial relationships that could be construed as a
potential conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript:

HSWW Hot-summer and warm winter

BPS Building performance simulation
BPO Building performance optimization
EUI Energy use intensity
UDI Useful daylight illuminance
ML Machine learning
NSGA-II Non-dominated sorting genetic algorithm
SHAP SHapley Additive exPlanation

References
1. WMO Confirms That 2023 Smashes Global Temperature Record. Available online: https://wmo.int/news/media-centre/wmo-
confirms-2023-smashes-global-temperature-record (accessed on 15 March 2025).
2. The 2019 Blue Book on Climate Change in China. Available online: https://www.cma.gov.cn/zfxxgk/gknr/qxbg/201905/t20190
524_1709279.html (accessed on 15 March 2025).
Sustainability 2025, 17, 4090 24 of 27

3. United Nations; Intergovernmental Panel on Climate Change (IPCC). Climate Change 2013: The Physical Science Basis; Plattner, M.,
Ed.; Cambridge University Press: Cambridge, UK, 2013.
4. Geng, Y.; Sarkis, J. China-US trade spat could hit the environment. Nature 2018, 557, 309. [CrossRef] [PubMed]
5. China Association of Building Energy Efficiency; Chongqing University. Research Report on Energy Consumption and Carbon
Emission of Buildings in China (2023). Construct Archit. 2024, 2024, 46–59.
6. Reshaping Energy: A Study on the Roadmap of China’s Energy Consumption and Production Revolution Towards 2050. 2016.
Available online: https://china.lbl.gov (accessed on 15 March 2025).
7. China Association of Building Energy Efficiency. 2022 Research Report of China Building Energy Consumption and Carbon Emissions;
China Association of Building Energy Efficiency: Chongqing, China, 2023; Volume 27, p. 12.
8. Ma, M.; Cai, W.; Wu, Y. China Act on the Energy Efficiency of Civil Buildings (2008): A decade review. Sci. Total Environ. 2019,
651, 42–60. [CrossRef] [PubMed]
9. Song, L.; Zhang, C.; Li, H.J. 2015 National Green Building Evaluation Label Statistical Report. Constr. Sci. Technol. 2016, 10, 12–15.
[CrossRef]
10. Ruparathna, R.; Hewage, K.; Sadiq, R. Improving the energy efficiency of the existing building stock: A critical review of
commercial and institutional buildings. Renew. Sustain. Energy Rev. 2016, 53, 1032–1045. [CrossRef]
11. The Ministry of Housing and Urban Rural Development of China. Several Opinions of the Ministry of Housing and Urban Rural
Development on Promoting the Development and Reform of the Construction Industry. Intell. Build. City Inf. 2014, 7, 24–28.
12. American Society of Heating, Refrigeration and Air-Conditioning Engineers. ASHRAE Handbook: Fundamentals; ASHRAE:
Atlanta, GA, USA, 2009.
13. EnergyPlus. Available online: https://energyplus.net/ (accessed on 15 March 2025).
14. TRNSYS: Transient System Simulation Tool. Available online: http://www.trnsys.com/ (accessed on 15 March 2025).
15. Lin, B.; Chen, H.; Liu, Y.; He, Q.; Li, Z. A Preference-Based Multi-Objective Building Performance Optimization Method for Early
Design Stage. Build. Simul. 2021, 14, 477–494. [CrossRef]
16. Li, Y.; O’Neill, Z.; Zhang, L.; Chen, J.; Im, P.; DeGraw, J. Grey-box modeling and application for building energy simulations—A
critical review. Renew. Sustain. Energy Rev. 2021, 146, 111–174. [CrossRef]
17. Manmatharasan, P.; Bitsuamlak, G.; Grolinger, K. AI-driven design optimization for sustainable buildings: A systematic review.
Energy Build. 2025, 332, 115440. [CrossRef]
18. Clarke, J.A.; Clarke, J.A. Energy Simulation in Building Design; Routledge: London, UK, 2001.
19. Javanroodi, K.; Nik, V.M.; Mahdavinejad, M. A novel design—Based optimization framework for enhancing the energy efficiency
of high-rise office buildings in urban areas. Sustain. Cities Soc. 2019, 49, 101577. [CrossRef]
20. Božiček, D.; Kunič, R.; Krainer, A.; Stritih, U.; Dovjak, M. Mutual Influence of External Wall Thermal Transmittance, Thermal
Inertia, and Room Orientation on Office Thermal Comfort and Energy Demand. Energies 2023, 16, 3524. [CrossRef]
21. Soflaei, F.; Shokouhian, M.; Tabadkani, A.; Moslehi, H.; Berardi, U. A simulation-based model for courtyard housing design based
on adaptive thermal comfort. J. Build. Eng. 2020, 31, 101335. [CrossRef]
22. Du, Y.; Mak, C.M.; Li, Y. A multi-stage optimization of pedestrian level wind environment and thermal comfort with lift-up
design in ideal urban canyons. Sustain. Cities Soc. 2019, 46, 101424. [CrossRef]
23. Moazzeni, M.H.; Ghiabaklou, Z. Investigating the Influence of Light Shelf Geometry Parameters on Daylight Performance and
Visual Comfort, a Case Study of Educational Space in Tehran, Iran. Buildings 2016, 6, 26. [CrossRef]
24. Alhagla, K.; Mansour, A.; Elbassuoni, R. Optimizing windows for enhancing daylighting performance and energy saving. Alex.
Eng. J. 2019, 58, 283–290. [CrossRef]
25. Susa-Páez, A.; Piderit-Moreno, M.B. Geometric Optimization of Atriums with Natural Lighting Potential for Detached High-Rise
Buildings. Sustainability 2020, 12, 6651. [CrossRef]
26. Gan, V.J.L.; Wang, B.; Chan, C.M.; Weerasuriya, A.U.; Cheng, J.C.P. Physics-based, data-driven approach for predicting natural
ventilation of residential high-rise buildings. Build. Simul. 2022, 15, 129–148. [CrossRef]
27. Østergård, T.; Jensen, R.L.; Maagaard, S.E. Building simulations supporting decision making in early design—A review. Renew.
Sustain. Energy Rev. 2016, 61, 187–201. [CrossRef]
28. Wortmann, T.; Cichocka, J.; Waibel, C. Simulation-based optimization in architecture and building engineering—Results from an
international user survey in practice and research. Energy Build. 2022, 259, 111863. [CrossRef]
29. Radford, A.D.; Gero, J.S. On optimization in computer aided architectural design. Build. Environ. 1980, 15, 73–80. [CrossRef]
30. Deb, K. Multi-Objective Optimization Using Evolutionary Algorithm; John Wiley & Sons: Hoboken, NJ, USA, 2001; p. 497.
31. Longo, S.; Montana, F.; Riva Sanseverino, E. A review on optimization and cost-optimal methodologies in low-energy buildings
design and environmental considerations. Sustain. Cities Soc. 2019, 45, 87–104. [CrossRef]
32. Attia, S. Computational Optimisation for Zero Energy Building Design, Interviews with Twenty Eight International Experts.
In Proceedings of the Building Simulation 2013—13th International IBPSA Conference, Chambery, France, 25–28 August 2012;
Architecture et Climat: Paris, France, 2012.
Sustainability 2025, 17, 4090 25 of 27

33. Wetter, M.; Wright, J.A. A comparison of deterministic and probabilistic optimization algorithms for nonsmooth
simulation—Based optimization. Build. Environ. 2004, 39, 989–999. [CrossRef]
34. Hamdy, M.; Nguyen, A.-T.; Hensen, J.L.M. A performance comparison of multi-objective optimization algorithms for solving
nearly-zero-energy-building design problems. Energy Build. 2016, 121, 57–71. [CrossRef]
35. modeFRONTIER. Available online: http://www.esteco.com/modefrontier (accessed on 15 March 2025).
36. Octopus. Available online: https://www.grasshopper3d.com/group/octopus?overrideMobileRedirect=1 (accessed on
15 March 2025).
37. Alelwani, R.; Ahmad, M.W.; Rezgui, Y.; Alshammari, K. Optimising Energy Efficiency and Daylighting Performance for Designing
Vernacular Architecture—A Case Study of Rawshan. Sustainability 2025, 17, 315. [CrossRef]
38. Wang, M.; Xu, Y.; Shen, R.; Wu, Y. Performance—Oriented Parametric Optimization Design for Energy Efficiency of Rural
Residential Buildings: A Case Study from China’s Hot Summer and Cold Winter Zone. Sustainability 2024, 16, 8330. [CrossRef]
39. Chaturvedi, S.; Rajasekar, E.; Natarajan, S. Multi-objective Building Design Optimization under Operational Uncertainties Using
the NSGA II Algorithm. Buildings 2020, 10, 88. [CrossRef]
40. Zhao, J.; Du, Y. Multi-objective optimization design for windows and shading configuration considering energy consumption and
thermal comfort: A case study for office building in different climatic regions of China. Sol. Energy 2020, 206, 997–1017. [CrossRef]
41. Amasyali, K.; El-Gohary, N.M. A review of data-driven building energy consumption prediction studies. Renew. Sustain. Energy
Rev. 2018, 81, 1192–1205. [CrossRef]
42. Qiao, Q.; Yunusa-Kaltungo, A.; Edwards, R.E. Towards developing a systematic knowledge trend for building energy consumption
prediction. J. Build. Eng. 2021, 35, 101967. [CrossRef]
43. Zhang, L.; Wen, J.; Li, Y.; Chen, J.; Ye, Y.; Fu, Y.; Livingood, W. A review of machine learning in building load prediction. Appl.
Energy 2021, 285, 116452. [CrossRef]
44. Kalogirou, S.A. Applications of artificial neural-networks for energy systems. Appl. Energy 2000, 67, 17–35. [CrossRef]
45. Wong, S.L.; Wan, K.K.W.; Lam, T.N.T. Artificial neural networks for energy analysis of office buildings with daylighting. Appl.
Energy 2010, 87, 551–557. [CrossRef]
46. Moon, J.W.; Kim, J.-J. ANN-based thermal control models for residential buildings. Build. Environ. 2010, 45, 1612–1625. [CrossRef]
47. Geyer, P.; Singaravel, S. Component-based machine learning for performance prediction in building design. Appl. Energy 2018,
228, 1439–1453. [CrossRef]
48. Shao, M.; Wang, X.; Bu, Z.; Chen, X.; Wang, Y. Prediction of energy consumption in hotel buildings via support vector machines.
Sustain. Cities Soc. 2020, 57, 102128. [CrossRef]
49. Liu, Y.; Chen, H.; Zhang, L.; Wu, X.; Wang, X. -J. Energy consumption prediction and diagnosis of public buildings based on
support vector machine learning: A case study in China. J. Clean. Prod. 2020, 272, 122542. [CrossRef]
50. Cai, W.; Wen, X.; Li, C.; Shao, J.; Xu, J. Predicting the energy consumption in buildings using the optimized support vector
regression model. Energy 2023, 273, 127188. [CrossRef]
51. Wu, C.; Pan, H.; Luo, Z.; Liu, C.; Huang, H. Multi-objective optimization of residential building energy consumption, daylighting,
and thermal comfort based on BO-XGBoost-NSGA-II. Build. Environ. 2024, 254, 111386. [CrossRef]
52. Yan, K.; Li, W.; Ji, Z.; Qi, M.; Du, Y. A Hybrid LSTM Neural Network for Energy Consumption Forecasting of Individual
Households. IEEE Access 2019, 7, 157633–157642. [CrossRef]
53. Yu, Z.; Haghighat, F.; Fung, B.C.M.; Yoshino, H. A decision tree method for building energy demand modeling. Energy Build.
2010, 42, 1637–1646. [CrossRef]
54. Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [CrossRef]
55. Wang, Z.; Wang, Y.; Zeng, R.; Srinivasan, R.S.; Ahrentzen, S. Random Forest based hourly building energy prediction. Energy
Build. 2018, 171, 11–25. [CrossRef]
56. Pham, A.-D.; Ngo, N.-T.; Truong, T.T.H.; Huynh, N.-T.; Truong, N.-S. Predicting energy consumption in multiple buildings using
machine learning for improving energy efficiency and sustainability. J. Clean. Prod. 2020, 260, 121082. [CrossRef]
57. Chi, B.; Li, Y.; Zhou, D. A Hybrid Method of Cooling and Heating Consumption Prediction for Six Types of Buildings Based on
Machine Learning. Sustainability 2024, 16, 11200. [CrossRef]
58. Safarzadegan Gilan, S.; Goyal, N.; Dilkina, B. Active learning in multi-objective evolutionary algorithms for sustainable building
design. In Proceedings of the Genetic and Evolutionary Computation Conference 2016, Denver, CO, USA, 20–24 July 2016.
59. Chen, X.; Yang, H. A multi-stage optimization of passively designed high-rise residential buildings in multiple building operation
scenarios. Appl. Energy 2017, 206, 541–557. [CrossRef]
60. Gou, S.; Nik, V.M.; Scartezzini, J.L.; Zhao, Q.; Li, Z. Passive design optimization of newly-built residential buildings in Shanghai
for improving indoor thermal comfort while reducing building energy demand. Energy Build. 2018, 169, 484–506. [CrossRef]
61. Ilbeigi, M.; Ghomeishi, M.; Dehghanbanadaki, A. Prediction and optimization of energy consumption in an office building using
artificial neural network and a genetic algorithm. Sustain. Cities Soc. 2020, 61, 102325. [CrossRef]
Sustainability 2025, 17, 4090 26 of 27

62. Chen, R.; Tsay, Y.-S.; Ni, S. An integrated framework for multi-objective optimization of building performance: Carbon emissions,
thermal comfort, and global cost. J. Clean. Prod. 2022, 359, 131978. [CrossRef]
63. Ding, Z.; Li, J.; Wang, Z.; Xiong, Z. Multi-Objective Optimization of Building Envelope Retrofits Considering Future Climate
Scenarios: An Integrated Approach Using Machine Learning and Climate Models. Sustainability 2024, 16, 8217. [CrossRef]
64. Si, B.; Ni, Z.; Xu, J.; Li, Y.; Liu, F. Interactive effects of hyperparameter optimization techniques and data characteristics on
the performance of machine learning algorithms for building energy metamodeling. Case Stud. Therm. Eng. 2024, 55, 104124.
[CrossRef]
65. Al-Masrani, S.M.; Al-Obaidi, K.M. Dynamic shading systems: A review of design parameters, platforms and evaluation strategies.
Autom. Constr. 2019, 102, 195–216. [CrossRef]
66. Zhou, F.; Wang, Z.; Su, X.; Yang, Y.; Duanmu, L.; Zhou, X.; Lian, Z.; Zhai, Y.; Cao, B.; Zhang, Y.; et al. Study on the Thermal
Adaptation Model During the Transition Season in Hot Summer and Cold Winter Regions. Heat. Vent. Air Cond. 2022, 52, 132–136.
[CrossRef]
67. Kheiri, F. A review on optimization methods applied in energy-efficient building geometry and envelope design. Renew. Sustain.
Energy Rev. 2018, 92, 897–920. [CrossRef]
68. Li, S.; Liu, L.; Peng, C. A Review of Performance-Oriented Architectural Design and Optimization in the Context of Sustainability:
Dividends and Challenges. Sustainability 2020, 12, 1427. [CrossRef]
69. Xuanyuan, P.; Zhang, Y.; Yao, J.; Zheng, R. Sensitivity Analysis and Optimization of Energy-Saving Measures for Office Building
in Hot Summer and Cold Winter Regions. Energies 2024, 17, 1675. [CrossRef]
70. Ma, Y.; Deng, W.; Xie, J.; Heath, T.; Xiang, Y.; Hong, Y. Generating prototypical residential building geometry models using a new
hybrid approach. Build. Simul. 2022, 15, 17–28. [CrossRef]
71. Touloupaki, E.; Theodosiou, T. Performance Simulation Integrated in Parametric 3D Modeling as a Method for Early Stage Design
Optimization—A Review. Energies 2017, 10, 637. [CrossRef]
72. Honeybee for Grasshopper. Available online: https://github.com/mostaphaRoudsari/Honeybee/ (accessed on 15 March 2025).
73. Ward, G.J. The Radiance lighting simulation and rendering system. In Proceedings of the 21st Annual Conference on Computer
Graphics and Interactive Techniques, SIGGRAPH, Orlando, FL, USA, 24–29 July 1994.
74. GB 55015-2021; General Specification for Building Energy Efficiency and Renewable Energy Utilization. China Architecture &
Building Press: Beijing, China, 2022.
75. GB 50189-2015; General Administration of Quality Supervision, Inspection and Quarantine of the People’s Republic of China.
Design Standard for Energy Efficiency of Public Buildings. China Architecture & Building Press: Beijing, China, 2015.
76. ASHRAE Standard 90.1-2019; Energy Standard for Buildings Except Low-Rise Residential Buildings. ASHRAE: Atlanta, GA,
USA, 2019.
77. GB 50352-2019; Unified Standard for Civil Building Design. China Architecture & Building Press: Beijing, China, 2014.
78. Honeybee. Available online: https://www.ladybug.tools/honeybee.html (accessed on 15 March 2025).
79. Roudsari, M.S.; Pak, M.; Smith, A. Ladybug: A parametric environmental plugin for grasshopper to help designers create an
environmentally-conscious design. In Proceedings of the 13th International IBPSA Conference, Lyon, France, 25–28 August 2013;
Volume 8.
80. Negendahl, K.; Nielsen, T.R. Building energy optimization in the early design stages: A simplified method. Energy Build. 2015,
105, 88–99. [CrossRef]
81. Nabil, A.; Mardaljevic, J. Useful daylight illuminance: A new paradigm for assessing daylight in buildings. Light Res. Technol.
2005, 37, 41–57. [CrossRef]
82. Tian, W. A review of sensitivity analysis methods in building energy analysis. Renew. Sustain. Energy Rev. 2013, 20, 411–419.
[CrossRef]
83. Mahmoud, A.H.A.; Elghazi, Y. Parametric-based designs for kinetic facades to optimize daylight performance: Comparing
rotation and translation kinetic motion for hexagonal facade patterns. Solar Energy 2016, 126, 111–127. [CrossRef]
84. Helton, J.C.; Johnson, J.D.; Sallaberry, C.J.; Storlie, C.B. Survey of sampling-based methods for uncertainty and sensitivity analysis.
Reliab. Eng. Syst. Saf. 2006, 91, 1175–1209. [CrossRef]
85. Ascione, F.; Bianco, N.; De Stasio, C.; Mauro, G.M.; Vanoli, G.P. Artificial neural networks to predict energy performance and
retrofit scenarios for any member of a building category: A novel approach. Energy 2017, 118, 999–1017. [CrossRef]
86. Drucker, H.; Burges, C.J.; Kaufman, L.; Smola, A.; Vapnik, V. Support vector regression machines. Adv. Neural Inf. Process. Syst.
1997, 9, 155–161.
87. Jain, R.K.; Smith, K.M.; Culligan, P.J.; Taylor, J.E. Forecasting energy consumption of multi-family residential buildings using
support vector regression: Investigating the impact of temporal and spatial monitoring granularity on performance accuracy.
Appl. Energy 2014, 123, 168–178. [CrossRef]
Sustainability 2025, 17, 4090 27 of 27

88. Chen, T.; Guestrin, C. XGBoost: A scalable tree boosting system. In Proceedings of the KDD’16: Proceedings of the 22nd ACM
SIGKDD International Conference on Knowledge Discovery and Data Mining, Hong Kong, China, 13–17 August 2016; Volume 9,
p. 3. [CrossRef]
89. Yandex. CatBoost: Gradient Boosting with Categorical Features. Available online: https://catboost.ai (accessed on
16 March 2025).
90. Bian, J.; Wang, J.; Yece, Q. A novel study on power consumption of an HVAC system using CatBoost and AdaBoost algorithms
combined with the metaheuristic algorithms. Energy 2024, 302, 131841. [CrossRef]
91. American Society of Heating, Refrigerating and Air-Conditioning Engineers. Measurement of Energy and Demand Savings (ASHRAE
Guideline 14-2014); ASHRAE: Atlanta, GA, USA, 2014.
92. Deb, K.; Agrawal, S.; Pratap, A.; Meyarivan, T. A fast elitist non-dominated sorting genetic algorithm for multi-objective
optimization: NSGA-II. In Proceedings of the International Conference on Parallel Problem Solving from Nature, Paris, France,
18–20 September 2000; pp. 849–858. [CrossRef]
93. Delgarm, N.; Sajadi, B.; Delgarm, S.; Kowsary, F. A novel approach for the simulation-based optimization of the buildings energy
consumption using NSGA-II: Case study in Iran. Energy Build. 2016, 127, 552–560. [CrossRef]
94. Cohen, J. Statistical Power Analysis for the Behavioral Sciences, 2nd ed.; Lawrence Erlbaum Associates: Mahwah, NJ, USA, 1988.
95. Lundberg, S.; Lee, S. A unified approach to interpreting model predictions. In Proceedings of the 31st International Conference
on Neural Information Processing Systems (NIPS’17), Long Beach, CA, USA, 4–9 December 2017; Volume 4, p. 12.
96. Salih, A.M.; Raisi-Estabragh, Z.; Galazzo, I.B.; Radeva, P.; Petersen, S.E.; Lekadir, K.; and Menegaz, G. A Perspective on
Explainable Artificial Intelligence Methods: SHAP and LIME. Adv. Intell. Syst. 2024, 7, 2400304. [CrossRef]
97. Suga, K.; Kato, S.; Hiyama, K. Structural analysis of Pareto-optimal solution sets for multi-objective optimization: An application
to outer window design problems using Multiple Objective Genetic Algorithms. Build. Environ. 2010, 45, 1144–1152. [CrossRef]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.

1 s2.0 S2590123025010709 Main
No ratings yet
1 s2.0 S2590123025010709 Main
24 pages
1 s2.0 S0378778824008855 Main
No ratings yet
1 s2.0 S0378778824008855 Main
21 pages
Cucs 008 12
No ratings yet
Cucs 008 12
6 pages
Energy & Buildings: Badr Chegari, Mohamed Tabaa, Emmanuel Simeu, Fouad Moutaouakkil, Hicham Medromi
No ratings yet
Energy & Buildings: Badr Chegari, Mohamed Tabaa, Emmanuel Simeu, Fouad Moutaouakkil, Hicham Medromi
20 pages
Sustainability 14 14446 v2
No ratings yet
Sustainability 14 14446 v2
14 pages
1 s2.0 S0360544221029418 Main
No ratings yet
1 s2.0 S0360544221029418 Main
12 pages
Engineering Applications of Artificial Intelligence
No ratings yet
Engineering Applications of Artificial Intelligence
24 pages
Smart Energy ML
No ratings yet
Smart Energy ML
14 pages
Smart Energy Optimization with ML
No ratings yet
Smart Energy Optimization with ML
14 pages
Sustainability 17 02056
No ratings yet
Sustainability 17 02056
51 pages
Reference - 8
No ratings yet
Reference - 8
17 pages
1 s2.0 S2352484723000586 Main
No ratings yet
1 s2.0 S2352484723000586 Main
17 pages
1 s2.0 S0378778819323047 Main
No ratings yet
1 s2.0 S0378778819323047 Main
15 pages
B 275
No ratings yet
B 275
14 pages
Electricity Consumption Forecast of High Rise Office Buildings Based On The Long Short Term Memory Method
No ratings yet
Electricity Consumption Forecast of High Rise Office Buildings Based On The Long Short Term Memory Method
22 pages
Urban Building Energy Prediction via ML
No ratings yet
Urban Building Energy Prediction via ML
16 pages
Energy Retrofitting of Hospital Buildings Considering Climate Change An
No ratings yet
Energy Retrofitting of Hospital Buildings Considering Climate Change An
26 pages
1 s2.0 S0378778822000597 Main
No ratings yet
1 s2.0 S0378778822000597 Main
19 pages
s12273 024 1181 y
No ratings yet
s12273 024 1181 y
19 pages
Energy-Efficient Buildings Facilitated by Microgrid
No ratings yet
Energy-Efficient Buildings Facilitated by Microgrid
11 pages
Convolutional Neural Network Based Energy Consumption Management Model For The Full Life Cycle
No ratings yet
Convolutional Neural Network Based Energy Consumption Management Model For The Full Life Cycle
9 pages
Attaining Sustainable High-Rise Office in Warm-Summer-Cold-Winter
No ratings yet
Attaining Sustainable High-Rise Office in Warm-Summer-Cold-Winter
10 pages
Case Studies in Thermal Engineering
No ratings yet
Case Studies in Thermal Engineering
17 pages
A Comprehensive Method For Optimizing The Design of A Regular A - 2021 - Energy
No ratings yet
A Comprehensive Method For Optimizing The Design of A Regular A - 2021 - Energy
16 pages
Lipid Nanoparticle Delivery
No ratings yet
Lipid Nanoparticle Delivery
18 pages
1 s2.0 S0360132323002792 Main
No ratings yet
1 s2.0 S0360132323002792 Main
15 pages
Energy Conversion and Management
No ratings yet
Energy Conversion and Management
16 pages
Preprints202408 1489 v1
No ratings yet
Preprints202408 1489 v1
10 pages
Climate Façade Optimization Dissertation
No ratings yet
Climate Façade Optimization Dissertation
224 pages
A Machine Learning-Based Intelligent Framework For Predicting Energy Efficiency in Next-Generation Residential Buildings
No ratings yet
A Machine Learning-Based Intelligent Framework For Predicting Energy Efficiency in Next-Generation Residential Buildings
35 pages
Design For Cold-Region
No ratings yet
Design For Cold-Region
21 pages
Ruijun Chen Improving Building Resilience in The Face
No ratings yet
Ruijun Chen Improving Building Resilience in The Face
19 pages
Modeling and Forecasting Building Energy Consumption - A Review of Data-Driven Techniques
No ratings yet
Modeling and Forecasting Building Energy Consumption - A Review of Data-Driven Techniques
27 pages
Predicitive Models Building
No ratings yet
Predicitive Models Building
22 pages
Machine Learning for Building Energy Prediction
No ratings yet
Machine Learning for Building Energy Prediction
14 pages
Modeling Heating and Cooling Loads by Artificial Intelligence For Energy-Efficient Building Design
No ratings yet
Modeling Heating and Cooling Loads by Artificial Intelligence For Energy-Efficient Building Design
10 pages
Integration of Thermal-Daylighting Climate Subzones and Energy Efficiency Design Optimization For Office Buildings
No ratings yet
Integration of Thermal-Daylighting Climate Subzones and Energy Efficiency Design Optimization For Office Buildings
20 pages
(Asce) SC 1943-5576 0000555
No ratings yet
(Asce) SC 1943-5576 0000555
8 pages
1 s2.0 S036013232300834X Main
No ratings yet
1 s2.0 S036013232300834X Main
10 pages
Energies 16 03748
No ratings yet
Energies 16 03748
23 pages
Reference
No ratings yet
Reference
11 pages
Machine Learning Models For TH
No ratings yet
Machine Learning Models For TH
25 pages
Pruvost Et Al Ontology Based Expert System For Automated Monitoring of Building Energy Systems
No ratings yet
Pruvost Et Al Ontology Based Expert System For Automated Monitoring of Building Energy Systems
11 pages
Energies 14 01722 v2
No ratings yet
Energies 14 01722 v2
20 pages
Buildings 13 02161
No ratings yet
Buildings 13 02161
4 pages
A Novel Hybrid Modelling Structure Fabricated by Using Takagi-Sugeno Fuzzy To Forecast HVAC Systems
No ratings yet
A Novel Hybrid Modelling Structure Fabricated by Using Takagi-Sugeno Fuzzy To Forecast HVAC Systems
18 pages
Buildings 12 02039
No ratings yet
Buildings 12 02039
25 pages
SSRN Id4232295
No ratings yet
SSRN Id4232295
30 pages
Building Energy Models at Different Time Scales Based On Multi-Output Machine Learning
No ratings yet
Building Energy Models at Different Time Scales Based On Multi-Output Machine Learning
30 pages
1 s2.0 S0360132322002724 Main
No ratings yet
1 s2.0 S0360132322002724 Main
14 pages
Transformers For Energy Forecast
No ratings yet
Transformers For Energy Forecast
21 pages
Paper Presentation Betab Ash
No ratings yet
Paper Presentation Betab Ash
7 pages
Neural Machine Translation For Low-Resource (Repoprt)
No ratings yet
Neural Machine Translation For Low-Resource (Repoprt)
24 pages
5an Application of Bayesian Network Approach For Selecting Energy Efficient
No ratings yet
5an Application of Bayesian Network Approach For Selecting Energy Efficient
1 page
Youssef Boutahri24 - Machine Learning-Based Predictive Model For Thermal Comfort and Energy Optimization in Smart Buildings
No ratings yet
Youssef Boutahri24 - Machine Learning-Based Predictive Model For Thermal Comfort and Energy Optimization in Smart Buildings
12 pages
Salerno 2021
No ratings yet
Salerno 2021
41 pages
Deep Learning in Energy Modeling Application in Smart Buildings With Distributed Energy Generation
No ratings yet
Deep Learning in Energy Modeling Application in Smart Buildings With Distributed Energy Generation
23 pages
Recurring Deposit Account
No ratings yet
Recurring Deposit Account
2 pages
Binary Phase Diagrams for Students
No ratings yet
Binary Phase Diagrams for Students
42 pages
RD 0201 Online Vers
No ratings yet
RD 0201 Online Vers
22 pages
DS JetBox3300-w V1.0 PDF
No ratings yet
DS JetBox3300-w V1.0 PDF
3 pages
It's All About The Details: Ferd Vollmar
No ratings yet
It's All About The Details: Ferd Vollmar
3 pages
Atomic Theory and Periodic Table
No ratings yet
Atomic Theory and Periodic Table
46 pages
Code For BFS Applied On MAP To Reach From Arad To Bucharest. (Artificial Intelligence)
100% (1)
Code For BFS Applied On MAP To Reach From Arad To Bucharest. (Artificial Intelligence)
2 pages
Industrial Grease Separator Specs
No ratings yet
Industrial Grease Separator Specs
3 pages
White Minimal Clean 2024 Monthly Calendar
No ratings yet
White Minimal Clean 2024 Monthly Calendar
1 page
The Roots of Modern Psychology and Law: A Narrative History Thomas Grisso PDF Download
No ratings yet
The Roots of Modern Psychology and Law: A Narrative History Thomas Grisso PDF Download
166 pages
Sat Practice Test 6 Answers
No ratings yet
Sat Practice Test 6 Answers
49 pages
TOS-reading and Writing 4thQ
No ratings yet
TOS-reading and Writing 4thQ
13 pages
Clivet: Innovative HVAC Systems Leader
No ratings yet
Clivet: Innovative HVAC Systems Leader
20 pages
Transmission Line Design-Advanced TADP 640: Steel Poles-Design Considerations - Miscellaneous Topics Dr. Prasad Yenumula
No ratings yet
Transmission Line Design-Advanced TADP 640: Steel Poles-Design Considerations - Miscellaneous Topics Dr. Prasad Yenumula
40 pages
MBBS ExamCentre Aug2024
No ratings yet
MBBS ExamCentre Aug2024
5 pages
Carbohydrates
No ratings yet
Carbohydrates
29 pages
Zeong Playermodel & NPC Setup
No ratings yet
Zeong Playermodel & NPC Setup
2 pages
Arabistas Españoles Crítica
No ratings yet
Arabistas Españoles Crítica
14 pages
NTSE Stage 2 MAT Language & Logic Test
No ratings yet
NTSE Stage 2 MAT Language & Logic Test
18 pages
Effective Assessment for Teachers
No ratings yet
Effective Assessment for Teachers
5 pages
Калинина ПП 19-1 ПКАГ (October, 28th - November, 1st)
No ratings yet
Калинина ПП 19-1 ПКАГ (October, 28th - November, 1st)
3 pages
Psychic First Aid
0% (1)
Psychic First Aid
58 pages
(START HERE) 7 Day Feminine Transformation Workbook
100% (3)
(START HERE) 7 Day Feminine Transformation Workbook
37 pages
Icse 10 Physics Sample Papers
100% (1)
Icse 10 Physics Sample Papers
152 pages
Weekly Learning Plan: Laoag Central Elementary School
No ratings yet
Weekly Learning Plan: Laoag Central Elementary School
6 pages
Da'wa Guide for Islamic Outreach
No ratings yet
Da'wa Guide for Islamic Outreach
33 pages
Toolbox Final 261a
100% (1)
Toolbox Final 261a
17 pages
Santosh Kumar Tankala
No ratings yet
Santosh Kumar Tankala
2 pages
Ricoh MPC401
No ratings yet
Ricoh MPC401
4 pages

Machine Learning Enhanced Building Performance Gui

Uploaded by

Machine Learning Enhanced Building Performance Gui

Uploaded by

Article

Machine-Learning-Enhanced Building Performance-Guided

Sustainability 2025, 17, 4090 https://doi.org/10.3390/su17094090

1.2. Related Work

prediction accuracy is inherently dependent on data characteristics (e.g., data quality,

1.3. Aims and Originality

Figure 1. Research framework.

Table 1. Summary of form parameters.

Classification Form Parameters Range Units Steps Properties Baseline

Table 1. Summary of form parameters.

Classification Form Parameters Range Units Steps Properties Baseline

2.2. Specification of Material and Thermophysical Parameters

Table 2. Summary of thermophysical properties of the envelope.

Thermal Conductivity Solar Heat Gain Visible

2.3. Setup of Building Operation Schedule

Table 3. Summary of detailed building operation schedule.

Classification Components Values

2.4. Selection of Climate Dataset

2.7. Multi-Objective Optimization with Machine Learning

3. Results and Discussion

Figure 8 presents Pearson correlation coeﬃcients (r) between 10 architectural form

Figure 8 presents Pearson correlation coeﬃcients (r) between 10 architectural form

Sustainability 2025, 17, x FOR PEER REVIEW 15 of 29

Additionally, to evaluate whether the prediction model can be applied to performance

When adjusting the ML model’s predictions by incorporating the 3.50% temperature

3.3. Interpretability Analysis of Machine Learning Model Based on SHAP

However, these 65 Pareto-optimal solutions were predicted by the surrogate model.

However, these 65 Pareto-optimal solutions were predicted by the surrogate model.

Performance Minimum Maximum Median Mean Baseline

Institutional Review Board Statement: Not applicable.

Informed Consent Statement: Not applicable.

HSWW Hot-summer and warm winter

You might also like