0% found this document useful (0 votes)

6 views14 pages

Design and Optimization of Neural Networks For Multifidelity Cosmological Emulation

The document presents T2N-MusE, a neural network framework designed for multifidelity cosmological emulation, which significantly improves the efficiency and accuracy of emulators for the matter power spectrum. It introduces a novel architecture, hyperparameter optimization, and training strategies, demonstrating a reduction in validation error by over five times compared to previous models. This framework has been successfully applied to data from the Goku simulation suite and is expected to enhance future emulators for various cosmological statistics.

Uploaded by

cccc36631

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views14 pages

Design and Optimization of Neural Networks For Multifidelity Cosmological Emulation

Uploaded by

cccc36631

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 14

Design and optimization of neural networks for multifidelity cosmological emulation

Yanhui Yang (杨焱辉)1 ,∗ Simeon Bird1 ,† Ming-Feng Ho (何銘峰)1,2,3 , and Mahdi Qezlou4
1
Department of Physics & Astronomy, University of California,
Riverside, 900 University Ave., Riverside, CA 92521, USA
2
Department of Physics, University of Michigan, 450 Church St, Ann Arbor, MI 48109, USA
3
Leinweber Center for Theoretical Physics, 450 Church St, Ann Arbor, MI 48109, USA and
4
The University of Texas at Austin, 2515 Speedway Boulevard, Stop C1400, Austin, TX 78712, USA
(Dated: July 11, 2025)
Accurate and efficient simulation-based emulators are essential for interpreting cosmological sur-
vey data down to nonlinear scales. Multifidelity emulation techniques reduce simulation costs by
combining high- and low-fidelity data, but traditional regression methods such as Gaussian processes
struggle with scalability in sample size and dimensionality. In this work, we present T2N-MusE, a
neural network framework characterized by (i) a novel 2-step multifidelity architecture, (ii) a 2-stage
arXiv:2507.07184v1 [astro-ph.CO] 9 Jul 2025

Bayesian hyperparameter optimization, (iii) a 2-phase k-fold training strategy, and (iv) a per-z prin-
cipal component analysis strategy. We apply T2N-MusE to selected data from the Goku simulation
suite, covering a 10-dimensional cosmological parameter space, and build emulators for the mat-
ter power spectrum over a range of redshifts with different configurations. We find the emulators
outperform our earlier Gaussian process models significantly and demonstrate that each of these
techniques is efficient in training neural networks or/and effective in improving generalization accu-
racy. We observe a reduction in validation error by more than a factor of five compared to previous
work. This framework has been used to build the most powerful emulator for the matter power
spectrum, GokuNEmu, and will also be used to construct emulators for other statistics in future.

I. INTRODUCTION lations [25], the emulators based on the Mira-Titan Uni-

verse suite [26–30], the E-MANTIS emulator [31], EuclidEm-
Cosmological surveys such as DESI [1], LSST [2], Eu- ulator [32, 33], CSST Emulator [34] and GokuEmu [35]. These

clid [3], the Nancy Grace Roman Space Telescope [4], emulators are able to predict summary statistics within
the China Space Station Telescope (CSST) [5], and the their parameter space with orders of magnitude lower
Prime Focus Spectrograph (PFS) on the Subaru Tele- computational costs than full simulations.
scope [6] will enable precise measurements of the galaxy There are several well-motivated extensions of the
power spectrum, as well as the weak lensing shear field. standard cosmological model which are constrained by
These measurements will be used to constrain cosmologi- current and future surveys. However, including these
cal models motivated by unresolved fundamental physics extensions in emulators is challenging due to the high
questions. dimensionality of the parameter space that necessitates
Interpreting the data and inferring cosmological pa- a large number of computationally expensive samples.
rameters requires making predictions for the matter field Multi-fidelity (MF) techniques have been developed to
or a summary statistic, such as the matter power spec- reduce the computational cost of building emulators,
trum, and using Bayesian methods. A naive inference e.g., MFEmulator [36] and MF-Box [37]. Ref. [35] built
run may require 106 –107 matter power spectrum evalua- GokuEmu [35], an emulator for the matter power spectrum,
tions at different cosmological parameters, which would which expanded the parameter space to 10 dimensions
be computationally expensive. for the first time, taking into account dynamical dark
Emulation replaces intensive numerical computation energy, massive neutrinos, the effective number of ultra-
for every likelihood evaluation by the evaluation of a relativistic neutrinos and the running of the primordial
cheap pre-trained surrogate model. For instance, emu- spectral index. This was achieved by using MF-Box, which
lators have been widely used to replace the Boltzmann combines simulations with different box sizes and parti-
codes in cosmological inference [7–14]. Emulators based cle loads, at a computational cost 94% less than single-
on N -body simulations are needed to interpret observa- fidelity approaches.
tions on nonlinear scales, k ≳ 0.1h/Mpc. There have Despite the success of MF-Box in reducing the computa-
been several such cosmological emulators, e.g., Franken- tional cost of producing the training data (simulations),
Emu [15–17], the emulators of the Aemulus project [18– the regression technique used, Gaussian process (GP) re-
20], NGenHalofit [21], the dark quest emulator [22], BE- gression, still suffers from the curse of dimensionality.
HaPPY [23], the baryonification emulator of the BACCO The computational complexity of GP regression scales
project [24], the emulators built on the Quijote simu- poorly (cubically) with sample size (see Chapter 8 of
Ref. [38] or Chapter 9 of Ref. [39]). This in turn leads
to lengthy prediction and training times, as well as in-
creased memory usage. GP regression struggles to sat-
∗ [email protected] isfy our need for next-generation cosmological emulators,
† [email protected] which would ideally become yet more complex, including
2

non-standard dark matter models or baryonic physics.

TABLE I. Specifications and numbers of simulations in the
Neural networks (NNs) have been used in emulators. Goku-W suite.
For example, Ref. [40] built an NN emulator for the
Lyman-α forest 1D flux power spectrum, Ref. [41] con- Simulation Box size Particle Number of
structed an MF emulator for large-scale 21 cm light- fidelity (Mpc/h) load simulations
cone images using generative adversarial networks, and HF 1000 30003 nH = 21
LF 250 7503 nL = 564
Ref. [42] trained models for gravitational waves using
NNs. NNs are suitable for larger data sets, given that
they typically scale linearly or sublinearly with sample
size (see, e.g., Ref. [43]). They are also more efficient in of high-fidelity (HF) cosmologies were chosen from the
inference time and memory usage. In addition, Ref. [44] LF cosmologies so as to optimize the available HF infor-
showed that NN MF regression can outperform GP re- mation. Goku includes two Latin hypercubes that cover
gression in terms of accuracy in some cases and suggested different parameter boxes, Goku-W and Goku-N, with wide
that a high-dimensional parameter space would prevent and narrow ranges of parameters, respectively. For con-
GP regression from being effective. venience of testing the methods, only Goku-W will be used
In this work, we develop the “Triple-2” neural net- in this study. We note that the emulator trained on Goku-
work framework for multifidelity cosmological emulation W exhibit larger generalization errors than that trained
(T2N-MusE), characterized by a “2-step” MF architec- on Goku-N [35], underscoring the need for improved mod-
ture, a “2-stage” hyperparameter optimization process, eling over this broader parameter space. Although there
and a “2-phase” k-fold training strategy. Compared to are two LF nodes, L1 and L2, we only use L2 in this study
Ref. [44], we have made several improvements. We intro- (hereafter, we refer to L2 as LF).1 We summarize the LF
duce a modified “2-step” MF architecture, which turns and HF simulations in Table I. The redshifts considered
out to be more suitable in the context of cosmological are z = 0, 0.2, 0.5, 1, 2 and 3. The matter power spec-
emulation than the original “2-step” architecture. The tra measured from these simulations, along with their
“2-stage” hyperparameter optimization process and “2- cosmologies, are the data we use to train the neural net-
phase” training strategy further improve the emulation works.
performance. In addition, we propose a per-redshift data Specifically, the input of the target model are the 10
compression strategy to further boost the emulator’s ac- cosmological parameters2 , i.e., the input vector x ∈ Rdin ,
curacy. We test the performance of T2N-MusE on selected where din = 10. The output is the matter power spec-
data from the Goku simulation suite [35], demonstrating trum at a series of k modes and redshifts, i.e., the output
the efficacy of these training strategies. vector
We organize this paper as follows. Sec. II intro-
y = [y(z1 , k1 ), y(z1 , k2 ), . . . , y(z1 , knk ),
duces the cosmological simulation data used in this study
(Sec. II A), the MF architectures of the neural networks y(z2 , k1 ), . . . , y(z2 , knk ), (1)
(Sec. II B), the workflow of training the neural networks y(znz , k1 ), . . . , y(znz , knk )],
(Sec. II C), the comparative study we design to evaluate
the performance of different choices of architectures and where y(zi , kj ) = lg P (zi , kj ) is the matter power spec-
strategies for data compression and optimization of NNs trum in log space at redshift zi and wavenumber kj , nz is
(Sec. II D). In Sec. III, we present the results of the com- the number of z bins, and nk is the number of k modes.
parative study, showing the effects of different approaches The output vector y ∈ Rdout , where dout = nz × nk . In
on the emulation performance. Finally, we conclude in our case with Goku-W, we have nz = 6 and nk = 64, hence
Sec. IV. dout = 384.

II. METHODS B. Multifidelity Architectures

A. Simulation Data Ref. [44] proposed a “2-step” architecture for NN

MF regression, which consists of two NNs. A first
NN is trained on the LF data, T L = {(xL,i , yL,i ) :
We briefly recap the Goku simulation suite and the spe-
i = 1, 2, . . . , nL }, to learn the LF function f L , such
cific data we use in this work. This paper focuses on
that yL = f L (x). Then a second NN approximates
the machine learning techniques we have developed for
the correlation between the LF and HF functions, F,
building highly optimized emulation models. For more
details on the simulation suite, please refer to Ref. [35].
Goku is a suite of N -body simulations that covers 10
cosmological parameters, performed using the MP-Gadget 1 L2 corresponds to the k range where emulation error dominates
code [45]. A relatively large number of low-fidelity (LF) total uncertainty, making it more suitable for evaluating improve-
simulations were sampled in the parameter space using a ment from the applied techniques.
2 In practice, they are normalized to [−0.5, 0.5].
sliced Latin hypercube design [46], and a small number
3

such that yH = F(x, yL ), based on the input data configurations found in the first stage. Each evaluation of
(X H , fNN
L
(X H )) = {(xH,i , fNNL
(xH,i )) : i = 1, 2, . . . , nH } the hyperparameters involves both training and valida-
and the available HF output data Y H = yH (X H ) = tion of an NN. Importantly, this pipeline is not limited to
{yH,i : i = 1, 2, . . . , nH }. Note that in our case, the the multifidelity emulation context; it is broadly applica-
HF cosmologies are a subset of the LF cosmologies, so ble to other tasks, including single-fidelity emulation and
L
we can replace fNN (X H ) with yL (X H ), such that the two general regression problems involving high-dimensional
NNs can be trained independently and simultaneously. outputs.
While Ref. [44] restricts the second NN to be a shallow More details of each component of the workflow are
NN with only one hidden layer, we allow multiple hidden given in their dedicated sections. See Sec. II C 1 for data
layers in the second NN to increase the flexibility of the compression, Sec. II C 2 for neural network training, and
model. Sec. II C 3 for hyperparameter optimization.
Figure 1 illustrates the original 2-step architecture
with a simple example of 2D input and 3D output. Note
that the input of N NLH (the NN modeling LF-HF cor- 1. Data compression
relation), is a 5D vector, which is a concatenation of the
LF output and the initial input vector. We use PCA to reduce the dimensionality of the output
We propose a modified 2-step architecture with the data. Two strategies are explored in this work: global
same N NL but a different N NLH , illustrated in Fig. 1. PCA and per-redshift (hereafter, local) PCA. The former
Instead of approximating the correlation between the was adopted in some existing emulators, e.g., EuclidEmula-
LF and HF functions, the new N NLH learns the ra- tor2 [33] and the CSST Emulator [34]. We propose the latter
tio of yH to yL , r with the component ri = yiH /yiL as a new approach to compress the output data, allow-
for i = 1, 2, . . . , dout , as a function of the input vec- ing a more flexible representation of the output data that
tor x, i.e., r = G(x). The training data for N NLH is may be better suited to the case where the redshift evo-
T H = {(xH,i , rH,i ) : i = 1, 2, . . . , nH }, where rH,i = lution of the output is nonlinear or complex.
yH,i ⊘ fNNL
(xH,i ). As before, we replace fNN L
(xH,i ) with In the global PCA approach, we perform PCA on all
L H,i
y (x ). With the trained N NL and N NLH , we can pre- k modes and redshifts together, and then each of the
H L
dict the HF output as yNN = GNN (x) ⊙ fNN (x).3 Note original output components can be expressed as a linear
that, for the matter power spectrum, the ratio is calcu- combination of the principal components (PCs), i.e.,
lated in original space rather than log space.
nX
The modified 2-step model significantly reduces the di- PCA

mensionality of the input of N NLH , which is din + dout in y(zi , kj ; x) = µ(zi , kj ) + al (x)ϕl (zi , kj ), (2)
the original architecture and din in the modified architec- l=1

ture. This is particularly important for high-dimensional

where µ(zi , kj ) is the mean of the output data, al (x) is
output data, such as the matter power spectrum, where
the coefficient of the lth PC (i.e., eigenvector), ϕl (zi , kj )
dout ≫ din .
is the lth PC at redshift zi and wavenumber kj , and nPCA
We will test the performance of both architectures in
is the number of PCs. Then the compressed output can
our comparative study (Sec. II D) and show the results
be expressed as
in Sec. III A.
yc = [a1 , a2 , . . . , anPCA ], (3)
C. Neural Network Workflow reducing the dimensionality of the output from n × m
[see Eq. (1)] to dglob
out = nPCA .
We show a schematic of the training workflow for a In the local PCA, we perform PCA on each redshift
highly optimized fully-connected neural network (FCNN) separately. For redshift zi , we have
in Fig. 2. First, we perform data compression to reduce
the dimensionality of the output using principal compo- niPCA
X
i
nent analysis (PCA). Then we explore the hyperparam- y(zi , kj ; x) = µ (kj ) + ail (x)ϕil (kj ), (4)
eter space of the neural networks through a two-stage l=1
Bayesian optimization process. In the first stage, we
perform a coarse search over a large space of hyperpa- where µi , ail and ϕil are the mean, coefficient and PC at
rameters, and in the second stage, we perform a fine- redshift zi , respectively, and niPCA is the number of PCs
tuning search over a narrower space. The bounds of the for zi . The compressed output vector is then
fine-tuning search are defined around the best-performing (1) (1) (1) (2) (2)
yc = [a1 , a2 , . . . , a (1) , a1 , . . . , a (2) ,...,
nPCA nPCA
(nz ) (nz )
(5)
a1 ,...,a (n )
z
],
nPCA
3 The symbol ⊙ denotes the element-wise multiplication, and ⊘
Pnz
denotes the element-wise division. with dimensionality dloc
out = i=1 niPCA .
4

Step 1: N NL Step 2: N NLH

<latexit sha1_base64="d45znZsI36HmHEK0QpVdsxT0wss=">AAAB9XicbVDLSgMxFL1TX7W+qi7dBIvgqsxIUZcFN11IqWAf0I4lk6ZtaJIZkoxShv6HGxeKuPVf3Pk3ZtpZaOuBwOGce7knJ4g408Z1v53c2vrG5lZ+u7Czu7d/UDw8aukwVoQ2SchD1QmwppxJ2jTMcNqJFMUi4LQdTG5Sv/1IlWahvDfTiPoCjyQbMoKNlR7q9X5PYDNWIrmtzfrFklt250CrxMtICTI0+sWv3iAksaDSEI617npuZPwEK8MIp7NCL9Y0wmSCR7RrqcSCaj+Zp56hM6sM0DBU9kmD5urvjQQLracisJNpRL3speJ/Xjc2w2s/YTKKDZVkcWgYc2RClFaABkxRYvjUEkwUs1kRGWOFibFFFWwJ3vKXV0nrouxdlit3lVK1ktWRhxM4hXPw4AqqUIMGNIGAgmd4hTfnyXlx3p2PxWjOyXaO4Q+czx9pipJt</latexit>

<latexit sha1_base64="ftW4DZPAFahncYmVaOpcyN1jR2o=">AAAB9HicbVDLSgMxFL3js9ZX1aWbYBFclRkp6rLgxoWUCvYB7VAyaaYNTTJjkimUod/hxoUibv0Yd/6NmXYW2nogcDjnXu7JCWLOtHHdb2dtfWNza7uwU9zd2z84LB0dt3SUKEKbJOKR6gRYU84kbRpmOO3EimIRcNoOxreZ355QpVkkH800pr7AQ8lCRrCxkl+v93sCm5ES6f2sXyq7FXcOtEq8nJQhR6Nf+uoNIpIIKg3hWOuu58bGT7EyjHA6K/YSTWNMxnhIu5ZKLKj203noGTq3ygCFkbJPGjRXf2+kWGg9FYGdzBLqZS8T//O6iQlv/JTJODFUksWhMOHIRChrAA2YosTwqSWYKGazIjLCChNjeyraErzlL6+S1mXFu6pUH6rlWjWvowCncAYX4ME11OAOGtAEAk/wDK/w5kycF+fd+ViMrjn5zgn8gfP5A9Jwkhs=</latexit>

(a) Original (b) Modified

Input Hidden layers Output
x1
<latexit sha1_base64="afKV9aJhHeoYg4uVrgiaaB3XUu8=">AAAB6nicbVBNS8NAEJ3Ur1q/qh69LBbBU0lKUY8FLx4r2lZoQ9lsJ+3SzSbsbsQS+hO8eFDEq7/Im//GbZuDtj4YeLw3w8y8IBFcG9f9dgpr6xubW8Xt0s7u3v5B+fCoreNUMWyxWMTqIaAaBZfYMtwIfEgU0igQ2AnG1zO/84hK81jem0mCfkSHkoecUWOlu6e+1y9X3Ko7B1klXk4qkKPZL3/1BjFLI5SGCap113MT42dUGc4ETku9VGNC2ZgOsWuppBFqP5ufOiVnVhmQMFa2pCFz9fdERiOtJ1FgOyNqRnrZm4n/ed3UhFd+xmWSGpRssShMBTExmf1NBlwhM2JiCWWK21sJG1FFmbHplGwI3vLLq6Rdq3oX1fptvdKo5XEU4QRO4Rw8uIQG3EATWsBgCM/wCm+OcF6cd+dj0Vpw8plj+APn8wcJzo2Z</latexit>

y1H
<latexit sha1_base64="8GMRkfFpcNFmjJZsgG/gWt48A7g=">AAAB9XicbVDLSgMxFL3js9ZX1aWbYBFclZlS1GXBTZcV7APaacmkmTY0yQxJRilD/8ONC0Xc+i/u/Bsz7Sy09UDgcM693JMTxJxp47rfzsbm1vbObmGvuH9weHRcOjlt6yhRhLZIxCPVDbCmnEnaMsxw2o0VxSLgtBNM7zK/80iVZpF8MLOY+gKPJQsZwcZKg9nQG/QFNhMl0sZ8WCq7FXcBtE68nJQhR3NY+uqPIpIIKg3hWOue58bGT7EyjHA6L/YTTWNMpnhMe5ZKLKj200XqObq0ygiFkbJPGrRQf2+kWGg9E4GdzBLqVS8T//N6iQlv/ZTJODFUkuWhMOHIRCirAI2YosTwmSWYKGazIjLBChNjiyraErzVL6+TdrXiXVdq97VyvZrXUYBzuIAr8OAG6tCAJrSAgIJneIU358l5cd6dj+XohpPvnMEfOJ8/miWSiw==</latexit>

y1L x2
<latexit sha1_base64="HkjDia/DYWYx47Q6vXLU59CEAKc=">AAAB9XicbVBNTwIxFHyLX4hfqEcvjcTEE9klRD2SePHgARMBE1hIt3Shoe1u2q6GbPgfXjxojFf/izf/jV3Yg4KTNJnMvJc3nSDmTBvX/XYKa+sbm1vF7dLO7t7+QfnwqK2jRBHaIhGP1EOANeVM0pZhhtOHWFEsAk47weQ68zuPVGkWyXszjakv8EiykBFsrNSfDrx+T2AzViK9nQ3KFbfqzoFWiZeTCuRoDspfvWFEEkGlIRxr3fXc2PgpVoYRTmelXqJpjMkEj2jXUokF1X46Tz1DZ1YZojBS9kmD5urvjRQLracisJNZQr3sZeJ/Xjcx4ZWfMhknhkqyOBQmHJkIZRWgIVOUGD61BBPFbFZExlhhYmxRJVuCt/zlVdKuVb2Lav2uXmnU8jqKcAKncA4eXEIDbqAJLSCg4Ble4c15cl6cd+djMVpw8p1j+APn8wegOZKP</latexit>

r1
<latexit sha1_base64="aeQxYJPVes6SS0talrfteKXVpPo=">AAAB6nicbVBNS8NAEJ3Ur1q/qh69LBbBU0lKUY8FLx4r2lZoQ9lsN+3SzSbsTsQS+hO8eFDEq7/Im//GbZuDtj4YeLw3w8y8IJHCoOt+O4W19Y3NreJ2aWd3b/+gfHjUNnGqGW+xWMb6IaCGS6F4CwVK/pBoTqNA8k4wvp75nUeujYjVPU4S7kd0qEQoGEUr3T31a/1yxa26c5BV4uWkAjma/fJXbxCzNOIKmaTGdD03QT+jGgWTfFrqpYYnlI3pkHctVTTixs/mp07JmVUGJIy1LYVkrv6eyGhkzCQKbGdEcWSWvZn4n9dNMbzyM6GSFLlii0VhKgnGZPY3GQjNGcqJJZRpYW8lbEQ1ZWjTKdkQvOWXV0m7VvUuqvXbeqVRy+Mowgmcwjl4cAkNuIEmtIDBEJ7hFd4c6bw4787HorXg5DPH8AfO5w8LUo2a</latexit>

<latexit sha1_base64="IhCunT2xzWQPSfkaGxQnwyLKW1o=">AAAB6nicbVBNS8NAEJ3Ur1q/qh69LBbBU0lKUY8FLx4r2g9oQ9lsN+3SzSbsToQS+hO8eFDEq7/Im//GbZuDtj4YeLw3w8y8IJHCoOt+O4WNza3tneJuaW//4PCofHzSNnGqGW+xWMa6G1DDpVC8hQIl7yaa0yiQvBNMbud+54lrI2L1iNOE+xEdKREKRtFKD3rgDcoVt+ouQNaJl5MK5GgOyl/9YczSiCtkkhrT89wE/YxqFEzyWamfGp5QNqEj3rNU0YgbP1ucOiMXVhmSMNa2FJKF+nsio5Ex0yiwnRHFsVn15uJ/Xi/F8MbPhEpS5IotF4WpJBiT+d9kKDRnKKeWUKaFvZWwMdWUoU2nZEPwVl9eJ+1a1buq1u/rlUYtj6MIZ3AOl+DBNTTgDprQAgYjeIZXeHOk8+K8Ox/L1oKTz5zCHzifPwCqjZM=</latexit>

x1
<latexit sha1_base64="afKV9aJhHeoYg4uVrgiaaB3XUu8=">AAAB6nicbVBNS8NAEJ3Ur1q/qh69LBbBU0lKUY8FLx4r2lZoQ9lsJ+3SzSbsbsQS+hO8eFDEq7/Im//GbZuDtj4YeLw3w8y8IBFcG9f9dgpr6xubW8Xt0s7u3v5B+fCoreNUMWyxWMTqIaAaBZfYMtwIfEgU0igQ2AnG1zO/84hK81jem0mCfkSHkoecUWOlu6e+1y9X3Ko7B1klXk4qkKPZL3/1BjFLI5SGCap113MT42dUGc4ETku9VGNC2ZgOsWuppBFqP5ufOiVnVhmQMFa2pCFz9fdERiOtJ1FgOyNqRnrZm4n/ed3UhFd+xmWSGpRssShMBTExmf1NBlwhM2JiCWWK21sJG1FFmbHplGwI3vLLq6Rdq3oX1fptvdKo5XEU4QRO4Rw8uIQG3EATWsBgCM/wCm+OcF6cd+dj0Vpw8plj+APn8wcJzo2Z</latexit>

y1L
<latexit sha1_base64="HkjDia/DYWYx47Q6vXLU59CEAKc=">AAAB9XicbVBNTwIxFHyLX4hfqEcvjcTEE9klRD2SePHgARMBE1hIt3Shoe1u2q6GbPgfXjxojFf/izf/jV3Yg4KTNJnMvJc3nSDmTBvX/XYKa+sbm1vF7dLO7t7+QfnwqK2jRBHaIhGP1EOANeVM0pZhhtOHWFEsAk47weQ68zuPVGkWyXszjakv8EiykBFsrNSfDrx+T2AzViK9nQ3KFbfqzoFWiZeTCuRoDspfvWFEEkGlIRxr3fXc2PgpVoYRTmelXqJpjMkEj2jXUokF1X46Tz1DZ1YZojBS9kmD5urvjRQLracisJNZQr3sZeJ/Xjcx4ZWfMhknhkqyOBQmHJkIZRWgIVOUGD61BBPFbFZExlhhYmxRJVuCt/zlVdKuVb2Lav2uXmnU8jqKcAKncA4eXEIDbqAJLSCg4Ble4c15cl6cd+djMVpw8p1j+APn8wegOZKP</latexit>

y2H
<latexit sha1_base64="V8O31qBQhOJIvX9JXolzdJgmyBw=">AAAB9XicbVDLSgMxFL3js9ZX1aWbYBFclZlS1GXBTZcV7APaacmkaRuaZIYkowxD/8ONC0Xc+i/u/Bsz7Sy09UDgcM693JMTRJxp47rfzsbm1vbObmGvuH9weHRcOjlt6zBWhLZIyEPVDbCmnEnaMsxw2o0UxSLgtBPM7jK/80iVZqF8MElEfYEnko0ZwcZKg2RYHfQFNlMl0sZ8WCq7FXcBtE68nJQhR3NY+uqPQhILKg3hWOue50bGT7EyjHA6L/ZjTSNMZnhCe5ZKLKj200XqObq0ygiNQ2WfNGih/t5IsdA6EYGdzBLqVS8T//N6sRnf+imTUWyoJMtD45gjE6KsAjRiihLDE0swUcxmRWSKFSbGFlW0JXirX14n7WrFu67U7mvlejWvowDncAFX4MEN1KEBTWgBAQXP8ApvzpPz4rw7H8vRDSffOYM/cD5/AJu0kow=</latexit>

y2L
<latexit sha1_base64="ojCuRe8tzwXSPocIh065uj1ELXA=">AAAB9XicbVBNTwIxFHyLX4hfqEcvjcTEE9klRD2SePHgARMBE1hIt3Shoe1u2q6GbPgfXjxojFf/izf/jV3Yg4KTNJnMvJc3nSDmTBvX/XYKa+sbm1vF7dLO7t7+QfnwqK2jRBHaIhGP1EOANeVM0pZhhtOHWFEsAk47weQ68zuPVGkWyXszjakv8EiykBFsrNSfDmr9nsBmrER6OxuUK27VnQOtEi8nFcjRHJS/esOIJIJKQzjWuuu5sfFTrAwjnM5KvUTTGJMJHtGupRILqv10nnqGzqwyRGGk7JMGzdXfGykWWk9FYCezhHrZy8T/vG5iwis/ZTJODJVkcShMODIRyipAQ6YoMXxqCSaK2ayIjLHCxNiiSrYEb/nLq6Rdq3oX1fpdvdKo5XUU4QRO4Rw8uIQG3EATWkBAwTO8wpvz5Lw4787HYrTg5DvH8AfO5w+hyJKQ</latexit>

<latexit sha1_base64="vKwz7DbnlHWxCVKEJIGlKWng4/g=">AAAB6nicbVBNS8NAEJ3Ur1q/qh69LBbBU0lKUY8FLx4r2g9oQ9lsN+3SzSbsToQS+hO8eFDEq7/Im//GbZuDtj4YeLw3w8y8IJHCoOt+O4WNza3tneJuaW//4PCofHzSNnGqGW+xWMa6G1DDpVC8hQIl7yaa0yiQvBNMbud+54lrI2L1iNOE+xEdKREKRtFKD3pQG5QrbtVdgKwTLycVyNEclL/6w5ilEVfIJDWm57kJ+hnVKJjks1I/NTyhbEJHvGepohE3frY4dUYurDIkYaxtKSQL9fdERiNjplFgOyOKY7PqzcX/vF6K4Y2fCZWkyBVbLgpTSTAm87/JUGjOUE4toUwLeythY6opQ5tOyYbgrb68Ttq1qndVrd/XK41aHkcRzuAcLsGDa2jAHTShBQxG8Ayv8OZI58V5dz6WrQUnnzmFP3A+fwACLo2U</latexit>

r2
x2
<latexit sha1_base64="aeQxYJPVes6SS0talrfteKXVpPo=">AAAB6nicbVBNS8NAEJ3Ur1q/qh69LBbBU0lKUY8FLx4r2lZoQ9lsN+3SzSbsTsQS+hO8eFDEq7/Im//GbZuDtj4YeLw3w8y8IJHCoOt+O4W19Y3NreJ2aWd3b/+gfHjUNnGqGW+xWMb6IaCGS6F4CwVK/pBoTqNA8k4wvp75nUeujYjVPU4S7kd0qEQoGEUr3T31a/1yxa26c5BV4uWkAjma/fJXbxCzNOIKmaTGdD03QT+jGgWTfFrqpYYnlI3pkHctVTTixs/mp07JmVUGJIy1LYVkrv6eyGhkzCQKbGdEcWSWvZn4n9dNMbzyM6GSFLlii0VhKgnGZPY3GQjNGcqJJZRpYW8lbEQ1ZWjTKdkQvOWXV0m7VvUuqvXbeqVRy+Mowgmcwjl4cAkNuIEmtIDBEJ7hFd4c6bw4787HorXg5DPH8AfO5w8LUo2a</latexit>

x2
<latexit sha1_base64="aeQxYJPVes6SS0talrfteKXVpPo=">AAAB6nicbVBNS8NAEJ3Ur1q/qh69LBbBU0lKUY8FLx4r2lZoQ9lsN+3SzSbsTsQS+hO8eFDEq7/Im//GbZuDtj4YeLw3w8y8IJHCoOt+O4W19Y3NreJ2aWd3b/+gfHjUNnGqGW+xWMb6IaCGS6F4CwVK/pBoTqNA8k4wvp75nUeujYjVPU4S7kd0qEQoGEUr3T31a/1yxa26c5BV4uWkAjma/fJXbxCzNOIKmaTGdD03QT+jGgWTfFrqpYYnlI3pkHctVTTixs/mp07JmVUGJIy1LYVkrv6eyGhkzCQKbGdEcWSWvZn4n9dNMbzyM6GSFLlii0VhKgnGZPY3GQjNGcqJJZRpYW8lbEQ1ZWjTKdkQvOWXV0m7VvUuqvXbeqVRy+Mowgmcwjl4cAkNuIEmtIDBEJ7hFd4c6bw4787HorXg5DPH8AfO5w8LUo2a</latexit>

y3H
<latexit sha1_base64="9aKi0Q1GwTeLAaJajFgKdw2Nh0A=">AAAB9XicbVDLSsNAFL2pr1pfVZduBovgqiS1qMuCmy4r2Ae0aZlMJ+3QmSTMTJQQ+h9uXCji1n9x5984abPQ1gMDh3Pu5Z45XsSZ0rb9bRU2Nre2d4q7pb39g8Oj8vFJR4WxJLRNQh7KnocV5Sygbc00p71IUiw8Trve7C7zu49UKhYGDzqJqCvwJGA+I1gbaZiMroYDgfVUirQ5H5UrdtVeAK0TJycVyNEalb8G45DEggaacKxU37Ej7aZYakY4nZcGsaIRJjM8oX1DAyyoctNF6jm6MMoY+aE0L9Boof7eSLFQKhGemcwSqlUvE//z+rH2b92UBVGsaUCWh/yYIx2irAI0ZpISzRNDMJHMZEVkiiUm2hRVMiU4q19eJ51a1bmu1u/rlUYtr6MIZ3AOl+DADTSgCS1oAwEJz/AKb9aT9WK9Wx/L0YKV75zCH1ifP51Dko0=</latexit>

y3L
<latexit sha1_base64="ZnzXZ6ls7M1jJd2fc9OoVGqZeY0=">AAAB9XicbVDLSsNAFL2pr1pfVZduBovgqiS1qMuCGxcuKtgHtGmZTCft0MkkzEyUEPofblwo4tZ/ceffOGmz0NYDA4dz7uWeOV7EmdK2/W0V1tY3NreK26Wd3b39g/LhUVuFsSS0RUIeyq6HFeVM0JZmmtNuJCkOPE473vQm8zuPVCoWigedRNQN8FgwnxGsjTRIhheDfoD1RAbp3WxYrthVew60SpycVCBHc1j+6o9CEgdUaMKxUj3HjrSbYqkZ4XRW6seKRphM8Zj2DBU4oMpN56ln6MwoI+SH0jyh0Vz9vZHiQKkk8MxkllAte5n4n9eLtX/tpkxEsaaCLA75MUc6RFkFaMQkJZonhmAimcmKyARLTLQpqmRKcJa/vEratapzWa3f1yuNWl5HEU7gFM7BgStowC00oQUEJDzDK7xZT9aL9W59LEYLVr5zDH9gff4Ao1eSkQ==</latexit>

<latexit sha1_base64="ixnTP8mUDOcvFSeF/yQrFZ6phJg=">AAAB6nicbVBNS8NAEJ34WetX1aOXxSJ4Kkkt6rHgxWNF+wFtKJvtpF262YTdjVBCf4IXD4p49Rd589+4bXPQ1gcDj/dmmJkXJIJr47rfztr6xubWdmGnuLu3f3BYOjpu6ThVDJssFrHqBFSj4BKbhhuBnUQhjQKB7WB8O/PbT6g0j+WjmSToR3QoecgZNVZ6UP3LfqnsVtw5yCrxclKGHI1+6as3iFkaoTRMUK27npsYP6PKcCZwWuylGhPKxnSIXUsljVD72fzUKTm3yoCEsbIlDZmrvycyGmk9iQLbGVEz0sveTPzP66YmvPEzLpPUoGSLRWEqiInJ7G8y4AqZERNLKFPc3krYiCrKjE2naEPwll9eJa1qxbuq1O5r5Xo1j6MAp3AGF+DBNdThDhrQBAZDeIZXeHOE8+K8Ox+L1jUnnzmBP3A+fwADso2V</latexit>

r3
y3L
<latexit sha1_base64="ZnzXZ6ls7M1jJd2fc9OoVGqZeY0=">AAAB9XicbVDLSsNAFL2pr1pfVZduBovgqiS1qMuCGxcuKtgHtGmZTCft0MkkzEyUEPofblwo4tZ/ceffOGmz0NYDA4dz7uWeOV7EmdK2/W0V1tY3NreK26Wd3b39g/LhUVuFsSS0RUIeyq6HFeVM0JZmmtNuJCkOPE473vQm8zuPVCoWigedRNQN8FgwnxGsjTRIhheDfoD1RAbp3WxYrthVew60SpycVCBHc1j+6o9CEgdUaMKxUj3HjrSbYqkZ4XRW6seKRphM8Zj2DBU4oMpN56ln6MwoI+SH0jyh0Vz9vZHiQKkk8MxkllAte5n4n9eLtX/tpkxEsaaCLA75MUc6RFkFaMQkJZonhmAimcmKyARLTLQpqmRKcJa/vEratapzWa3f1yuNWl5HEU7gFM7BgStowC00oQUEJDzDK7xZT9aL9W59LEYLVr5zDH9gff4Ao1eSkQ==</latexit>

yiH = ri yiL
<latexit sha1_base64="3TfcrNRJZnQwOCfbFeYxPwLyqZQ=">AAACCnicbVDLSsNAFJ34rPUVdelmtAiuSlKKuhEKbrpwUcE+oI1hMp20QyeTMDMRQsjajb/ixoUibv0Cd/6NkzaL2nrgwuGce7n3Hi9iVCrL+jFWVtfWNzZLW+Xtnd29ffPgsCPDWGDSxiELRc9DkjDKSVtRxUgvEgQFHiNdb3KT+91HIiQN+b1KIuIEaMSpTzFSWnLNk8SlD4MAqbEI0mZ2LVwK56XbzDUrVtWaAi4TuyAVUKDlmt+DYYjjgHCFGZKyb1uRclIkFMWMZOVBLEmE8ASNSF9TjgIinXT6SgbPtDKEfih0cQWn6vxEigIpk8DTnfmFctHLxf+8fqz8KyelPIoV4Xi2yI8ZVCHMc4FDKghWLNEEYUH1rRCPkUBY6fTKOgR78eVl0qlV7Ytq/a5eadSKOErgGJyCc2CDS9AATdACbYDBE3gBb+DdeDZejQ/jc9a6YhQzR+APjK9fBM+bEQ==</latexit>

FIG. 1. Examples of the original and modified 2-step MF NN architectures. Both architectures have the same N NL (step 1:
the LF NN) but different N NLH (step 2: the NN used to correct the LF output). The original N NLH (a) approximates the
correlation between the LF and HF functions, with (x, yL ) as input and yH as output. The modified N NLH (b) learns the
mapping from x to the ratio of yH to yL , r = G(x), and the final HF output is the element-wise product of the LF output with
the correction ratio r.

Data curacy across a range of variance thresholds prior to fi-

nalizing the compression scheme.
PCA is applied to the output for each NN prior to
CompressionI training, i.e., to both N NL and N NLH in the 2-step archi-
tecture (original and modified versions). We implement
Compressed data PCA using the scikit-learn library [47].

2. Neural network training

Hyperparameter optimizationII
• Initial search Here we describe how we train an NN, assuming the
• Fine-tuning architecture of the NN is given (i.e., with the number
(TrainingIII and validationIV)
of layers and layer widths are pre-defined). Training an
NN is essentially a process of minimizing the discrepancy
between the predicted output and the true output, which
is usually done by minimizing a loss function through
Model
iteratively updating the weights (W) and biases (b) of
the NN. PyTorch [48] is used to implement the NNs in
Techniques: I. PCA II. Bayesian optimization this work.
III. Fully-connected NN IV. k-fold cross validation
Suppose we have a training data set T = {(xi , yi ) : i =
1, 2, . . . , Ntrain }, where Ntrain is the number of training
FIG. 2. Overview of the workflow of training a highly opti- samples, xi is the ith input, and yi is the corresponding
mized NN. The workflow consists of three main steps: data output, and a separate validation set Tval = {(xival , yvali
):
compression, hyperparameter optimization, and training the
i = 1, 2, . . . , Nval }, where Nval is the number of validation
final model. When optimizing the hyperparameters, a large
space is explored in the initial search stage, and then a smaller samples. A good model should not only fit the training
space is searched in the second stage (fine-tuning). Each eval- data well but also generalize well to unseen data. To
uation of the hyperparameters involves both training and val- achieve this, a loss function that can prevent a model
idation of an NN. from being too complex is needed. In this work, we em-
ploy a regularized loss function of the form

Following Ref. [33], we determine the number of PCs L(W, b) = Ltrain (W, b) + λ||W||22 , (6)
based on the cumulative variance they explain. Specifi-
cally, we select the smallest value of nPCA (or niPCA for where Ltrain is the training loss, measuring the distance
local PCA) such that the remaining unexplained variance between the predicted output and the training output
is < 10−5 . While we do not investigate how emulator per- data, and the second term is the regularization term,
formance varies with this threshold in the present work, which penalizes large weights to prevent overfitting. The
we note that the optimal choice is likely data-dependent. regularization parameter λ is a hyperparameter that con-
As such, it is generally advisable to assess emulator ac- trols the strength of the regularization. We use the mean
5

squared error (MSE) as the training loss, i.e.,

TABLE II. Ranges of hyperparameters used in the first stage
NX
train of the hyperparameter optimization process. The prior for M
1 is uniform over integers from 16 to 512 in steps of 16.a
Ltrain (W, b) = ||fNN (xi ; W, b) − yi ||22 , (7)
Ntrain i=1 Hyperparameter Prior
i
where fNN (x ; W, b) is the predicted output of the NN L U({1, 2, 3, 4, 5, 6, 7})
with weights W and biases b at xi . Note that the loss M U({16, 32, 48, . . . , 512})
λ LU(10−9 , 5 × 10−6 )
is computed using PCA coefficients instead of the raw
output itself. The loss function is minimized using the a U ({}) denotes the discrete uniform distribution, and LU
AdamW optimizer [49].4 The activation function used in denotes the log-uniform distribution.
the hidden layers is the SiLU function, which is a special
case of the Swish function [51] with β = 1:
x the first stage, we use a uniform prior for L and M , and
SiLU(x) = x · σ(x) = , (8) a log-uniform prior for λ. The ranges of the hyperpa-
1 + e−x
rameters are chosen to be wide enough to cover a large
where σ(x) refers to the sigmoid function. space of hyperparameters. In the second stage, L is fixed
We define the validation loss, Lval , in a similar way as to the best value found in the first stage, since a differ-
the training loss, but replacing the training data with the ent L will lead to a significantly different NN that is un-
validation data in Eq. (7). likely to result in a better performance.6 The prior for M
A dynamically decreasing learning rate (LR) schedule follows U({M1 − 16 + 2q : q = 0, 1, . . . , 16}), where M1 is
is implemented to stabilize the training process. An ini- the best value found in the first stage. This defines a uni-
tial LR is set and decreased if L + Lval does not improve form prior over 25 integers centered at M1 with a step
for a certain number of epochs (patience). The schedule size of 2. The prior for λ is defined as LU(λ1 /2, 2λ1 ),
parameters, including the initial LR and patience, can be where λ1 is the best value found in the first stage.
adjusted for different training runs. Evaluating a point in the hyperparameter space in-
As a function of a large number of variables (the volves training and validating the NN with the given
weights and biases), L can be very complex and have hyperparameters. Notice that Goku-W does not have a
many local minima. The optimizer may converge to sub- separate validation set of HF data, so we will use leave-
optimal solutions (bad local minima). To mitigate this, one-out cross-validation (LOOCV) to evaluate the per-
we perform multiple training runs with different random formance of the emulator. LOOCV is a special case of
seeds for initialization, and the best model with the low- k-fold cross-validation [53] with k = Ntrain . In the next
est loss is retained. section, we detail how a given set of hyperparameters is
evaluated with k-fold training and validation, for N NL
and N NLH , respectively. We have confirmed that the
3. Hyperparameter optimization
LOOCV performance is consistent with the performance
on a separate test set in Appendix A, where we trained
Our objective is to identify the optimal set of hyperpa- an emulator based on the Goku-pre-N simulations [35] and
rameters for the NN that minimizes the combined train- tested it on the available test set.
ing and validation losses, thereby balancing underfitting
and overfitting. The hyperparameters subject to opti-
mization include the number of hidden layers L, the num-
ber of neurons per layer M (assumed uniform across lay- 4. k-fold training and validation
ers), and the regularization parameter λ.
We perform Bayesian optimization implemented with k-fold cross-validation is a technique to estimate the
Hyperopt [52] in two stages. In the first stage, we performance of a model by splitting the training data into
perform a coarse search over a large space of hyperpa- k subsets (folds). The model is trained on k − 1 folds and
rameters, and in the second stage, we perform a refined validated on the remaining fold. This process is repeated
search over a smaller region. The initial hyperparame- k times, with each fold being used as the validation set
ter ranges used in this work5 are given in Table II. For once. The final performance is obtained by averaging the
performance over all k folds. The quantity we minimize
in the hyperparameter optimization process (Sec. II C 3)
4 The AdamW optimizer is a variant of the Adam optimizer [50] is the mean of the training and validation losses, i.e., a
that decouples weight decay from the optimization process. How-
ever, we use explicit L2 regularization instead of weight decay in
this work, so the AdamW optimizer behaves like the Adam op-
timizer.
5 We chose the prior range empirically, picking values so that the 6 We empirically confirm that none of the best hyperparameter
optimal hyperparameters were not at the boundaries. The ranges sets found in the first stage led to a better performance when L
can be data dependent and may be adjusted for specific problems. was changed in our tests.
6

Phase 1: searching for a good local minimum Likewise, for N NL , the data set has nL samples, and
(with separate training and validation data) we split the data into k = nL folds. However, only the
nH HF cosmologies should be tested on for our purpose,
1 2 3 4 5 6 7 8 9 i.e., we only need to iterate over the nH folds that leave
the HF cosmologies out in training. We can take advan-
tage of this feature and use a 2-phase training strategy,
3 random seeds where a good local minimum is found in the first phase
and then used as the common initial model for the sec-
Model A Model B Model C ond phase of training for each fold. This way is much
more efficient than regular methods (such as what we do
for N NL ), since there will be no need to search for good
Choose the best minima for every fold by trying different random seeds
Training independently. Specifically, in the first phase, the LF
Model C
cosmology-only data (samples with HF cosmologies are
Test L
Initialize excluded), T1,train = (xi , yi ) : 1 ≤ i ≤ nL , xi ∈
/ X H , is
used as the training set, and the LF data with the HF
Phase 2: regular k-fold training and validation L
= T L \ T1,train
L
<latexit sha1_base64="/n8nEJKBtYAVuxMaizwscki5HXs=">AAAB6HicbVBNS8NAEJ34WetX1aOXxSJ4Kkkp6rHgxWML9gPaUDbbSbt2swm7G6GE/gIvHhTx6k/y5r9x2+agrQ8GHu/NMDMvSATXxnW/nY3Nre2d3cJecf/g8Oi4dHLa1nGqGLZYLGLVDahGwSW2DDcCu4lCGgUCO8Hkbu53nlBpHssHM03Qj+hI8pAzaqzUnAxKZbfiLkDWiZeTMuRoDEpf/WHM0gilYYJq3fPcxPgZVYYzgbNiP9WYUDahI+xZKmmE2s8Wh87IpVWGJIyVLWnIQv09kdFI62kU2M6ImrFe9ebif14vNeGtn3GZpAYlWy4KU0FMTOZfkyFXyIyYWkKZ4vZWwsZUUWZsNkUbgrf68jppVyvedaXWrJXr1TyOApzDBVyBBzdQh3toQAsYIDzDK7w5j86L8+58LFs3nHzmDP7A+fwB0KOM6A==</latexit>

cosmologies, T1,val , is used as the val-

(with k = 9, ktest = 3 )
<latexit sha1_base64="TBXz7hXZfesxSBzDy5jp5ESr7eE=">AAACAHicbVDLSsNAFJ3UV62vqgsXbgaL4EJKUouPRaHgxmUF+4AmlMl02g6dScLMjVBCNv6KGxeKuPUz3Pk3TtoutPXAhcM593LvPX4kuAbb/rZyK6tr6xv5zcLW9s7uXnH/oKXDWFHWpKEIVccnmgkesCZwEKwTKUakL1jbH99mfvuRKc3D4AEmEfMkGQZ8wCkBI/WKR+PazbmLxz1XEhgpmQDTkNYuesWSXbanwMvEmZMSmqPRK365/ZDGkgVABdG669gReAlRwKlgacGNNYsIHZMh6xoaEMm0l0wfSPGpUfp4ECpTAeCp+nsiIVLrifRNZ3amXvQy8T+vG8Pg2kt4EMXAAjpbNIgFhhBnaeA+V4yCmBhCqOLmVkxHRBEKJrOCCcFZfHmZtCpl57Jcva+W6pV5HHl0jE7QGXLQFaqjO9RATURRip7RK3qznqwX6936mLXmrPnMIfoD6/MH3b6V6g==</latexit>

idation set. We train the NN with nLseed = 15 ran-

dom seeds for initialization in the first phase. The best
1 2 3 4 5 6 7 8 9 model found in the first phase is then used to set the
initial state of the NN for the second phase of training.
1 2 3 4 5 6 7 8 9 We illustrate the process of the 2-phase training using a
simple example in Fig. 3. A bonus of this approach is
that the second phase will only take a small number of
1 2 3 4 5 6 7 8 9 epochs to converge, and is thus quite efficient in com-
putation.7 The second phase is the target k-fold train-
ing and validation, i.e., for each fold, the training set is
L,j
= (xi , yi ) : 1 ≤ i ≤ nL , i ̸= j, xj ∈ X H , where j

T2,train
is the index of the fold (also the HF cosmology), and the
Fold models L,j
validation set is T2,val = (x , y(j) ) (i.e., the point left
(j)

out).
FIG. 3. Illustration of the two-phase k-fold training and cross- For the best-performing set of hyperparameters, we ini-
validation strategy for N NL , assuming a total of 9 samples, tialize the final model training using the fold model with
of which 3 (orange circles) are supposed to be tested against
the median regularized loss. To prevent overfitting dur-
(i.e., the HF cosmologies). In phase 1, the model is trained on
the remaining 6 samples (blue circles) using 3 separate runs ing this final training step, we impose a lower bound on
with different random seeds, and validated on the 3 held-out the training loss: specifically, the final training loss is not
test samples. In phase 2, we perform regular k-fold training allowed to fall below 80% of the median training loss ob-
and validation, with the initial model (weights and biases) set served across the folds. This threshold has proven effec-
to the best model found in phase 1. tive in practice, as we have verified that the final model’s
performance remains consistent with the LOOCV results
(see Appendix A). Nevertheless, a more comprehensive
combined loss as a function of the hyperparameters: evaluation of this thresholding strategy could be pursued
in future work.
1 X
k The 2-phase training strategy ensures that all the fold
Φ(L, M, λ) = [Φtrain,i (L, M, λ) + Φval,i (L, M, λ)] , models fall into the same local minimum, and the valida-
2k i=1 tion error should be a better representative of the gener-
(9) alization error for the final model (also in the same local
where Φtrain (L, M, λ) = minW,b Ltrain (L, M, λ; W, b) minimum) trained on the full LF data set compared to
(the minimum training loss from Eq. (7)) and i is the in- regular k-fold validation. Note that the HF cosmologies
dex of the fold. Φval,i (L, M, λ) is defined in a similar way must be excluded from the training set in the first phase,
as Φtrain,i (L, M, λ) but with the training loss replaced by though that phase is just for the sake of local minima
the validation loss. searching instead of final validation. Because otherwise,
For N NLH , the data set has nH samples, and we split
the data into k = nH folds. In each iteration, we use
nH − 1 samples for training and 1 sample for validation.
In addition, nLHseed = 5 random seeds are used to initialize
7 In practice, we also set the initial learning rate in the second
the weights and biases of the NN for each fold training phase to be equal to the final learning rate from the first phase
to avoid bad local minima. to avoid jumping to other local minima.
7

the model for initialization would have memorized the discuss the impact of each technique at the level of the
data we are supposed to test on, and validation in the component NNs (N NL and N NLH ).
second phase would be invalid. This is also the reason
why we cannot use the 2-phase strategy for N NLH (no
data available other than the test points) but have to try III. RESULTS
multiple random seeds for each fold training.
We present the results of the comparative study in this
section. The models trained with different approaches
D. Comparative Study Design are evaluated using LOOCV, with the validation error de-
fined as the relative mean absolute error (rMAE) of the
The techniques evaluated in this work are summarized predicted power spectrum compared to the true power
in Table III. To assess the effectiveness of each technique, spectrum, denoted as ΦrMAE . For clarity, each model is
we design a comparative study with a series of different identified by the name of the approach used in its con-
approaches for emulator construction. These approaches struction (e.g., the model trained with the Base approach
are distinct combinations of the techniques we mentioned is referred to as Base).
above. The configurations for each approach are defined Figure 4 shows the validation errors as functions of k
in Table IV. and z for Base, Mid, and Optimal. We found that even
Mid serves as the reference approach, which uses the the basic model, Base, achieves a validation error sig-
modified 2-step architecture, separate PCA for each red- nificantly lower than GokuEmu’s 3% error (see Fig. 13 of
shift, but does not include hyperparameter fine-tuning Ref. [35]). This suggests that NNs may be better-suited
and 2-phase training of N NL . Base is the most basic for emulation tasks involving large training sets and high-
approach, with the original 2-step architecture, global dimensional parameter spaces than GPs. The improve-
PCA, and no additional optimization strategies. The ment may be accounted for by more efficient training
most advanced approach, Optimal, incorporates all en- that allows more intensive hyperparameter optimization,
hanced techniques. The remaining approaches differ from though PCA could have also contributed to the perfor-
Mid by altering only one component, allowing us to iso- mance improvement. Compared to Base, Mid achieves
late the contribution of each technique. For example, a significant improvement in accuracy, with an overall
Arch-0 uses the original 2-step architecture but keeps the validation error of 1.03% (compared to 1.73% for Base),
other techniques the same as Mid. HO-2 uses 2-stage hy- attributed to the modified 2-step architecture and the lo-
perparameter optimization without changing other com- cal PCA strategy.9 The improvement is observed across
ponents. all redshifts and wavenumbers, though the worst-case er-
By comparing HO-2 and Mid, we will see the effect of ror is still much higher than the average. The validation
hyperparameter fine-tuning. However, it is not a strictly error of Optimal is less than 1% for all redshifts and al-
fair comparison, since they would have significant differ- most all wavenumbers, with an overall mean of 0.62%,
ences in compute time. To make a fair comparison, we which is a further improvement over Mid resulting from
define HO-3, which uses the same 1-stage hyperparameter the changes in hyperparameter optimization and training
optimization as Mid but with a larger number of trials, of N NL . Not only is the overall validation error reduced,
ntrial = 120, ensuring that the total compute time is sim- but the worst-case error is also considerably lower than
ilar to HO-2. Similarly, while comparing NNL-1 and Mid that of Mid (a reduction by a factor of 5 will be seen in
will show the effect of 2-phase training of N NL , we also Fig. 5).
define NNL-0+ which uses the same 1-phase training as A summary comparison of the LOO errors of the emu-
Mid (which does not try multiple seeds) but with a larger lators built with different approaches is shown in Fig. 5,
number of random seeds, nseed = 3, leading to a similar with both the overall mean error and the worst-case error
compute time as NNL-1. Although we do not present a shown. The error of Mid is lower than that of Arch-0 and
detailed quantitative comparison of compute times across PCA-0, indicating both the modified 2-step architecture
all approaches, we note that training each emulator re- and the local PCA strategy are effective in improving
quires less than 24 hours on a single Grace-Hopper node the performance of the emulator, while the 2-step ar-
of the Vista supercomputer8 . This cost is negligible rel- chitecture leads to a larger improvement than the local
ative to the computational expense of running the sim- PCA data compression strategy. While the aforemen-
ulations themselves in the context of simulation-based tioned techniques improve the overall mean error, the
emulation. worst-case error is still high. A substantial reduction in
The results of the comparative study will be shown the worst-case error is seen when implementing a 2-phase
in Sec. III, where we will compare the performance of training strategy, NNL-1, which allows a large number
the emulators built with different approaches and also of local minima to be explored efficiently. NNL-0+ also

8 https://tacc.utexas.edu/systems/vista/ 9 Similar compute times were used to train these two models.
8

TABLE III. Techniques considered in this work. The numbers 0, 1, and 0+ (if applicable) refer to the choices of the strategies,
e.g., choice 0 represents the original 2-step model for the MF NN architecture. ntrial is the number of trials in the coarse search
stage of the hyperparameter optimization process, and ntune trial is the number of trials in the fine-tuning stage. “1-stage” means
no fine-tuning. For the training of N NL , “1-phase” refers to regular k-fold training and validation (nseed = 1 by default).
Choice MF NN architecture PCA Hyperparameter optimization Training of N NL
0 Original 2-step Global (all-z) 1-stage with ntrial = 80 1-phase
1 Modified 2-step Local (per-z) 2-stage with ntrial = 80 and ntune
trial = 40 2-phase
0+ 1-phase with nseed = 3

Base (1.73%) Mid (1.03%) Optimal (0.62%)

10
z = 0.0
8 z = 0.2
z = 0.5
z = 1.0
Φ rMAE (%)

6
z = 2.0
4 z = 3.0
1%
2

0
10 1 100 101 10 1 100 101 10 1 100 101
−1 −1 −1
k (h Mpc ) k (h Mpc ) k (h Mpc )

FIG. 4. LOO errors of the emulators built with approaches Base, Mid, and Optimal. Redshifts are color coded. The solid lines
are the error averaged over cosmologies, and the corresponding shaded regions indicate the range of individual cosmologies.
The gray-shaded area marks the region where the error is less than 1%. Each model is titled with the name of the approach
and its overall validation error.

8 TABLE IV. Approaches tested in this work. MFA, PCA,

Overall mean HO, and N NL are short for the column names in Table III.
7 Worst case The numbers 0, 1 and 0+ refer to the techniques described in
6 Table III for each column.
Approach MFA PCA HO N NL
5
Φ rMAE (%)

Base 0 0 0 0
Arch-0 0 1 0 0
4 PCA-0 1 0 0 0
3 Mid 1 1 0 0
NNL-1 1 1 0 1
2 NNL-0+ 1 1 0 0+
Optimal 1 1 1 1
1
0
shows an improvement over Mid, but the improvement
se

l
d

0+
0

1
ma
Mi
-

A-

L-

is not as significant as that of NNL-1, despite consuming

ch
Ba

L-
PC

ti
NN
Ar

a similar amount of compute time (essentially because

NNL-0+ tried less random seeds than NNL-1). Optimal
achieves slightly lower errors than NNL-1, attributed to
FIG. 5. Summary comparison of the LOO errors of the emu-
lators built with different approaches. Black crosses indicate hyperparameter fine-tuning.10
the mean validation error of each approach, while gray crosses In the following subsections, we present the effects of
show the worst-case errors (the maximal error over all test
points).
10 We have checked that simply increasing the number of trials (i.e.,
ntrial > 80) did not improve the performance, suggesting that it
9

3.0 3.0
Arch-0 PCA-0
2.5 (1.03%) 2.5 (1.15%)
Mid Mid
2.0 (0.33%) 2.0 (0.97%)

Φ LrMAE (%)
rMAE (%)

z = 0.0 z = 0.0
1.5 1.5 z = 1.0
z = 1.0
Φ LH

1.0 z = 3.0 1.0 z = 3.0

0.5 0.5

0.0 0.0
1 10 1 100 101
10 100 101 3.0
−1
k (h Mpc ) k (h Mpc −1 ) PCA-0
2.5 (0.38%)
Mid
FIG. 6. Comparison of the LF-to-HF correction NNs of the 2.0 (0.33%)

rMAE (%)
original (Arch-0) and the modified (Mid) 2-step architectures.
Blue lines are LOO errors of Arch-0’s N NLH , while orange 1.5

Φ LH
lines are Mid’s. The solid, dashed, and dotted lines correspond
1.0
to z = 0, 1, and 3, respectively. The overall mean errors
averaged over 6 redshifts are shown in the legends. 0.5
0.0
10 1 100 101
each technique in more detail, by comparing the perfor-
−1
mance of the component NNs (N NL and N NLH ) trained k (h Mpc )
with different approaches. Figures 6, 7 and 8 show the
rMAE of the component NNs, defined as the LOO error FIG. 7. Comparison of the two data compression strategies:
of the predicted power spectrum compared to the true global PCA (PCA-0, in blue) and separate PCA for each red-
power spectrum. This ensures that N NL and N NLH are shift (Mid, in orange). The top and bottom panels show the
evaluated separately and independently. Specifically, in LOO errors for N NL and N NLH , respectively. The solid,
Figure 6 and the lower panel of Figure 7, the component dashed, and dotted lines correspond to z = 0, 1, and 3, re-
shown is N NLH . Thus the input is the test cosmology spectively. The overall mean errors averaged over 6 redshifts
and the true LF power spectrum, instead of the LF power are shown in the legends.
spectrum predicted by N NL . In the upper panel of Fig-
ure 7 and Figure 8, the component shown is N NL . In this
case, both the predicted power spectrum and the true
power spectrum that it is tested against are LF power the information of the data is not fully exploited. The
spectra. improvement is likely due to the aforementioned signifi-
cantly reduced complexity of the NN (Sec. II B) relative
to the original architecture [44].
A. Architecture: 2-Step vs. Modified 2-Step

Figure 6 compares the LF-to-HF correction NNs of the

original and modified 2-step architectures. The valida-
tion error of Mid is shown to be significantly lower than B. Data Compression: Global vs. Local (PCA)
that of Arch-0 across redshifts and scales, with the aver-
age error reduced by a factor of ∼ 3. In addition, Mid’s er-
ror decreases with increasing redshift (especially at small From Fig. 7, we observe that the local PCA strategy
scales), which is consistent with our expectation that it (Mid) outperforms the global PCA strategy (PCA-0) for
is easier to learn the LF-to-HF correction at higher red- both N NL and N NLH , which is likely because the global
shifts, where the spectrum is more linear and less affected PCA is not as flexible as the local PCA in capturing
by nonlinear effects. In contrast, Arch-0 has a moder- redshift-dependent features of the spectrum. In partic-
ately larger error at z = 3 than at z = 0 and 1. This sug- ular, the improvement is more pronounced at z = 0 in
gests that the original architecture struggles to learn the both NNs, where the spectrum is more nonlinear.
correlation between the LF and HF power spectra and
We also note that ΦLrMAE is larger than ΦLH
rMAE in both
cases, indicating the uncertainty of the interpolation of
the LF power spectrum in the parameter space domi-
is the hyperparameter fine-tuning which is responsible for the nates the overall error of the emulator, consistent with
(small) improvement in performance. the findings of Ref. [35].
10

2.5 In this work, we build emulators based on the non-

Mid linear matter power spectra from the Goku simulations
2.0 (0.97%) suite [35] using different combinations of the various tech-
NNL-1 niques for a comparative study. The results show that
(0.55%)
Φ LrMAE (%)

1.5 all the techniques we proposed are effective in improving

NNL-0+
(0.74%) the performance of the emulator, although the effect of
1.0 z = 0.0 hyperparameter fine-tuning is modest. The novel 2-step
z = 1.0 MF architecture reduces the complexity of the LF-to-HF
0.5 z = 3.0 correction NN, decreasing the error by a factor of ∼ 3.
The per-z PCA strategy allows NNs to learn the redshift-
0.0 dependent features of the statistics of interest more ac-
10 1 100 101 curately, with accuracy improved by more than 10% in
k (h Mpc −1 ) both the LF NN and the LF-to-HF correction NN com-
pared to the global PCA strategy. The 2-stage hyperpa-
rameter optimization strategy moderately improves the
FIG. 8. Comparison of the training strategies for N NL . Mid
(blue) and NNL-0+ (green) use regular training, while NNL-1 performance of the emulator by fine-tuning the hyperpa-
(orange) uses the 2-phase training strategy. NNL-0+ tried more rameters in a smaller space after a coarse search. The
random seeds than Mid to match the compute time of NNL-1. 2-phase training strategy for the LF NN efficiently finds
Redshifts z = 0, 1, and 3 are coded with solid, dashed, and a common local minimum for k-fold (or LOO) training
dotted lines, respectively. The overall LOO errors averaged and validation and substantially improves the worst-case
over 6 redshifts are shown in the legends. error.
T2N-MusE realizes highly efficient training of NNs on
large data sets with high-dimensional parameter spaces
C. Training of the LF NN: Regular vs. 2-Phase that traditional GP-based methods struggle with. This
demonstrates the effectiveness of T2N-MuSE not only as a
Fig. 8 compares the LF NNs trained with different high-accuracy optimization scheme in its own right, but
strategies. Compared to Mid, NNL-1 reduces the overall also as a general tool for upgrading existing emulators to
error significantly from 0.97% to 0.55%, improving per- higher performance or expanding their parameter space,
formance across all redshifts and scales. When we simply all at significantly reduced computational costs. We have
increased the number of random seeds over Mid (NNL-0+), rebuilt a production emulator for the matter power spec-
the worst-case error was about midway between Mid and trum with T2N-MusE based on Goku, named GokuNEmu.
GokuNEmu is the highest performing in existence, in terms
NNL-1, despite a similar compute time. Regular training
with more random seeds, e.g., 15 distinct seeds, might of error, dimensionality, parameter coverage and infer-
allow a performance similar to NNL-1, but it would take ence speed, and is presented in Ref. [54]. We will also
much longer to train the NN and the final model initial- apply this framework to build emulators for other sum-
ized by one of the fold models might not generalize as well mary statistics, such as the Lyman-α forest flux power
as the model trained with the 2-phase strategy–the fold spectrum [55] in future work. The code of T2N-MusE is
models might have fallen into different local minima, and publicly available at https://github.com/astro-YYH/
the chosen model is not guaranteed to be the one with T2N-MusE for the community to use and extend.
the best generalization performance.
ACKNOWLEDGMENTS

IV. CONCLUSION
YY and SB acknowledge funding from NASA ATP
80NSSC22K1897. MFH is supported by the Leinweber
We have developed T2N-MusE, a multifidelity neural net- Foundation and DOE grant DE-SC0019193. Computing
work framework for cosmological emulation, which is ca- resources were provided by Frontera LRAC AST21005.
pable of building highly optimized regression models to The authors acknowledge the Frontera and Vista com-
predict summary statistics. This framework is character- puting projects at the Texas Advanced Computing Cen-
ized by a novel 2-step architecture, per-z PCA for data ter (TACC, http://www.tacc.utexas.edu) for provid-
compression, 2-stage hyperparameter optimization, and ing HPC and storage resources that have contributed to
a 2-phase training strategy for the low-fidelity regression the research results reported within this paper. Frontera
model. This NN approach improves on our earlier GP and Vista are made possible by National Science Foun-
approach by a factor of more than 5 on the same data.11 dation award OAC-1818253.

11 The training of GokuEmu used both the L1 and L2 nodes, while estimate.
we only use L2 in this work. So a factor of 5 is a very conservative
11

[1] DESI Collaboration, A. Aghamousa, and J. Aguilar et ulation, The Open Journal of Astrophysics 7, 10 (2024),
al., The DESI Experiment Part I: Science,Targeting, and arXiv:2307.14339 [astro-ph.CO].
Survey Design, arXiv e-prints , arXiv:1611.00036 (2016), [14] M. Bonici, G. D’Amico, J. Bel, and C. Carbone, Ef-
arXiv:1611.00036 [astro-ph.IM]. fort: a fast and differentiable emulator for the Ef-
[2] P. A. Abell et al., LSST Science Book, Version 2.0 fective Field Theory of the Large Scale Structure of
(arXiv, 2009) arXiv:0912.0201 [astro-ph.IM]. the Universe, arXiv e-prints , arXiv:2501.04639 (2025),
[3] R. Laureijs et al., Euclid Definition Study Report, arXiv arXiv:2501.04639 [astro-ph.CO].
e-prints , arXiv:1110.3193 (2011), arXiv:1110.3193 [astro- [15] K. Heitmann, D. Higdon, M. White, S. Habib, B. J.
ph.CO]. Williams, E. Lawrence, and C. Wagner, THE COY-
[4] R. Akeson et al., The Wide Field Infrared Survey OTE UNIVERSE. II. COSMOLOGICAL MODELS
Telescope: 100 Hubbles for the 2020s, arXiv e-prints AND PRECISION EMULATION OF THE NONLIN-
, arXiv:1902.05569 (2019), arXiv:1902.05569 [astro- EAR MATTER POWER SPECTRUM, The Astrophys-
ph.IM]. ical Journal 705, 156 (2009).
[5] Y. Gong, X. Liu, Y. Cao, X. Chen, Z. Fan, R. Li, X.- [16] K. Heitmann, M. White, C. Wagner, S. Habib, and
D. Li, Z. Li, X. Zhang, and H. Zhan, Cosmology from D. Higdon, THE COYOTE UNIVERSE. I. PRECISION
the Chinese Space Station Optical Survey (CSS-OS), DETERMINATION OF THE NONLINEAR MATTER
Astrophys. J. 883, 203 (2019), arXiv:1901.04634 [astro- POWER SPECTRUM, The Astrophysical Journal 715,
ph.CO]. 104 (2010).
[6] M. Takada, R. S. Ellis, M. Chiba, J. E. Greene, H. Ai- [17] K. Heitmann, E. Lawrence, J. Kwan, S. Habib,
hara, N. Arimoto, K. Bundy, J. Cohen, O. Doré, and D. Higdon, THE COYOTE UNIVERSE EX-
G. Graves, J. E. Gunn, T. Heckman, C. M. Hirata, TENDED: PRECISION EMULATION OF THE MAT-
P. Ho, J.-P. Kneib, O. Le Fèvre, L. Lin, S. More, H. Mu- TER POWER SPECTRUM, The Astrophysical Journal
rayama, T. Nagao, M. Ouchi, M. Seiffert, J. D. Silver- 780, 111 (2013).
man, L. Sodré, D. N. Spergel, M. A. Strauss, H. Sugai, [18] J. DeRose, R. H. Wechsler, J. L. Tinker, M. R. Becker,
Y. Suto, H. Takami, and R. Wyse, Extragalactic science, Y.-Y. Mao, T. McClintock, S. McLaughlin, E. Rozo, and
cosmology, and Galactic archaeology with the Subaru Z. Zhai, The Aemulus Project. I. Numerical Simulations
Prime Focus Spectrograph, Publications of the Astro- for Precision Cosmology, The Astrophysical Journal 875,
nomical Society of Japan 66, R1 (2014), arXiv:1206.0737 69 (2019).
[astro-ph.CO]. [19] T. McClintock, E. Rozo, M. R. Becker, J. DeRose, Y.-
[7] T. Auld, M. Bridges, M. P. Hobson, and S. F. Y. Mao, S. McLaughlin, J. L. Tinker, R. H. Wechsler,
Gull, Fast cosmological parameter estimation using neu- and Z. Zhai, The Aemulus Project. II. Emulating the
ral networks, MNRAS 376, L11 (2007), arXiv:astro- Halo Mass Function, The Astrophysical Journal 872, 53
ph/0608174 [astro-ph]. (2019).
[8] T. Auld, M. Bridges, and M. P. Hobson, COSMONET: [20] Z. Zhai, J. L. Tinker, M. R. Becker, J. DeRose, Y.-Y.
fast cosmological parameter estimation in non-flat mod- Mao, T. McClintock, S. McLaughlin, E. Rozo, and R. H.
els using neural networks, MNRAS 387, 1575 (2008), Wechsler, The Aemulus Project. III. Emulation of the
arXiv:astro-ph/0703445 [astro-ph]. Galaxy Correlation Function, The Astrophysical Journal
[9] G. Aricò, R. E. Angulo, and M. Zennaro, Accelerating 874, 95 (2019).
Large-Scale-Structure data analyses by emulating Boltz- [21] R. E. Smith and R. E. Angulo, Precision modelling of
mann solvers and Lagrangian Perturbation Theory, arXiv the matter power spectrum in a Planck-like Universe,
e-prints , arXiv:2104.14568 (2021), arXiv:2104.14568 Monthly Notices of the Royal Astronomical Society 486,
[astro-ph.CO]. 1448 (2019).
[10] A. Spurio Mancini, D. Piras, J. Alsing, B. Joachimi, [22] T. Nishimichi, M. Takada, R. Takahashi, K. Osato,
and M. P. Hobson, COSMOPOWER: emulating cosmo- M. Shirasaki, T. Oogi, H. Miyatake, M. Oguri, R. Mu-
logical power spectra for accelerated Bayesian inference rata, Y. Kobayashi, and N. Yoshida, Dark Quest. I. Fast
from next-generation surveys, MNRAS 511, 1771 (2022), and Accurate Emulation of Halo Clustering Statistics and
arXiv:2106.03846 [astro-ph.CO]. Its Application to Galaxy Clustering, The Astrophysical
[11] A. Nygaard, E. B. Holm, S. Hannestad, and T. Tram, Journal 884, 29 (2019).
CONNECT: a neural network based framework for em- [23] D. Valcin, F. Villaescusa-Navarro, L. Verde, and A. Rac-
ulating cosmological observables and cosmological pa- canelli, BE-HaPPY: bias emulator for halo power spec-
rameter inference, Journal of Cosmology and Astropar- trum including massive neutrinos, Journal of Cosmology
ticle Physics 2023, 025 (2023), arXiv:2205.15726 [astro- and Astroparticle Physics 2019 (12), 057.
ph.IM]. [24] G. Aricò, R. E. Angulo, S. Contreras, L. Ondaro-Mallea,
[12] S. Günther, J. Lesgourgues, G. Samaras, N. Schöneberg, M. Pellejero-Ibañez, and M. Zennaro, The BACCO sim-
F. Stadtmann, C. Fidler, and J. Torrado, CosmicNet II: ulation project: a baryonification emulator with neural
emulating extended cosmologies with efficient and ac- networks, MNRAS 506, 4070 (2021), arXiv:2011.15018
curate neural networks, Journal of Cosmology and As- [astro-ph.CO].
troparticle Physics 2022, 035 (2022), arXiv:2207.05707 [25] F. Villaescusa-Navarro, C. Hahn, E. Massara, A. Baner-
[astro-ph.CO]. jee, A. M. Delgado, D. K. Ramanah, T. Charnock,
[13] M. Bonici, F. Bianchini, and J. Ruiz-Zapatero, Capse.jl: E. Giusarma, Y. Li, E. Allys, A. Brochard, C. Uhlemann,
efficient and auto-differentiable CMB power spectra em- C.-T. Chiang, S. He, A. Pisani, A. Obuljen, Y. Feng,
12

E. Castorina, G. Contardo, C. D. Kreisch, A. Nicola, arXiv:2306.03144 [astro-ph.CO].

J. Alsing, R. Scoccimarro, L. Verde, M. Viel, S. Ho, [38] C. E. Rasmussen and C. K. I. Williams, Gaussian Pro-
S. Mallat, B. Wandelt, and D. N. Spergel, The Quijote cesses for Machine Learning (MIT Press, 2006).
Simulations, The Astrophysical Journal Supplement Se- [39] R. Garnett, Bayesian Optimization (Cambridge Univer-
ries 250, 2 (2020), arXiv:1909.05273 [astro-ph.CO]. sity Press, 2023).
[26] K. Heitmann, D. Bingham, E. Lawrence, S. Bergner, [40] L. Cabayol-Garcia, J. Chaves-Montero, A. Font-Ribera,
S. Habib, D. Higdon, A. Pope, R. Biswas, H. Finkel, and C. Pedersen, A neural network emulator for the
N. Frontiere, and S. Bhattacharya, THE MIRA–TITAN Lyman-α forest 1D flux power spectrum, MNRAS 525,
UNIVERSE: PRECISION PREDICTIONS FOR DARK 3499 (2023), arXiv:2305.19064 [astro-ph.CO].
ENERGY SURVEYS, The Astrophysical Journal 820, [41] K. Diao and Y. Mao, Multi-fidelity emulator for large-
108 (2016). scale 21 cm lightcone images: a few-shot transfer learn-
[27] E. Lawrence, K. Heitmann, J. Kwan, A. Upadhye, ing approach with generative adversarial network, arXiv
D. Bingham, S. Habib, D. Higdon, A. Pope, H. Finkel, e-prints , arXiv:2502.04246 (2025), arXiv:2502.04246
and N. Frontiere, The Mira-Titan Universe. II. Matter [astro-ph.IM].
Power Spectrum Emulation, The Astrophysical Journal [42] F. Zhang, Y. Luo, B. Li, R. Cao, W. Peng, J. Mey-
847, 50 (2017). ers, and P. R. Shapiro, SageNet: Fast Neural Network
[28] S. Bocquet, K. Heitmann, S. Habib, E. Lawrence, Emulation of the Stiff-amplified Gravitational Waves
T. Uram, N. Frontiere, A. Pope, and H. Finkel, The from Inflation, arXiv e-prints , arXiv:2504.04054 (2025),
Mira-Titan Universe. III. Emulation of the Halo Mass arXiv:2504.04054 [astro-ph.CO].
Function, Astrophys. J. 901, 5 (2020), arXiv:2003.12116 [43] J. Hestness, S. Narang, N. Ardalani, G. Diamos, H. Jun,
[astro-ph.CO]. H. Kianinejad, M. M. A. Patwary, Y. Yang, and Y. Zhou,
[29] K. R. Moran, K. Heitmann, E. Lawrence, S. Habib, Deep Learning Scaling is Predictable, Empirically, arXiv
D. Bingham, A. Upadhye, J. Kwan, D. Higdon, and e-prints , arXiv:1712.00409 (2017), arXiv:1712.00409
R. Payne, The Mira-Titan Universe - IV. High-precision [cs.LG].
power spectrum emulation, MNRAS 520, 3443 (2023), [44] M. Guo, A. Manzoni, M. Amendt, P. Conti, and J. S.
arXiv:2207.12345 [astro-ph.CO]. Hesthaven, Multi-fidelity regression using artificial neu-
[30] J. Kwan, S. Saito, A. Leauthaud, K. Heitmann, S. Habib, ral networks: Efficient approximation of parameter-
N. Frontiere, H. Guo, S. Huang, A. Pope, and S. Ro- dependent output quantities, Computer Methods in Ap-
driguéz-Torres, Galaxy Clustering in the Mira-Titan Uni- plied Mechanics and Engineering 389, 114378 (2022),
verse. I. Emulators for the Redshift Space Galaxy Corre- arXiv:2102.13403 [math.NA].
lation Function and Galaxy-Galaxy Lensing, Astrophys. [45] Y. Feng, S. Bird, L. Anderson, A. Font-Ribera, and
J. 952, 80 (2023), arXiv:2302.12379 [astro-ph.CO]. C. Pedersen, MP-Gadget/MP-Gadget: A tag for getting
[31] I. Sáez-Casares, Y. Rasera, T. R. G. Richardson, and a DOI (2018).
P. S. Corasaniti, The e-MANTIS emulator: Fast and ac- [46] P. Z. G. Qian, Sliced Latin Hypercube Designs, Journal
curate predictions of the halo mass function in f(R)CDM of the American Statistical Association 107, 393 (2012),
and wCDM cosmologies, Astronomy & Astrophysics 691, https://doi.org/10.1080/01621459.2011.644132.
A323 (2024), arXiv:2410.05226 [astro-ph.CO]. [47] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel,
[32] Euclid Collaboration, M. Knabenhans, and J. Stadel B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer,
et al., Euclid preparation: II. The EuclidEmula- R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cour-
tor – a tool to compute the cosmology dependence napeau, M. Brucher, M. Perrot, and E. Duchesnay,
of the nonlinear matter power spectrum, Monthly Scikit-learn: Machine learning in Python, Journal of Ma-
Notices of the Royal Astronomical Society 484, chine Learning Research 12, 2825 (2011).
5509 (2019), https://academic.oup.com/mnras/article- [48] A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury,
pdf/484/4/5509/27790453/stz197.pdf. G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga,
[33] Euclid Collaboration, M. Knabenhans, and J. Stadel et A. Desmaison, A. Köpf, E. Yang, Z. DeVito, M. Rai-
al., Euclid preparation: IX. EuclidEmulator2 – power son, A. Tejani, S. Chilamkurthy, B. Steiner, L. Fang,
spectrum emulation with massive neutrinos and self- J. Bai, and S. Chintala, PyTorch: An Imperative Style,
consistent dark energy perturbations, Monthly Notices High-Performance Deep Learning Library, arXiv e-prints
of the Royal Astronomical Society 505, 2840 (2021). , arXiv:1912.01703 (2019), arXiv:1912.01703 [cs.LG].
[34] Z. Chen, Y. Yu, J. Han, and Y. P. Jing, CSST Cosmolog- [49] I. Loshchilov and F. Hutter, Decoupled Weight Decay
ical Emulator I: Matter Power Spectrum Emulation with Regularization, arXiv e-prints , arXiv:1711.05101 (2017),
one percent accuracy, arXiv e-prints , arXiv:2502.11160 arXiv:1711.05101 [cs.LG].
(2025), arXiv:2502.11160 [astro-ph.CO]. [50] D. P. Kingma and J. Ba, Adam: A Method for Stochastic
[35] Y. Yang, S. Bird, and M.-F. Ho, Ten-parameter simu- Optimization, arXiv e-prints , arXiv:1412.6980 (2014),
lation suite for cosmological emulation beyond ΛCDM, arXiv:1412.6980 [cs.LG].
Phys. Rev. D 111, 083529 (2025), arXiv:2501.06296 [51] P. Ramachandran, B. Zoph, and Q. V. Le, Searching for
[astro-ph.CO]. Activation Functions, arXiv e-prints , arXiv:1710.05941
[36] M.-F. Ho, S. Bird, and C. R. Shelton, Multifidelity em- (2017), arXiv:1710.05941 [cs.NE].
ulation for the matter power spectrum using Gaussian [52] J. Bergstra, D. Yamins, and D. Cox, Making a science of
processes, MNRAS 509, 2551 (2022), arXiv:2105.01081 model search: Hyperparameter optimization in hundreds
[astro-ph.CO]. of dimensions for vision architectures, in Proceedings of
[37] M.-F. Ho, S. Bird, M. A. Fernandez, and C. R. Shel- the 30th International Conference on Machine Learn-
ton, MF-Box: multifidelity and multiscale emulation for ing, Proceedings of Machine Learning Research, Vol. 28,
the matter power spectrum, MNRAS 526, 2903 (2023), edited by S. Dasgupta and D. McAllester (PMLR, At-
13

lanta, Georgia, USA, 2013) pp. 115–123. it on the available test set. Goku-pre-N contains 297 pairs
[53] R. Kohavi, A study of cross-validation and Bootstrap for of LF simulations and 27 HF simulations in the training
accuracy estimation and model selection, in Proceedings set and 12 HF simulations in the test set. Following the
of the International Joint Conference on Artificial Intelli- main text, we do not use L1 simulations in this study.
gence (IJCAI) (Morgan Kaufmann, 1995) pp. 1137–1143. The HF simulations evolve 3003 particles in a box of size
[54] Y. Yang, S. Bird, M.-F. Ho, and M. Qezlou, Ten-
100 Mpc/h. For more details about the Goku-pre-N simu-
dimensional neural network emulator for the nonlinear
matter power spectrum, arXiv e-prints (2025), submit- lations, see Ref. [35].
ted concurrently to arXiv.
[55] S. Bird, M. Fernandez, M.-F. Ho, M. Qezlou, R. Monadi,
Y. Ni, N. Chen, R. Croft, and T. Di Matteo, PRIYA: a
new suite of Lyman-α forest simulations for cosmology,
Journal of Cosmology and Astroparticle Physics 2023,
037 (2023), arXiv:2306.05471 [astro-ph.CO]. The LOO error and test error of the emulator are
shown in the top and bottom panels of Fig. 9, respec-
tively. They are consistent with each other, with the
Appendix A: LOOCV vs. Separate Test Set test error being slightly lower than the LOO error. This
indicates that the LOO cross-validation is a good repre-
We train an emulator based on the preliminary simula- sentative of the generalization error of the final emulator
tion set, Goku-pre-N, using the Optimal approach and test trained on the full training set.
14

LOOCV (2.24%)
20

15
Φ rMAE (%)

0
Test (2.05%)
20
z = 0.0
z = 0.2
15 z = 0.5
z = 1.0
Φ rMAE (%)

10 z = 2.0
z = 3.0

0
100 101
−1
k (h Mpc )

FIG. 9. Comparison of LOO error (top) and test error (bot-

tom) for the emulator trained on the Goku-pre-N simulations.
The solid lines are the mean errors, and the shaded regions
indicate the range of individual cosmologies. The redshifts
are color coded, and the overall mean errors are shown in the
titles of the panels.

Numsense! Data Science For The Layman
100% (3)
Numsense! Data Science For The Layman
65 pages
Murphy Book Solution
No ratings yet
Murphy Book Solution
100 pages
Mca sYLLABUS
No ratings yet
Mca sYLLABUS
90 pages
European Journal of Operational Research: Stefan Feuerriegel, Julius Gordon
No ratings yet
European Journal of Operational Research: Stefan Feuerriegel, Julius Gordon
14 pages
Credit Risk Modelling Final
No ratings yet
Credit Risk Modelling Final
135 pages
Umberto Michelucci - Fundamental Mathematical Concepts For Machine Learning in Science-Springer (2024)
100% (1)
Umberto Michelucci - Fundamental Mathematical Concepts For Machine Learning in Science-Springer (2024)
259 pages
Python For Multivariate Analysis
No ratings yet
Python For Multivariate Analysis
47 pages
A Survey of Artificial Neural Networks Based Fault Detection and
No ratings yet
A Survey of Artificial Neural Networks Based Fault Detection and
5 pages
Ades
No ratings yet
Ades
74 pages
Yockey 2016
No ratings yet
Yockey 2016
9 pages
1 - A Novel Area-Power Efficient Design For Approximated Small-Point FFT Architecture
No ratings yet
1 - A Novel Area-Power Efficient Design For Approximated Small-Point FFT Architecture
12 pages
Cosmological Simulations of Scale-Dependent Primordial Non-Gaussianity
No ratings yet
Cosmological Simulations of Scale-Dependent Primordial Non-Gaussianity
24 pages
Using GPU Technologies To Drastically Accelerate FDTD Simulations
No ratings yet
Using GPU Technologies To Drastically Accelerate FDTD Simulations
8 pages
Rocking DFT Script3
100% (1)
Rocking DFT Script3
3 pages
Perspectives On System Identification
100% (1)
Perspectives On System Identification
13 pages
PCA1
No ratings yet
PCA1
45 pages
Sensors 21 00891 v2
No ratings yet
Sensors 21 00891 v2
17 pages
A Deep-Learning Search For Technosignatures of 820 Nearby Stars
No ratings yet
A Deep-Learning Search For Technosignatures of 820 Nearby Stars
26 pages
VkFFT-A Performant Cross-Platform and Open-Source GPU FFT Library
No ratings yet
VkFFT-A Performant Cross-Platform and Open-Source GPU FFT Library
20 pages
Han 2020
No ratings yet
Han 2020
12 pages
Customer Satisfaction of Super Stores in Bangladesh-An Explorative Study
No ratings yet
Customer Satisfaction of Super Stores in Bangladesh-An Explorative Study
8 pages
(W-8671) Improved NN MC Simulation
No ratings yet
(W-8671) Improved NN MC Simulation
19 pages
Non-Linear Perturbation Theory Extension of The Boltzmann Code CLASS
No ratings yet
Non-Linear Perturbation Theory Extension of The Boltzmann Code CLASS
57 pages
A Graphical Multi-Fidelity Gaussian Process Model, With Application To Emulation of Expensive Computer Simulations
No ratings yet
A Graphical Multi-Fidelity Gaussian Process Model, With Application To Emulation of Expensive Computer Simulations
49 pages
1712 06863 PDF
No ratings yet
1712 06863 PDF
16 pages
Statistics PG
No ratings yet
Statistics PG
31 pages
Zenny Wetterstan PASC2024 Hardware Acceleration in Hard Event Generation
No ratings yet
Zenny Wetterstan PASC2024 Hardware Acceleration in Hard Event Generation
31 pages
Searching For Cosmological Collider in The Planck CMB Data
No ratings yet
Searching For Cosmological Collider in The Planck CMB Data
41 pages
Algorithm For Scalable Fourier Transforms
No ratings yet
Algorithm For Scalable Fourier Transforms
5 pages
Edge Detection
No ratings yet
Edge Detection
9 pages
Machine Learning and The Physical Sciences13-16
No ratings yet
Machine Learning and The Physical Sciences13-16
4 pages
Machine-Learning Set 1
No ratings yet
Machine-Learning Set 1
22 pages
Image Transform-WKF
No ratings yet
Image Transform-WKF
66 pages
SoC Based PR
No ratings yet
SoC Based PR
13 pages
Bayesian "Deep" Process Convolutions: An Application in Cosmology
No ratings yet
Bayesian "Deep" Process Convolutions: An Application in Cosmology
16 pages
Deep Learning and Genetic Algorithms For Cosmological Bayesian Inference Speed-Up
No ratings yet
Deep Learning and Genetic Algorithms For Cosmological Bayesian Inference Speed-Up
16 pages
Cosmological Analysis With Calibrated Neural Quantile Estimation and Approximate Simulators
No ratings yet
Cosmological Analysis With Calibrated Neural Quantile Estimation and Approximate Simulators
9 pages
Model-Agnostic Basis Functions For The 2-Point Correlation Function of Dark Matter in Linear Theory
No ratings yet
Model-Agnostic Basis Functions For The 2-Point Correlation Function of Dark Matter in Linear Theory
21 pages
Welcome To International Journal of Engineering Research and Development (IJERD)
No ratings yet
Welcome To International Journal of Engineering Research and Development (IJERD)
6 pages
Convolutional Vision Transformer For Cosmology Parameter Inference
No ratings yet
Convolutional Vision Transformer For Cosmology Parameter Inference
9 pages
Ieee Paper
No ratings yet
Ieee Paper
5 pages
5 Ijaest Vol No.4 Issue No.2 Developoment of Programmable Demodulator Using Arm Processor 018 022
No ratings yet
5 Ijaest Vol No.4 Issue No.2 Developoment of Programmable Demodulator Using Arm Processor 018 022
5 pages
Extracting Axion String Network Parameters From Simulated CMB Birefringence Maps Using Convolutional Neural Networks
No ratings yet
Extracting Axion String Network Parameters From Simulated CMB Birefringence Maps Using Convolutional Neural Networks
23 pages
Running Markov Chain Monte Carlo On Modern Hardware and Software
No ratings yet
Running Markov Chain Monte Carlo On Modern Hardware and Software
26 pages
Cosmological Correlators at The Loop Level: Zhehan Qin
No ratings yet
Cosmological Correlators at The Loop Level: Zhehan Qin
47 pages
006 Synthesizers
No ratings yet
006 Synthesizers
14 pages
Transaction-Based Emulation Helps Tame Soc Verification: Subscribe
No ratings yet
Transaction-Based Emulation Helps Tame Soc Verification: Subscribe
1 page
Emulation of The Final - Process Abundance Pattern With A Neural Network
No ratings yet
Emulation of The Final - Process Abundance Pattern With A Neural Network
22 pages
Raspberry Pi Aided Daily Attendance Management System Using Face Recognition
No ratings yet
Raspberry Pi Aided Daily Attendance Management System Using Face Recognition
9 pages
FFT Processor
No ratings yet
FFT Processor
29 pages
Constraining The Primordial Power Spectrum Using A Differentiable Likelihood
No ratings yet
Constraining The Primordial Power Spectrum Using A Differentiable Likelihood
29 pages
Machine Learning and The Physical Sciences17-20
No ratings yet
Machine Learning and The Physical Sciences17-20
4 pages
Multimodal Deep Representation Learning For Quantum Cross-Platform Verification
No ratings yet
Multimodal Deep Representation Learning For Quantum Cross-Platform Verification
24 pages
XStream WhitePaper Glossy Final
No ratings yet
XStream WhitePaper Glossy Final
8 pages
Ece1373 Final Report
No ratings yet
Ece1373 Final Report
54 pages
Imfit Howto
No ratings yet
Imfit Howto
29 pages
2013 Mazumdar Patent US8570203
No ratings yet
2013 Mazumdar Patent US8570203
32 pages
Exploring Galaxy Evolution With Generative Models
No ratings yet
Exploring Galaxy Evolution With Generative Models
4 pages
Microbiome. Volume 2, Issue 7
No ratings yet
Microbiome. Volume 2, Issue 7
7 pages
Digital Quantum Simulation of Cosmological Particle Creation With IBM Quantum Computers
No ratings yet
Digital Quantum Simulation of Cosmological Particle Creation With IBM Quantum Computers
10 pages
Identifying Affluent Tehsils in Rural India Approach To The Project
No ratings yet
Identifying Affluent Tehsils in Rural India Approach To The Project
76 pages
Applications Enabled by FPGA-Based Technology
No ratings yet
Applications Enabled by FPGA-Based Technology
4 pages
Reviews in Aquaculture - 2019 - Mengistu - A Systematic Literature Review of The Major Factors Causing Yield Gap by
No ratings yet
Reviews in Aquaculture - 2019 - Mengistu - A Systematic Literature Review of The Major Factors Causing Yield Gap by
18 pages
Craig Et Al., 1992, JIBS - Patterns of Convergence and Divergence Among Industrialized Nations
No ratings yet
Craig Et Al., 1992, JIBS - Patterns of Convergence and Divergence Among Industrialized Nations
16 pages
Ma 2020
No ratings yet
Ma 2020
14 pages
Rocking DFT Script
No ratings yet
Rocking DFT Script
2 pages
Emulador Analítico para o Espectro de Potência Da Matéria Linear A Partir Do Aprendizado de Máquina Baseado em Física
No ratings yet
Emulador Analítico para o Espectro de Potência Da Matéria Linear A Partir Do Aprendizado de Máquina Baseado em Física
15 pages
Ackerly2004 Water Deficit
No ratings yet
Ackerly2004 Water Deficit
20 pages
Precision Animal Nutrition
No ratings yet
Precision Animal Nutrition
9 pages
Independent Component Analysis and Projection Pursuit - Ps
No ratings yet
Independent Component Analysis and Projection Pursuit - Ps
23 pages
Final Report Womanium Quantum+AI 2024 Bootcamp Project
No ratings yet
Final Report Womanium Quantum+AI 2024 Bootcamp Project
11 pages
10 1016@j Agee 2019 02 006
No ratings yet
10 1016@j Agee 2019 02 006
11 pages
LDA Final
No ratings yet
LDA Final
25 pages
Unit 5 Pattern Recognition
No ratings yet
Unit 5 Pattern Recognition
10 pages
Optimizacion
No ratings yet
Optimizacion
5 pages
Rank 3
No ratings yet
Rank 3
4 pages
ADD Serie 01 Eng
No ratings yet
ADD Serie 01 Eng
2 pages
10 21105 Joss 07626
No ratings yet
10 21105 Joss 07626
4 pages
Soft Fault Diagnosis For DC-DC Converter Based On
No ratings yet
Soft Fault Diagnosis For DC-DC Converter Based On
11 pages
ICCAD24-TopoOrderPart A Multi-Level Scheduling-Driven Partitioning Framework For Processor-Based Emulation
No ratings yet
ICCAD24-TopoOrderPart A Multi-Level Scheduling-Driven Partitioning Framework For Processor-Based Emulation
9 pages
ISBPM - Assignment - 22.1 - Rajnish Ranjan
No ratings yet
ISBPM - Assignment - 22.1 - Rajnish Ranjan
4 pages
Efficient Quantum Circuit Simulation by Tensor Network Methods On Modern Gpus
No ratings yet
Efficient Quantum Circuit Simulation by Tensor Network Methods On Modern Gpus
26 pages
Constraining Dynamical Dark Energy From Galaxy Clustering With Simulation-Based Priors
No ratings yet
Constraining Dynamical Dark Energy From Galaxy Clustering With Simulation-Based Priors
17 pages
A Low Power Analog Integrated Implementation of The Support 2fgvohvl
No ratings yet
A Low Power Analog Integrated Implementation of The Support 2fgvohvl
20 pages
A General Polynomial Emulator For Cosmology Via Moment Projection
No ratings yet
A General Polynomial Emulator For Cosmology Via Moment Projection
8 pages
B O C E: Uilding Cean Limate Mulators
No ratings yet
B O C E: Uilding Cean Limate Mulators
19 pages
Ten-Dimensional Neural Network Emulator For The Nonlinear Matter Power Spectrum
No ratings yet
Ten-Dimensional Neural Network Emulator For The Nonlinear Matter Power Spectrum
9 pages
MIRACLE II - Unveiling The Multi-Phase Gas Interplay in The
No ratings yet
MIRACLE II - Unveiling The Multi-Phase Gas Interplay in The
26 pages
Fluctuations in Hill's Equation Parameters and Application To
No ratings yet
Fluctuations in Hill's Equation Parameters and Application To
25 pages
Whispers From The Early Universe - The Ringdown of Primordial Black Holes
No ratings yet
Whispers From The Early Universe - The Ringdown of Primordial Black Holes
6 pages
The Kinematic Age of
No ratings yet
The Kinematic Age of
8 pages
Imaging-Spectroscopic Diagnosis of The Giant Sloshing Spiral
No ratings yet
Imaging-Spectroscopic Diagnosis of The Giant Sloshing Spiral
11 pages
Exploring Substructures in The Milky Way Halo
No ratings yet
Exploring Substructures in The Milky Way Halo
12 pages
Constraining Decaying Dark Matter Models With Gravitational Lensing and Cosmic Voids
No ratings yet
Constraining Decaying Dark Matter Models With Gravitational Lensing and Cosmic Voids
11 pages
Magnetic Fields and Cosmic Rays in M 31
No ratings yet
Magnetic Fields and Cosmic Rays in M 31
15 pages
Correlations Between Dust Extinction Features Across All Wavelength Scales
No ratings yet
Correlations Between Dust Extinction Features Across All Wavelength Scales
20 pages
Obscured and Unobscured X-Ray AGNs I - Host Galaxy Properties
No ratings yet
Obscured and Unobscured X-Ray AGNs I - Host Galaxy Properties
12 pages
JWST Discovery of Warm Dust in The Circumgalactic Medium of The Makani Galaxy
No ratings yet
JWST Discovery of Warm Dust in The Circumgalactic Medium of The Makani Galaxy
15 pages
Orientation of Galaxy Spins Relative To Filaments of The Large-Scale
No ratings yet
Orientation of Galaxy Spins Relative To Filaments of The Large-Scale
7 pages
Singularities in Loop Quantum Cosmology
No ratings yet
Singularities in Loop Quantum Cosmology
21 pages
B-Fields and Dust in InterstelLar FiLAments Using Dust POLarizati
No ratings yet
B-Fields and Dust in InterstelLar FiLAments Using Dust POLarizati
18 pages
Spatially Resolved Hα Emission in B14-65666
No ratings yet
Spatially Resolved Hα Emission in B14-65666
16 pages
Fisher Score Matching For Simulation-Based Forecasting and Inference
No ratings yet
Fisher Score Matching For Simulation-Based Forecasting and Inference
9 pages
KMT-2024-BLG-0404L - A Triple Microlensing System Consisting of A
No ratings yet
KMT-2024-BLG-0404L - A Triple Microlensing System Consisting of A
9 pages
A Semi-Analytic Model For Effects of Fuzzy Dark Matter Granule
No ratings yet
A Semi-Analytic Model For Effects of Fuzzy Dark Matter Granule
17 pages
Euclid - Early Release Observations. A Combined Strong and Weak
No ratings yet
Euclid - Early Release Observations. A Combined Strong and Weak
13 pages
How A Klein-Nishina Modified Eddington Limited Accretion Explains Rapid Black
No ratings yet
How A Klein-Nishina Modified Eddington Limited Accretion Explains Rapid Black
22 pages
Constraints From CMB Lensing Tomography With Projected Bispectra
No ratings yet
Constraints From CMB Lensing Tomography With Projected Bispectra
27 pages
Open Cluster Chemical Abundances and Mapping Surve
No ratings yet
Open Cluster Chemical Abundances and Mapping Surve
28 pages
The Spheroidal Bulge of The Milky Way
No ratings yet
The Spheroidal Bulge of The Milky Way
31 pages
20 GeV Halo-Like Excess of The
No ratings yet
20 GeV Halo-Like Excess of The
31 pages
As of Thin Tubes of Massless Scalar Field As A
No ratings yet
As of Thin Tubes of Massless Scalar Field As A
32 pages
The Nineteenth Data Release of The Sloan Digital Sky Survey
No ratings yet
The Nineteenth Data Release of The Sloan Digital Sky Survey
56 pages
Exploring Direct Detection of Massive Particles Using Wave Propagation From
No ratings yet
Exploring Direct Detection of Massive Particles Using Wave Propagation From
14 pages
Lensing Without Mass - The Matter Density Profile in Cosmic Voids From
No ratings yet
Lensing Without Mass - The Matter Density Profile in Cosmic Voids From
19 pages
Equation of State During (P) Reheating With Trilinear Interactions
No ratings yet
Equation of State During (P) Reheating With Trilinear Interactions
22 pages
ΛCDM Constraints From the Power Spectrum & Bispectrum
No ratings yet
ΛCDM Constraints From the Power Spectrum & Bispectrum
38 pages

Design and Optimization of Neural Networks For Multifidelity Cosmological Emulation

Uploaded by

Design and Optimization of Neural Networks For Multifidelity Cosmological Emulation

Uploaded by

Design and optimization of neural networks for multifidelity cosmological emulation

I. INTRODUCTION lations [25], the emulators based on the Mira-Titan Uni-

non-standard dark matter models or baryonic physics.

II. METHODS B. Multifidelity Architectures

A. Simulation Data Ref. [44] proposed a “2-step” architecture for NN

ture. This is particularly important for high-dimensional

Step 1: N NL Step 2: N NLH

(a) Original (b) Modified

Data curacy across a range of variance thresholds prior to fi-

2. Neural network training

squared error (MSE) as the training loss, i.e.,

cosmologies, T1,val , is used as the val-

idation set. We train the NN with nLseed = 15 ran-

Base (1.73%) Mid (1.03%) Optimal (0.62%)

8 TABLE IV. Approaches tested in this work. MFA, PCA,

is not as significant as that of NNL-1, despite consuming

a similar amount of compute time (essentially because

1.0 z = 3.0 1.0 z = 3.0

Figure 6 compares the LF-to-HF correction NNs of the

2.5 In this work, we build emulators based on the non-

1.5 all the techniques we proposed are effective in improving

E. Castorina, G. Contardo, C. D. Kreisch, A. Nicola, arXiv:2306.03144 [astro-ph.CO].

FIG. 9. Comparison of LOO error (top) and test error (bot-

You might also like