0% found this document useful (0 votes)

76 views172 pages

Avila Perez Santiago Javier

Uploaded by

fs tvpc

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

76 views172 pages

Avila Perez Santiago Javier

Uploaded by

fs tvpc

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 172

Large Scale Structure

of the Universe:
from simulations to observations
by Santiago Javier Ávila Pérez
PhD Thesis in Theoretical Physics

May 6, 2016

supervised by
Alexander Knebe &
Juan Garcı́a-Bellido Capdevila

Universidad Autónoma de Madrid

Facultad de Ciencias
Departamento de Fı́sica Teórica
&
Instituto de Fı́sica Teórica (UAM-CSIC)
A mi familia,
que luchó por verme llegar aquı́.

3
La Estructura a Gran Escala
del Universo:
Simulando las Observaciones

5
Authorship

This PhD thesis was authored by Santiago Javier Ávila Pérez.

Chapter 1 is based on the peer-reviewed article [1]. It additionally contains a review
on the N -Body simulation field (Section 1.1).
Chapter 2 is based on the peer-reviewed article [2]. It also includes a summary of
the results in the peer-reviewed article [3] (Section 2.6).
Chapter 3 is unpublished work done within the Large Scale Structure working
group of the Dark Energy Survey collaboration.

[1] SUSSING MERGER TREES: the influence of the halo finder

Avila S.; Knebe A.; Pearce F.R.; Schneider A.; Srisawat C; Thomas P.A.; Behroozi P.; Elahi P.J.;
Han J.; Mao Y.; Onions J.; Rodriguez-Gomez V. and Tweed D.
2014 MNRAS 441 p. 3488-3501

[2] HALOGEN: a tool for fast generation of mock halo catalogues

Avila S., Murray S.G., Knebe A., Power C., Robotham A. and Garca-Bellido J.
2015 MNRAS 450 p. 1856-1867

[3] nIFTy Cosmology: galaxy/halo mock catalogue comparison project on clustering statistics
Chuang C.-H., Zhao C., Prada F., Munari E., Avila S., Itzard, A., et al. Kitaura F.S., Monaco, P.,
Murray S., Knebe A., Scoccola C.G., Yepes G., Garcia-Bellido J., Marin F., Muller V., et al.
2015 MNRAS 452 p. 686-700

7
Contents

Authorship 6

Prólogo 12

Preface 23

1 Merger Trees and Halo Finder Comparison 33

1.1 Introduction: Cosmological Simulations . . . . . . . . . . . . . . . . . 33

1.1.1 N-Body Simulations . . . . . . . . . . . . . . . . . . . . . . . 34

1.1.2 Simulation Post-processing . . . . . . . . . . . . . . . . . . . . 38

1.1.3 The Comparison Project . . . . . . . . . . . . . . . . . . . . . 40

1.2 Halo Finding Techniques . . . . . . . . . . . . . . . . . . . . . . . . . 42

1.3 Merger Tree Builders . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

1.4 Geometry of trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

1.4.1 Length of main branches . . . . . . . . . . . . . . . . . . . . . 48

1.4.2 Branching ratio . . . . . . . . . . . . . . . . . . . . . . . . . . 55

1.5 Mass Evolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

1.5.1 Mass Growth . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

1.5.2 Mass Fluctuations . . . . . . . . . . . . . . . . . . . . . . . . 62

9
10 Contents

1.5.3 Combining growth and fluctuations . . . . . . . . . . . . . . . 66

1.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

2 HALOGEN: an approximate halo catalogue generator 71

2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
2.1.1 Approximate halo mock catalogues . . . . . . . . . . . . . . . 71
2.1.2 The reference simulations . . . . . . . . . . . . . . . . . . . . 73
2.2 HALOGEN: the method outline . . . . . . . . . . . . . . . . . . . . . 75
2.2.1 Density Field . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
2.2.2 Halo Mass Function . . . . . . . . . . . . . . . . . . . . . . . . 77
2.2.3 Spatial placement of halos . . . . . . . . . . . . . . . . . . . . 80
2.2.4 Assignment of velocities . . . . . . . . . . . . . . . . . . . . . 80
2.3 HALOGEN: Bias scheme . . . . . . . . . . . . . . . . . . . . . . . . . 81
2.3.1 Random particles . . . . . . . . . . . . . . . . . . . . . . . . . 81
2.3.2 Random particles (with exclusion) . . . . . . . . . . . . . . . . 82
2.3.3 Ranked approach . . . . . . . . . . . . . . . . . . . . . . . . . 83
2.3.4 α approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
2.3.5 α(M ) approach . . . . . . . . . . . . . . . . . . . . . . . . . . 86
2.3.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
2.4 HALOGEN: Parameter Study . . . . . . . . . . . . . . . . . . . . . . 88
2.4.1 Fitting α(M ) . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
2.4.2 Velocity factor fvel . . . . . . . . . . . . . . . . . . . . . . . . 91
2.4.3 Cell size: lcell . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
2.5 HALOGEN: Outcome . . . . . . . . . . . . . . . . . . . . . . . . . . 97
2.5.1 Mass production of halo catalogues . . . . . . . . . . . . . . . 98
2.5.2 Probability Distribution Function . . . . . . . . . . . . . . . . 99
Contents 11

2.5.3 Power Spectrum . . . . . . . . . . . . . . . . . . . . . . . . . . 101

2.5.4 Correlation Function in Redshift Space . . . . . . . . . . . . . 102
2.6 Comparison with other Approximate Methods . . . . . . . . . . . . . 103
2.6.1 Description of methods . . . . . . . . . . . . . . . . . . . . . . 104
2.6.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
2.7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108

3 Dark Energy Survey Galaxy Mock Catalogues 115

3.1 The Dark Energy Survey . . . . . . . . . . . . . . . . . . . . . . . . . 115
3.2 HALOGEN lamps: observational galaxy mock catalogues . . . . . . . 119
3.2.1 Lightcone . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
3.2.2 Photometric Redshift . . . . . . . . . . . . . . . . . . . . . . . 122
3.2.3 Galaxies with HOD and HAM . . . . . . . . . . . . . . . . . . 124
3.3 Results and Applications . . . . . . . . . . . . . . . . . . . . . . . . . 129
3.3.1 Modelling Insight . . . . . . . . . . . . . . . . . . . . . . . . . 131
3.3.2 Optimizing methodology . . . . . . . . . . . . . . . . . . . . . 134
3.3.3 Uncertainty . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
3.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135

Closure 139

Epı́logo 142

Bibliography 146
Prólogo

El Modelo Cosmológico Estándar: ΛCDM

El campo de la Cosmologı́a ha llegado a un modelo de concordancia capaz de conciliar

todos los experimentos cosmológicos: ΛCDM. Este modelo se basa en los principios
de la teorı́a del Big Bang, que explica la expansión del Universo a través de la
ecuación de Friedman. Ésta puede ser expresada como:

ȧ
q
H(a) = = H0 (Ωc + Ωb ) a−3 + Ωrad a−4 + Ωk a−2 + ΩDE a−3(1+w) (1)
a
donde Ωi representa el parámetro de densidad de la especie i. Es decir, el cociente
3H 2
entre la densidad ρi y la densidad crı́tica ρcrit = 8πG0 en la época actual.
Los experimentos encuentran una componente de materia ordinaria (o bariónica)
de Ωb = 0.049, y una pequeña componente de radiación (fotones y neutrinos) de
Ωrad = 9 · 10−5 . La novedad con respecto a la antigua teorı́a del Big Bang es el efecto
dominante de dos nuevas especies: la Materia Oscura Frı́a (Ωc = 0.266) y la Energı́a
Oscura (ΩDE = 0.685). En el Modelo Estándar ΛCDM, la componente de curvatura
es insignificante (Ωk = 0), y la ecuación de estado de la Energia Oscura es w = −1,
que corresponde a una Constante Cosmológica Λ. La velocidad de expansión actual
del Universo es H0 = 67.3(km/s)/M pc. Todos los valores citados proceden de [4].
La Materia Oscura Frı́a (CDM, en adelante todas las siglas tienen su origen en la
nomenclatura en lengua inglesa) es indistinguible de la materia ordinaria en su com-
portamiento gravitatorio y, por tanto, la forma en que afecta a la expansión del
Universo (Ecuación 1). Sin embargo, es necesaria una componente de materia acol-

13
14 Prólogo

isional y no bariónica para explicar las estructuras que encontramos en el Universo.

La presencia de CDM se ha determinado a través de mediciones de las curvas de
velocidad de las galaxias, la velocidad de dispersión de los cúmulos de galaxias, el
efecto de lentes gravitacionales y el fondo de radiación cósmico (CMB). No obstante,
aún no se ha producido una detección directa que pueda esclarecer su naturaleza [5].
La Energı́a Oscura es una fuerza de repulsión introducida en la ecuación de Friedman
para explicar la aceleración del Universo en épocas recientes. La naturaleza de esta
componente permanece desconocida, aunque el modelo más simple (la Constante
Cosmológica) asume que es equivalente a una energı́a de vacı́o. Sin embargo, las
diferentes teorı́as cuánticas de campos predicen una energı́a de vacı́o entre 60 y 120
órdenes de magnitud superior al valor observado. Por tanto, una extensión de la
teorı́a parece necesaria, y entre las propuestas encontramos teorı́as modificadas de
la gravedad [6].
Desde un punto de vista fenomenológico, la Energı́a Oscura se puede parametrizar
con la ecuación de estado w (el cociente entre la presión y la densidad), necesitando
w < −1/3 para una expansión acelerada. También podemos parametrizarla con una
ecuación de estado que depende del tiempo:

w = w0 + (1 − a)wa (2)

Un tercer ingrediente en este paradigma es la Inflación: una época de expansión

acelerada del Universo justo después del Big Bang. Esta teorı́a es capaz de explicar
la planitud (Ωk = 0) del Universo, su homogeneidad a grandes escalas y el origen
de las estructuras. Más especı́ficamente, predice un espectro de potencias de las
fluctuaciones primordiales cercano a la invariancia de escalas PS (k) ∝ k ns −1 [7].
Cabe comentar que ciertas publicaciones o ciertos autores aseguran que se han en-
contrado evidencias observacionales en contra del Modelo Estándar ΛCDM: la abun-
dancia de cúmulos de caracterı́sticas extremas [8, 9], los problemas encontrados en
las estructuras a pequeñas escalas (e.g. la escasez de galaxias satélites [10]) o las
anomalı́as encontradas en las grandes escalas del CMB [11]. No obstante, ninguna
de esas pruebas son concluyentes y el debate sigue abierto. En el caso del problema
de pequeñas escalas, se puede argüir que nuestro conocimiento del efecto de la ma-
15

teria bariónica a esas escalas es limitado y puede estar detrás de la causa (e.g. [12]).
De hecho, con nuevas observaciones se están encontrando soluciones a algunos de
los problemas que llevaban tiempo esperando respuesta (e.g. el descubrimiento de
nuevas galaxias satélites [13]). Respecto a los otros dos casos, estimar la probabili-
dad de eventos extraños (la existencia de cúmulos extremos y las peculiaridades del
CMB), una vez que sabemos que ocurren, puede ser una tarea complicada y subje-
tiva. Nuevas estimaciones de otros grupos encuentran que las citadas anomalı́as son
compatibles con ΛCDM [14–16].

La Revolución Cosmológica

Vivimos actualmente una Revolución Cosmológica. En los últimos 20 años la Cos-

mologı́a Observacional ha evolucionado desde sólo poder estimar el orden de magni-
tud de la cantidad de materia en el Universo Ωm = Ωb + Ωc o el ritmo de expansión
H0 , a medirlos con una precisión por debajo del 2%, a descubrir la existencia de la
Energı́a Oscura, determinar la planitud del Universo, poner cotas a la masa de los
neutrinos, encontrar desviaciones de la invariancia de escala perfecta, medir factores
de crecimiento, las distorsiones en el espacio de redshift, las oscilaciones acústicas de
bariones (BAO), etc.
En la Figura 1 presento una visión más personal de esta Revolución Cosmológica.
Los dos paneles superiores, obtenidos de la recopilacilación Union2.1 1 [17], repre-
sentaban el conocimiento que la comunidad tenı́a de la Cosmologı́a cuando empecé el
doctorado en 2012. Pero durante los últimos 4 años han quedado obsoletos, dado que
los resultados del experimento Planck2 [4, 18] han puesto los lı́mites más restrictivos
a los parámetros cosmológicos, representados en los paneles centrales. Mientras que
los paneles superiores solı́an mostrarse en cualquier presentación ligeramente rela-
cionada con cosmologı́a, ese rol lo han tomado ahora los datos de Planck. Pero
todavı́a hay mucho trabajo por delante, ya que seguimos avanzando hacia la Cos-
mologı́a de Precisión. Los dos paneles inferiores muestran predicciones de los lı́mities
que se impondrán en la próxima década: en la izquierda con los datos finales del Dark
1
http://supernova.lbl.gov/union/
2
http://www.cosmos.esa.int/web/planck
16 Prólogo

Energy Survey (DES)3 [19], y en la derecha con los datos del futuro cartografiado Eu-
clid4 [20]. El principal objetivo de estos dos experimentos es medir con una precisión
sin precedentes la ecuación de estado de la Energı́a Oscura y su variación temporal
(Ecuación 2). Esto nos ayudará a comprender la naturaleza de esa misteriosa fuerza
que domina la densidad de energı́a del Universo.
Como hemos visto en la parte superior de la Figura 1, tres tipos principales de
experimentos han contribuido en las primeras etapas de la Cosmologı́a de Precisión.
A continuación, los repasaré brevemente.

Las Supernovas de tipo Ia (SNIa) son explosiones violentas que pueden ser us-
adas como candelas estándares (objetos cuya luminosidad es conocida) y observadas
a distancias cosmológicas. Midiendo su desplazamiento al rojo o redshift z, podemos
estudiar la relación entre éste y la distancia de luminosidad:

z
c dz 0
Z
dL (z) = (1 + z) (3)
0 H(z 0 )
que depende fuertemente de los parámteros cosmológicos que determinan la evolución
del Universo ([21, 22]).
Hacia el final del pasado siglo (1998-1999), los equipos de High-Z Supernova Search
Team5 y Supernova Cosmology Project6 midieron dL (z), encontrando que –al con-
trario de lo que se esperaba– la expansión del Universo se estaba acelerando [23, 24].
Esto suposo la primera de una serie de evidencias acerca de la existecia de la Energı́a
Oscura.

Oscilaciones Acústicas de Bariones (BAO). Al principio, el Universo primigé-

neo consistı́a en un plasma a altas temperaturas donde la materia bariónica estaba
ionizada e interactuaba muy fuertemente con los fotones. Las perturbaciones ini-
ciales, que son sembradas por la Inflación, se propagan por el plasma a través de
3
http://www.darkenergysurvey.org/es
4
http://www.euclid-ec.org/
5
https://www.cfa.harvard.edu/supernova/public.html
6
http://www-supernova.lbl.gov/
17

ondas de sonido causadas por los gradientes de presión. En un momento dado, el

Universo se vuelve neutro (recombination), y poco después se produce el desacoplo
(decoupling): las materia bariónica deja de interactuar con los fotones. En ese in-
stante, adec , las oscilaciones se congelan y dejan impresa la escala del horizonte del
sonido χBAO en la distribución de materia [25, 26]:

Z adec
c da
χBAO =√ p (4)
3 0 a2 H(a) 1 + 3Ωb /(4Ωγ )

Esta escala se puede observar a diferentes tiempos cósmicos en la distribución de

galaxias (u otro identificador de la materia) como una protuberancia en la función
de correlación a grandes escalas. Finalmente, podemos usar esta escala como una
regla estándar (un objecto cuyo tamaño es conocido) para medir la relación entre la
distancia angular y el redshift:

z
c dz 0
Z
1
dA (z) = (5)
1+z 0 H(z 0 )
que se relaciona con la Ecuación 3 a través de dM (z) = dA (z)·(1+z) = dL (z)/(1+z).
En la Figura 2 se muestra conjuntamente la señal del BAO y de las SNIa.
A pesar de que entender profundamente la fı́sica altamente no lı́neal asociada a
la formación de estructuras y su relación con los observables no es tarea fácil, la
fı́sica que determina χBAO se basa en princios mucho más sencillos. De este modo
las señales del BAO medidas en la Estructura a Gran Escala (LSS) se convirtieron
pronto en una fuente muy provechosa para determinar los parámetros cosmológicos.
El BAO fue detectado por primera vez en la distribución de galaxias por las co-
laboraciones 2dFGRS7 [27] y SDSS8 [28]. Más adelante otras medidas más precisas
fueron establecidas por 6dFGS9 [29], WiggleZ10 [30] y BOSS11 [31]. Además de estas
medidas, BOSS midió el BAO a través de los llamados bosques de Lyman-α: re-
costruyendo la distribución tridimensional de la posición de las balsas de hidrógeno

7
http://www.2dfgrs.net/
8
http://www.sdss.org/
9
http://www.6dfgs.net/
10
http://wigglez.swin.edu.au/site/
11
http://Cosmology.lbl.gov/BOSS/
18 Prólogo

neutro intergaláctico que dejan lı́neas de absorción en los espectros de cuásares le-
janos [32, 33].

Fondo Cósmico de Radiación (CMB). Tras el desacoplo de fotones y bariones,

la luz puede viajar libre, dejando un baño de fotones en el Universo que vemos
a dı́a de hoy a la temperatura de TCMB = 2.73K: el CMB. Ésta era una de las
predicciones clave de la teorı́a del Big Bang, que fue confirmada experimentalmente
tras el descubrimiento de Penzias y Wilson [34].
Esta temperatura es isotrópica en una fracción ∼ 1/105 , pero existen pequeñas fluc-
tuaciones originadas durante la Inflación, que más tarde se propagan en el plasma de
luz y materia, como se explicó anteriormente. Después del desacoplo, los rayos de luz
son ligeramente desviados por la Estructura a Gran Escala (CMB lensing [35]) y la
energı́a de los fotones se ve modificada por el colapso de las estructuras (efecto ISW,
[36]). A dı́a de hoy, podemos medir esas fluctuaciones, que contienen información
acerca de esas tres épocas. El efecto más visible es, precisamente, el pico del BAO.
Sin embargo, en el resto de esta tesis el BAO se referirá a la señal que se queda
grabada en la distribución de materia a menos que se indique lo contrario.
Entre los experimentos del CMB cabe destacar los tres satélites: COBE12 [37], que
detectó por primera vez las anisotropı́as del CMB; WMAP13 [38], que midió los tres
primeros picos del espectro de potencias, y Planck [11], que completó las medidas
del espectro hasta escalas de ∼ 0.07◦ .

Cosmologı́a de Precisión con la Estructura a Gran Escala

En la actualidad, la mayor fuente de información para limitar el rango de los pará-

metros cosmológicos procede del CMB. Esto es en parte debido a que la fı́sica del
CMB se puede entender y modelizar fácilemtne con teorı́a lı́neal de perturbaciones.
Por ello, este campo se desarrolló muy rápido y fue pionero en la Cosmologı́a de
Precisión. Sin embargo, la extracción de inforamación cosmológica del CMB ha
alcanzado ya su máximo y se centra ahora en estudios de orden superior, que son
12
http://lambda.gsfc.nasa.gov/product/cobe/
13
http://map.gsfc.nasa.gov/
19

más sutiles (polarización, distorsión espectral, etc.)

En el otro extremo, cada vez entendemos mejor el campo de la Estructura a Gran
Escala (LSS). Esto se debe a un mejor control de los errores sistemáticos e instru-
mentos más precisos y especializados, pero también a un mayor entendimiento de la
astrofı́sica de los observables y un mayor conocimiento de la cosmologı́a subyacente.
Además, mientras el CMB sólo nos aporta un mapa de dos dimensiones con infor-
mación proviniente principalmente de la fı́sica previa al desacoplo y durante éste;
LSS explora el espacio tridimensional a tiempos más recientes, cuando la influencia
de la Energı́a Oscura empeza a ser importante. Se puede explorar LSS con diferentes
identificadores de la materia: galaxias, cuásares, cúmulos de galaxias, bosques Ly-α,
gas HI, etc.
Anteriormente, vimos el potencial de los cartografiados de galaxias para medir el
BAO. Pero con un mejor entendimiento de LSS y los datos adecuados, los car-
tografiados de galaxias pueden poner nuevas cotas sin precedentes en los parámetros
cosmológicos. Cuando se trata de determinar la naturaleza de la Energı́a Oscura y
de distinguir entre la Constante Cosmológica y un modelo de gravedad modificada,
toda información es útil, ya que los modelos alternativos de gravedad pueden mostrar
su signo diferenciador en muchos tipos de medidas diferentes.
Entre las metas de los cartografiados venideros se encuentra medir el espectro de
potencias completo para limitar el rango de ns , hacer tomografı́a 3D del Universo
con weak lensing, medir la función de correlación a 3 puntos para encontrar no-
gausianidades, determinar los factores de crecimiento a diferentes redshifts, estimar
la función de masas de los halos (especialmente el final de la distribución), etc.
[19, 20]
Para lograr Cosmologı́a de Precisión con observaciones de la Estructura a Gran Es-
cala, necesitamos un modelo con el que contrastar los datos. Aunque existen modelos
teóricos de la formación de estructura [39], las predicciones son limitadas, y tenemos
que recurrir a simulaciones de N -cuerpos (N -Body), y una serie de herramientas
asociadas (véase la Sección 1.1 y las referencias allı́ citadas). En la Figura 3, vemos
una representación de las galaxias observadas en un cartografiado (en azul) y su
equivalente a través de simulaciones (en rojo). Vemos muchas similitudes en la dis-
20 Prólogo

tribución de galaxias, que forman estructuras muy complejas y altamente no lineales:

supercúmulos, filamentos, muros y vacı́os.
Las herramientas que utilizamos en las simulaciones deben de ser validadas para
verificar si son adecuadas para la Cosmologı́a de Precisión. En esas lı́neas, en el
Capı́tulo 1, evaluamos una serie de Halo Finders y Merger Tree builders. Los Halo
Finders son herramientas que se usan para identificar objetos densos o ligados grav-
itacionalmente –halos– en la distribución de materia oscura de las simulaciones (y que
sirven para alojar galaxias). Los Merger Tree builders son otro tipo de herramientas,
diseñadas para reconstruir la historia de los halos en las simulaciones. Partiendo de
la misma simulación de materia oscura, aplicamos diferentes combinaciones de las
citadas técnicas para analizar los Merger Trees (un esquema con toda la historia de
un halo) resultantes. Estudiaremos la edad de los halos, la probabilidad de fusión
entre halos y la evolución de su masa, según los resultados extraı́dos con cada una
de las diferentes técnicas.
Además, las medidas de LSS necesitan barras de error y matrices de covariancia que
cuantifiquen los errores sistemáticos, la varianza cósmica y su combinación. Para
poder estimarlas, no es suficiente con tener una simulación, sino que necesitamos
cientos o miles de ellas. Las simulaciones N -Body, son muy costosas en términos
de recursos informáticos. Por ello, generar tantas, con los volúmenes requeridos
por los cartografiados actuales, y con sufciente resolución de masa, está fuera del
alcance de cualquier grupo de investigación, incluso con los supercomputadores más
punteros. Por ejemplo, una de las simulaciones que tomaremos como referencia,
MICE, fue llevada a cabo en el supercomputador MareNostrum14 (véase la Sección
2.1 y las referencias allı́ citadas). Por tanto, necesitamos una nueva generación
de herramientas de simulación, capaz de generar catálogos simulados de manera
apróximada. En el Capı́tulo 2, presento halogen, un método diseñado para generar
catálogos de halos que presenten una función de correlación a 2 puntos correcta a
grandes escalas. Con este método el número de horas de CPU se reduce en un factor
∼ 103−5 y la memoria en un factor ∼ 101−2 con respecto a una simulación N -Body
(Tabla 2.5). Además veremos otros métodos aproximados para la generación de
catálogos simulados y una comparación de todos ellos.
14
https://www.bsc.es/
21

Para finalizar, en el Capı́tulo 3, presento la aplicación de halogen en el contexto del

análisis de los datos del Dark Energy Survey. En primer lugar, al método le añadimos
tres nuevas implementaciones de caracter observacional: la construcción de un cono
de luz, la simulación de un redshift fotométrico, y la implementación de un método
de ocupación de halos con galaxias, (HOD, véase referencias en la Sección 3.2.3)
que es ajustado para reproducir la correlación de las galaxias observadas. Después,
generamos una remesa de catálogos de galaxias y demostramos cómo pueden ser (y
están siendo) utilizados para: comprender mejor la modelización y la fı́sica de las
obserservaciones que simulamos; optimizar la metodologı́a del análisis y, por último,
calcular barras de error y matrices de covariancia para el análisis de la Estructura a
Gran Escala con los datos de DES.
22 Prólogo

Figure 1: Cosmological Constraints. Top: Constraints from Union2 [17] (2010)

combining data from SNIa, CMB and BAO. 1, 2 & 3 σ confidence level contours
of the density parameter of matter and Dark Energy assuming w = −1 (Left),
and constraints of the density of matter and the equation of state of Dark Energy
assuming a flat Cosmology (Right). Middle: Constraints from Planck 2015 release
[4]. Contours of the 1 & 2 σ region of the density parameters (Left) and on the time-
dependent equation of state of Dark Energy (Right), parametrised by Equation 2.
Note, that constraints become strong when combined with other probes (red on the
left, blue on the right). Bottom: Forecast for DES [40] (Left) and Euclid [41]
(Right), 1-σ constraints on the {w0 , wa } plane.
Preface

The Standard Cosmological Model: ΛCDM

A concordance model has been reached in the field of Cosmology, able to reconcile
all the cosmological experiments: ΛCDM. This model is based on the principles of
the Hot Big Bang theory, that explains the expansion of the Universe through the
Friedman equation, which can be expressed as

ȧ
q
H(a) = = H0 (Ωc + Ωb ) a−3 + Ωrad a−4 + Ωk a−2 + ΩDE a−3(1+w) (1)
a
with Ωi representing the density parameter of the species i. This is, the ratio of the
3H 2
density ρi to the critical density ρcrit = 8πG0 at the current epoch.
Experiments find a component of ordinary (baryonic) matter of Ωb = 0.049 and a
small component of radiation (photons and neutrinos) of Ωrad = 9·10−5 . The novelty
with respect to the old Big Bang theory is the dominant effect of two new species:
the Cold Dark Matter (Ωc = 0.266) and the Dark Energy (ΩDE = 0.685). In the
standard ΛCDM model, the curvature is negligible (Ωk = 0) and the equation of
state of Dark Energy is w = −1, corresponding to a Cosmological Constant Λ. The
measured current expansion rate is H0 = 67.3(km/s)/M pc (all quoted values from
[4]).
The Cold Dark Matter is indistinguishable from ordinary matter in its gravitational
behaviour and, hence, in the way it affects the expansion of the Universe (Equa-
tion 1). However, we need a non-baryonic collisionless component to explain the
formation of the structures that we find in the Universe. Its presence has been

23
24 Preface

determined by measurements of rotational curves of galaxies, velocity dispersion of

galaxies in galaxy clusters, gravitational lensing and Cosmic Microwave Background
(CMB). But we are still missing a direct detection of Dark Matter that can shed
light upon its nature [5].
The Dark Energy is a repulsive force introduced in the Friedman equation to account
for the acceleration of the Universe at late times. The nature of this component
remains unknown, although the simplest model (the Cosmological Constant) assumes
it is equivalent to a vacuum energy. However, predictions on the vacuum energy
from Quantum Field Theories disagree in ∼ 60 − 120 orders of magnitude with
observations. Hence, extensions of the theory appear necessary amongst which we
find modified theories of gravity [6].
From a phenomenological approach, Dark Energy is parametrised by its equation
of state w (the pressure to density ratio), needing w < −1/3 for an accelerated
expansion. It can further be parametrised as a time-dependent equation of state as

w = w0 + (1 − a)wa (2)

A third ingredient in this paradigm is Inflation: an epoch of accelerated expansion

just after the Big Bang. It explains the flatness (Ωk = 0) of the Universe, its homo-
geneity at large scales and the origin of structures. Particularly, inflation predicts a
nearly scale invariant power spectrum of primordial fluctuations PS (k) ∝ k ns −1 [7].
As a final comment, some experimental evidences against ΛCDM have been claimed:
the abundance of extreme clusters [8, 9], problems with structure formation at small
scales (e.g. the missing satellite problem [10]) or the large scale anomalies of the CMB
[11]. However, none of those evidences were conclusive and the debate remains open.
The small scale problems can be argued away by highlighting the little knowledge
we have about the effect of baryon physics on those scales (e.g. [12]). In fact,
new observations are solving some of the long-lasting challenges (with many more
satellites found [13]). As for the other two cases, determining the probability of
anomalous events (large scale features in the CMB or abundance of extreme clusters)
happening once we know they do occur is not trivial. Estimations from other groups
find the anomalies compatible with ΛCDM [14–16].
25

Figure 2: Distance-redshift relation. Compilation of SNIa from [42] (2008) and BAO
measurements from [29–31, 33]. Figure from C. Blake in [43].

Cosmological Revolution

We are currently living in a Cosmological Revolution. In the last 20 years Observa-

tional Cosmology has evolved from struggling to estimate the amount of matter in
the Universe Ωm = Ωb + Ωc or the rate of expansion H0 , to measuring them with
a < 2% accuracy, measuring the flatness of the Universe, discovering the existence
of Dark Energy, setting constraints on neutrino masses, finding deviations from per-
fect scale independence in the primordial power spectrum, measuring growth factors,
redshift space distortions, Baryonic Acoustic Oscillations (BAO), etc.
In Figure 1 I present a more personal perspective of this Cosmological Revolution.
The two top panels from the Union2.1 compilation15 [17] represented the knowledge
the scientific community had about Cosmology when I started my PhD in 2012. But
during the last four years they became outdated, since the results from the Planck
experiment16 [4, 18] set the most stringent constraints of cosmological parameters,
15
http://supernova.lbl.gov/union/
16
http://www.cosmos.esa.int/web/planck
26 Preface

represented in the middle panels. The top panes used to appear in any presentation
slightly related to Cosmology, nowadays, Planck data have taken over that role. But
there is still much work to be done in the future, as we keep advancing towards
Precision Cosmology. The two bottom panels show constraints forecast for the next
decade: on the left for the completed Dark Energy Survey17 [19] and on the right
for the future survey Euclid18 [20]. The main target of both of these experiments
is to set unprecedented constraints on the time-dependent equation of state of Dark
Energy (Equation 2). This will help us understanding the nature of this mysterious
force that dominates the energy density of the Universe.
As seen at the top of Figure 1, three main type of experiments contributed to the
early stages of Precision Cosmology. I will briefly review them below.

Type Ia Supernova (SNIa) are violent explosions that can be used as standard
candles and detected at cosmological distances. Measuring their redshift z, we can
study the luminosity distance-redshift relation

z
c dz 0
Z
dL (z) = (1 + z) (3)
0 H(z 0 )
highly dependent on the cosmological parameters that determine the evolution of
the late Universe ([21, 22]).
At the end of the last century (1998-1999) the High-Z Supernova Search Team19
and Supernova Cosmology Project20 measured dL (z) determining that instead of
decelerating –as it was expected– the expansion of the universe was accelerating
[23, 24]. This was the first evidence for Dark Energy.

Baryonic Acoustic Oscillations (BAO). At early times, the primordial Uni-

verse consists in a very hot plasma where baryonic matter is ionised and strongly
interacting with photons. Initial perturbations seeded by inflation propagate in
17
http://www.darkenergysurvey.org/
18
http://www.euclid-ec.org/
19
https://www.cfa.harvard.edu/supernova/public.html
20
http://www-supernova.lbl.gov/
27

the fluid following sound-waves caused by the pressure gradients. Eventually, the
Universe becomes neutral at recombination and baryonic matter and photons stop
interacting shortly after: at decoupling. At this moment, oscillations also freeze and
leave imprinted the scale of the sound horizon χBAO in the distribution of matter
[25, 26]:

Z adec
c da
χBAO =√ p (4)
3 0 a2 H(a) 1 + 3Ωb /(4Ωγ )
being adec the scale factor at decoupling.
This scale can be found at different cosmological times in the distribution of galaxies
(and other tracers of matter) as a bump in the correlation function at large scales.
We can use it as a standard ruler to determine the angular distance-redshift relation

z
c dz 0
Z
1
dA (z) = (5)
1+z 0 H(z 0 )
being related to Equation 3 through dM (z) = dA (z) · (1 + z) = dL (z)/(1 + z). Both
BAO and SNIa measurements are shown together in Figure 2.
Even though disentangling the highly non-linear physics involved in galaxy and struc-
ture formation might be arduous, BAO measurements from the Large Scale Structure
(LSS) became soon a very powerful tool to constrain Cosmology. This is partially due
to the size of χBAO relying on more basic principles. BAO was first detected in the
galaxy distribution by the 2dFGRS collaboration21 [27] and SDSS22 [28], later more
precise measurements were performed by 6dFGS23 [29], WiggleZ24 [30] and BOSS25
[31]. Additionally, BOSS measured BAO from Lyman-α forests: 3D reconstruction
of intergalactic blobs of neutral hydrogen that imprint absorption lines in spectra
from distant quasars [32, 33].

21
http://www.2dfgrs.net/
22
http://www.sdss.org/
23
http://www.6dfgs.net/
24
http://wigglez.swin.edu.au/site/
25
http://Cosmology.lbl.gov/BOSS/
28 Preface

Cosmic Microwave Background (CMB). After decoupling, light travels free

leaving a bath of photons in the Universe that we see now at the temperature of
TCMB = 2.73K: the CMB. This was one of the predictions of the Hot Big Bang
theory confirmed experimentally by the discovery of Penzias and Wilson [34].
This temperature is isotropic to a fraction of ∼ 10−5 , but there are small fluctua-
tions originated during inflation and later propagated in the matter-photon plasma
as already explained. After decoupling, light-rays are slightly bent by the Large
Scale Structure (CMB lensing [35]) and photon energy slightly modified by structure
collapse (ISW effect, [36]). Today, we can measure the CMB fluctuations containing
information about those three epochs. The most visible effect is precisely the BAO
peak. However, in the rest of this thesis, unless otherwise stated, BAO will refer to
the imprint left in the matter distribution.
Amongst other CMB experiments, we highlight the satellites COBE26 [37] that first
detected the CMB anisotropies, WMAP27 [38] that measured the first three peaks of
the power spectrum and Planck that completed the power spectrum up to ∼ 0.07◦
scales [11].

Towards Precision Cosmology with Large Scale Structure

Currently, the strongest constraints on Cosmology are set by CMB. This is in part
due to the fact that CMB physics can be easily understood and modelled from linear
perturbation theory, this boosted these type of experiments, pioneering Precision
Cosmology. But its information exploitation reached a maximum and now attention
focuses in higher order and more subtle effects (polarization, spectral distortions,
etc.)
On the other hand, our understanding of the Large Scale Structure (LSS) is notably
improving with time. This is due to a better control of the systematic experimental
errors, more precise and specialised instruments, but also due to a better under-
standing of the astrophysics involved and a more precise knowledge of the Cosmology
behind.
26
http://lambda.gsfc.nasa.gov/product/cobe/
27
http://map.gsfc.nasa.gov/
29

Moreover, whereas CMB provides a 2-dimensional map with information mostly

about physics before and during decoupling, LSS explores the full 3-dimensional
space at later times, where the influence of Dark Energy appears. LSS can be
explored via different tracers: galaxies, quasars, galaxy clusters, Ly-α forests, HI-
emission gas, etc.
We have already seen the power of galaxy surveys to measure BAO. But, with a
better understanding of the LSS and the appropriate data, galaxy surveys can set
more stringent constraints. When it comes to determining the nature of Dark Energy
and to distinguishing between a Cosmological Constant and modified gravity, all the
information is useful, since modifications of gravity can show their signature in many
different measurements.
Amongst the targets in the coming galaxy surveys we find: measuring the full
power spectrum of matter to constrain ns , making 3D tomography of the Uni-
verse with weak lensing, measuring the 3-point correlation function in search of non-
gaussianities, determining growth factors to constrain modified gravity, estimating
the halo mass function (especially the high end), etc. [19, 20]
But for Precision Cosmology from LSS observations we need a counterpart from
theory. Although, there are analytical models of structure formation [39], their pre-
dictions are limited, and we need to rely on N -Body simulations and other associated
computing tools (see Section 1.1 and references therein). In Figure 3 we find a rep-
resentation of data from galaxy surveys (blue) and its simulated counterpart (red),
finding similarities in the distribution of galaxies, which follow complex and highly
non-linear structures.
The tools that we use for simulations need to be validated in order to verify if they
are appropriate for Precision Cosmology. Along these lines, in Chapter 1 we test the
performance of different Halo Finders and Merger Tree builders used by the commu-
nity. Halo Finders are tools used to identify bound/dense objects within the dark
matter distribution of a simulation, that serve as hosts for galaxies. Merger tree
builders are tools designed to follow the history of those objects along the cosmolog-
ical time in the simulation. From the same underlying dark matter field we apply
different combination of these techniques and analyse their outcome Merger Trees
30 Preface

Figure 3: Representation of catalogues of galaxies from galaxy surveys in blue (SDSS

and 2dFGRS) and the millennium N -Body simulation in red (http://wwwmpa.
mpa-garching.mpg.de/millennium/ [44]) Radial coordinates represents redshift z
or, equivalently, recession Hubble velocity or cosmological look-back time. Angular
coordinates, represents angle of observation. Both in simulations and observations
we find a very complex and non-linear distribution of galaxies forming superclusters,
filaments, walls and voids.
31

(scheme representing the history of halos). These are analysed by means of the halo
age, merger rate and mass evolution.
Additionally, measurements from LSS need error bars and covariance matrices ac-
counting for systematic errors, cosmic variance and their interplay. In order to
estimate them we do not only need one simulation but hundreds or even thousands.
Precise N -Body simulations are very costly in terms of computing resources, and
running that many is prohibitive (see Section 2.1 and references therein). Hence,
we need a new generation of simulating tools to generate approximate synthetic cat-
alogues. In Chapter 2 I present halogen, a technique designed to generate halo
catalogues with the correct 2-point correlation function at large scales, reducing the
CPU-hours required by a factor of ∼ 103−5 compared to an N -Body simulation, and
memory in a factor ∼ 101−2 (Table 2.5). Other approximate methods for fast gener-
ation of halo mock catalogues from the literature are also presented and compared
in Section 2.6.
Finally, in Chapter 3, I present the application of halogen in the context of the
Dark Energy Survey data analysis. Firstly, the catalogues are adapted with three
additional observational features: construction of a lightcone, simulation of photo-
metric redshift and the implementation of a Halo Occupation Distribution scheme
(HOD, see Section 3.2.3 and references therein) fitted to reproduce the observed
galaxy clustering. Then, a batch of mock catalogues is generated and we show its
applicability to: gain insight into the modelling, optimise the analysis methodol-
ogy and compute error bars and covariance matrices for the Large Scale Structure
analysis.
Chapter 1

Merger Trees and Halo Finder

Comparison

1.1 Introduction: Cosmological Simulations

In the early stages of evolution of the Universe, the homogeneous and isotropic as-
sumption represents a good approximation. For studies of CMB where perturbations
are very small (δ ∼ 10−5 ), we can use linear perturbation theory to model the dis-
tribution of matter in the Universe. However, these small fluctuations continue to
grow due to gravitational collapse forming the complex cosmic web (with filaments,
knots and walls) that we find around us at the present epoch (Figure 3).
The details of structure formation can not be properly modelled from perturbation
theory because collapsed objects enter in the highly non-linear regime of gravity. In
this regime, we can only rely on N -Body simulations, where particles are let evolved
with gravity step by step. We will shortly review the methods to perform N -Body
simulations in Section 1.1.1
The final outcome of an N -Body simulation represents the dark matter density field
as shown in Figure 1.1, which is not a direct observable. Hence, for each type of
N -Body simulation and associated observation, a post-processing is needed (Sec-
tion 1.1.2). Halo finders and merger tree builders are part of the analysis pipeline of

33
34 Chapter 1. Merger Trees and Halo Finder Comparison

N -Body simulations. The former finds collapsed objects called halos within the dark
matter distribution at a given time-step or snapshot (Section 1.2). The latter links
those halos across different time-steps and identifies the merger of halos generating
a scheme called merger tree (Figure 1.2 and Section 1.3).
There is a wide variety of methods used by the community for halo finding and
merger tree building. In this chapter we analyse the differences and similarities in
the outcome merger trees for the different combination of methods, see Section 1.1.3
for a more detailed description of the context and motivation of this study. This
analysis is done on one hand from the geometrical point of view (Section 1.4) and
on the other hand studying the halo mass evolution Section 1.5. Finally, conclusions
are presented in Section 1.6.

1.1.1 N-Body Simulations

N -Body simulations are used in many fields of Astrophysics and other fields of
physics. Depending on the area, there are different physical processes that may
be relevant, and the simulations will have different requirements. For Large Scale
Structure, we only simulate dark matter particles, for which only gravity and the
expansion of the universe are relevant. These particles are not fundamental particles,
but collisionless tracers of the phase-space, with very high masses exceeding a mil-
lion (and often a billion) solar masses. Even if baryons –which represent a relevant
fraction of the matter of the Universe– are not collisionless, in these simulations the
hydrodynamics of baryons is neglected since its effects are only relevant at small
scales (. 2M pc) and represent a severe increment in the computing time. See [45]
for a thorough study of N -Body simulations in different fields and a detailed deriva-
tion of the computations presented below. A review more specialised in N -Body
methods in Cosmology is [46], and [47] is a more recent review more contextualised
with experiments.
Cosmological simulations represent the Universe in a box of constant comoving vol-
ume sampled with N particles. In order to model an infinite and boundless Universe,
we impose periodic conditions, i.e. a particle leaving one side of the box appears
in the opposite side, and the gravitational potential is generated not only by the
1.1. Introduction: Cosmological Simulations 35

Figure 1.1: MICE Grand Challenge N -Body Simulation Dark Matter distribution,
brighter parts represent denser regions (Table 2.1). This figure explores the very
large scales up to 3072M pc/h, which is the simulation size (being the extension
replicas following the periodical conditions, see text), as well as intermediate scales
(100M pc/h), where the deviations from homogeneities are more pronounced.

particles in the box, but also by an infinite number of replicas of the same box. Each
particle is represented by its comoving coordinate ~x = ~r/a and comoving velocity ~u,
being ~r its physical position. In this comoving frame, the equations of motion are
left as
d~x
= ~u
dt (1.1)
d~u 1
= −2 · H · ~u − 3 ∇x φ
dt a
where φ(~x) is the Newtonian potential determined by the density perturbations re-
spect to the mean ρ̄:

∇x2 φ = 4πG(ρ(~x) − ρ̄) (1.2)

Having all the equations and physics that govern the system, we now need to specify
the method to generate the initial conditions, numerically compute the potential and
integrate the equation of motion.
36 Chapter 1. Merger Trees and Halo Finder Comparison

Initial Conditions

The initial power spectrum of matter is well known as it is measured from the
CMB. It can be easily calculated given that cosmology is known with codes as camb
[48]. Starting from a completely uniform Universe, we use Lagrangian Perturbation
Theory to perturb the field and generate a density field with the correct power
spectrum.
Lagrangian Perturbation Theory (LPT) studies how particles (fluid elements) move
across the fixed coordinate space, unlike Eulerian theory where the matter of study
is the variation of density and velocity field at a given position [49, 50]. At z = ∞
particles are distributed on a regular grid with coordinates ~q (Lagrangian Position).
As the Universe expands particles are displaced to their Eulerian position ~x:

~ ~q)
~x(t) = ~q + Ψ(t, (1.3)

~ the so-called displacement field. Expanding it in a Taylor series:

being Ψ

~ =Ψ
Ψ ~ (1) + Ψ
~ (2) + ... (1.4)

In Zel’dovich Approximation (ZA [51], traditional name given to 1st -order LPT), we
only keep the first term, whereas for 2nd -order LPT (2LPT) we keep up to 2nd order
terms. Resolving the corresponding equations of motion, one arrives at:

~ qΨ
∇ ~ (1) = −D1 (t)δ(q)

~ (2) = 1 D2 (t) (1.5)

X (1) (1) (1) (1)
~ qΨ
∇ (Ψi,i Ψj,j − Ψj,i Ψi,j )
2 i6=j

being D1 and D2 the 1st and 2nd order growth factors, respectively.
Initial conditions have to be generated at a high enough redshift such that perturba-
tions are still in the linear regime. But if we set them at too large redshift the gravity
solver will integrate numerical noise. The standard method for many years has been
to use ZA at the redshift when density perturbations reach δ ∼ 0.1. However, it has
been shown that transients may appear with that method [52], and using 2LPT is
1.1. Introduction: Cosmological Simulations 37

becoming more standard nowadays.

Force computation

Once the initial conditions are set, particles move according to gravity via the force

1
F~ = ∇φ (1.6)
a
There are different ways to compute it depending on the N -Body code. The most
naive way to compute this force is using Newton’s law for each particle i, under the
so-called Particle-Particle (PP) approach:

X Gmi mj
F~ (~
xi ) = − 2
r̂ij (1.7)
j6=i
rij

being ~rij = x~i − x~j with rij = |~rij | and r̂ij = ~rij /rij
This method is very accurate, but very slow since it scales as O(N 2 ). Tree solvers
[53–55] transform it to a O(N logN ) problem by arranging particles in a tree (a
scheme where particles are hierarchically grouped by proximity) and only resolving
groups of particles that subtend an angle θ > θ0 from the position x~i .
Another associated problem with both of these methods is that we have to manually
add a softening to the force because we need to avoid strong accelerations caused by
2-body interactions at small distances (recall that we are simulating a collisionless
fluid). This softening arises naturally in the Particle-Mesh (PM) approach [56],
where Equation 1.2 is solved on a grid in Fourier space and the force derived from
Equation 1.6. This method is really fast, but it lacks accuracy at small scales.
Combining the efficiency of PM and the accuracy of PP we find the P3 M method
[57, 58]. It computes the large scale part of φ with the PM and the small scale
contributions with Equation 1.7. This can be still computationally costly in very
clustered regions and an another solution is the TreePM method [59, 60]. In this
method, small scale forces are computed using the Tree approach, whereas large
scales are computed with the Fourier transform as in PM. Nowadays, one of the
38 Chapter 1. Merger Trees and Halo Finder Comparison

most commonly used N -Body codes is Gadget-21 which is a publicly available

implementation of the TreePM algorithm. Different versions of this code have been
used for the N -Body simulations presented in this thesis.
Another popular method is the Adaptive Mesh Refinement (AMR) method[61,
62], that is equivalent to a PM method, but increasing in real time the resolution of
the mesh in the higher density regions. This is specially useful if we want to include
hydrodynamical gas physics in an Eulerian approach.

Time integration

At every time-step after calculating the force, we need to move the particles according
to the velocity and accelerate them according to their forces. It is actually found that
it is desirably to do this alternatively using a leap-frog scheme. This is, position and
momentum are not updated simultaneously, but with a delay of half the time-step
∆t:

k+1/2 k ∆t p~ k
~x = ~x +
2 a2 m
p~ k+1
= p~ k
+ ∆t · F~ k+1/2 (1.8)
k+1 k+1/2 ∆t p~ k+1
~x = ~x +
2 a2 m
Note that the third part of the step k = l its identical to the first part of the step
k = l + 1, so they can be applied together forming the leap-frog scheme. It is in the
second part where we need to compute the force at every step, as explained before.

1.1.2 Simulation Post-processing

The N -Body simulations give us the distribution of dark matter of the Universe.
However, this is not a direct observable, and we need to include galaxies if want to
compare them with observations. There are different methods to do so, which rely
on different techniques that will be explained below:
1
http://wwwmpa.mpa-garching.mpg.de/gadget/
1.1. Introduction: Cosmological Simulations 39

• Halo Finder. Galaxies live in dark matter halos: self-bound, virialised and
very dense objects with spheroidal shape. Halo finders are codes that identify
these objects from the distribution of particles in the simulation at a given
time-step or snapshot. Some halo finders also identify sub-halos (halos that lie
in another halo). See Section 1.2 and [63] for a review.

• Merger Tree builders. A merger tree is a scheme that traces back a halo
from the latest snapshot to the origin of all its progenitors, it tells us about
the history of halos, including the age, merger rate, etc. A merger tree builder
is a code that links halos across different snapshots. See Section 1.3 and [64]
for a description of most methods.

• Semi-analytical Methods (SAM). The aim of semi-analytical methods is

to infer the properties of galaxies from the mass evolution and merger rate of
the host halo as given by the merger trees. It is motivated from hierarchical
structure formation and is a natural complement to Press-Schechter Theory
[39]. This implies reducing very complex processes of gas physics (gas dynamics
of rotation and merger, gas cooling, star formation, supernova feedback, etc.)
to a few prescriptions and parameters. See [65] for a recent comparison with a
summary of the different techniques.

• Halo Occupation Distribution (HOD). The aim of HOD is mainly to

place galaxies in a halo with the correct clustering. This model places galaxies
in halos following a halo density profile (typically a NFW, [66]) with some
free parameters that are fitted to data. It does not take into account any
internal/temporal property of the halo, but only the final mass. Simple models
are fitted to a specific observed sample, whereas other models more ambitious
include magnitudes, colours, stellar mass, etc. [67–75]

• Halo Abundance Matching (HAM) or Subhalo Abundance Matching

(SHAM). This method relates a magnitude of the halos or subhalos (mass,
vmax , etc) to a magnitude of the galaxies (luminosity, stellar mass, etc). This
can be done either strictly by rank-order (e.g. the nth most massive halo
corresponds to the nth most luminous galaxy) or stochastically by adding some
scatter. Typically the scatter is used to match the clustering. [76–79]
40 Chapter 1. Merger Trees and Halo Finder Comparison

Figure 1.2: Merger Trees representation. On the left panel we see the particles of an
N -Body Simulation as red dots, the halos as blue circles with radius R200c (defined by
Equation 1.9) and merger trees as arrows linking halos across snapshots (the green
one represents a merger). On the right panel (from [80]) we find a chart representing
a merger tree following the mergers of halos (circles) through different snapshots ti .

1.1.3 The Comparison Project

We just saw in Section 1.1.2 that many techniques and prescriptions implemented in
codes are used to analyse simulations. Different codes are used by different research
groups in the community for the same purpose. Each of these codes make some
approximations and assumptions but, do they lead to the same results? This is
the question posed by the Mocking Astrophysics programme2 . During a series of
workshops and subsequent studies this programme has been analysing and validating
the post-processing pipeline used by the community. Among the target of study we
find the halo finders [81], merger tree builders [64] and Semi-analytical Models [65].
The analysis presented in the remainder of this chapter was done as part of this
programme and as a consequence of the SussingMergerTree workshop3 . The
aim is to address the question of how the combination of halo finders and merger
tree builders affect the properties of the final merger tree [1].

2
http://www.nottingham.ac.uk/~ppzfrp/mockingastrophysics/
3
http://popia.ft.uam.es/SussingMergerTrees
1.1. Introduction: Cosmological Simulations 41

Sussing Merger Trees: the influence of the halo finder

The backbone of any semi-analytical model of galaxy formation is a merger tree

of dark matter halos. Some modern semi-analytical codes [82–86] rely on purely
analytical forms such as Press-Schechter [87] or Extended Press-Schechter [88] –see
[89] for a comparison of such methods–, while other codes take as input halo merger
trees derived from large numerical simulations (see [90, 91] for the historical origin of
both approaches). Therefore, stable semi-analytic models require well-constructed
and physically realistic merger trees: halos should not dramatically change in mass
or size, or jump in physical location from one step to the next.
The properties of the merger trees built using a variety of different methods was
addressed in [64]. While it was observed that different tree building algorithms
produce distinct results, the influence of the underlying halo catalogue still remained
unanswered. This is nevertheless an important question as different groups rely on
their individual pipelines, which often includes their own simulation software, halo
finding method and tree construction algorithm before the trees are fed to a semi-
analytical model to obtain galaxy catalogues.
In a series of comparisons of (sub-)halo finders [e.g. 81, 92–95], which are all sum-
marised in [96], it was shown that there can be substantial variations in the halo
properties depending on the applied finder. This will certainly leave an imprint
when using the catalogue to construct merger trees. As a fixed input halo catalogue
was used for the first tree builder comparison, the question addressed here is to what
extent merger trees are sensitive to the supplied halo catalogue.
In this work we include both steps of the tree building process, i.e. we will apply
a set of different tree builders to a range of halo catalogues constructed using a
variety of object finders. Please note that the underlying cosmological simulation
remains identical in all instances studied here. We are investigating how much of the
scatter in the resulting merger trees that form the input to semi-analytical models
stems from the tree building code and how much stems from the halo finder. Or put
differently, is a merger tree more affected by the choice of the code used to generate
the tree or the code used to identify the dark matter halos in the simulation?
42 Chapter 1. Merger Trees and Halo Finder Comparison

1.2 Halo Finding Techniques

As already explained, halo finders search dark matter halos within the particle dis-
tribution of a simulation snapshot. The exact definition of halos in simulations is
actually set by the halo finder itself and can vary significantly from one finder to
another, specially when it comes to subhalos (halos lying inside another halo). The
subtleties of each code will be explained below, but we will introduce here the two
basic types of halo finding techniques at the main halo level:

• Friends of Friend (FoF)[97]. It is a geometrical algorithm: particles separated

less than b · Dmean are glued together to form halos. Dmean is the mean inter-
particle separation, and b a free parameter typically set to 0.2.

• Spherical Overdensity (SO)[39]. This technique searches density peaks and

defines halos as spherical objects with density higher than ρref with a typical
value of 200ρcrit .

While FoF can only give main halos, the SO method may also be used for subhalos.
A minimum number of particles Nmin must be considered for halos to be valid,
generally Nmin = 20 is chosen.

The halo catalogues used for this study are extracted from 62 snapshots of a cosmo-
logical dark-matter-only simulation undertaken using the Gadget-3 N -body code
[98] with initial conditions drawn from the WMAP-7 cosmology [99]. We use 2703
particles in a box of comoving width 62.5 h−1 Mpc/h, with a dark-matter particle
mass of mp = 9.31 × 108 h−1 M . We use 62 snapshots (000,. . . ,061) evenly spaced in
log a from redshift 50 to redshift 0.
While in previous comparison projects [e.g. 81, 93, 96] the same mass definition was
imposed (or even used a common post-processing pipeline to assure this), it was not
request any such thing this time, i.e. every halo finder was allowed to use its own
mass definition.
1.2. Halo Finding Techniques 43

On the one hand, AHF and Rockstar define a spherically truncated mass through

4π 3
Mref (< Rref ) = ∆ref × ρref × R , (1.9)
3 ref

adopting the values ∆ref = 200 and ρref = ρcrit (we will call this mass M200c )
and iteratively removing particles not bound to the structure. On the other hand,
HBThalo and SUBFIND return arbitrarily shaped self-bound objects based upon
initial Friends-of-Friends (FoF) groups, assigning them the mass of all (i.e. no spher-
ical truncation) particles gravitationally bound to the halo.
Furthermore, some halo finders include the mass of any bound substructures in the
main halo mass whereas others do not include the mass of any bound substructures.
Technically, finders for which particles can only belong to one halo are termed exclu-
sive while finders for which particles can belong to more than one halo are termed
inclusive. As substructures can typically account for 10% of the halo mass this choice
alone can make a substantial difference to the halo mass function.
Given these definitions we can now describe the general properties of the halo finders
applied to the data:

• AHF [100, 101] is a configuration-space Spherical Overdensity (SO) adaptive

mesh finder. It returns inclusive gravitationally bound halos and subhalos
spherically truncated at R200c (thus, the mass returned is M200c ).

• HBThalo [102] is a tracking algorithm working in the time domain that fol-
lows structures from one time-step to the next. It returns exclusive arbitrarily
shaped gravitationally bound objects. It uses FoF groups for the initial particle
collection.

• Rockstar [103] is a phase-space halo finder. A peculiarity of this code is

that –unlike AHF, HBThalo and SUBFIND– the mass returned for a halo
does not correspond to the sum of the mass of the particles listed as belonging
to it. While it uses the same mass definition as AHF (inclusive bound M200c
mass), the particle membership list of the halo is exclusive and is made up by
proximity of particles in phase-space to the halo centre.
44 Chapter 1. Merger Trees and Halo Finder Comparison

z=0 z=2
105
AHF
HBThalo
Rockstar (Mass)
Rockstar (N x mp)
104 Subfind

main haloes main haloes

3
10
N(>M)

102 subhaloes subhaloes

101

100 10
10 1011 1012 1013 1014 1010 1011 1012 1013 1014
M (MO• h-1) M (MO• h-1)

Figure 1.3: Cumulative mass functions at redshift z = 0 (left panel) and z = 2 (right
panel) for the four halo finders. There are two lines for Rockstar corresponding to
the two mass definitions discussed in the text: one corresponding to M200c (Mass)
and one based upon the particle list (N × mp , being N the number of particles and
mp the particle mass). The upper set of curves in each panel is based upon main
halos whereas the lower set of curves in each panel refers only to subhalos.

• SUBFIND [104] is a configuration-space finder using FoF groups as a start-

ing point which are subsequently searched for subhalos. It returns arbitrarily
shaped exclusive self-bound main halos, and arbitrarily shaped self-bound sub-
halos that are truncated at the isodensity contour that is defined by the density
saddle point between the subhalo and the main halo.

To give an impression of the differences in the halo catalogues, we present in Fig-

ure 1.3 the cumulative mass function for the four halo finders at redshift z = 0
(left panel) and z = 2 (right panel); further, we separate subhalos from main halos
and present their cumulative mass spectrum in the upper and lower set of curves
of each panel, respectively. A threshold of 20 particles has been set (equivalent to
M = 20mp = 1.86 × 1010 h−1 M ) for halos to be considered. In order to highlight the
peculiarity of Rockstar (for which the returned mass does not correspond to the
sum of the mass of the particle membership) we included two lines for Rockstar:
one based upon summing individual particle masses (cyan dash-dotted) and one with
the mass M200c as returned by Rockstar (blue dotted, extending to masses below
1.2. Halo Finding Techniques 45

the 20 particle threshold). Given that some tree builders only use particle mem-
bership information for a halo whereas others combine this with a table of global
properties (including halo mass), this choice of mass definition will also contribute
to the differences in the final trees.
We find that other than for the largest 100 main halos the different mass definitions
make little difference for the main halos at z = 0 unless the mass taken from the
returned Rockstar particle membership is used. This mass is systematically higher
that the other estimates (and Rockstar’s own returned mass). The differences in
mass for main halos are slightly more pronounced at z = 2.
For subhalos there are noticeably different mass functions: AHF is incomplete at
the low-mass end, with a trend that appears to worsen as the redshift increases4 .
However, despite generally finding more subhalos the other finders do not appear
to have converged to a common set. Part of this relates to the rather ambigu-
ous definition of subhalo mass: whereas for main halos it simply appears to be a
matter of choice for ∆ref and ρref (or some other well-defined criterion for viriali-
sation/boundness/linkage), subhalos – due to the embedding within the inhomoge-
neous background of the host – cannot easily follow any such rule. Again, each finder
has been allowed to pick its favourite definition for subhalo mass. But please note
that the variations seen here are not the prime focus of this study; they should nev-
ertheless be taken into account when interpreting the results presented and discussed
below. Further, the scatter in subhalo mass functions seen in previous comparisons
was much reduced due to the use of a common post-processing pipeline that ensured
a unique subhalo mass definition [93, 94, 96].
All these differences should and will certainly leave an imprint and be reflected in
the outcome when building merger trees.

4
It was checked that a more restrictive parameter set for AHF leads to the recovery of the
missing low-mass subhalos at high redshift. As already shown by [101] (Fig.5 in there) there is a
direct dependence of the applied refinement threshold used by AHF to construct its mesh hierarchy
(upon which halos are based) to the number of low-mass objects found.
46 Chapter 1. Merger Trees and Halo Finder Comparison

Figure 1.4: A summary of the main features and requirements of the different merger
tree algorithms. For details see the individual descriptions in the text.

1.3 Merger Tree Builders

In the first merger tree comparison paper [64], we can find an extensive description
of most merger tree builders available and a terminology convention to describe
them, that we also use here. A lot of the methodology is similar across the various
codes used for this study, the main features and requirements have been captured
in Figure 1.4. We first categorise tree builders into either using halo trajectories
(JMerge, and Consistent Trees) or individual particle identifiers (together with
possibly some additional information; all remaining tree builders). Consistent
Trees is the only method that utilises both types of approach. HBT constructs
halo catalogues and merger trees at the same time as it is a tracking finder that
follows structures in time. A cautionary note regarding HBT: it can be applied
both as a halo finder or a tree builder and includes elements of both so we will
always specify whether we refer to one or the other by appending ‘halo’ or ‘tree’, as
necessary.
The codes themselves are best portrayed as follows:

• Consistent Trees forms part of the Rockstar package. It gravitationally

evolves positions and velocities of halos between time-steps, making use of
information from surrounding snapshots to correct missing or extraneous halos
in individual snapshots [105].
1.3. Merger Tree Builders 47

• HBTtree is built into the halo finder HBT. It identifies and tracks objects at
the same time using particle membership information to follow objects between
output times.

• JMerge only uses halo positions and velocities to construct connections be-
tween snapshots, i.e. halos are moved backwards/forward in time to iden-
tify matches that comply with a pre-selected thresholds for mass and position
changes.

• MergerTree forms part of the AHF package and cross-correlates particle

IDs between snapshots.

• SubLink tracks particle IDs in a weighted fashion, giving priority to the in-
nermost parts of subhalos and allowing branches to skip one snapshot if an
object disappears.

• TreeMaker consists of cross-comparing (sub)halos from two consecutive out-

put times by tracing their exclusive sets of particles.

• VELOCIraptor is part of the VELOCIraptor/STF package and cross-

correlates particle IDs from two or more structure catalogues.

Two codes were allowed to modify the original catalogue: Consistent Trees and
HBTtree. Consistent Trees adds halos when it considers they are missing: i.e.,
the halo was found both at an earlier and at a later snapshot. Consistent Trees
also removes halos when it considers them to be numerical fluctuations: i.e., the
halo does not have a descendant and both merger and tidal annihilation are unlikely
due to the distance to other halos. HBTtree for external halo finders (i.e. halo
catalogues not generated by its own inbuilt routine) takes the main halo catalogue
and reconstructs the substructure. This produces an exclusive halo catalogue in
which the properties of the main halos may also have changed.
48 Chapter 1. Merger Trees and Halo Finder Comparison

1.4 Geometry of trees

In this section we present the geometry and structure of merger trees and the result-
ing evolution of dark matter halos. This includes the length of the tree (Section 1.4.1)
and the tree branching ratio (Section 1.4.2). Further, it is shown graphically how
halo finders and tree builders work differently, to illustrate the features found in the
comparison.

1.4.1 Length of main branches

One of the conceptually simplest properties of a tree is the length of the main branch.
It measures how far back a halo can be traced in time – starting in this case at z = 0.
This property not only relies on the performance of the halo finder and its ability
to identify halos throughout cosmic history, but also on the tree builder correctly
matching the same halo between snapshots. [64] found that the different tree building
methods produced a variety of main branch lengths, ascribing some of the features
to halo finder flaws. We shall verify this now.
Figure 1.5 shows a histogram of the main branch length l, defined as the number of
snapshots a halo main branch extends backwards in time from snapshot 61 (z = 0)
to snapshot 61 − l. This is roughly equivalent to an age, given that the last 50
snapshots are separated uniformly in expansion factor, a = 1/(1 + z). On the left,
we selected the 1000 most massive main halos, whereas on the right we see the results
for the 200 most massive subhalos. The main halo population coincides from one
halo catalogue to another in at least 85% of the objects. The subhalo population
is more complicated and, in some cases, they only agree in 15% of the objects from
one finder to another. However, if we focus on comparing AHF with Rockstar or
HBThalo with SUBFIND, we find a better agreement between catalogues, rising
to ∼ 95% for main halos and ∼ 70% for subhalos. Due to these differences, the
applied number threshold translates to mass thresholds Mth that are different from
finder to finder (see also Figure 1.3); we therefore list the corresponding values in
Table 1.1. Furthermore, when using HBTtree, the individual masses of the halos
can change and so does the mass threshold. In what follows we will consistently use
1.4. Geometry of trees 49

103 102
ConsistentTrees HBTtree ConsistentTrees HBTtree
AHF
2
10 HBThalo AHF
Rockstar 10
1 HBThalo
Rockstar
101
Subfind
Subfind

100 100
JMerge MergerTree JMerge MergerTree

102
101
1
10
N+1
0
10 100
N+1

SubLink TreeMaker SubLink TreeMaker

2
10
101
101

100 100
VELOCIraptor 10 20 30 40 50 60 70 VELOCIraptor 10 20 30 40 50 60 70
Length (l)
102 Length (l)
101
101

100 100
0 10 20 30 40 50 60 70 0 10 20 30 40 50 60 70
Length (l)
Length (l)

Figure 1.5: Histogram of the length of the main branch. The length l is defined as
the number of snapshots a halo can be traced back through from z = 0. The left
group of panels show the 1000 most massive main halos. The right group of panels
show the 200 most massive subhalos. These number selections are equivalent to the
mass cuts shown in Table 1.1. Different panels contain results from different tree
building methods (as indicated), while within each panel there is one line for each
halo finder (as marked in the legend).
50 Chapter 1. Merger Trees and Halo Finder Comparison

AHF HBThalo Rockstar SUBFIND

main
Mth 7.93 8.25 7.90 9.61
main
Mth,HBT 7.52 8.25 10.64 8.30
sub
Mth 3.09 6.91 3.00 5.30
sub
Mth,HBT 2.75 6.91 2.68 5.90

Table 1.1: Mass threshold in units of 1011 h−1 M needed to select at z = 0 the 1000
most massive main halos (rows 1 and 2) and the 200 most massive subhalos (rows
3 and 4) for different halo finders (columns). Odd rows show the threshold for a
general tree builder, whereas even rows show the threshold for HBTtree

these mass thresholds, even at higher redshift.

As expected by the hierarchical structure formation scenario induced by cold dark
matter, most large mass objects can be traced back to high redshift. This is not
surprising and was already reported in [64], but here we can appreciate that this
result depends on the choice of the halo finder and we will elaborate on this below.
As a general observation, for both main halos and subhalos, it is apparent that
HBThalo leads to the best results: nearly all massive halos are found and followed
from an early origin. We attribute this to the fact that by its very nature as a
tracking finder HBThalo is designed with the intention of building a merger tree in
mind. SUBFIND tends to give similar results but with occasional early truncation.
These truncations become more pronounced for AHF and Rockstar. Further,
AHF tends to terminate each tree slightly earlier, even if it was well followed back
in time, because of the incompleteness at low mass end (Figure 1.3). For AHF
missing low-mass objects at high redshift cannot be the small progenitors of the
high-mass low-redshift objects followed in Figure 1.5.
Differences between subhalos and main halos are also apparent. First, the subhalo
curves in general appear more noisy, in part due to having fewer objects, but also
because they are always placed in a more complicated environment which enhances
the stochasticity. The difficulty in following subhalos then causes more cases with
low l, especially for AHF and Rockstar. One could naively think his excess of
sub
low-l subhalos for AHF and Rockstar could be the result of a much smaller Mth
threshold (see Table 1.1). However, it was verified that using the same threshold for
all catalogues, only mitigates that difference without completely erasing it.
1.4. Geometry of trees 51

200 most massive subhaloes

60
AHF
HBThalos
Rockstar
Subfind
50

40
Length (l)

0
0 500 1000 1500 2000
distance to host halo (Kpc/h)

Figure 1.6: Distance to the centre of the host halo vs. length of the tree l for the 200
most massive subhalos. We show the results for the four halo finders (see legend) for
the MergerTree builder.

Subhalo finding becomes especially difficult as the subhalo approaches the centre of
the host halo, as has been shown in Fig.4 of [106] and Fig.7 of [93]. In particular,
SUBFIND underestimates the mass of subhalos close to the centre of their host
halo. Given that the 200 most massive subhalos are not the same for all finders, the
subhalos selected for SUBFIND tend to be further from the host halo centres (see
Figure 1.6), and therefore they are easier to trace. AHF and especially Rockstar
find many (massive) subhalos near the centre but, due to the difficulties in that
region, a fraction of them cannot be provided with a credible progenitor in an earlier
snapshot, resulting in early tree termination. Finally, the HBThalo selection is
composed of subhalos at short, medium and large distances from the host halo centre
but, by construction, they are always required to be traceable.
On the tree builder side, JMerge allows halos to only shrink their mass by a factor
of up to 0.7 and to grow by a factor of up to 4 in one snapshot, and it estimates their
trajectories from global quantities (Section 1.3). This artificially truncates main
branches too early for massive objects when it loses track of halos. This effect is
enhanced for subhalos, whose trajectories are difficult to estimate due to the non-
linear environment and the fact that their mass is more likely to grow or shrink
52 Chapter 1. Merger Trees and Halo Finder Comparison

abruptly (Section 1.5). Consistent Trees and HBTtree essentially eliminate

the low-l cases for nearly all the halos (and subhalos). This is due to their freedom
to modify the catalogue in such a way as to avoid exactly these occurrences.
In order to better illustrate the factors that influence the main branch length, l, we
present in Figure 1.7 a graphical representation of the performance of the various
halo finders and tree builders. This slice shows two halos of similar size passing
through each other in the process of a merger (the same merger as shown in Fig.4 of
[64]). These two halos are identified at z = 0 with a blue line and a red line, and then
traced back by the merger tree. For this example MergerTree, TreeMaker and
VELOCIraptor gave identical results and so we only show the MergerTree
result.
We find a wide variety of situations: in some cases every halo is correctly traced (e.g.
Consistent Trees with AHF) but in others the tracing fails (e.g. JMerge with
AHF). In the success or failure of the tracing the influence of both the halo finder
and the tree builder are important:

• AHF considers one of the merging halos to be the main halo (blue) and the
other to be a subhalo (red). In snapshot 060 the subhalo found is quite small,
so that most of the tree building codes do not link it with the (much larger) halo
in the next snapshot (061). In simple codes (JMerge, MergerTree... ) this
leads to an artificial truncation of the tree. Consistent Trees artificially
adds one halo to snapshot 060 to replace the small subhalo whereas SubLink
jumps snapshot 060 for this object. In this way both codes continue the tree.
HBTtree recomputes the substructure, creating a more traceable subhalo.

• HBThalo is able to identify at snapshot 060 two big and well defined halos
of almost the same size (only possible for exclusive halo catalogues). This is
due to the tracking nature of the finder and ensures the correct follow-up by
most tree builders. Only JMerge encounters problems due to the non-smooth
trajectories of the halos.

• Rockstar uses phase-space information so that even when the halos are over-
lapping (snapshot 060) it is able to distinguish them by their velocities. This
1.4. Geometry of trees 53

AHF HBThalo
057 059 060 061 057 059 060 061

Consistent Trees

Consistent Trees
057 059 060 061 057 059 060 061
HBTtree

HBTtree
057 059 060 061 057 059 060 061
JMerge

JMerge
057 059 060 061 057 059 060 061
Merger Tree

MergerTree
057 059 060 061 057 059 060 061
Sublink

Sublink

Rockstar Subfind
057 059 060 061 057 059 060 061
Consistent Trees

Consistent Trees

057 059 060 061 057 059 060 061

HBTtree

057 059 060 061 057 059 060 061

JMerge

057 059 060 061 057 059 060 061

Merger Tree

057 059 060 061 057 059 060 061

Sublink

Figure 1.7: Projected image of a 1.2 Mpc/h-side cube from the N -Body simulation.
Halos are represented by circles of radius corresponding to R200c . This is an example
of a merger between two halos that are found at z = 0 (snapshot 061) and linked
across snapshots by the tree builders: the blue and red colours represent the two
trees. Other halos found are represented in green. Each subfigure presents a single
halo finder, with each row representing the indicated tree builder. In each row time
evolves from left to right, with each cell a different snapshot.
54 Chapter 1. Merger Trees and Halo Finder Comparison

allows almost all tree codes (besides JMerge) to follow the evolution of the
halos.

• SUBFIND gives similar problems to AHF: the subhalo at snapshot 060 is

too small to be considered a credible progenitor. For this catalogue, Consis-
tent Trees is not able to deal with it and completely removes the red tree.
HBTtree patches over that problem the usual way while JMerge associates
the halo to a progenitor incorrectly and MergerTree truncates the tree.
SubLink, by omitting snapshot 060, is able to follow the history correctly.

This example neatly illustrates the difficulties that arise when dealing with subhalos.
However, the left panel of Figure 1.5 tells us that there are also situations in which
the main halo branch is truncated. We studied several of these cases and found two
main types: in the first type the main halo lies in the vicinity of a bigger halo, and is
likely to have entered it and become a subhalo a few snapshots before. In this case
the problems encountered are similar to those illustrated in the subhalo example
above, but here the in-falling halo has been classified as a main halo at z = 0.
The other type occurs when at some point the halo was wrongly associated to some
other smaller halo as happened with the red halo in Figure 1.5 for the combination
JMerge-HBThalo. In this case the incorrect halo assignment never gets corrected
and typically the much smaller halo has a much shorter prior history.
Already at this stage of the analysis we can draw some conclusions from this sub-
section:

• In general, the influence of the halo finder is at least as (if not more) important
than the tree building algorithm.

• Main halos are easier to trace.

• The way the halo finder deals with substructure is crucial for merger trees.

• Tree building tricks such as the creation of artificial halos or omitting snapshots
help in some cases, but are not infallible.
1.4. Geometry of trees 55

• AHF and Rockstar catalogues lead to earlier tree truncation for most tree
builders. This is especially true for subhalos, because they try to find subhalos
close to the host halo centre and are not able to provide them with credible
progenitors.

• SUBFIND tends to find more subhalos in the outer regions of the host, which
are easier to track.

• HBT appears to be very well designed to not truncate a tree too early, both
as a halo finder and as a tree builder (as seen in Figure 1.5 and 1.7).

• Consistent Trees also stands out in avoiding low-l cases (Figure 1.5).

• JMerge faces problems in complex environments.

1.4.2 Branching ratio

Another simple tree property, which is nevertheless very important for characterising
the structure or geometry of a tree, is the number of direct progenitors Ndprog (or
local branches) that a halo typically has. Figure 1.8 shows the normalised (divided by
the total number of events) histogram of Ndprog for all halos in the range 0 ≤ z ≤ 2.
For all the various combinations of tree building method and halo finder the most
common situation is to have just one single progenitor, corresponding to a halo
having no mergers on this step (which can happen multiple times during a halo
lifetime). The second most common situation is for a halo to have no progenitors,
which corresponds to a halo passing above the detection threshold and appearing for
the first time, which can happen only once. As for other properties studied in this
study, our results would certainly change if we were to use a different set of output
times, so the importance does not lie in the individual tree results, but in their
differences. For an elaborate study of the optimal choice for the temporal spacing of
snapshots to construct merger trees see [107] or [86].
It is noticeable that the Rockstar catalogue (blue dotted line) yields a tree with
significantly large branching ratio for the tree builders SubLink, TreeMaker, and
VELOCIraptor. Also, besides using a very similar technique, MergerTree
56 Chapter 1. Merger Trees and Halo Finder Comparison

shows a more moderate branching ratio. By removing objects with mass lower than
20mp (cyan dash-dotted line), we verified that this high branching ratio is caused
by objects with very low mass as these high-Ndprog cases disappear. Recall that,
even though all the halo finders cut their catalogue at 20 particles, for Rockstar
the mass M200c can be lower if some of those particles lay outside R200c . This small
change, in general, moves the curves for Rockstar from the highest branching ratio
to the lowest one. Note that the mass limited tree shown in cyan is not equivalent
to the other trees because the catalogue was reduced after running the tree building
algorithm on it, hence giving non-self-consistent trees. Nevertheless, we do not expect
great variations in Figure 1.8 between the cyan line and a fully self-consistent tree
with the same mass limit. This serves as an illustration of the great influence of the
lower mass limit, pointing out again the importance of the input halo catalogue in
the resulting tree construction. To illustrate a high branching ratio case we have
selected one of the extreme cases with Ndprog > 30 in Figure 1.9. It corresponds to
one of the two most massive halos (depending on the halo finder) at snapshot 050
(z=0.32). Figure 1.9 shows all the direct progenitors of that halo and other halos
found in the area. The blue halo is the main and most massive progenitor in the
plot. The red and magenta circles represent other direct progenitors at snapshot 049
while green circles represent other (sub)halos detected in the same region. Magenta
is used for halos whose mass is below 20mp (only possible for Rockstar), while red
halos have larger mass. SubLink also has halos that were found at snapshot 048,
but were not linked in snapshot 049, which were linked to the big halo at snapshot
050; these are marked as crosses.
Figure 1.9 tells us that, when comparing different halo catalogues, Ndprog tends to
be correlated to the number of (small) halos available to be absorbed, i.e. the more
green halos we find the more merging (red and magenta) halos we find. We further
confirm that most secondary progenitors (red and magenta circles) are subhalos of
the main progenitor (blue circle) and lie within R200c . However, in some cases sec-
ondary progenitors were found outside the volume displayed (e.g. the halos missing
in Consistent Trees with AHF). But in general, the properties of these halos
fit into the standard merging picture in which halos approaching a bigger one be-
come satellites (subhalos), lose mass via tidal stripping and are eventually totally
1.4. Geometry of trees 57

100
ConsistentTrees HBTtree
10-1
10-2
AHF
HBThalo
10-3 Rockstar all
10-4 Rockstar cut
Subfind
10-5
-6
10
JMerge MergerTree
10-1
10-2
-3
10
10-4
-5
f(Ndprog)

10
-6
10
Sublink TreeMaker
10-1
10-2
-3
10
10-4
10-5
10-6
10-1
VELOCIraptor 5 10 15 20 25 30 35 40
10-2 Ndprog
10-3
10-4
10-5
10-6
0 5 10 15 20 25 30 35 40
Ndprog

Figure 1.8: Normalised histograms of the number of direct progenitors Ndprog for all
halos from z = 0 to z = 2 (snapshots from 061 to 031). Each panel corresponds to a
single tree building method, within each panel each line represents a halo catalogue
as indicated. For Rockstar we show two lines, one with all the halos (’Rockstar
all’) and one where halos with mass lower than 20 mp were removed (’Rockstar
cut’).
58 Chapter 1. Merger Trees and Halo Finder Comparison

AHF HBThalo Rockstar Subfind

Consistent Trees

Ndprog=5 Ndprog=13 Ndprog=24/6 Ndprog=10

HBTtree

Ndprog=6 Ndprog=13 Ndprog=10 Ndprog=11

JMerge

Ndprog=1 Ndprog=2 Ndprog=2/2 Ndprog=1

Merger Tree

Ndprog=12 Ndprog=13 Ndprog=14/5 Ndprog=11

Sublink

Ndprog=10 Ndprog=20 Ndprog=45/7 Ndprog=12

Tree Maker

Ndprog=13 Ndprog=15 Ndprog=39/7 Ndprog=12

Figure 1.9: Projected image of a 3 Mpc/h-side cube from snapshot 049 centred on
one of the most massive objects (M > 1014 h−1 M ) for all the combinations of halo
finder (column) and tree builder (row). Symbol and colour coding explained in the
text. VELOCIraptor (omitted) gives the same results as TreeMaker. The
label Ndprog indicates the number of progenitors (some of which might be outside
this volume). For Rockstar we show a second value in which only those with mass
larger than 20mp are considered.
1.5. Mass Evolution 59

absorbed.
If all the available halos are considered, Rockstar is the catalogue with most small
halos, leading to a higher branching ratio, which drops when removing the low mass
halos. HBThalo is also able to discern more substructure, yielding a slightly higher
Ndprog than SUBFIND and AHF.
From the tree building point of view we remark that SubLink, with the possibility of
omitting one snapshot, increases Ndprog considerably for the two catalogues with more
substructure: Rockstar and HBThalo. HBTtree, in modifying the catalogue,
tends to recover the halo set generated by HBThalo. This effect is more noticeable
in the case of SUBFIND because it is also based on FoF catalogues (Section 1.2).
JMerge shows very little branching (Ndprog = 1 or 2) because by construction it
never associates a small merging halo with a much bigger one. It rather associates
the in-falling halo with another small halo.
Note, however, that this was a very extreme case and that Figure 1.9 is not necessarily
representative of the statistics seen in Figure 1.8, rather it helps to understand the
kind of factors that influence the branching ratio.

1.5 Mass Evolution

The mass evolution of halos is an important input for semi-analytical models of

galaxy formation. In this section we will study it through mass growth (Section 1.5.1)
and fluctuations in mass (Section 1.5.2).

1.5.1 Mass Growth

Mass growth can be characterised by the discretised logarithmic growth, defined as:

d log M (tk + tk+1 )(Mk+1 − Mk )

≈ αM (k, k + 1) = (1.10)
d log t (tk+1 − tk )(Mk+1 + Mk )

where k and k + 1 are a halo and its descendant, with masses Mk and Mk+1 at times
tk and tk+1 , respectively [64]. In order to reduce the range of possible values of this
60 Chapter 1. Merger Trees and Halo Finder Comparison

variable to the finite interval (−1, +1), we define:

1
βM = arctan(αM ) (1.11)
π/2

Figure 1.10 shows the distribution of βM for three populations: all halos (A, on the
left), main halos (B, in the centre) and subhalos (C, on the right). All distributions
have been normalised by the total number of events found in halo sample A in each
case. Selection is done as follows: all the halos identified at z = 0 are traced back
along the main branch and at any snapshot if both a halo and its descendant are
main sub
main [sub] halos and have mass M > Mth [M > Mth ] (Table 1.1) sum to the
population B [C]. The population A is compiled similarly, but taking all pairs of
main
halos satisfying M > Mth , regardless of being main or subhalos. Note that the
distribution A is dominated by main halos, since they are more numerous.
Within the hierarchical structure formation scenario one expects halos to grow over
time. This can be appreciated in column A, where the distribution of βM is skewed
towards values βM > 0. However, there is a non-negligible number of cases (∼
15 − 30%) where it decreases (βM < 0). While mass loss could be associated with
tidal stripping of subhalos, column B shows that this is not the sole explanation
within this simulation: while subhalos have an important contribution at the very
far end of the distribution (corresponding to large mass losses), there are also many
instances leading to βM < 0 for main halos. Nevertheless, there are physical ways
for main halos to lose mass: when two main halos approach each other, the effective
radius for tidal stripping extends beyond the virial radius of the larger halo [see
108, for an elaborate discussion of exactly this phenomenon], thus, the small one
can experience mass loss before becoming a satellite. Also, when halos change their
shape, the specific halo mass definition (e.g. M200c for AHF/Rockstar) of a halo
finder can lead to an apparent mass loss.
The plot clearly shows that the differences across halo finders are greater than the
variations introduced by the tree building method, with the exception of HBTtree
(that modifies the input halo catalogue). There are two distinct classes of distribution
for main halos (B): on the one hand, Rockstar and AHF, and on the other hand,
SUBFIND and HBThalo which have a more skewed distribution. Recall from
1.5. Mass Evolution 61

A: All haloes B: Main Haloes C: Subhaloes

10-1
ConsistentTrees ConsistentTrees ConsistentTrees

10-2

10-3
AHF
HBThalo
10-4 Rockstar
Subfind
10-1
HBTtree HBTtree HBTtree

10-2

10-3

10-4

10-1
JMerge JMerge JMerge

10-2

10-3

10-4

10-1
MergerTree MergerTree MergerTree

10-2
f(βM)

10-3

10-4

10-1
Sublink Sublink Sublink

10-2

10-3

10-4

10-1
TreeMaker TreeMaker TreeMaker

10-2

10-3

10-4

10-1
VELOCIraptor VELOCIraptor VELOCIraptor

10-2

10-3

10-4

-1 -0.5 0 0.5 -1 -0.5 0 0.5 -1 -0.5 0 0.5 1

βM

Figure 1.10: Mass growth distribution between two snapshots, βM , related to the
logarithmic mass growth through Equation 1.11, for halos that can be identified at
z = 0, with mass M > Mth at both output times. We distinguish 3 populations:
main
A which contains all halos with Mth = Mth , B with only main halos and Mth =
main sub
Mth , and C with only subhalos and Mth = Mth . Mth is tabulated in Table 1.1 for
the different halo finders. Each row displays a different tree building algorithm (as
indicated). Each halo finder has its own line style as indicated in the legend. The
distribution is computed as a histogram, normalised by the total number of events
found by the corresponding halo finder for the population A.
62 Chapter 1. Merger Trees and Halo Finder Comparison

(Section 1.2) that the former use an inclusive mass definition, thus, for a subhalo
that just crossed the centre and is moving away, the total (inclusive) mass of the
host halo can decrease if part of that subhalo crosses R200c .
We finally remark that while subhalos are present in our somewhat low-resolution
simulation (when compared to the state-of-the-art), they contribute significantly to
neither the shape nor the amplitude of the mass growth distribution shown in column
A (all halos). However, their own distribution (column C) is interesting in its own
regard: we primarily observe mass loss due to tidal stripping, i.e. an imbalance
of the distribution towards negative βM values. In this case we find that whereas
HBThalo follows one distribution, the other three follow their own. This reflects
the inconsistency in subhalo mass functions already seen in Figure 1.3.
In conclusion, most of the differences in the mass growth βM can be accounted
for by the choices made by the respective halo finder when defining quantities. In
particular, HBThalo and SUBFIND agree best with the a priori expectation from
hierarchical structure formation.

1.5.2 Mass Fluctuations

After studying mass growth above, we quantify mass fluctuations by using

βM (k, k + 1) − βM (k − 1, k)
ξM = (1.12)
2
where k − 1, k, k + 1 represent consecutive time-steps. When far from zero, it implies
a growth followed by a dip in mass (ξM < 0) or vice versa (ξM > 0). Within the
hierarchical structure formation scenario this behaviour can be considered unphys-
ical and equates to a snapshot where the halo finder might not have assigned the
correct mass – though there are certainly situations where the definition of correct
mass remains arguable. Nevertheless, it provides another means of quantifying the
influence of the halo finder upon a merger tree.
The (normalised) distribution of ξM is presented in Figure 1.11 in the same way
as Figure 1.10, i.e. three distinct columns for all halos (A, left), main halos (B,
1.5. Mass Evolution 63

middle), and subhalos (C, right). It reconfirms most of the claims of Section 1.5.1.
We again find the distribution is essentially independent of the tree builder (besides
HBTtree) for all three populations. We find two types of distributions for main
halos (B): on the one hand, the SUBFIND and HBThalo catalogues give the
broadest distributions and on the other hand, Rockstar and AHF have a more
peaked distribution. This implies that the first pair of halo finders present more
mass fluctuations (ξM 6= 0) than the second one. Note that this pairing is identical
to the one reported in Section 1.5.1. And we also find (again) that subhalos (C)
do not provide an explanation for the wings of the mass fluctuation distribution in
column A, even though their own plot indicates that they predominantly undergo
abrupt changes, i.e. they have easily distinguished wings.
Given that subhalos often undergo fluctuations (column C of Figure 1.11), this could
cause fluctuations in main halos when the mass is defined exclusively (HBThalo
and SUBFIND). In order to study this effect, we selected a halo whose mass evo-
lution is characterised by a large ξM value (for the SUBFIND/HBThalo pair) in
Figure 1.12. We localised the same object (the blue halo) and surrounding ones
(red a green) in all four halo catalogues, showing the three consecutive snapshots
used for the calculation of ξM given at the very right hand side of each panel. The
halo undergoes a mass fluctuation for the finders HBThalo and SUBFIND, while
it keeps growing for AHF and Rockstar. Figure 1.12 shows that, although it is
true that for HBThalo/ SUBFIND the total mass of the subhalos increases when
the main halo decreases and vice versa, the fluctuation of subhalo mass is one order
of magnitude smaller than the main halo fluctuation and this cannot be the sole
explanation. The fact that the red halo changes from being a subhalo to a main
halo and then back to a subhalo again may be related (in a non-trivial way, since
masses are defined exclusively) to the mass fluctuation. For this simple (compared
to Figure 1.7 & Figure 1.9) configuration of halos, all the tree building algorithms
agree in the resulting trees. We also note that even small fluctuations (10% in mass)
are detected by this parameter ξM , in part due to an enhancement of ξM at late
times (cf. Equation 1.11 & Equation 1.12).
64 Chapter 1. Merger Trees and Halo Finder Comparison

A: All haloes B: Main Haloes C: Subhaloes

ConsistentTrees ConsistentTrees ConsistentTrees

10-1

10-2

10-3 AHF
HBThalo
10-4 Rockstar
Subfind
10-5
HBTtree HBTtree HBTtree
10-1

10-2

10-3

10-4

10-5
JMerge JMerge JMerge
10-1
-2
10

10-3

10-4

10-5
MergerTree MergerTree MergerTree
10-1

10-2
f(ξM)

10-3

10-4

10-5
Sublink Sublink Sublink
10-1
-2
10

10-3

10-4

10-5
TreeMaker TreeMaker TreeMaker
10-1
-2
10

10-3

10-4

10-5
VELOCIraptor VELOCIraptor VELOCIraptor
10-1

10-2

10-3
-4
10

10-5
-1 -0.5 0 0.5 -1 -0.5 0 0.5 -1 -0.5 0 0.5 1
ξM

Figure 1.11: Distribution of mass fluctuations ξM (Equation 1.12), for halos found
in three consecutive snapshots along a main branch that can be identified at z = 0,
with mass M > Mth for each appearance of the halo. We distinguish 3 populations:
main
A which contains all halos with Mth = Mth , B with only main halos and Mth =
main sub
Mth , and C with only subhalos and Mth = Mth . Mth is tabulated in Table 1.1.
Comparison is made between different tree builders (each row as labelled) and halo
finders (line styles as in the legend). The distribution is computed as a histogram
normalised by the total number of events for the corresponding halo finder for the
population A.
1.5. Mass Evolution 65

058 059 060

M=8.24 M=8.15 M=8.43

ξM=-0.13
AHF

M=78.0 M=79.1 M=79.3

M=2.7 M=8.61 M=2.9 M=8.80 M=0 M=8.98
HBThalo

ξM=0.87
M=104.6 M=94.5 M=105.6
M=7.5 M=7.6 M=7.2

ξM=-0.03
Rockstar

M=77.5 M=78.5 M=79.7

M=8.5 M=9.1 M=8.14

ξM=0.85
Subfind

M=107.5 M=97.6 M=106.5

Figure 1.12: Projected 1 Mpc/h-side cube containing two halos (three for
HBThalo) evolving from snapshot 058 (left column) to 059 (central column) to
060 (right column). Each row shows a different halo finder. The radius of the circle
is represented proportional to the mass of the object, with an extra factor of ×5 for
the small (red and green) halos. Dashed lines denote subhalos whereas solid lines
are used for main halos. The mass of each halo is also shown in units of 1010 h−1 M .
At the right of each row we can see the value of ξM for the big halo, which quantifies
the mass fluctuation as defined by Equation 1.12.
66 Chapter 1. Merger Trees and Halo Finder Comparison

0.3

0.29

0.28

0.27

0.26

0.25
σ(ξM)

0.24

0.23

0.22
ConsistentTrees
0.21 HBTtree AHF
JMerge HBThalo
MergerTree Rockstar
0.2 Sublink Subfind
TreeMaker
Velociraptor
0.19
0.7 0.72 0.74 0.76 0.78 0.8 0.82 0.84 0.86 0.88
fβ
M>0

Figure 1.13: Summary of Figure 1.10 and Figure 1.11. On the abscissa we show the
fraction of halos for which mass grows; on the ordinate we show the standard devi-
main
ation of the mass fluctuations. Only main halos satisfying M > Mth (Table 1.1)
are taken into account. Every point represents a combination of a tree builder (size
and colour-coded) and a halo catalogue (symbol-coded, see legend).

1.5.3 Combining growth and fluctuations

To better draw any conclusion from our study of the mass evolution of main halos
we summarise results from their βM and ξM statistics (Section 1.5.1 & Section 1.5.2)
in Figure 1.13: the x-axis shows the fraction fβM >0 of objects for which βM > 0,
whereas the y-axis shows the standard deviation σξM of ξM . Different sizes (or
colours) now represent different tree building methods whereas the symbols stand
for the input halo catalogue. The desirable feature of a tree describing hierarchical
structure formation would be to have small mass loss for main halos (high fβM >0 )
and small mass fluctuations (low σξM ), at least a priori, because we also explained
physical causes for these phenomena. Note also that the quantities plotted here do
not provide a substitute for the whole curve shown in Figure 1.10 & Figure 1.11, but
rather capture well the features of interest as they are observed. This summary plot
illustrates very well how mass evolution sensitively depends on the choice of the halo
finder:

• Points for the same halo finder (symbol) group together. The small scatter
1.6. Conclusions 67

amongst those groups represents the small influence of the tree building method
on these magnitudes.

• HBTtree points deviate from the group, approaching the area of the HBThalo
finder (crosses).

• The pair of halo finders HBThalo/SUBFIND achieves a lower rate of mass

loss at the price of having more mass fluctuation than the other pair of finders
AHF/Rockstar for main halos. We relate this pairing to the mass definition
of the halo finder: the former is exclusive and uses self-bound objects, whereas
the latter uses inclusive spherical M200c objects.

We have verified that mass growth and fluctuations are intrinsically related to the
mass definition. A simple change from an inclusive to an exclusive halo catalogue
or from M200c to arbitrarily shaped halos would change the shape of the curves seen
in Figure 1.10 & Figure 1.11 and the position of the points in Figure 1.13. But
other fundamental properties of the halo finder also leave their imprint, the evident
differences between HBThalo and SUBFIND in Figure 1.13 are a proof of this.

1.6 Conclusions

We investigated the influence of the input halo catalogue on the quality of the result-
ing merger trees. ’Quality’ in this regard has been identified as length of the main
branch, number of direct progenitors, and, quantities that are highly relevant for
semi-analytical modelling, the mass growth and mass fluctuation of halos. We also
showed some specific examples of cases that aided our understanding of the influence
of the halo finder and tree builder on the resulting properties of the trees.
In total, seven different tree building methods have been applied to the halo cata-
logues produced by four different halo finding algorithms which examined the same
cosmological simulation. This produced 28 merger trees to be analysed. The influ-
ence of both groups of codes is summarised below, and the particular achievements
and difficulties of the different methods discussed.
68 Chapter 1. Merger Trees and Halo Finder Comparison

The influence of the halo finder

The primary conclusion of all the studies presented here is that the influence of the
input halo catalogue is greater than the influence of the tree building method em-
ployed. This is especially clear for the mass evolution studies (Section 1.5) although
it is also noticeable from the results of the main branch length (Section 1.4.1) and
the studies on the branching ratio also suggest it (Section 1.4.2). Part of these dif-
ferences are due to the fact that for this comparison we allowed the halo finders to
choose their own definitions instead of unifying them as done in previous halo finder
comparison projects. However, this way we find the real impact a user will encounter
when choosing one or the other halo finder for his/her analysis.
Another pattern encountered along our studies is the pairing AHF/Rockstar vs.
HBThalo/SUBFIND. This is very clear in the mass evolution of main halos (cen-
tral columns of Figure 1.10 and Figure 1.11, summarised in Figure 1.13) and can
also be seen in the main branch length distribution (Figure 1.5). We interpret this
pairing to be caused by the fundamental construction of the halo catalogues, namely
spherically truncated M200c inclusive masses (Equation 1.9) for the former pair vs.
self-bound exclusive objects starting from FoF groups for the latter. These differences
can already be acknowledged in the main halo mass function shown in Figure 1.3.
The studies on the length of the tree (Section 1.4.1) are the cleanest test, since they
do not rely on arbitrary choices such as the lower mass cut (which makes a significant
difference for the branching ratio) or the mass definition (which is of great influence
in the mass evolution). The tracking nature of HBThalo showed excellent results
in this section, with no early truncation of (sub)halos. Rockstar and AHF showed
early truncation of trees, especially for subhalos near the centre of their host, whereas
SUBFIND did not show too much early truncation of subhalos, because they are
systematically missing in the centre of the hosts. AHFled to the shortest main
branches: halos disappear due to the high-z low-M incompleteness and the main
branches tend to end early.
The relevance of the lower mass cut was also seen in the study of the branching ratio
(Figure 1.8 in Section 1.4.2). In particular, for Rockstar a cut in mass was not
equivalent to a cut in the number of particles. Because of this, doing the same cut
1.6. Conclusions 69

in particles as for other catalogues, the branching ratio of Rockstar was too high.
The mass evolution of halos was found to be mostly dependent upon the mass def-
inition employed by the halo finder. However, it is not clear which finders perform
best: HBThalo/SUBFIND show less mass loss whereas AHF/Rockstar show
fewer mass fluctuations. Mass evolution is intrinsically related to the way the mass
is defined, and the choice of a different mass definition within the same halo finder
would lead to different results.
Along these lines, note that some properties of the halo finders are simple choices that
are relatively easy to change, as for example the exclusive/inclusive mass assignment
or the choice of spherical halos vs. self-bound objects. However, we have seen in [96]
that other, more fundamental, details of each halo finder (such as the initial particle
collection) leave their own unique signature in the catalogue. These are practically
unavoidable and hence users have to decide upfront which halo finder best suits their
needs.

The influence of the tree building method

Although we found a greater dependence on the halo finder than on the tree building
method, each of the tree codes also has its own peculiarities:

• Consistent Trees in many cases is able to correct the problems posed by

the finder by adding artificial halos.

• HBTtree, when recomputing the substructure, makes halos more traceable,

improving the results.

• JMerge has problems in dealing with the motion of (sub)halos in highly

clustered environments.

• MergerTree, TreeMaker and VELOCIraptor behave very similarly,

as they are based on nearly identical algorithms.

• SubLink is sometimes able to compensate for non-detection of halos by looking

at non-consecutive timesteps.
70 Chapter 1. Merger Trees and Halo Finder Comparison

In these lines, and confirming results from [64], being able to skip snapshots or having
a tracking nature is found to be crucial in properly trace the history of halos.

Outlook

The main outcome of the present study is that the fundamental properties of halo
finders have a major impact on the merger trees constructed from them, and that
some tree building techniques can help improving those trees by correcting for halo
finder defects. We pointed out the repercussions that several properties of the halo
finders and tree building codes can have on the final trees. This should help the
community choosing, designing or modifying their pipelines to construct merger trees
idealised for their specific purposes.
It is worth mentioning that, although here we focused on the differences among the
resulting merger trees, the agreement among them is nevertheless remarkable. The
general features of the trees resulted as one would have expected, and are similar
from one tree to another. Many times the differences between trees are only seen
when plots are done on a logarithmic scale, since those differences are at the order
of a few-cases for every thousand plotted.
The series of workshops and studies within the Mocking Astrophysics programme
is helping us to quantify the degree of understanding that we actually have about
structure formation. It helps the community validating and improving their algo-
rithms used in the simulation pipeline, whose outcome we compare with observations
to learn about the physics of the Universe. This process will continue, since for every
milestone reached in Cosmology a few more arise on the horizon.
Chapter 2

HALOGEN: an approximate halo

catalogue generator

2.1 Introduction

2.1.1 Approximate halo mock catalogues

We have entered an observational era where it is customary for redshift surveys to

map millions of galaxies in the sky with the volumes of these surveys exceeding Gpc3
scales. Recent and upcoming galaxy survey projects include PAU1 [109], BOSS2
[110], DES3 [19], DESi4 [111], Euclid5 [20], LSST6 [112] etc. The interpretation of
such surveys demands a new generation of theory tools in order to better understand
and interpret the large amounts of data. One important component is the need
for accurate simulations of the expected results, to which the observations should
be compared. However, models of large-scale-structure and the clustering of (dark
matter) halos forming in it are inherently non-linear, and require the production

1
http://www.pausurvey.org/
2
http://Cosmology.lbl.gov/BOSS/
3
http://www.darkenergysurvey.org/
4
http://desi.lbl.gov/
5
http://www.euclid-ec.org/
6
http://www.lsst.org/

71
72 Chapter 2. HALOGEN: an approximate halo catalogue generator

of simulations based on N -body calculations (as explained in Section 1.1.1). Such

simulations are extremely costly, and consequently very few realisations can be run
for a given application. However, investigating the effects of systematic errors, cosmic
variance, and their interplay require many hundreds or even thousands of realisations
of a single simulation (e.g. BOSS survey used 600 [113]). These are necessary to
compute covariance matrices which characterise the resultant uncertainty on the final
parameters.
To mitigate this situation, it is now a common practice to use approximate schemes
in order to calculate the required realisations of the simulations. Early such work
used the so-called log-normal realisations [114], which placed particles randomly
according to a log-normal distribution, given the true power spectrum. While this
is indeed efficient, and reproduces 2-point statistics faithfully, its lack of physical
motivation for the particle placement results in poor higher-order statistics, such as
the 3-point function or Counts-in-Cells moments. Improved methods developed in
the past decade include PThalos [113, 115], pinocchio [116, 117], patchy [118],
cola [119], qpm [120], EZmocks [121], FastPM [122] etc.
One may segregate these methods into two classes – predictive-type methods which
are required to ‘find’ halos in a given density field (e.g.. pinocchio, cola and
PThalos), and statistical-type methods which merely stochastically sample a den-
sity field to locate halos (e.g.. patchy, qpm and EZmocks). The former have the
advantage of being predictive, and often not requiring an N -body reference simula-
tion for calibration, while the latter have the advantage of computational speed and
resources, as the large-scale statistics are produced by an approximate gravity solver
– including higher-order statistics. See Section 2.6 for a comparison based on [3].
In this chapter we present a (statistical-type) approximate scheme, called halogen
[2], whose prime objective is to generate halo catalogues with the correct 2-point
clustering and mass-dependent bias using a simple and rapid approach.
We note that statistical-type methods tend to follow a standard pattern of four steps:

1. Produce a density field.

2. Sample halo masses.

2.1. Introduction 73

3. Sample particles as halos with some bias.

4. Assign halo velocities.

In this study we seek to abstract this pattern, providing a framework in which each
step is highly modular. Whilst modular, halogen implements default behaviour
with very simple (and rapid) components – using 2nd -order Lagrangian Perturbation
Theory (2LPT) as the gravity solver, theoretical mass functions, a single-parameter
bias prescription (as opposed to 2 or more parameters for other statistical-type meth-
ods) and a direct linear transformation of the velocities. As such, halogen can be
rapidly calibrated, and easily extended. In addition, we introduce physically moti-
vated constraints for halo exclusion and mass conservation, which tie the individual
steps together.
We will compare the results from halogen to the reference N -body simulations pre-
sented in Section 2.1.2. We introduce the general ideas of the method in Section 2.2,
leaving a more detailed explanation of the spatial placement of halos – which we
consider the essence of halogen – for Section 2.3. Section 2.4 demonstrates the
effects of each parameter of halogen and how to optimise them. We present some
applications and results of halogen in Section 2.5 and compare it to other methods
section Section 2.6

2.1.2 The reference simulations

To tune halogen to a specific cosmology, we require an N -body simulation. In

order to show the adaptability of halogen to varying setups, we have not limited
ourselves to a single simulation but used several references with differing box size,
mass resolution, and cosmology. Further, the reference halo catalogues have been
obtained by applying different halo finding techniques, and have different number
density. We summarise the characteristics of both reference catalogues in Table 2.1
and describe them below.
74 Chapter 2. HALOGEN: an approximate halo catalogue generator

Goliat Simulation This simulation was run with the Gadget2 code [60] from
initial conditions generated by 2LPTic7 at z = 32. It uses N = 5123 dark matter
particles in a box with side length Lbox = 1000h−1 Mpc. The cosmological parameters
used in this simulation are ΩM = 0.27, ΩΛ = 0.73, Ωb = 0.044, h = 0.7, σ8 = 0.8,
ns = 0.96 yielding a mass resolution of mp = 5.58 × 1011 h−1 M . In this catalogue we
use a reference halo number density of n = 2.0 · 10−4 (Mpc/h)−3 The halo catalogue
was obtained from a z = 0 snapshot and has been generated with the halo finder
AHF [123], a spherical-overdensity (SO) algorithm (see Section 1.2). Though AHF
identifies subhalos, they have been discarded for the present analysis as these scales
are too small for 2LPT to resolve. We show in Section 3.2.3 how to add substructure
in a phenomenological way following a Halo Occupation Distribution.
halogen requires an input density field obtained from 2LPT (see Section 1.1.1). For
this purpose, we run a 2LPTic snapshot at z = 0 with the same initial condition
phases as those used in goliat.

Big MultiDark Simulation8 BigMultiDark described in [124], employs the

cosmology from the Planck CMB mission [125], which for some parameters rep-
resents a significant change with respect to the goliat simulation: ΩM = 0.31,
ΩΛ = 0.69, Ωb = 0.048, h = 0.68, σ8 = 0.82, ns = 0.96. The halo catalogue is
extracted with a Friends-of-Friends (FOF) [97] algorithm (which intrinsically ne-
glects substructure) at z = 0.56, and we choose a reference halo number density of
n = 3.5 · 10−4 (M pc/h)−3 .
Compared to goliat, is both larger (Lbox = 2500h−1 Mpc) and more resolved (N =
38403 particles of mass mp = 2.3 × 1010 h−1 M ). It was run with L-Gadget2 from
initial conditions based on the Zel’dovich Approximation (ZA) at z = 100. Given the
large scales that it explores while resolving large numbers of halos, it is well suited
to probing the Baryon Acoustic Oscillation (BAO) peak.
For the input of halogen we run 2LPTic to z = 0.56 with the same initial condition
phases as BigMultiDark. The cosmology and Lbox used are the same, but with a
lower resolution of N = 12803 .
7
http://cosmo.nyu.edu/roman/2LPT
8
http://www.cosmosim.org
2.2. HALOGEN: the method outline 75

Name Lbox Npart z Ωb ΩM ΩΛ h σ8 nS Finder n IC zIC

goliat 1000 5123 , 5123 0 0.044 0.27 0.69 0.7 0.8 0.96 AHF 2.0 · 10−4 2LPT 32
BigMultiDark 2500 38403 , 12803 0.56 0.048 0.31 0.73 0.68 0.82 0.96 FOF 3.5 · 10−4 ZA 100
mice 3072 40963 , 12803 0,0.5,1,1.5 0.044 0.25 0.75 0.7 0.8 0.95 FOF 15.6 · 10−4 ZA 100

Table 2.1: Properties of the three reference N -body halo catalogues. From left to right: Side-length of the simulated
cubic volume (in h−1 Mpc), number of particles (for N -body and halogen), redshift of the snapshot, cosmological
parameters (density of baryons, total matter and dark energy, Hubble parameter, power spectrum normalisation and
spectral index), halo finding technique, halo number density (in (Mpc/h)−3 ), method used to generate the initial
conditions and redshift at which they were generated.

This simulation will also be the reference for the comparison of the methods in
Section 2.6

MICE Grand Challenge Simulation9 The MICE Grand Challenge simulation

(from now mice), described in [126–128], is based yet in a different cosmology:
ΩM = 0.25, ΩΛ = 0.75, Ωb = 0.044, h = 0.7, σ8 = 0.8, ns = 0.95. Halos are found by
FOF, the simulation have a larger box size (Lbox = 2500h−1 Mpc) and more particles
(40963 ), yielding a slightly lower mass resolution of 2.9 × 1010 h−1 M .
This simulation will not be used during this chapter, in which we are developing
halogen, but will be a reference to halogen in Chapter 3, where we apply the
method in the context of the Dark Energy Survey. This simulation is more ap-
propriate for DES studies, since it can be used to build a lightcone of an octant
of the sky up to z = 1.4. For this, we need to fit the model at several redshifts
(z = 0, 0.5, 1.0, 1.5) and we use a larger number density reference (n = 15.6 · 10−4 at
z = 0.5). Again, the same initial condition phases were used for the calibration.

2.2 HALOGEN: the method outline

In this section we briefly outline our method, leaving a more detailed presentation
of the actual modus operandi of halogen for Section 2.3. The general algorithm
consists of four (major) steps:

• generate a dark matter density field,

• draw halo masses by sampling a halo mass function,

9
http://maia.ice.cat/mice/
76 Chapter 2. HALOGEN: an approximate halo catalogue generator

20 20
δ \Symbol d

15 15

10 10

δ δ

5 5

0 0

Figure 2.1: Here we show the difference between performing an actual N -body sim-
ulation (left) and using 2LPT (right) to generate a particle distribution at z = 0.5,
with the same initial conditions. The image shows a slice of the density contrast δ
3
distribution in a 1h−1 Gpc box.

• populate the volume with halos in the box, and

• assign velocities to the halos.

We aim to de-couple each of these steps from the others as far as possible so that
different algorithms may be used at each point. The first two steps are relatively
trivial, as they use pre-developed prescriptions from the literature, and we discuss
these, and basic outlines of the last two steps, in this section.

2.2.1 Density Field

The basic scaffolding of halogen is an appropriate dark matter density field realised
at the desired redshift, sampled by N particles. For simplicity we choose to use 2nd -
order Perturbation Theory (2LPT) (see Section 1.1.1 or [49, 50]) to produce this
field, which can be obtained with the public code 2LPTic.
We show in Figure 2.1 the density distribution of an N -body simulation (left panel)
2.2. HALOGEN: the method outline 77

and a 2LPT representation (right panel)at z = 0.5. Notably, the 2LPT distribution
appears to be blurred in comparison to the N -body simulation. This is due to
the fact that 2LPTic – as the name suggests – was originally designed only to
generate initial conditions [129], since even 2nd -order perturbation theory breaks
down at low redshift when over-densities become highly non-linear. The small-scale
difference in Figure 2.1 can be explained by shell crossing, an effect in which particles
following their 2LPT trajectories cross paths and continue rather than gravitationally
attracting each other in a fully non-linear manner [130, 131]. In order to compensate
for shell-crossing, [113] advocates the use of a smoothing kernel over the input power
spectrum. We tested the effect of this smoothing in halogen but did not find any
improvement in the final catalogue.
Nevertheless, 2LPT provides a suitable approximation of the large scale distribution
of matter, where perturbations have not yet entered into the highly non-linear regime
and this is sufficient for halogen. Note that halogen is in principle agnostic about
the method in which this density field snapshot is produced. Other methods, for
instance the “Quick-PM” cola[119] or 3LPT could equally be employed by the user.
A different choice of density field will yield somewhat different results, especially at
smaller scales. As long as the chosen method reconstructs large scales correctly, the
remaining steps of halogen should be unmodified.
Despite this, we have by default incorporated 2LPTic as part of the halogen code
(which bypasses the costly I/O of writing the snapshot to disk), but also allow the
user to provide an arbitrary snapshot with a distribution of N particles in a cosmo-
logical volume. Our choice for 2LPT was mainly driven by its low computational
cost and success in the distribution of matter at large scales. We use this approach
for all results in this study.

2.2.2 Halo Mass Function

The halo mass function (HMF) n(> M ) measures the number density of halos above
a given mass scale. It is required to generate mass-conditional clustering, which in
turn is a pre-requisite for extension to HOD-based galaxy mock generation.
78 Chapter 2. HALOGEN: an approximate halo catalogue generator

The most accurate HMF for a given cosmology, over a range of suitable scales, may
be obtained from an N -body simulation via a halo-finding algorithm –although there
are notable variations depending on the technique [81]. Since we require a full N -
body simulation for the tuning of halogen, it would be perfectly acceptable to use
this simulation to generate the HMF. However, in the hope of future improvements,
we wish to avoid using the full simulation as far as possible. Fortunately, there is a
wealth of literature concerning accurate predictions of the HMF for widely varying
cosmologies and redshifts using Extended Press-Schechter theory [39, 132].
The mass function may be calculated by any means, so as long as a discretised
function of n(> M ) is provided. For simplicity, we decided to use the online halo
mass function calculator HMFcalc10 [133] for obtaining the halo mass distribution
in this study. We produce a sampled mass function by the standard inverse-CDF
method, utilising an arbitrary input HMF and sampling it as follows:

1. Either directly set a number density n0 or compute it from a chosen minimum

halo mass Mmin , i.e. n0 = n(> Mmin ).

2. Estimate the number of halos in the simulation volume V , i.e. Nh = n0 · V .

3. Generate Nh random numbers yi in the interval yi ∈ [0, n0 ].

4. Invert the cumulative mass function (n(> M )) to obtain the halo masses Mi =
n−1 (yi ).

In Figure 2.2 we demonstrate how well the input HMF is reproduced, only differing
from the mass function fit [Watson, 134] at high mass due to Poisson shot-noise
controlled by the volume V (where expected numbers are of order unity). Further,
the HMF of BigMultiDark shows similar behaviour, indicating that the chosen
fit is appropriate for this simulation.

10
http://hmf.icrar.org
2.2. HALOGEN: the method outline 79

107
Watson fit
FoF
HALOGEN
10-4
106

10-5
105

10-6
n(>M) [(Mpc/h)-3]

104

N(>M)
10-7 3
10

-8
10 2
10

10-9
101

10-10
100

1013 1014 1015

M [Msun/h]

Figure 2.2: Mass function reconstruction by halogen, as explained in the text. We

show the theoretical fit nwat (> M ) as the solid red line, the resulting mass function
of halogen in a blue dotted line, and the mass function from the BigMulti-
Dark simulation in a green dashed line. The left y-axis represents the cumulative
HMF in number density whereas the right y-axis has been multiplied by the volume
V = (2.5Gpc/h)3 and represents the actual number of halos above M . halogen
reproduces the HMF extremely well, as expected, though we note small fluctuations
at the high-mass end due to discreteness effects.
80 Chapter 2. HALOGEN: an approximate halo catalogue generator

2.2.3 Spatial placement of halos

The crucial step in the generation of approximate halo catalogues is the commis-
sioning of halo positions. In keeping with the philosophy of modularity, the halo-
placement step is de-coupled from the rest. Any routine which takes a vector of halo
masses and an array of dark matter particle positions and returns a subset of those
positions as the halo locations is acceptable. However, we consider this step to be
at the heart of the halogen method, as it is responsible of generating the correct
mass-dependent clustering.
To achieve an efficient placement that reconstitutes the target two-point statistics,
we recognise the validity of the clustering on large scales from the broad-brush 2LPT
field. We place halos on 2LPT field particles, essentially using the estimated density
field as scaffolding on which to build an approximate halo field. We will follow a
series of steps in the construction of the method of spatial placement to be presented
in Section 2.3 below.

2.2.4 Assignment of velocities

The most obvious way to assign velocities to each halo would be to use the velocity
of the particle on which it is centred. However, halos are virialised systems whose
velocities tend to be lower than that of their constituent particles. This is potentially
mitigated by using the average velocity of all particles within a defined radius of the
artificially placed halo. However, this is not robust as there are often very few
particles inside the halo radius. Additionally, the 2LPT particle velocities will differ
from their N -body counterparts due to shell-crossing, especially on the small scales
associated with halos.
Thus, we prefer to take a phenomenological approach, and assume that a simple
mapping via a factor fvel can be applied to the collection of halo velocities to recover
the results of the N -body distribution

~vhalo = fvel · ~vpart . (2.1)

2.3. HALOGEN: Bias scheme 81

This factor could a priori depend on the velocity (i.e. a non-linear mapping) and
the mass of the halo fvel (vpart , Mhalo ). However, we will show in Section 2.4.2 that a
linear mapping is sufficient and present a way to compute fvel (Mhalo ).

2.3 HALOGEN: Bias scheme

Though halogen is a four-stage process, the most crucial aspect is the assignment
of halo positions, which this section describes in some detail. The general concept is
to specify a sample of particles from an underlying density field as halos.
The motivating philosophy of halogen is to start from the simplest idea and im-
prove if necessary. In this vein, we present here successive stages of evolution of
the halogen method, which we hope will show satisfactorily that the method as
it stands is optimal. Figure 2.3 will serve as the showcase for the various stages of
halogen. In it we present the 2-point correlation function (2PCF) for each stage of
development to verify that the method approaches the goliat reference catalogue
as new characteristics are added.
Note that the 2PCF is computed with the publicly available parallel code CUTE11
[135]. In the fitting routine that is included in the halogen package and described
in Section 2.4.1 we also use the same code.

2.3.1 Random particles

We start with the simplest approach: using random particles from the 2LPT snapshot
as the sites for halos. We expect to recover the large-scale shape of the 2PCF in this
way, as this is encoded in the 2LPT density field which we trace.
However, it is clear from Figure 2.3 that this method (’random no-exc’) consistently
underestimates the 2PCF over all scales except r < 1h−1 Mpc, where it should sharply
drop to -1, but rather remains positive.
The consistent under-estimate is a realisation of an inaccurate linear bias, b, defined
11
http://members.ift.uam-csic.es/dmonge/CUTE.html
82 Chapter 2. HALOGEN: an approximate halo catalogue generator

as the scaling factor between the 2-point function of the halos and the underlying
matter density field:
ξhalo (r) = b2 ξdm (r) (2.2)

We begin to address this in Section 2.3.3.

The small-scale clustering can be explained by the fact that particles can be arbi-
trarily close, whereas distinct halos – recall that subhalos have been removed – have
a well-defined minimum separation (otherwise they merge). The turn-over in the
simulation based 2PCF occurs around the mean halo radius scale.

2.3.2 Random particles (with exclusion)

The simplest improvement to the random case is to eliminate the artificial small-scale
correlations. Though the primary application of halogen will be for large scales, a
simple improvement at small scales is useful.
As we have noted, the artificial clustering at small scales arises from the fact that
particles can be arbitrarily close, whereas simulated halos have a minimum sepa-
ration. The radius of a halo is a rather subjective quantity, and its definition is
modified in various applications and halo-finders. However, we may parametrise this
by
!1/3
3Mhalo
R∆ = , (2.3)
4π∆h ρcrit

where ∆h is the overdensity of the halo with respect to the critical density of the
Universe. For the work presented here we used ∆h = 200.
Using this scale, we introduce exclusion, a modifiable option which controls the de-
gree to which halos can overlap, which we set to mimic the halo finder’s specification.
For example, in this work we use both AHF and FOF (see Section 1.2). For the
latter we do not allow any overlap whereas for the former halogen’s halo centres
are not allowed to lie inside another halo’s radius.
The effect of exclusion is presented in Figure 2.3 (’random exc’). As expected, scales
of r < 1h−1 Mpc show a turnover while larger scales are unaffected. We note that
2.3. HALOGEN: Bias scheme 83

the turnover is at smaller scales for halogen than for AHF. This is to be expected,
as it is unlikely to find two AHF halos separated by a distance slightly exceeding
R∆ , due to reasons akin to the FOF over-linking problem. In such cases, there is an
increased likelihood of the two halos being subsumed into one, or one becoming a
subhalo of the other. It is conceivable that one could empirically model these effects
by tuning the value of ∆h by some factor which captures this suppressed probability.
However, as we are more interested in large scales and these considerations touch
upon the subtleties of halo definition, we consider these exclusion criteria sufficient
for present purposes. We will use this form of exclusion (in an appropriate form) for
all following work.

2.3.3 Ranked approach

We return now to the problem of under-estimation of the correlations, which we noted

was due to an incorrect realisation of the linear halo bias. In effect, a random choice
of particle position corresponds to sampling the matter power spectrum uniformly,
and therefore b = 1. However, halo bias is generally greater than unity (especially
for higher mass halo samples) [136].
Increasing the bias corresponds to sampling higher-density regions. The simplest way
to achieve this is to rank-order the density of regions in the particle distribution, and
assign halos to these regions based on their mass.
To calculate densities from the particle distribution, we simply create a uniform grid
with cell-size lcell , and obtain the density in each cell using a Nearest-Grid-Point
(NGP) assignment scheme [137]. We consider specification of the optimal lcell in
Section 2.4.3. The cells are ordered by density, and the halos by mass, and each halo
is assigned to its corresponding cell (a random particle is chosen within the cell).
Using lcell = 5h−1 Mpc in this case, we obtain the results shown in Figure 2.3 labelled
’ranked exc’. The resulting 2PCF is now overestimated. This is not surprising, since
even if we expect halos to form in dense environments, the bias is not completely
deterministic: in reality the nth most massive halo does not need to reside in the nth
densest place.
84 Chapter 2. HALOGEN: an approximate halo catalogue generator

The effect of introducing a scale length, lcell , is also clearly seen in this result. There
is a turnover in the 2PCF below lcell , which corresponds to a significant reduction of
bias on these scales since a random particle is chosen within the cell.

2.3.4 α approach

We find that selecting completely random particles yields too low a bias, whereas
the ranked approach is highly biased. We require an intermediate solution, which
has higher probability of selecting dense areas than the random approach, and lower
probability than the ranked approach.
The probability that a cell is chosen is a function of its density,

Pcell ∝ G(ρcell ). (2.4)

In the completely random case, we have G(ρcell ) = ρcell . In principle we can tailor
G(ρcell ) so that the probability of selecting a cell reproduces the appropriate bias.
We choose to constrain G(ρcell ) to have a power-law form, i.e.

G(ρcell ) = ραcell . (2.5)

When α = 1, we recover the random approach, and as α → ∞ we obtain the ranked

approach.
In Figure 2.3 we show results for α = 1.5, 2, demonstrating the effectiveness of our
model for tuning the normalisation (i.e. bias) of the 2PCF. The α = 1.5 curve closely
matches the 2PCF of the AHF catalogue, at least at scales larger than the applied
cell size lcell = 5h−1 Mpc.
The exact value of α for a particular application may be determined by a least-
squares fit, which we describe in more detail in Section 2.4.1 (note that here the
choice of α was not formally fit).
In corollary with this prescription, we also introduce a means to roughly ensure
mass conservation in cells: once a halo is placed, if the total halo mass in the cell
exceeds the original mass, the cell is eliminated from future selections. However, we
2.3. HALOGEN: Bias scheme 85

100

1
lcell
ξ (r)

0.1
random no-exc
0.01 random exc
ranked exc
α=1.5 exc
0.001 α=2 exc
α(M) exc
GOLIAT
0.0001
1 10 100
r (Mpc/h)

Figure 2.3: Two-point correlation function of the goliat halos in comparison to

halogen for the various evolutionary stages presented in Sections 2.3.1 through
2.3.5. The dashed vertical line indicates the cell size of lcell = 5h−1 Mpc applied for
the approaches 2.3.3 through 2.3.5.

do not update the value of the probability after every halo placement because it is
3
computationally very expensive (O(Ncell )) and we have checked that doing so has a
negligible effect on output statistics.
We note that a similar method was employed in QPM [120]. In fact, the physically
meaningful distribution is fhalo (ρ) – the fraction of halos in cells with density ρ. This
can be written as
fhalo (ρ) = P (cell|ρ)fcell (ρ), (2.6)

where P (cell|ρ) specifies the relative probability of choosing a cell given its density
(in our case, ρα ), and fcell (ρ) is the intrinsic distribution of cell densities given the
cell size and cosmology (heavily related to the cosmological parameter σ8 ). QPM
specifies the target distribution fhalo (ρ) directly, as a Gaussian. In halogen we
instead specify P (cell|ρ), which is more closely tied to our algorithm. In principle
one can convert from QPM-like methods to halogen with Equation 2.6.
86 Chapter 2. HALOGEN: an approximate halo catalogue generator

2.3.5 α(M ) approach

The approach as it stands reproduces the 2PCF accurately down to the scale of lcell .
If the 2PCF of a sample of given number density is all that is required for a specific
application, then this will do well.
However, if we were to select a sub-sample of the most massive halos of our catalogues
and recompute the 2PCF, the bias would be incorrect, since more massive halos are
more biased [136]. For a truly representative catalogue, in which the halos are
conditionally placed based on their mass, the bias model is required to be mass-
dependent. Failing this, there is no physical meaning attached to the assignment of
masses in the second step (Section 2.2.2).
Mass-dependent halo bias is also crucial for implementing HOD models on the cat-
alogue, for use in galaxy survey statistics, as the number of galaxies associated with
a halo depends on its mass.
We incorporate this mass-dependence into the α parameter, so that we finally have

α(M )
G(ρcell , M ) = ρcell , (2.7)

with α(M ) an increasing function.

i−1
In practice, we use discrete mass bins, and for each bin i, with masses Mth >M >
i
Mth , we use a different αi . We describe how we obtain the best-fit to this mass-
dependent α using the fiducial halo catalogue from the simulation in Section 2.4.1.
Using just five mass bins, we illustrate this approach in Figure 2.3, labelled “α(M )
exc” (magenta line) using the best-fit values for α(M ). We list in Table 2.2 the mass
thresholds, applied α-values, and corresponding number densities of all halos with
i
Mhalo > Mth . Note that though the probability is not recomputed after placing a
halo, it is recomputed with updated ρ and α when changing mass bins.
Though the α(M ) approach does not improve the 2PCF with respect to the α ap-
proach in Figure 2.3, it has the clear advantage of reproducing a mass dependent
clustering, which as we noted is essential for further HOD analyses, and useful for
being able to use any mass-range in the same realisation.
2.3. HALOGEN: Bias scheme 87

i
bin Mth [h−1 M ] ni [(h−1 Mpc)−3 ] αi
0 1.64 · 1014 0.05 · 10−4 3.54
1 4.80 · 1013 0.40 · 10−4 2.26
2 2.65 · 1013
0.90 · 10−4 1.77
3 1.86 · 1013 1.40 · 10−4 1.48
4 1.38 · 1013 2.00 · 10−4 1.41

Table 2.2: Properties of the selected mass bins for the goliat simulation: mass
i i i−1
threshold Mth , equivalent number density n(M > Mth ) and best fit αi in Mth <
i
M < Mth for the halogen α(M ) approach.

2.3.6 Summary

In conclusion, halogen constitutes a method for generating a halo catalogue which

exhibits correct 2-point clustering statistics, while not only positioning the halos
correctly, but also imbuing them with physically meaningful masses. The method
can be summarised as follows.
The particles generated by 2LPT (Section 2.2.1) are covered by a grid of cell size
lcell , the halo masses Mi generated from the halo mass function (Section 2.2.2) are
ordered by mass, and starting from the most massive halo they are placed by

α(M )
1. selecting a cell with probability Pcell ∝ ρcell ,

2. randomly selecting a particle within the cell and using its coordinates as the
halo position,

3. ensuring that the halo does not overlap (following an exclusion criterion) with
any previously placed halo in any cell, and re-choosing a different random
particle in that case,12

4. subtracting the halo’s mass from the selected cell, mcell = mcell −M : if mcell ≤ 0
the cell is removed from selection.
12
If, after several iterations all the particles are found inside another halo, re-choose cell (to avoid
infinite loops).
88 Chapter 2. HALOGEN: an approximate halo catalogue generator

Parameter Motivation Value

2
αi linear bias χ -fit to bias
i i i
fvel velocity bias fvel = σNB /σpi
lcell algorithm lcell ≈ 2 · dp

Table 2.3: A summary of the parameters involved in halogen, the motivation to

introduce them and how to compute/optimise them. See text for details

Note that the physically motivated nature of the process suggests that higher-order
statistics may also be recovered with some success.

2.4 HALOGEN: Parameter Study

We have mentioned several parameters of the halogen method, and these are of
particular importance in producing accurate realisations. In this section we will
discuss each parameter, its effects and how to optimise for it if possible.
There are three parameters in halogen (with other options and parameters being
expressly determined by the required output, such as the size of the simulation box
L): the two physical parameters of the model, α – controlling the linear bias – and
fvel – controlling the velocity bias – and the one parameter of the algorithm, lcell .
In the previous Section we used goliat as a reference. We now turn to BigMul-
tiDark and its FOF catalogue: this simulation has a larger volume, allowing us to
probe BAO scales. The increased volume also reduces cosmic variance on interme-
diate scales. halogen primarily aims at reproducing clustering statistics for even
larger volumes, hence it is beneficial to assess the performance of halogen and its
parameters in this regime. Furthermore, this demonstrates independence from the
underlying simulation and halo finding technique.

2.4.1 Fitting α(M )

The value of α(M ) is crucial to the performance of halogen, as it constitutes the

only physical parameter controlling the bias. The halogen package contains a
2.4. HALOGEN: Parameter Study 89

stand-alone routine which determines a best-fit for α(M ), which can then be passed
to halogen to generate any number of realisations. We describe this routine here,
and illustrate it with application to BigMultiDark. The fitting of α(M ) is based
on the standard χ2 -minimisation technique. However, a few details are worth men-
tioning.

Mass-dependence. We perform the fit in sharp-edged mass bins to determine a

i−1
mass-dependent α(M ), i.e. for each bin i we fit a αi for the mass range Mth <M <
i
Mth . There are two conceivable ways of doing this – differentially or cumulatively.
We have experimented with both and find that the cumulative procedure has better
performance. That is, we fit the first mass bin, and then the first and second together
(keeping the best value of α0 for the first bin), and so on. This has the advantage of
being able to properly correct for deviations in previous bins, which is particularly
important since the first bins to be fit are the high masses, for which fewer halos
exist. Misestimation of α here is more likely, but is compensated for when fitting to
lower mass bins by including the high-mass estimates in the fit.

HALOGEN variance. The halo placement in halogen is probabilistic, even

given a constant underlying density field. Using different random seeds can slightly
affect the final placement, and thus the clustering statistics (the extent of this is
dependent on the volume, n and α). We term this “halogen variance”, and note
that it is not to be confused with cosmic variance. Cosmic variance is introduced by
modifying the the random seed of 2LPTic, which in effect results in a different real-
isation of the universe13 . During the fit each mass bin is realised several times (ten in
the case of BigMultiDark) with halogen to average out the effects of halogen
variance, and also provide an error σH (computed as the standard deviation) to use
in the definition of χ2 .

13
Cosmic variance – strictly speaking – requires the study of the same volume, but in a dif-
ferent place in the universe. This approach is more appropriately called ’sampling variance’ yet
nevertheless the generally accepted technique for generating covariance matrices.
90 Chapter 2. HALOGEN: an approximate halo catalogue generator

χ2 minimisation. The fit is performed by minimising χ2 :

X ξH (rj |α) − ξNB (rj ) 2
2
χ (α) = (2.8)
j
σH (rj |α)

where ξH and ξNB are the 2PCFs of halogen and the reference catalogue, respec-
tively. We note that minimising this statistic is susceptible to systematic errors in
halogen in bins where the stochastic error (σH ) is much smaller than the system-
atic error (∆ξ). This is especially likely when the region of the fit approaches lcell .
To test whether the region is stable, we may choose a distance estimator to be min-
imised that treats all scales with the same weight, e.g.. ∆ = (ξH − ξNB )2 /ξNB
2
. We
have tried both definitions in our fitted range, and the results are left unchanged,
indicating that the range of the fit is stable.
We use a grid of α to cover the expected result for each mass bin. We use a cubic
spline interpolation over χ2 (α) to locate a precise minimum for the best-fit α.

Number of mass bins. The number of bins to use in this procedure will depend
on the needs of the user, and the size and resolution of the reference simulation. It
determines the reliability of the mass-dependent clustering. For BigMultiDark we
i
distribute the halos into 8 roughly equi-numbered bins with the mass thresholds Mth
as shown in Table 2.4. In that table we also show the best-fit αi , and the equivalent
number density ni for each mass threshold.

Fitting Range. We restrict the range of the fit to scales in which the shape of
ξH (r)/ξNB (r) is flat. This corresponds to mid-range scales of 15h−1 Mpc < r <
47h−1 Mpc, which avoids small-scale effects of halogen, and large-scale cosmic vari-
ance.
The 2PCFs for our 8 values of ni are shown in Figure 2.4, where we compare the
results from halogen against the BigMultiDark reference catalogue. The range
used during the fitting procedure and for the χ2 -minimisation is indicated by the
vertical lines.
We note that the choice of α finely controls the bias. This is demonstrated in
2.4. HALOGEN: Parameter Study 91

BigMultiDark
HALOGEN
10 n0
n1
n2
n3
n4
1 n5
n6
ξ (r)
n7
0.1

0.01

10 100
r (Mpc/h)

Figure 2.4: Correlation function of both BigMultiDark (crosses) and halogen

halos (lines). We select 8 number densities ni (colours in the legend) of halos, with
values found in Table 2.4. The vertical dashed lines indicate the range of the fit.

Figure 2.5, in which we show the resultant ξ(r) for the entire grid of α7 for this
fit (top figure). There is a ∼ 10 per cent deviation in ξH (r) over the grid range
(1% between consecutive lines). On the bottom figure, we show the χ2 of each of
those curves and the cubic spline fit interpolation used to find the minimum, which
corresponds to the α7 best-fit value shown in Table 2.4.

2.4.2 Velocity factor fvel

In Section 2.2.4 we outlined a method of converting the velocity of 2LPTic particles

(designated as halo sites), vp , to the velocity of a halogen halo, vh . We stated that
the transformation was linear in vp , and thus we can write

vh = fvel (M ) · vp , (2.9)

where we have retained a mass-dependence in the conversion factor. This section

will explore the means to calculate this factor.
We begin by justifying our choice of a linear function. Figure 2.6 shows the one-
92 Chapter 2. HALOGEN: an approximate halo catalogue generator

1 BigMultiDark
α7=1.296
α7=1.353
ξ (r)

0.1 α7=1.411
α7=1.468
α7=1.526
α7=1.584
0.01 α7=1.641
α7=1.699
α7=1.757
α7=1.814
1.15
1.1
1.05
ξ/ξFOF

1
0.95
0.9
0.85
1 10 100
r (Mpc/h)
350
test values
splines
300 30
minimum
25
250
20
200 15
χ2

10
150
5

100 0
1.65 1.7 1.75 1.8 1.85

0
1.3 1.4 1.5 1.6 1.7 1.8 1.9
α7

Figure 2.5: Illustration of variations in α and its consequences for the 2PCF. Top
figure: Correlation function of the target halo catalogue (BigMultiDark, crosses)
and the grid of ξH corresponding to the grid of α7 used for minimisation. The lower
sub-panel shows the ratios to the BigMultiDark result. The vertical dashed lines
mark the spatial r-range of the fit. Bottom figure: χ2 (Equation 2.8) as a function
of α7 for the grid of values used in the left panel (red crosses) and the interpolated
curve (dashed blue line). In the inner box we zoom into the area near the minimum
(green circle).
2.4. HALOGEN: Parameter Study 93

bin i i
Mth [h−1 M ] ni [(h−1 Mpc)−3 ] αi fvel
0 1.64 · 1014 0.05 · 10−4 4.80 0.564
1 4.93 · 1013 0.45 · 10−4 2.79 0.672
2 2.95 · 1013
0.95 · 10−4 2.28 0.715
3 2.15 · 1013 1.45 · 10−4 2.00 0.743
4 1.70 · 1013 1.95 · 10−4 1.90 0.754
5 1.41 · 1013
2.45 · 10−4 1.84 0.760
6 1.21 · 1013 2.95 · 10−4 1.73 0.771
7 1.04 · 1013
3.50 · 10−4 1.73 0.771

Table 2.4: Properties of the selected mass bins for the BigMultiDark simulation:
i i
mass threshold Mth , equivalent number density n(M > Mth ), best fit αi for the
i−1 i
interval of masses Mth < M < Mth and fvel computed for the same interval (see
Section 2.4.2).

component velocity distribution of BigMultiDark and the particles selected by

halogen. Both curves are well-described by a Gaussian with v̄x = 0, where the
standard deviation of the N -body halos is reduced compared to that of vx,p , i.e.
σp > σNB . This confirms our claim in Section 2.2.4 that the particle velocities are
larger than the halo velocities, and also shows that a simple linear transformation
suffices to map the distribution of vp → vh .
This simple characterisation leads to a transformation of fvel = σNB /σp , which is
verified by the blue dotted line where this remapping has been applied.
We expect that the velocity bias [138] will be dependent on mass-scale in general.
We can easily incorporate this into our fit by calculating
i
i σNB
fvel = (2.10)
σpi

i−1 i
for each interval of mass M = (Mth : Mth ] while performing the fit for α. These
results are also listed in Table 2.4. There is a noticeable decrease in fvel towards
higher mass halos. We will see in Section 2.5.4 below how this affects the modelling
of Redshift Space Distortions.
We finally note that there may be other more complex models of velocity bias ac-
counting for the physics of low scales and adjusting other statistics beyond the overall
94 Chapter 2. HALOGEN: an approximate halo catalogue generator

BigMultiDark
selected particles
HALOGEN

-1000 -600 -200 200 600 1000

Vx [km/s]

Figure 2.6: One-component (vx ) velocity distribution of the halo catalogues. The
FOF halos from the BigMultiDark simulation are in a red solid line, and vx,p of
the particles selected by halogen catalogue are in a green dashed line, while the
corrected vh halos from halogen are in a blue dotted line. The correction provides
a very closely matching distribution, which has a generally lower velocity.

velocity distribution. However, the model presented here is very simple and capable
of reproducing the halo velocity distribution with a great accuracy.

2.4.3 Cell size: lcell

We have previously mentioned the cell-size lcell which is introduced to halogen to

provide a simple local density via the NGP scheme [137]. We have also noted that it
defines a lower-limit of reliability of the resultant 2PCF. In this section we explore
this parameter further, describing its effects and how to optimise for it.
In Figure 2.7 we show the 2PCF of the BigMultiDark catalogue against halogen
results for several values of lcell . We note two effects, lcell

1. determines the minimum scale at which the 2PCF is reliable and

2. controls the broadening of the Baryonic Acoustic Oscillations (BAO) peak.

2.4. HALOGEN: Parameter Study 95

10
BigMultiDark
l=15
1 l=10
l=8
l=6
ξ (r)

0.1 l=5
l=4
l=3
0.01

0.001
1.15
1.1
1.05
ratio

1
0.95
0.9
0.85

1 10 100
r
80
BigMultiDark
l=15
60 l=10
l=8
40 l=6
r2 ξ (r)

l=5
20 l=4
l=3
0

-20
1.15
1.1
1.05
ratio

1
0.95
0.9
0.85

0 20 40 60 80 100 120 140

Figure 2.7: Two-point correlation function on logarithmic (top figure) and linear
(bottom figure) scale of the FOF catalogue of the BigMultiDark simulation
(crosses) against the results from halogen (lines) for different values of lcell (differ-
ent line styles as indicated in the legend). Note that in the bottom figure the 2PCF
has been multiplied by r2 to increase the visibility of the BAO peak. The lower
sub-panels show the ratio with respect to the BigMultiDark curve.
96 Chapter 2. HALOGEN: an approximate halo catalogue generator

The first effect is clearly noticeable in the top figure where the halogen 2PCF
detaches from the BigMultiDark curve at r ≈ lcell . This is expected, since particles
are chosen at random inside the cell, tending towards a bias of unity at these scales.
The second effect is more noticeable in the bottom figure. As lcell is decreased,
the broadening and dampening (best seen in the lower sub-panel as the difference
between the artificial peak at r = 80h−1 Mpc and trough at r = 100h−1 Mpc) is
decreased. The reason for this is that we introduce an uncertainty (on a scale lcell ) in
the position of the halos that propagates to an uncertainty in the determination of
rBAO . In effect, the density field has been filtered by a quasi-top-hat function [137],
which has the known effect of peak-broadening.
Clearly, lcell should be set as small as possible to mitigate these effects. However, a
limit is enforced by the mean-interparticle-separation, dp , of the input density field.
We cannot hope to reliably probe scales smaller than dp , and even just above this
scale we run into the problem of having poor statistics within cells. We recommend
using a value of lcell ≥ 1.5dp (ensuring > 3 particles per cell on average), and in this
work we take lcell = 4h−1 Mpc ≈ 2dp as the reference.
We comment here that the choice of lcell affects the optimal α(M ) relation. This is
unfortunate, because it would be useful to be able to perform the fit for α using a
lower resolution (since this is the bottleneck). The mechanism by which this effect
occurs is known, and we hope to be able to correct for it in the future.
Let us illustrate the mechanism with an example: suppose we take a cell with cell-
I
size lcell and density ρIcell from a volume (N lcell
I
)3 . For the same distribution, we could
II I
also use lcell = lcell /2, which forms 8 sub-cells i with densities ρIIcell,i . For the same α,
the probability of choosing the cell in case I is

1
P8 α
(ρI )α 8 i ρII
cell,i
I
Pcell = PNcell
3 = PN 3 (2.11)
I α
j (ρj ) j (ρIj )α

whereas in case II we have P8 II α

II i (ρcell,i )
Pcell = P(2N )3 II α
, (2.12)
j (ρ j )
and clearly these are not in general equivalent if α 6= 1. We expect the difference
2.5. HALOGEN: Outcome 97

5.5
lcell= 3
5 lcell= 4
lcell= 5
lcell= 6
4.5 lcell= 8
lcell=10
4 lcell=15

3.5
α 3

2.5

1.5
1e+13 1e+14
M (Msun/h)

Figure 2.8: Best-fit α(M ) functions for different values of lcell , as marked in the
legend (units of h−1 Mpc).

in the distributions to be dependent on α, the two cell-sizes and their ratio and
the cosmology, via the mass variance σ(r). In future studies we hope to be able to
quantify this relationship to enable faster fitting.
Figure 2.8 shows the effect of changing lcell on the best-fit α(M ) and we notice two
characteristics. Firstly, α(M ) is an increasing function for all lcell , as expected since
b(M ) is increasing. Secondly, low masses are less sensitive to lcell , which we expect
mathematically from Eqs.2.11 and 2.12 with an increasing α(M ) (the greater α is,
the greater the differences expected).
In Figure 2.7 we have re-fit the α(M ) relation for each value of lcell , ensuring proper
comparison between curves. Furthermore, we run 5 realisations of each and display
the average, to reduce the effects of halogen variance.

2.5 HALOGEN: Outcome

While previous sections were dedicated to the design and optimisation of halogen,
we have now defined the final method and fixed the optimal parameters. In this
section we discuss the performance of halogen in more detail, both in the clustering
statistics so far analysed, and in other statistics that halogen is not constrained to
98 Chapter 2. HALOGEN: an approximate halo catalogue generator

match. We begin by demonstrating the power of halogen for mass-production of

halo catalogues for use in deriving covariance matrices to measure cosmic variance,
which we envision as the primary application of the halogen machinery.
Some of the results presented here (Section 2.5.2,Section 2.5.3,Section 2.5.4) are also
presented in Section 2.6 when comparing to other methods. However, we find some
subtleties for which is worth presenting them also here. The PDF (Section 2.5.2)
shown here is computed in several cell sizes, exploring different scales. The P (k)
in Figure 2.12 is in logarithmic scale, focusing more at large scales. We show in
Section 2.5.4 why is necessary to introduce the velocity bias as explained in Sec-
tion 2.4.2.

2.5.1 Mass production of halo catalogues

The driving motivation of developing fast methods for synthetic halo catalogues is
to accurately produce robust covariance matrices for large galaxy survey statistics.
Though halogen requires a full N -body simulation to calibrate its two parameter
sets, once these parameters have been established, we are free to run as many re-
alisations (with different phases for the initial conditions) of the the halo catalogue
(using the same cosmological parameters, volume, mass resolution etc.) as we like.
This process is expected to purely simulate the effects of cosmic variance, and thus
is extremely valuable for deriving the covariance matrices.
In order to verify that the variance seen in the resulting data traces the expected
cosmic variance, we complemented the generation of the halogen catalogues with
several corresponding N -body simulations. Due to the computational time con-
straints, we were only able to run five simulations, which were based on goliat, and
in which only the seed for the random Initial Condition (IC) phases was changed.
The initial conditions for these runs were generated with 2LPTic at redshift z = 32
(for the N -body) and z = 0 (for halogen), using the same seed for each pair. The
N -body particle distributions were evolved to z = 0 using Gadget2 (and subse-
quently analysed with AHF).
In Figure 2.9 we present the 2PCF of those 5 pairs of catalogues (random seeds
2.5. HALOGEN: Outcome 99

are colour-coded, with halogen as solid lines, and AHF as points). The halo-
gen lines are the average of 5 realisations of halogen placement (maintaining the
same phases) and the error bars show the halogen variance. Given that the go-
liat box size is rather small (1h−1 Gpc), scales r >∼ 60h−1 Mpc are dominated by
cosmic variance effects. This makes it easy to identify the signature of each set of
initial conditions. Though the realisations are significantly different, we note that
the halogen catalogue follows the N -body result, and maintains the correct nor-
malisation at intermediate scales (20h−1 Mpc < r < 50h−1 Mpc). We stress that the
fitting procedure has only been performed once; all five cases used fixed parameters.
The similarity of the goodness of fit in each case (as compared to that directly fitted
to) demonstrates that the fitted α(M ) is universal with respect to input seed. We
note also that the halogen variance is significantly sub-dominant to the cosmic
variance.
To better appreciate the dominance of the cosmic variance in a more applicable
scenario, we return to the BigMultiDark simulation. This has a reduced cosmic
variance due to the larger volume, but has the disadvantage that we cannot run
several N -body simulations of this magnitude. The blue line of Figure 2.10 shows
how the 2PCF of a single-run halogen (neither halogen nor cosmic variance has
been averaged out) compares to the reference BigMultiDark catalogue when they
have the same initial condition phases. We further show the halogen variance (σH )
and cosmic variance (σcosm ). The former has been computed as usual: running 5
realisations of halogen on the same 2LPT snapshot. For the latter we run five
2LPTic snapshots with different IC seeds. In order to avoid mixing σcosm and σH
for each of them we first averaged out halogen variance by running 5 realisations
of halogen and σcosm is computed as the dispersion of the five resulting (σH -free)
lines. We find for all scales that the halogen variance is dominated by the cosmic
variance, σH < σcosm .

2.5.2 Probability Distribution Function

A simple but powerful statistic for point particles is the Probability Distribution
Function (PDF), which is the distribution of particles per cell on a given scale.
100 Chapter 2. HALOGEN: an approximate halo catalogue generator

120

100

60
r2 ξ (r)

40
GOLIAT IC-0
HALOGEN IC-0
20 GOLIAT IC-1
HALOGEN IC-1
GOLIAT IC-2
HALOGEN IC-2
GOLIAT IC-3
0 HALOGEN IC-3
GOLIAT IC-4
HALOGEN IC-4

-20
0 20 40 60 80 100 120 140
r [Mpc/h]

Figure 2.9: 2PCF of halogen (lines) and the AHF (points) catalogues for five
different 2LPTic random seeds (colour-coded). The first case corresponds to the
original goliat used to obtain the α(M ) relation whereas the following share the
same setup besides the seed.

80
BigMultiDark
60 HALOGEN

40
r2 ξ (r)

-20
cosmic variance
1.2 HALOGEN variance
1.1
ξ/ξref

1
0.9
0.8

0 20 40 60 80 100 120 140

Figure 2.10: 2PCF of the FOF catalogue of the BigMultiDark simulation com-
pared to that of a single-run halogen (non-averaged) with the same initial condi-
tion phases. We also include error bars: in green the cosmic variance and in orange
the halogen variance (see text). The lower panel shows the ratio with respect to
BigMultiDark.
2.5. HALOGEN: Outcome 101

107
HALOGEN N=1000
BigMultiDark N=1000
6
10 HALOGEN N=500
BigMultiDark N=500
5 HALOGEN N=250
10 BigMultiDark N=250
HALOGEN N=125
4 BigMultiDark N=125
10
counts

103

102

1
10
0
10
1 10 100
Nhalo/cell

Figure 2.11: PDF of halo counts for both halogen (lines) and BigMultiDark
(points) catalogues from BigMultiDark. Several mesh numbers are used, as
labelled by colours, and these correspond to the physical scales of 2.5h−1 Mpc,
5h−1 Mpc, 10h−1 Mpc and 20h−1 Mpc respectively.

Though simple, it contains interesting information as it contains contributions from

the entire hierarchy of n-point functions [139–141].
Covering the BigMultiDark simulation with meshes of various (regular) sizes, we
show in Figure 2.11 a histogram of the number of halos per cell for both the halogen
and BigMultiDark catalogues; the cell size ranges from 2.5 to 10h−1 Mpc. We find
good agreement, especially at lower numbers of Nhalo /cell, where the contribution of
non-linear scales is reduced. We note that the mesh used to calculate the PDF is not
to be confused with the grid used by halogen for the NGP density assignment.

2.5.3 Power Spectrum

halogen has been designed to recover the 2PCF ξ(r) of a provided halo catalogue.
As the power spectrum P (k) is its Fourier Transform, it theoretically contains the
same information. However, this information is distributed differently in the two
functions and there is mode coupling when transforming from one to another: an
102 Chapter 2. HALOGEN: an approximate halo catalogue generator

1600
1400 BigMultiDark
HALOGEN
1200
1000
k P(k)

800
600
400
200
0
-200
1.15
1.1
1.05
ratio

1
0.95
0.9
0.85

0.01 0.1 1
k [h/Mpc]

Figure 2.12: Power Spectrum P(k) of halogen (blue line) and FOF (red line) for
BigMultiDark. The bottom panel shows their ratio. The Power Spectrum has
been computed using a N = 10243 mesh and corrected for shot noise as explained
in [142].

error at a given scale in one of the magnitudes can propagate to an error at all scales
in the other. So we expect to witness different strengths and weaknesses in P (k).
In Figure 2.12 we compare the power spectrum of the BigMultiDark FOF cata-
logue to the corresponding halogen realisation. We find agreement to 5% across the
scales 0.01hMpc−1 < k < 0.3hMpc−1 , but note that smaller scales k > 0.3hMpc−1
(r < 20h−1 Mpc) are underestimated. This underestimation arises from the smallest
scales of the 2PCF, r < lcell , which integrate through higher scales in P (k).

2.5.4 Correlation Function in Redshift Space

Observed galaxies are not directly located in 3D space, but 2D-angular (θ, φ) with
redshift z converted to a polar distance. However, such distances are modified by
galaxies’ peculiar velocities – velocity components that are not due to the Hubble
expansion. These modifications are encoded as Redshift Space Distortions (RSD),
and we can begin to account for them by assigning correct velocities to halos.
2.6. Comparison with other Approximate Methods 103

120
BigMultiDark RS
100 HALOGEN RS
80 selected particles RS

r ξ (r)
60
40
2 20
0
-20
1.15
1.1
1.05
ξ/ξref

1
0.95
0.9
0.85

0 20 40 60 80 100 120 140

Figure 2.13: 2PCF in redshift space (RS) for FOF (red points), and halogen (blue
line) of the BigMultiDark simulation. We also include in magenta the results of
our catalogue without applying the velocity bias (i.e. fvel = 1, ’selected particles’)
and find that a correct velocity bias is needed.

Using the halo velocities, we can mimic this effect when calculating the 2PCF. We
show the results of such an analysis in Figure 2.13, in which the monopole of the
2PCF in redshift space is compared for the halogen and BigMultiDark cata-
logues. To show the effect of our velocity transformation, we also include the 2PCF
of the ’selected particles’ in which the velocities were not transformed. The nor-
malisation and shape are significantly improved by the simple linear transformation
(Equation 2.9), and we find agreement to below 5% per cent at intermediate scales.

2.6 Comparison with other Approximate Meth-

ods

So far, we devoted this chapter to the construction and analysis of halogen. How-
ever, there are other approximate methods in the literature that also generate fast
halo mock catalogs. Within the Mocking Astrophysics program described in Sec-
104 Chapter 2. HALOGEN: an approximate halo catalogue generator

BigMD (NB) COLA EZmock HALOGEN Log-normal PATCHY PINOCCHIO PTHalos

CPU-hour 800,000 130 1.3 6.7 0.5 8 440 45
Memory 8Tb 550Gb 28Gb 130Gb 15Gb 24Gb 890Gb 112Gb
Particle (force) 38403 12803 9603 12803 12803 9603 19203 12803
mesh size (38403 )
Resolve halos YES YES NO NO NO NO YES YES

Table 2.5: Computing resources and related properties used by each code to generate
the halo catalogue analysed in this study. The CPU-hours can vary significantly from one
machine to another, but it is important to note their order of magnitude, which depends
on the algorithm and the particle mesh size. The memory usage is mostly determined by
the mesh size, that determines the spatial and mass resolution. Whereas most codes use
the same particle and force mesh, cola need 3 times more resolution in the latter. Codes
that need to resolve halos need more particles and, hence, more resources, but always much
lower than a full N -Body simulation (BigMD).

tion 1.1.3, the ’nIFTy cosmology’ workshop14 arose, in which we compared nearly
all existing methods for approximate halo mock catalogs.
We briefly present here some of the results that emerged from that comparison [3].

2.6.1 Description of methods

As described before (Section 2.1.1 & Section 2.2), all methods can be seen as a four
step process. The main differences among methods rest in the way they generate the
density field and how they apply a bias to generate a halo distribution. This idea is
graphically represented in Figure 2.14.
The methods presented here can be used in different contexts and each of them is
designed with different purposes. Some of them require less computing resources
at the price of having lower resolution, whereas others prefer to keep the resources
higher but gain accuracy. Table 2.5 compares the resources needed for each method
to generate the same halo catalogue (being the N -Body simulation BigMultiDark
in Table 2.1 the reference catalogue). Those that need to resolve halos (cola,
pinocchio and PThalos) have a predictive nature and typically require more
resources than those with a stochastic nature (EZmocks, halogen, patchy and
Log-Normal) that need to be fitted to a reference simulation.
14
http://popia.ft.uam.es/nIFTyCosmology
2.6. Comparison with other Approximate Methods 105

(EZmock, LN)

INITIAL
(PINOCCHIO) CONDITIONS MODIFIED
IC
2LPT
Gaussian
(HALOGEN,
ALPT
PTHALOS)
(PATCHY)
COLLAPSE +PM
(COLA)
TIMES

DENSITY
FIELD ZA
(EZmock)

FoF
(COLA) Non linear
+ mass deterministic
reassignment + stochastic
Halo build-up via MF match bias
+ 2LPT (PTHALOS) (EZmock, LN, PATCHY)
(PINOCCHIO) + mass bins
(HALOgen)

HALOS

Figure 2.14: Scheme of approximate methods. Most of them use a gravity solver
(2LPT, ALPT, 2LPT+PM, ZA) to generate a density field from which halos are
generated either using a halo finder or a stochastic bias. Some methods additionally
need to modify the initial power spectrum (EZmocks and Log-Normal). pinoc-
chio computes the halo formation and evolution in collapse time.

The main characteristics of the methods are shown in Table 2.6, and we briefly
describe them hereunder:

• cola [COmoving Lagrangian Acceleration, 119] is a PM method (Section 1.1.1)

in which the equations of motion have been rewritten by subtracting the 2LPT
solution ~xres = ~x − ~x2LPT . Then ~x2LPT is computed following LPT (as ex-
plained in Section 1.1.1), and ~xres can be integrated in larger timesteps, saving
substantial computational time. The halos are extracted with a halo finder.

• EZmocks. [Effective Zel’dovich approximation mock catalogue, 121] is con-

structed from the Zel’dovich approximation density field. It is an stochastic
method that maps the PDF from the reference simulation and fits several pa-
rameters (density saturation, density threshold, P (k)-tilt, BAO-enhancement,
etc.) to obtain the correct 2-point and 3-point functions.
106 Chapter 2. HALOGEN: an approximate halo catalogue generator

• halogen. [2], extensively described in this chapter, is a stochastic method

that fits the large scale bias with one parameter (power-law bias), by placing
halos in a 2LPT density field.

• LogNormal. At large scale, galaxies have been measured to follow a lognor-

mal distribution [143, 144], and this can be derived from the continuity equa-
tion in linear perturbation theory if the initial conditions are gaussian [114].
This method places halos following a lognormal distribution that matches by
construction any desired correlation function (up to certain scales), however it
lacks any physics of any higher order.

• patchy[118] solves gravity with Augmented-LPT ([145]), a combination of

2LPT at large scales and spherical collapse model at small scales. It uses a
non-linear, scale-dependant and stochastic biasing prescription based on several
parameters (density threshold, density cut-off, power-law, etc.) fitted to match
the PDF and power spectrum.

• pinocchio[116] is based on the ellipsoidal collapse, solved with the aid of

3LPT to compute the time at which mass elements collapse (in the orbit-
crossing sense), and Extended Press & Schecther (EPS) to deal with multiple
smoothing radii. It starts from the generation of a a regular grid in Lagrangian
space, the density field is smoothed on a set of scales, and the collapse time is
computed for each particle and at each smoothing radius. The earliest time is
recorded as the estimate of collapse time. An algorithm mimics the hierarchical
formation and merging of halos, and collapsed objects are moved with 2LPT,
finally generating both the halo catalogue and merger tree.

• PThalos[115, Perturbation Theory halos] is based on a 2LPT density field

from which halos are found using a friends-of-friend algorithm. Since matter
collapses differently in 2LPT than in an N -Body simulation the linking length
sim (1/3)
∆vir
used is b2LPT = bsim ∆2LP T .
vir
2.6. Comparison with other Approximate Methods 107

2.6.2 Results

Here we study how different methods perform in the 1, 2 and 3-point statistics.
Recall that the PDF is primarily a 1-point statistics, but with contributions of all
higher orders. This section focuses more in the 2-point functions, which are the most
studied in the literature (and the primary objective of halogen) because it contains
the most net information about cosmology (including BAO). 3-point functions are
more difficult to measure with current surveys (although it has been done [147]) and
have high contributions from non-linearities more difficult to predict from the theory.
Nevertheless, it is also a target for the future surveys.
The PDF distribution is shown in Figure 2.15, where we find two outliers: the
Log-Normal method and pinocchio. Note however, that the scales explored here
2.6M pc/h are already highly non-linear.
In regard to the 2-point function, looking at the ξ at the top part of Figure 2.16, we
find that most methods give similar results both in real (left) and redshift (right)
space. For the Log-Normal method, velocities where not computed (although they
can be computed with linear theory), so all results from redshift space are missing.
The normalisation of PThalos is off by more than 20%, this is due to the fact
that here we took the binding length b2LPT from a theoretical value and categorised
PThalos as a predictive method, but b2LPT could be left free and the bias fitted.
In Fourier space (bottom of Figure 2.16), it appears similarly at the linear scales,
but this figure focuses more in the non-linear scales (k > 0.1h/M pc, compared to
Figure 2.12) where methods based on 2LPT (halogen, PThalos and pinocchio)
and Log-Normal start having problems. Only methods with accurate density field
(cola and patchy) or many free bias parameters (EZmocks) can reproduce these
scales within 5% error.
For the 3-point function something similar occurs: we need more sophisticated meth-
ods. Particularly, the Log-Normal does not reproduce even the shape of the func-
tions, whereas halogen and PThalos (and slightly pinocchio) have an offset in
the normalisation but reproduce the shape.
In conclusion, as long as the 2-point is concerned, nearly all the methods presented
108 Chapter 2. HALOGEN: an approximate halo catalogue generator

Figure 2.15: PDF of halo counts in a grid with N = 9603 cells for the different
approximate methods.

here can reproduce good results at large scales. This is particularly interesting for
BAO analysis. If we are also interested in higher order statistics we will need more
sophisticated methods that may require more computing resources or a more complex
bias model. Depending of the needs of a particular study it will be more convenient
to use one code or other.

2.7 Conclusions

We have presented a method called halogen for the construction of approximate

halo catalogues. It consists of 4 major steps:

1. Create a distribution of particles in a cosmological volume using 2nd -order

Lagrangian Perturbation Theory and distribute them in a grid of cell size lcell

2. Sample a theoretical halo mass function n(> M ) with a list of Nh halo masses
M and order them in descending mass.

3. Place the halos at the position of particles with a probability dependent on

2.7. Conclusions 109

100 140
BigMD.FoF BigMD.FoF
80
COLA 120 COLA
EZmock EZmock
HALOgen 100 HALOgen
60 LogNormal PATCHY
PATCHY 80 PINOCCHIO
40 PINOCCHIO 60
PTHalos
PTHalos

s2 ξ0 (s)
r 2 ξ0 ( r )

20 40

20
0
0
20
20

40 40
1.15 1.15
1.10 1.10
1.05 1.05
ratio

ratio
1.00 1.00
0.95 0.95
0.90 0.90
0.85 0.85
0.800 20 40 60 80 100 120 140 160 180 0.800 20 40 60 80 100 120 140 160 180
r [h−1 Mpc] s [h−1 Mpc]

600 800
700
500
600
400 500

400
300
k1.5 P0 (k)

k1.5 P0 (k)

300
BigMD.FoF
200 COLA BigMD.FoF
EZmock 200 COLA
HALOgen EZmock
LogNormal HALOgen
PATCHY PATCHY
PINOCCHIO PINOCCHIO
100 PTHalos PTHalos
90 100
1.15 1.15
1.10 1.10
1.05 1.05
ratio

ratio

1.00 1.00
0.95 0.95
0.90 0.90
0.85 0.85
0.80 0.1 0.2 0.3 0.4 0.5 0.80 0.1 0.2 0.3 0.4 0.5
k [h Mpc−1 ] k [h Mpc−1 ]

Figure 2.16: Comparison of the 2-point functions for different methods. Top sub-
figures show configuration space whereas bottom panels show Fourier space. Left
sub-figures show real space and right subfigures are represented in redshift space.
110 Chapter 2. HALOGEN: an approximate halo catalogue generator

4.0 1e8
BigMD.FoF
3.5 COLA
EZmock
3.0 HALOgen
LogNormal
2.5 PATCHY
PINOCCHIO
PTHalos

B(θ)
2.0

1.5

1.0

0.5

0.0
1.2

1.1

ratio
1.0

0.9

0.8
0.0 0.2 0.4 0.6 0.8 1.0
θ12/π

Figure 2.17: Comparison of the 3-point functions for different methods. Left: 3-
point function in real space with fixed r1 = 10h−1 Mpc and r2 = 20h−1 Mpc and free
r3 Right: Bispectrum with k1 = 0.1 h Mpc−1 and k2 = 0.2 h Mpc−1 , and a varying
angle θ12 .

α(M )
the cell density and halo mass Pcell ∝ ρcell . We select random particles
within cells, respecting the exclusion criterion and conserving mass in cells (cf.
Section 2.3).

4. Assign the velocity of the selected particle to the halo through a factor vhalo =
fvel (M ) · vpart

Further, we noted the modularity of these steps and acknowledged alternatives for
each of them. The 2LPT in step (1) provides us with the correct large scale clustering
at a low computational cost, while step (2) reconstructs the halo mass function. The
heart of halogen is step (3) where the mass dependent bias is modelled through
the parameter α(M ) that stochastically places more massive halos in overdensities,
recovering the correct 2-point correlation function as a function of mass. We also
preclude halos from overlapping to match the small-scale behaviour of the 2-point
clustering. In the last step (4), we re-map particle velocities in order to obtain the
correct halo velocity distribution.
2.7. Conclusions 111

We studied how the parameters of the method – α(M ), fvel (M ) and lcell – can be
optimised and summarised the results in Table 2.3. Though halogen needs a
reference halo catalogue from an N-Body simulation to obtain α(M ) and fvel (M ),
once they have been optimised for a given setup, halogen can be used to generate
a multitude of halo catalogues, allowing the quantification of cosmic variance.
The halo mass function is recovered by construction to the theoretical value. The
2-point function at intermediate scales (10h−1 Mpc < r < 50h−1 Mpc, where the bias
is controlled by α(M )) can be obtained in a BigMultiDark-like simulation at the
∼ 2% level and to the 15% level at BAO scales (80h−1 Mpc < r < 110h−1 Mpc)
(Figure 2.10). In redshift space, the error at intermediate scales rises to ∼ 4%
and remains at ∼ 15% at large scales (Figure 2.13). The clustering has a mass-
dependence, for which the accuracy is controlled by the number of bins in the α(M )
fit (Figure 2.4). The power spectrum can be recovered at the 5% level in the range
of scales 0.01Mpc−1 h < k < 0.3Mpc−1 h (Figure 2.12). The halo PDF is accurately
reproduced at low Nhalo /cell, but overpredicts the high-Nhalo /cell tail where the
contributions of non-linearities are higher (Figure 2.11).
halogen was constructed in favour of simplicity of the method and adaptabil-
ity. Even though goliat and BigMultiDark have different characteristics (see
Table 2.1), halogen can be used for both with little recalibration effort. In Sec-
tion 3.2.1 we will also fit it to mice simulation, with still another very different setup.
This indicates that halogen is not only capable of running on one specific box-size,
redshift or cosmology, which makes it a powerful tool for exploring the statistics of
varying cosmologies etc.
We have also verified that changing the initial phases in 2LPTic for halogen leads
to changes in the correlation function (due to cosmic variance) that follow the N -
body simulation both in shape and normalisation. This implies that doing so will
yield robust estimates of cosmic variance, over potentially hundreds to thousands of
realisations. Hence, it has been demonstrated that halogen is a powerful tool for
modelling statistics of halo catalogues, and quantify the effects of cosmic variance
on them.
112 Chapter 2. HALOGEN: an approximate halo catalogue generator

Comparing halogen with other methods, we find that the 2-point correlation func-
tion at large scales is well recovered by nearly all methods including halogen.
halogen is also found well suited for PDF statistics. If we also want to recover
non-linear scales or 3-point functions, a more sophisticated method would be re-
quired. This method could either have a very accurate density field as cola, for
which computing resources are high compared with a statistical approach as halo-
gen, or a complex bias model with many free parameters that need be tuned to
recover all the different statistics (as patchy and EZmocks), losing in adaptability
and simplicity. This links with the idea of modularity remarked across the chapter,
we could change the density field (step 1) or the way we place halos (step 3).
For example, for BAO physics, where only large scales of the 2-point function are
relevant or for Counts-in-Cells (observational counterpart to PDF), halogen has
been demonstrated to be a powerful tool able to generate fast mock catalogues, with
low computing resources and simple algorithms. In the next chapter we will see an
example of exactly this: how halogen is used to study the systematics and account
for the cosmic variance of an experiment, and show how eventually will be used to
determine the error bars of a BAO measurement.
Approximate halo mock generation is an emerging field that will have great impact
in the coming years with the increasing volume surveyed by the experiments. For
different studies there will be a different optimal method depending on the accuracy
needed, computing resources available, adaptability to different cosmologies required,
number of catalogs needed, etc. Having a variety of methods available and knowing
the strengths and weaknesses of all of them will be crucial for the experiments. The
new era of observational cosmology is moving forward fast and cosmology modelling
must adapt its pace for the new times.
2.7. Conclusions

cola EZmocks halogen Log-Normal patchy pinocchio PThalos

Mass, Vel M + V M(post-process) + V M(binned) + V – M(post-process) + V M+V M +V
Initial conditions 2LPT ZA 2LPT Gaussian ALPT 2LPT 2LPT
Using white noise NO YES YES NO YES YES NO
Assumed HMF NO YES YES – YES NO YES
Assumed bias model NO YES YES NO YES NO NO
Substructures Post-process YES Post-process Yes YES Post-process Post-process
Merger histories NO NO NO NO NO YES NO
No. free params 0 7 1 (each mass bin) – 7 5 1
No. free params for z-space dist. 0 1 1 – 2 0 0
No. free params for HMF 0 – adopt HMF – – 5 adopt HMF
No. free params for bias 0 6 1 – 5 0 0

Table 2.6: Main technical features of the methodologies. From top to bottom: whether they provide mass and velocity,
how they generate initial conditions, whether they used the same initial random seeds as BigMultiDark (cola did not
and large scales could be affected by cosmic variance), whether they generate or assume a halo mass function (EZmocks
and patchy generate it with a post-processing procedure explained in [146]), whether they assume a bias model, whether
provide substructure and merger trees, the number of free parameters introduced in total, for the RSD, for HMF and for
the bias.
113
Chapter 3

Dark Energy Survey Galaxy Mock

Catalogues

3.1 The Dark Energy Survey

The Dark energy Survey (DES) [19] is a photometric survey designed to observe the
southern hemisphere sky. In particular, DES aims at constraining the equation of
state w(a) of Dark Energy in order to shed light on its nature. For that it combines
four different main probes:

• Baryon Acoustic Oscillation (BAO)

• Type Ia Supernova (SNIa)

• Galaxy Cluster Counts

• Weak Lensing (WL)

Observations are performed with the 570-Megapixel digital Dark Energy camera
(DECam) mounted on the 4-meter Victor Blanco Telescope in Chile. DECam was
specifically designed for this experiment. Its main peculiarity is its high sensitivity at
the red end of the visible spectrum and at the near infrared, crucial for the detection
of objects at high redshift. The survey will cover 5000 deg2 using a field of view of

115
116 Chapter 3. Dark Energy Survey Galaxy Mock Catalogues

2.2 deg of diameter with five different filters (the traditional g, r, i, z and the infrared
Y) over 5 years, reaching a magnitude limit of 24 in the band i. DES will observe
∼ 200 million galaxies up to z ∼ 1.4 determining its angular position, photometric
redshift (photo-z) and shape.
Opposed to spectroscopic redshift surveys, where redshift can be measured accurately
(σz ∼ (0.001 − 0.0001)(1 + z)) with a spectrograph, DES is a photometric survey
where redshifts are estimated by the combination of flux obtained in the 5 filters
(see some techniques in [148–150]) with a typical accuracy of σz ≈ 0.03(1 + z).
This decreases the knowledge we have about galaxy radial positions, but at the
same time it allows us to increase significantly the number of observed galaxies (as
spectra measurements are very time consuming), obtaining a complete magnitude-
limit survey. Additionally, problems with fibre collisions and apertures, associated
with spectroscopy, do not appear here.
Baryonic Acoustic Oscillations (BAO) of the primordial photon-matter plasma leave
their imprint in the large scale distribution of matter. From galaxy positions, we can
measure the correlation function and find the BAO feature. Detecting BAO with
a photometric survey will be an arduous task, since most of the radial information
is lost, and the density field is effectively smoothed. But, carefully analysing the
data, DES will be able to detect the BAO scale evolution with redshift in the range
0.6 . z . 1.4 and, consequently, measure the evolution of expansion of the Universe.
More importantly, this is a range not explored before with BAO, and it will tighten
the constraints in the distance-redshift relation shown in Figure 2.
Galaxy shapes are distorted due to gravitational lensing. Whereas in some cases, this
effect is so strong that we see multiple images of the same galaxy (strong lensing),
generally is much milder and can not be seen in individual galaxies (as intrinsic
dispersion of shapes is larger), but only study it statistically. This phenomenon is
known as Weak Lensing and tells us about the amount and clustering stage of dark
matter. Galaxy cluster counts is another mean to measure the dark matter and its
stage of clustering, as is tightly related to the high mass halo abundance as seen in
simulations. In the standard ΛCDM model we expect to detect over 100, 000 clusters
with DES (being sensitive to clusters with ∼ 10 red-sequence galaxies). Studying
these two effects as a function of time will be another probe of the expansion of the
3.1. The Dark Energy Survey 117

universe.
SNIa are used as standard candles in Cosmology to study the evolution of the Uni-
verse and were the first evidence of Dark Energy. DES has 4 special fields for SNIa
search different to the galaxy field (Figure 3.1), as for SNIa we need to target the
same field periodically to search new appearing objects and characterise their light
curves (flux as a function of time). Each DES supernova field is revisited ∼ 5 times
every month, and will discover ∼ 4000 SNIa up to redshift z ∼ 1.
All probes combined together will tightly constrain the time-dependent equation of
state of Dark Energy parametrised as w = w0 + (a − 1)wa , as already indicated at
the bottom of Figure 1. But DES is well suited for many more astrophysical studies.
From DES early data, there have been many remarkable discoveries [151]: 17 (out of
48 known) Milky Way satellite galaxies, a new type of objects termed Super-luminous
SN, high-redshift (z ∼ 6) and lensed quasars, 34 new transnewtonian objects, etc.
DES is also relevant for the two major astrophysical events that happened in the
last few months and that even reached the public attention: the discovery of a ninth
planet in the Solar System [152] and the direct detection of gravitational waves by
the LIGO experiment [153]. DES has an agreement with LIGO to search for an
optical counterpart of any triggered detection of gravitational waves. No optical
counterpart was found for LIGO event GW150914 [154, 155], caused by a merger
of two massive black holes. This is not surprising, since this type of merger is not
expected to emit in the optical, but it can be really useful for other type of events or
to find unexpected physics. As for the ninth planet, the predicted trajectory [156]
passes through the DES observed area, so that a detection may be possible in the
future.
The DES data are split by seasons into Science Verification (SV), Year-1 (Y1), Year-
2 (Y2), etc. The Science Verification observations were taken in 2012 and 2013 and
provided data of over 250 deg2 at nearly the nominal depth of DES. It was used
to test all the science potential of the 5-year survey, finding promising results for
cosmology [157–162]. While SV has been widely analysed, DES collaboration is now
analysing Y1 post-processed data taken between August 2013 and February 2014. Y1
covers a large fraction of the targeted area but at a milder flux limit (Figure 3.1). Y2
118 Chapter 3. Dark Energy Survey Galaxy Mock Catalogues

Figure 3.1: DES observed strategy footprint from [151]. We find the supernova field,
SV regions, and Y1, Y2 and Y5 masks in equatorial coordinates. Dash and dotted
lines represent, respectively, the galactic and ecliptic planes.

is currently being post-processed, although some results quoted above have already
emerged from it.
This chapter is part of the work done within the Large Scale Structure Working
Group with the aim of detecting BAO from Y1 data. It focuses particularly in the
creation of galaxy mock catalogues matching the overall statistics of the selected
LSS-Y1 sample (see Section 3.2) to be used to compute the covariance matrices and
error bars on the large scale clustering. We further present preliminary results of its
application for the data analysis (Section 3.3).
3.2. HALOGEN lamps: observational galaxy mock catalogues 119

3.2 HALOGEN lamps: observational galaxy mock

catalogues

In Chapter 2 we described the halogen method to generate halo mock catalogues in

a simulation box, i.e. a distribution of halos in cartesian comoving coordinates at a
fixed cosmological time. However, these dark matter halos are not direct observables,
we need to include their luminous counterpart: galaxies. Here, we present halogen-
lamps, a new implementation of halogen with three new observational features:

• Lightcone. From observations we do not measure cartesian coordinates (X,Y,Z)

at a fixed time t but angular positions and redshift –a combination of radial,
temporal and velocity information– (ra, dec, zrsd )1 . This effect is included in
Section 3.2.1

• Photo-z. DES is a photometric survey for which zrsd is estimated by zph with
low precision. This effect mixes galaxies of different z in the same zph -bin, we
will see how to implement it in Section 3.2.2.

• Galaxy population. We do not observe dark matter halos but galaxies,

which show different clustering. We generate galaxy catalogues with a HAM
and HOD method in Section 3.2.3.

In this section we simulate these three effects with the aim of creating galaxy
mock catalogues with the same statistical properties as the selected LSS-Y1 sample.
Namely, the same galaxy number density as a function of redshift n(zph ), the same
angular correlation function in zph -bins wi (θ) and the same P (zrsd |zph ) distribution.
The selection of the LSS-Y1 sample has been optimised to yield a BAO detection with
error below the 5%. It consists in a sub-sample of the full Y1 data, to which we apply
three main cuts in the different filter magnitudes: completeness 17.5 < mi < 22,
brightness mi < 19 + 3zph and red selection (mi − mz ) + 2(mr − mi ) < 1.7. The
1
From now we will use zrsd as the ideally measured redshift with no error but with redshift space
distortions included. We introduce this notation to distinguish it from the z = ztrue that represents
the cosmological time and has been used so far, and also from zph .
120 Chapter 3. Dark Energy Survey Galaxy Mock Catalogues

selection has been done balancing the trade-off between having a sample with higher
bias and better photo-z (brighter and redder sample) and reducing the shot-noise
(that increases if we reduce the number density). This is optimised together with
the selection of the mask, based on the goodness of the different areas (see [163] for
the details).
This links with a forth observational feature: the application of a mask. A mask
consists in a list of pixels (we use healpix pixelisation2 ) telling us which regions of the
sky can be used, and which ones can not. Excluded regions can be due to no observa-
tion, insufficient observation (for magnitude limited samples), bad seeing, foreground
(mainly stellar) contamination or other causes for large systematics. This leads to
a somewhat patchy footprint that depends specifically on the selected sample. The
red region in Figure 3.1 represents approximately the Y1 mask. More specifically,
the Y1-LSS mask has an area of ∼ 1426deg 2 . This mask does not fit in an octant (it
is around 150◦ wide in ra), and we need to cover it with three different patches of the
lightcone generated in Section 3.2.1. We will not enter into the details of masking,
beyond noting that the Y1-LSS mask has been applied to all the catalogues analysed
in the figures from Figure 3.4 onwards.

3.2.1 Lightcone

Changing from cartesian coordinates to an observational lightcone is very simple

once the observer is placed. We place the observer at one corner of the box, so that
we can simulate one octant of the sky, and transform coordinates as

Y
ra = arctan
X

Z
dec = arcsin (3.1)
r
1 + z(r)
zrsd = z(r) + ~u · r̂
c
√
being r = X 2 + Y 2 + Z 2 , ~u the comoving velocity, r̂ = ~r/r and z(r) the inverse of
2
http://healpix.sourceforge.net/
3.2. HALOGEN lamps: observational galaxy mock catalogues 121

z
dz 0
Z
r(z) = c (3.2)
0 H(z 0 )

The reason why this implementation is called a lightcone is because the cosmolog-
ical time (t(z)) is determined by the radial distance r in the same way as done in
observations as light travels. But the Universe changes with time and hence, our
simulations will also change with redshift.
Particularly, we are interested in a redshift-dependent clustering, and hence we will
have the halogen parameters (α and fvel , summarised in Table 2.3) varying as
a function of redshift as well. For this we will use as a reference the mice N -
Body simulation (see Table 2.1 and Section 2.1.2) and fit α(M ) and fvel (M ) at the
snapshots z = 0, 0.5, 1.0, 1.5 and interpolate at intermediate redshifts.
The outcome of the fit is shown in Figure 3.2 where we see the mass-dependent clus-
tering for the snapshots z = 0.5 and z = 1.0. Note, that the reference density for this
simulation is ∼ 4.5 times bigger than the one previously used for BigMultiDark,
and the minimum mass Mmin here 4 times smaller. This is roughly the minimum
number density that we need to simulate the sample. We found that in this case a
logarithmic binning of masses was more useful, and we represent in Figure 3.2 the
Mass thresholds that were used during the fitting.
The halogen parameters (including the HMF) were interpolated to z = 0.55,
0.625, 0.675, 0.725, 0.775, 0.825, 0.875, 0.925, 0.975, 1.05 and halogen was run
at those redshifts. We build the lightcone from the superposition of zrsd shells of
those snapshots by setting the edges at the intermediate redshifts, and saving data
from 0.45 < zrsd < 1.2 (restricted for storage saving). We repeat the same process
8 times setting the observer in the 8 corners of the box to generate 8 different cat-
alogues. This process might not be ideal and we are working on a future version of
the catalogues where we avoid the need for 10 snapshots by building the lightcone
directly in one box with growth factors that depends on the position D1,2 (z(r)) in
the 2LPT Equation 1.5.
Finally, we compare the resulting halogen lightcone with the halo lightcone gener-
ated by mice in Figure 3.3. The mice simulated lightcone is constructed from fine
122 Chapter 3. Dark Energy Survey Galaxy Mock Catalogues

1000 1000

z=0.5 z=1.0

100 100
r2 ξ (r)

r ξ (r)
2
HALOGEN Mth=1.6e14 HALOGEN Mth=1.6e14
MICE Mth=1.6e14 NB Mth=1.6e14
Mth=8e13 Mth=8e13
10 Mth=4e13 10 Mth=4e13
Mth=2e13 Mth=2e13
Mth=1e13 Mth=1e13
Mth=5e12 Mth=5e12
Mth=2.5e12 Mth=2.5e12
1.15 1.15
1.1 1.1
1.05 1.05
ratio

ratio
1 1
0.95 0.95
0.9 0.9
0.85 0.85
0 20 40 60 80 100 120 140 0 20 40 60 80 100 120 140
r r

Figure 3.2: 2-point correlation function of mice vs. halogen halos in the simulation
box at the snapshots z = 0.5 (left) and z = 1.0 (right). We show the different mass
thresholds used during the fit.

shells (∆z = 0.005 − 0.025) of snapshots generated from a full N -Body simulation,
and using the velocity of the particles to extrapolate their positions at the precise
moment they cross the lightcone [164]. Remarkably, despite the great differences in
the methodology, the angular correlation function from both lightcones shows very
good agreement at all the interpolated redshifts.

3.2.2 Photometric Redshift

As already explained, a photometric survey can determine the redshift of a galaxy

with a precision of σz /(1 + z) ≈ 0.03, depending on the sample and the estimating
algorithm. Here, we use data with photo-z estimated by BPZ [165, 166]. The aim
of this section is to apply this effect in the mock catalogues.
Data have been decided to be binned in 8 zph -bins of width 0.05 between zph = 0.6
and zph = 1.0. The same algorithm that provides the zph estimates an error in that
measurement and P (zrsd |i) can be estimated. This is, the probability of a galaxy
having redshift zrsd given that it lies in the zph -bin i. These probabilities are shown
in Figure 3.4 (crosses) and can be fitted by a Gaussian with width σ i = si · (1 + z)
(although not explicitly shown here), whose values are shown in Table 3.1.
A naive way to apply this to our catalogues would be to add a Gaussian random
3.2. HALOGEN lamps: observational galaxy mock catalogues 123

MICE vs HALOGEN lightcone Mth=2.5x1012

0.1

0.01
w(θ)

0.001 z=0.5 MICE

z=0.625 MICE
z=0.725 MICE
z=0.825 MICE
z=0.925 MICE
z=1.0 MICE
0.0001 z=0.5 HALOGEN
z=0.625 HALOGEN
z=0.725 HALOGEN
z=0.825 HALOGEN
z=0.925 HALOGEN
z=1.0 HALOGEN
1e-05
0.2 0.5 1 2 5
θ(deg)

Figure 3.3: Angular correlation function of halos from the mice (crosses) and halo-
gen (lines) lightcones. The different curves correspond to different redshift bins with
width ∆zrsd = 0.5 and centred at the indicated zrsd .

number with the same width σ i to our zrsd , i.e.

zph = zrsd + ∆ph (zrsd ) · (1 + zrsd ) · Rgauss (0, 1) (3.3)

with

bin(z)
∆ph (z) = sdata (3.4)

being Rgauss (0, 1) a Gaussian random number with mean 0 and standard deviation
1, and bin(z) a discrete function giving i between 1 and 8 according to Table 3.1.
For zrsd > 1.0 and zrsd < 0.6 we use, respectively, the value of ∆ph from the last and
first bin.
However, as shown in Figure 3.4 (cp, dashed lines), this is far from reproducing the
data. The reason is that, as σ i is not independent of redshift (although si is flat in a
certain z range), a width in a P (zrsd |zph ) distribution is not equivalent to a width in
P (zph |zrsd ). This concept may be better understood with an illustration: a galaxy
124 Chapter 3. Dark Energy Survey Galaxy Mock Catalogues

bin-i 1 2 3 4 5 6 7 8 – – Ξ2
z-range [0.6,0.65) [0.65,0.7) [0.7,0.75) [0.75,0.8) [0.8,0.85) [0.85,0.9) [0.9,0.95) [0.95,1.0) [1.0,1.1) [1.1,1.2) –
sidata 0.030 0.030 0.029 0.029 0.030 0.035 0.041 0.049 – – –
∆icp 0.030 0.030 0.029 0.029 0.030 0.035 0.041 0.049 – – 0.612
sicp 0.031 0.034 0.039 0.042 0.044 0.044 0.044 0.043 – – –
∆iopt 0.031 0.029 0.029 0.029 0.029 0.029 0.029 0.030 0.040 0.050 –
siopt 0.030 0.030 0.031 0.032 0.035 0.039 0.040 0.039 – – 0.098

Table 3.1: Photo-z. Redshift intervals for the 8 z-bins (rows 2 and 1) and, below, measured (from P (zrsd |i)) and
applied (in Equation 3.3) widths. First (row 3), we find the measured width in the data sidata , then we present the
input (∆) and output (s) of two models: cp for which we take ∆icp = sidata and opt for which ∆iopt are free and set
to minimise Ξ2 . For ∆iopt we allow two additional z-bins. Finally, in the last column we show the Ξ2 as defined in
Equation 3.5 obtained for the two models.

that has been assigned zph = 0.65 will be more likely coming from zrsd = 0.8 than
from zrsd = 0.5, since the error applied at higher redshifts is bigger. This effect skews
and widens the distribution. This is also seen in the widths sicp measured from the
P (zrsd |i) distribution of the catalogs after applying this method (Table 3.1).
In order to improve this, in a second method, we vary the values of ∆iph and minimise

8
X (si i
method − sdata )
2
Ξ2 = (3.5)
i=1
(sidata )2

where, simethod is the measured width in the P (zrsd |i) distribution after having applied
∆imethod in Equation 3.3.
Further, we allow different values of ∆ph in two additional bins, z ∈ [1.0, 1.1) and
z ∈ [1.1, 1.2), as we find it helps minimizing Ξ2 . The best fit values ∆iopt and the
outcome siopt are shown in the last two rows of Table 3.1. Note that ∆opt remains
nearly flat in all the target redshift range, and that it is the contamination from
higher redshifts what makes siopt change with redshift.
The P (zrsd |i) for this method has been also plotted in Figure 3.4 (opt, solid lines),
showing an improvement with respect to the previous method. We fix this photo-z
scheme for the rest of the results presented below.

3.2.3 Galaxies with HOD and HAM

So far, all the clustering measurements shown throughout the thesis were obtained
from halo catalogues at a given mass threshold. But observed clustering is typi-
3.2. HALOGEN lamps: observational galaxy mock catalogues 125

9
Data
∆cp
8
∆opt
bin 1
7 bin 2
bin 3
6 bin 4
bin 5
5 bin 6
P(zrsd|i)

bin 7
bin 8
4

0
0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 1.3 1.4
zrsd

Figure 3.4: zrsd probability distribution in each zph -bin i for the data and the two
photo-z schemes described in the text and applied to the mock catalogues.

cally measured from galaxy catalogues with a magnitude limited sample with the
associated selection effects and, more generally, with redshift-dependent colour and
magnitude cuts.
One could assume that the most massive halo in a simulation would correspond to
the most luminous galaxy in the observations and that we could do a one-to-one
mapping in rank order. This is certainly very optimistic and we need to add a
scatter in the Luminosity-Mass relation (L − M ) that will decrease the clustering
for a magnitude-limit sample. This idea presented here is the basis of the Halo
Abundance Matching (HAM) method [76–79].
halogen was designed to only deal with main halos, neglecting subhalos (Sec-
tion 2.3.2 & Section 2.1.2). This limits the potential of HAM, as we can not use its
natural extension to subhalos SHAM, where there is more freedom in the physics
modelling (see e.g. [167]).
Nevertheless, we already argued (Section 2.1.2) the possibility of adding substructure
to a main halo catalogue with a Halo Occupation Distribution (HOD) scheme. We
126 Chapter 3. Dark Energy Survey Galaxy Mock Catalogues

know that halos can host more than one galaxy, especially massive halos which rep-
resent galaxy clusters and host tens of galaxies. If we attribute a number of galaxies
Ngal which is an increasing function of the halo mass (Mh ) to a halo mock cata-
logue, the clustering will be enhanced, since massive halos will be over-represented
(as occurs in reality). This is the basis of the HOD methods [67–75].
We just presented two models that need to be implemented to a halo catalogue to
get a realistic galaxy catalogue. The details of these methods need to be matched
to observations via parameter fitting. This process can be particularly difficult if
one aims at having a general model that serves for any sample with any magnitude
and colour cut at any redshift (e.g. [73]). Additionally, the HOD implementation
will determine the small scale clustering corresponding to the correlation between
galaxies of the same halo. This is the 1-halo term according to the halo model [168],
as opposed to the 2-halo term, which is relevant at large scales and corresponds to
correlation between galaxies of different halos.
Here, we aim at presenting the first end-to-end galaxy mock catalogue set for the
Y1-LSS selected sample. Hence, we start from the simplest model that can match
the large scale clustering and number density. This will consist in applying a HAM
with a one-parameter L − M scatter when the halo-clustering is higher than the data
and a one-parameter HOD when the halo-clustering is lower than the data.
As we are only interested in a particular sample, we use M gal as a proxy for lumi-
nosity, and the HAM scatter is modelled as

logM gal = logMh + γ · Rgauss (0, 1) (3.6)

with γ a free parameter, indicated the dispersion in dex (decimal exponent units).
For the HOD, we always set a central galaxy at the centre of the halo and Nsat
satellite galaxies following a NFW profile [66] where Nsat is a Poisson draw of the
halo mass divided by the free parameter M1 :

Mh
Nsat (Mh ) = RPoisson (3.7)
M1
3.2. HALOGEN lamps: observational galaxy mock catalogues 127

We can find in the literature more complex Nsat (Mh ) functions that include expo-
nentials, exponentials by parts and error functions. However, we chose Equation 3.7
for simplicity in the fitting, finding valid results for the desired purpose. Moreover,
studies using power-law HODs find best fit values for the exponent very close to
unity [73], which leads to Equation 3.7.
The concentration relation needed for the NFW placement is determined by the mass
following [169]. The velocities of the central galaxies are taken from the host halo,
whereas the velocities of the satellite have an added dispersion following [170, 171]

vsat = vhalo + σv (Mh ) · Rgauss (0, 1)

Mh
1/3 (3.8)
2 1/6

σv (Mh ) = 476fvir ∆vir E(z) 15
km/s
10 M /h

where we use fvir = 0.9 and ∆vir (z) = 18π 2 + 82d(z) − 39d(z)2 from spherical top-hat
collapse theory, being d(z) = 1 − Ω(z) and E(z)2 = H(z)2 /H02 . Note that virial
theorem together with Equation 1.9 already predicts σ ∝ M 1/3 .
All the galaxies contained in a halo with mass Mh are assigned with the same mass
M gal = Mh .
This HOD-HAM process is done before constructing the lightcone and applying the
photo-z, but measurements of the target wi (θ) in the 8 zph -bins and its associated
χ2 are performed after those processes:

8 i
2
X X (wdata (θj ) − w̄i (θj ))2
χ = (3.9)
i=1 0.1◦ <θj <1◦
∆w(θj )2

Here, we use the same z-bins previously introduced in Table 3.1. The fit of this
procedure is done from 8 catalogues as follows:

• In each ztrue -bin i apply either the HOD or the HAM scatter with one pa-
rameter (M1i or γ i ) depending on whether we need to enhance or reduce the
clustering, respectively. The bin-1 value is also used for the low-z extension of
the lightcone (ztrue < 0.6) and the bin-8 for the upper part (ztrue > 1.0).
128 Chapter 3. Dark Energy Survey Galaxy Mock Catalogues

• Apply the lightcone, mask and photo-z. This mixes galaxies drawn from differ-
ent physics (γ i /M1i ), but this helps smoothing the transition between z-bins.
gal
• Compute the mass threshold Mth (zph ) (a proxy for a redshift-dependent mag-
nitude cut) needed to match the N (z) of data for each of the 8 catalogues.
gal
Compute the average of them and apply that threshold M̄th (zph ) to all the
catalogues.

• Compute the angular correlation function in the 8 zph -bins for the 8 catalogues
and measure their mean w̄(θ) and standard deviation ∆w(θ) to estimate χ2
(Equation 3.9)

• If convergence of χ2 is reached the fit is finished, otherwise, the processes is

repeated with another set of parameters {M1i ,γ i }.

The resulting fitted parameters are shown in Table 3.2. These are used for the genera-
tion of the 8 halogen-lamps catalogues whose statistics are shown in Figure 3.5 and
Figure 3.6. Both number density and angular clustering show an excellent agreement
with data. Moreover, we see that the galaxy mock catalogues (halogen-lamps)
represents a great improvement with respect to the halo catalogues (halogen).
Hence, the implementation of the HOD-HAM scheme appears necessary.
Finally, we remark that the dispersion γ found in the last three bins is large compared
to the typical dispersions found in the literature [172, 173]. This is partially due to the
sample selection and partially due to the modelling. Firstly, in those bins, the density
field drops quickly (Figure 3.5), and low density (highly biased) samples typically
present more dispersion. Moreover, the photo-z selection gets more contaminated
(see the broadening in Figure 3.4, or sidata values), and what is meant to be a highly
biased sample (especially due to the low density) may be selecting average galaxies
from other redshift, needing more scatter to compensate. Finally, the halo catalogues
have a mass resolution of M = 2.5 · 101 1h−1 M . Given the large dispersions that
we are applying (over 2-3 orders of magnitudes), it is clear, we are lacking lower
mass halos that would decrease the bias more efficiently. In fact, the bias barely
changes after γ & 1.5, clearly pointing towards the convenience of improving the
mass resolution of the catalogue.
3.3. Results and Applications 129

bin-i 1 2 3 4 5 6 7 8
log10 (M1i ) 13.4 13.6 14.2 14.5 14.0 – – –
γi – – – – – 2.6 2.6 3.5

Table 3.2: HOD and HAM fitted parameters for the 8 z-bins (Table 3.1). M1 is the
mass scale of the HOD and γ the scatter in the L − M relation in dex.
0.0016 15
HALOGEN-lamps
HALOGEN-lamps mean
0.0014 Mthgal
Data 14.5

0.0012
14
n(z) [(h/Mpc)3]

0.001

log10(M)
13.5

0.0008

13
0.0006

12.5
0.0004

0.0002 12
0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1
z

Figure 3.5: Left-axis: Number density of galaxies in units of (h/M pc)3 as a function
of zph for data and galaxy mock catalogues. For the latter we show the mean and σ
(halogen-lamps mean), and individual curves (halogen-lamps) over 8 realisa-
gal
tions. Right-axis: Mass threshold M̄th used to get the halogen-lamps catalogues,
the value indicates the decimal logarithm of mass in h−1 M .

Certainly, as we improve our understanding on the data, we will improve the mod-
elling and vice-versa. At the moment, in this section, we have constructed the first
set of catalogues that reproduce the three main properties of the Y1-LSS sample, as
shown in Figure 3.4, Figure 3.5 and Figure 3.6.

3.3 Results and Applications

Once we have a set of galaxy mock catalogues, we can use them for many applications:

• First, gain insight into the modelling and compare statistics with theoretical
130 Chapter 3. Dark Energy Survey Galaxy Mock Catalogues

0.60<z<0.65 0.65<z<0.70
0.1
w(θ)

0.01

HALOGEN-lamps
HALOGEN-lamps mean
HALOGEN mean
Data
0.001
0.70<z<0.75 0.75<z<0.80
0.1
w(θ)

0.01

0.001
0.80<z<0.85 0.85<z<0.90
0.1
w(θ)

0.01

0.001
0.90<z<0.95 0.95<z<1.00
0.1
w(θ)

0.01

0.001
0.1 1 0.1 1
θ [°] θ [°]

Figure 3.6: Angular correlation function in the 8 zph -bins (as indicated in each
panel). We compare the clustering of the Y1-LSS sample data with the halo mock
catalogues (halogen) and the galaxy mock catalogues (halogen-lamps), showing
the mean and standard deviation computed over 8 mock catalogues. Further, we
show the clustering of each of the 8 galaxy mock catalogues that can be more directly
comparable with the data. For all catalogues have been imposed the same rough n(z)
3.3. Results and Applications 131

predictions (Section 3.3.1).

• Additionally, we can study the optimal methodology to extract the data (Sec-
tion 3.3.2).

• Eventually, compute covariance matrices and set the uncertainty on the –BAO
and other– measurements Section 3.3.3.

All the results presented in this section are provisional, and some of the figures
presented here were based on a previous version of the catalogues (with a simpler
photo-z modelling and only halos). But we want to emphasise the need and function-
ality of these catalogues in the process of the analysis and optimisation rather than
presenting results, that will not be definitive until the optimisation in the sample is
finished, the method fixed, and the results published by the collaboration.

3.3.1 Modelling Insight

Whereas in Section 3.2.3 we already compared the catalogues with data during the
calibration, here we start by comparing them with purely theoretical models. This
will help us understanding the models and their range of validity.

We show in Figure 3.7 a comparison of the clustering and its error with theoretical
predictions done by the method explained in [174]. The theory part implemented the
same bias b(z), photo-z P (zrsd |i) and number density N (z) as the mock catalogues.
We find a good agreement both for the mean and error on w(θ) for all zph -bins,
although, as expected, the theoretical predictions underestimate the errors. These
errors represent the diagonal part of the covariance matrices. We leave for a future
study the comparison of off-diagonal components.

Being able to model the data from simulations allows us to understand better the
physics behind, and to control it in the simulation. For example, in the left panel of
Figure 3.8 we study the exact effect of adding a photo-z to our catalogues in the clus-
tering and the difference between the two photo-z models introduced in Section 3.2.2.
132 Chapter 3. Dark Energy Survey Galaxy Mock Catalogues

10-1 theory
mocks

10-2
w(θ)

10-3

10-4
0.6<z<0.65 0.65<z<0.70 0.7<z<0.75 0.75<z<0.8
-5
10-1
10 0.8<z<0.85 0.85<z<0.90 0.90<z<0.95 0.95<z<1.00

10-2
w(θ)

-3
10

10-4

-5
10
0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4
θ (deg) θ (deg) θ (deg) θ (deg)

theory
mocks

-3
∆w(θ)

0.6<z<0.65 0.65<z<0.70 0.7<z<0.75 0.75<z<0.8

0.8<z<0.85 0.85<z<0.90 0.90<z<0.95 0.95<z<1.00

-3
∆w(θ)

0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4
θ (deg) θ (deg) θ (deg) θ (deg)

Figure 3.7: Comparison of estimation from theory and mock catalogues of the mean
of the angular clustering (top) and the 1 − σ error (bottom) for the 8 zph -bins.
504 halogen halo mock catalogues have been used for this estimations. Theory
predictions provided by M. Crocce.
3.3. Results and Applications 133

1 0.01
Cosm.Var. + HOD + Photo-z
HOD + Photo-z
0.1
Photo-z

0.01

∆w(θ)
w(θ)

0.001
0.001

0.8<zrsd<0.85
0.0001 0.8<zph<0.85 (cp)
0.8<zph<0.85 (opt)
1e-05 0.0001
0.1 1 0.1 1
θ (deg) θ (deg)

Figure 3.8: Studying the effects of photo-z. On the left we compare the clustering
of a sample selected in zrsd with a sample selected in zph following the two different
methods explained in Section 3.2.2. All the catalogues were selected using the same
number density. On the right we compare the error introduced in w(θ) by the photo-
z, the HOD and cosmic variance (see text) for the 0.8 < zph < 0.85 bin.

As expected, the clustering decreases by adding the photo-z, because the density field
is effectively smoothed along the line of sight and inhomogeneities appear less pro-
nounced. The model labelled as cp, that presents a higher σph (Table 3.1), reduces
even more the clustering.

One of the motivations that we argued for the need of mock catalogues was to
account for the interplay of systematics and cosmic variance. In the right panel of
Figure 3.8, we show a comparison of the error induced in w(θ) by the photo-z, by the
combination of photo-z and HOD, and the total error by the combination of photo-z,
HOD and cosmic variance. This was computed from a) 8 different realisations of the
photo-z on the same catalogue without HOD; b) 8 realisations of photo-z and HOD
on the same catalogue, c) 8 different catalogues with the full implementation. The
HOD here refers to the combination of HOD and HAM scatter as fixed by Table 3.2.
Interestingly, we find that the error introduced by the photo-z seeds an important
fraction (∼ 0.3 − 0.5) of the total error, whereas the HOD stochasticity introduces a
negligible error.
134 Chapter 3. Dark Energy Survey Galaxy Mock Catalogues

3.3.2 Optimizing methodology

Another important application of mocks is the test of the methodology. The way
we compress the data from a list of the coordinates of the full galaxy catalogue to
a χBAO measurement, will affect χBAO itself and particularly ∆χBAO . This can be
analysed statistically with a large set of mock catalogues.
In the past sections we took 8 zph -bins and an angular binning of 0.015◦ for the
w(θ) as a starting point. The convenience of changing both bin sizes to optimise the
precision in χBAO is now under study. Preliminary results from theory suggest that
a smaller θ-binning will increase the precision in the BAO and that zph -bins can be
widen without information loss while reducing the size of the covariance matrices.
This needs to be confirmed or refuted by mock catalogues, since the validity of
theoretical predictions at small θ-binning may be non-realistic.
In addition, a comparison of the different methods to extract the BAO information
will be carried out in [175], where different proposed methods analyse both the data
and the mock catalogues. This will include methods that analyse the clustering in
3D space (ξ(s)), the angular clustering in configuration space (w(θ)) and in Fourier
space (Cl ).
Although most of the methods extract the BAO from the angular clustering because
most of the information along the line of sight is lost, preliminary results of [176]
show that combining carefully the 3D information one can recover χBAO with similar
precision as from the angular clustering with the advantage of reducing drastically
the dataset. In [176] we study how the BAO information distributes with µ (cosine
of the angle with respect to the line-of-sight), finding a non-negligible amount at
0.2 < µ < 0.4 that is neglected by angular clustering. The relative information in
different µ intervals can be seen in Figure 3.9, where we see a well pronounced peak
at µ < 0.2 that fades at larger µ.

3.3.3 Uncertainty

Finally, the ultimate goal of the mock catalogues is to compute the covariance ma-
trices of the correlation functions and the error bars of the BAO scale. Preliminary
3.4. Conclusions 135

10
s2 ξ

5 µ <0.2
0.2 <µ <0.4
0.4 <µ <0.6
0 0.6 <µ <0.8

5 40 60 80 100 120 140 160 180 200

s (h−1 Mpc)

Figure 3.9: 3D correlation ξ(s) integrated in µ intervals from [176]. Solid lines
follow the theoretical models, whereas points are computed with
√ 504 halogen mock
catalogues, error bars represent the standard deviation over 504.

results are shown in Figure 3.10, where we compare the correlation from the data
(BPZ) with halogen mock catalogues, a catalogue from an N -body simulation
(Buzzard) and a theoretical model. The errors from the data and Buzzard are com-
puted with a Jack-Knife algorithm. In the future we will see a similar figure with the
errors estimated purely from the mock catalogues. The results are very encouraging
and all the work done in the LSS working group is promising.

3.4 Conclusions

We have presented the first end-to-end set of galaxy mock catalogues for the Y1-LSS
sample of the Dark Energy Survey. They have been designed to match the data in
three statistics

• Photo-z distribution P (zrsd |i) in the 8 zph -bins.

• 1-point statistics: n(zph )

136 Chapter 3. Dark Energy Survey Galaxy Mock Catalogues

25 Y1 Red Sample

10
r2 ξ (h−2 Mpc2 )

5
BPZ
Buzzard Mock
10 HALOGEN mocks
1520 40 60 80 100 120 140 160 180 200
r (h−1 Mpc)

Figure 3.10: Preliminary 3D correlation function ξ(r) from data (BPZ), halogen
catalogues, Buzzard catalogue (N -body simulation) and theoretical model (solid
line). halogen points represent the mean and standard deviation from 504 halo
mock catalogues. Errors in both Buzzard and BPZ are estimated via Jack-Knife.
Figure provided by A. J. Ross
3.4. Conclusions 137

• 2-point statistics: wi (θ) in the 8 zph -bins.

With respect to previous methodology explained in Chapter 2, halogen-lamps

presented here suppose the implementation of 3 new features:

• Lightcone. The parameters of halogen are fitted with several snapshots (at
z = 0.0, 0.5, 1.0, 1.5) and then interpolated at intermediate redshifts. The
lightcone is constructed by superposition of z-shells of ∆z = 0.05 at the range
of interest.

• Photo-z. We implement a Gaussian model for the photo-z, whose applied

widths ∆iph are optimised to match the observed widths σ i .

• Galaxies. In order to adapt the clustering of halos to the observed galaxy

clustering we introduced two implementations: a HOD and a HAM with a
L − M scatter. The HAM scatter mixes halos of different bias in the same
magnitude-limited sample, reducing the clustering. The HOD places more
galaxies in the most massive halos enhancing the clustering.

During the construction of this set, we acknowledged some improvements to be done

in the future. First, the transition in z is abrupt when we change from one snapshot
to another, and we are implementing a version where the lightcone is constructed in a
unique snapshot where the growth factors depend on r(z) (Equation 3.2). Secondly,
we still find some differences in the photo-z implemented and the one measured (Fig-
ure 3.4). It appears that drawing galaxies from zrsd < 1.2 is not complete for the 8th
bin, and the lightcone must be completed until zrsd = 1.4. Finally, a more realistic
galaxy-halo distribution would be desired if we want to extend the use of the cat-
alogues for other samples or other physics (e.g. cluster physics). For this, we need
to implement together the two models –HOD and the scatter HAM– and a more
complex HOD capable of matching the magnitude-dependent, colour-dependent, as
well as the small scale clustering. A natural option for a future implementation is
applying the method presented in [73] that was used for the reference mice simula-
tion. Regardless of whether we use this method or a new one, it appears that we will
need to improve the mass resolution. In a future release we plan to have complete
138 Chapter 3. Dark Energy Survey Galaxy Mock Catalogues

catalogues up to Mh ∼ 3 · 1011 M /h. By having constructed a first complete set of

galaxy mock catalogue, we learned many features that can be improved at once in
the next release.
The analysis of the data is a complex process that needs several iterations in which
the sample, methodology, and mock catalogues tell each other how they need to be
modified until a degree of convergence is reached.
In the horizon, we would like to include more physics that can be measured with DES
or other experiments, including weak-lensing, cluster physics, cross-correlation with
CMB or intensity mapping, higher order statistics, etc. For all of this, the basics of
the method might need to change, for example, we may need to resolve more the
low-scale physics with a gravity solver beyond 2LPT.
Nonetheless, we have shown that this set of halogen-lamps catalogues has been
essential for the analysis of the Y1-LSS sample, contributing to the optimisation of
the methodology, learning about the physics behind the data, and eventually will
determine the error bars in the BAO measurements. There is still work ahead to
finish the analysis of Y1-LSS data and everything remains at a preliminary stage,
but Figure 3.10 is a promise for a land ahoy! shout.
Closure

In this thesis, I presented the research carried out during my PhD in the field of Large
Scale Structure of the Universe. It connects on the one side simulations and on the
other side observations. In Chapter 1, I studied the suitability of certain halo finders
and merger tree builders in the field of N -Body simulations. During Chapter 2,
I developed a technique to generate approximate halo mock catalogs: halogen.
Finally, in Chapter 3 that technique was applied to the analysis of observed data
from a specific galaxy survey: the Dark Energy Survey. I extensively presented the
conclusions of each chapter in Section 1.6, Section 2.7 and Section 3.4 respectively,
but I synthesize here some of the most general conclusions, together with a wider
outlook.

In Chapter 1, I showed that we need to carefully select/design the tools we use in the
simulation pipeline accordingly to the application. Specifically, for accurate merger
trees, we need halo finders able to trace halos when crossing the centre of another
halo. Achieving this can be aided by tracking algorithms or phase-space finders. It
is also desirable to have merger tree builders able to correct for halo finder flaws by,
for example, skipping one snapshot.

In Chapter 2, I argued for the need for a new generation of tools for the massive
production of synthetic galaxy catalogs. halogen was shown to be a powerful tool
able to produce halo catalogs with the correct 1 and 2-point statistics at large scales.
It consists on a single-parameter bias routine applied to a 2LPT density field, with
an analytic mass function and a velocity rescaling. I also presented a comparison of

139
140 Closure

practically all existing approximate methods (at that time) and discussed how each
of them is best suited for different applications, depending on the trade-off between
accuracy, simplicity, amount of resources and versatility.

All my previous work culminates in Chapter 3, where halogen is extended and

adapted to realistic observations: halogen-lamps. It includes the realization of a
lightcone, an optimized method to simulate photometric redshift and the inclusion of
galaxies with a HOD-HAM scheme. halogen-lamps catalogs are shown to match
the observed Y1-LSS DES sample in angular clustering, number density and photo-z
dispersion. Preliminary results showed how halogen-lamps catalogs are being used
to understand the modelling and improve the methodology in the data extraction.
Finally, we showed tentative results in Figure 3.10, where we proved the potential
of halogen to set error bars on the clustering and, eventually, measure the error in
χBAO and its associated covariance matrices.

Although no final results can be claimed until the selection and analysis of the Y1-
LSS sample concludes and the results are published, things are converging fast and
we expect to have a new BAO detection soon. But this is only the first year data
out of 5 year of DES observations. During that period a new point will appear in
the dM (z) diagram presented in Figure 2, eventually settling at the unexplored BAO
region of z = 1, perhaps solving some of the current puzzles in Cosmology, or maybe
posing new questions.

We have shown that precision Cosmology with Large Scale Structure is possible, but
not necessarily easy. The field of fast generation of mock catalogs is now boosting,
and will need soon to deal simultaneously with increasing volumes, higher precision
in measurements, more statistics to reproduce and larger covariance matrices to be
estimated. There is a lot of work ahead to be performed by the scientific community,
and a new generation of tools is needed.
141

Data analysis, theory and simulations must interact with each other, opening new
windows to explore and new horizons to reach. The cosmological revolution goes on,
supported by the work of countless cosmologist around the world doing their share,
and little by little widening the knowledge as a collective mind.
Epı́logo

En esta tesis he presentado el trabajo de investigación realizado durante mi doc-

torado en el campo de la Estructura a Gran Escala del Universo. En él se conecta
las simulaciones por un lado, y las observaciones por el otro lado. En el Capı́tulo 1
estudiamos la idoneidad de varios Halo Finders y Merger Tree builders en el campo
de las simulaciones de N -cuerpos, o simulaciones N -Body. A lo largo del Capı́tulo 2
desarrollé una nueva técnica para generar catálogos de halos de manera aproximada:
halogen. Por último, en el Capı́tulo 3, ese método es utilizado para el análisis de
datos observacionales de un cartografiado especı́fico: el Dark Energy Survey (DES).
Las conclusiones de cada uno de los capı́tulos han sido ampliamente debatidas en
Sección 1.6, Sección 2.7 y Sección 3.4. No obstante, resumiré brevemente las conclu-
siones principales a continuación.

En el Capı́tulo 1 esudiamos la necesidad seleccionar o diseñar de manera cuidadosa

las herramientas que utilizamos en nuestras simulaciones, dependiendo de las aplica-
ciones para las que vayan a ser utilizadas. Más especı́ficamente, para generar Merger
Trees de manera rigurosa, necesitamos Halo Finders que puedan seguir el rastro
de los halos incluso cuando están cruzando el centro de otro halo. Para conseguir
esto, un buen método es usar Halo Finders que utilicen el espacio de fases o que
cuenten con un algoritmo de seguimiento. Además, podemos mejorar la calidad de
los Merger Trees utilizando Merger Tree builders que puedan corregir los errores del
Halo Finder, por ejemplo, permitiendo la omisión de un snapshot.

143
144 Epı́logo

En el Capı́tulo 2 expliqué la necesidad de una nueva generación de herramientas

para la producción masiva de catálogos simulados de galaxias. A continuación, de-
sarrollamos el método halogen y mostramos su capacidad para producir catálogos
de halos con las estadı́sticas de 1-punto y 2-puntos adecuadas (a escalas grandes).
halogen consiste en un mecanismo de bias cosmológico con un único parámetro li-
bre, aplicado a un campo generado con 2LPT (Teorı́a Lagrangiana de Perturbaciones
a 2o orden), la creación de masas de halos mediante funciones de masas analı́ticas y
un reajuste de la velocidad. También presenté una comparación de, prácticamente,
todos los métodos aproximados existentes y debatı́ cómo cada uno de ellos se adecúa
a diferentes aplicaciones, dependiendo del balance entre precisión, simplicidad, re-
cursos y versatilidad.

Todo el trabajo anterior culmina en el Capı́tulo 3, donde extiendo las funciones

de halogen y el método es adaptado para observaciones realistas de galaxias:
halogen-lamps. En este capı́tulo, añadimos la construcción de un cono de luz,
un método optimizado para simular el redshift fotométrico, y la inclusión de galax-
ias con un método HOD-HAM. Demostramos que los catálogos de halogen-lamps
tienen las mismas caracterı́sticas que la muestra observacional de DES Y1-LSS en la
correlación angular, densidad de número de galaxias y distribución de photo-z, todo
ello como función del redshift z. Con unos resultados preliminares presentados en la
Sección 3.3, mostré cómo los catálogos de halogen-lamps están siendo utilizados
para entender mejor la modelización de los datos y mejorar la metodologı́a en la
extracción de información con los datos. Por último, en la Figura 3.10 mostramos
resultados provisionales donde probamos el potencial de halogen-lamps para es-
tablecer barras de error en la función de correlación y, en el futuro, medir el error en
χBAO y sus matrices de covarianza asociadas.

Aunque los resultados finales no estarán listos hasta que la selección y análisis de
la muestra Y1-LSS haya acabado, y los resultados hayan sido publicados por la
colaboración; la situación converge rápidamente y esperamos obtener una nueva
detección del BAO pronto. Pero esto es sólo el análisis de los datos del primer año, de
145

los 5 años programados en DES. Durante ese tiempo, un nuevo punto aparecerá en el
diagrama dM (z) que introjimos en la Figura 2. Con nuevos datos y el correspondiente
análisis, este punto se irá asentando en torno al área inexplorada de z ≈ 1. Quizás
ese nuevo punto nos ayude a resolver algunos de los enigmas actuales de la Cosmolgı́a
o, quizás, plantee nuevas preguntas.

A lo largo de esta tesis he mostrado que la Cosmologı́a de Precisión con la Estructura

a Gran Escala es posible, pero no necesariamente fácil. El campo de la generación
rápida de catálogos simulados de galaxias está reciviendo un fuerte estı́mulo. Pronto
requerirá lidiar simultáneamente con volúmenes más grandes, más precisión en las
medidas, ser capaz de reproducir más estadı́sticas y estimar matrices de covarianza
más grandes. Aún queda en el camino mucho trabajo, al que la comunidad deberá
enfrentarse para construir una nueva generación de herramientas de simulaciones
cosmológicas.

El análisis de datos, las simulaciones y la teorı́a deberán interactuar los uno con
los otros, abriendo nuevas ventanas para explorar nuevos horizontes. La Revolución
Cosmológica continúa, contando con el apoyo de innumerables cosmólogos a lo largo
y ancho del planeta, contribuyendo con su grano de arena, y poco a poco ensanchando
el conocimiento como una mente colectiva.
Bibliography

[1] S. Avila, A. Knebe, F. R. Pearce, A. Schneider, C. Srisawat, P. A. Thomas,

P. Behroozi, P. J. Elahi, J. Han, Y.-Y. Mao, J. Onions, V. Rodriguez-Gomez,
and D. Tweed. SUSSING MERGER TREES: the influence of the halo finder.
MNRAS, 441:3488–3501, July 2014.

[2] S. Avila, S. G. Murray, A. Knebe, C. Power, A. S. G. Robotham, and J. Garcia-

Bellido. HALOGEN: a tool for fast generation of mock halo catalogues. MN-
RAS, 450:1856–1867, June 2015.

[3] C.-H. Chuang, C. Zhao, F. Prada, E. Munari, S. Avila, A. Izard, F.-S. Ki-
taura, M. Manera, P. Monaco, S. Murray, A. Knebe, C. G. Scóccola, G. Yepes,
J. Garcia-Bellido, F. A. Marı́n, V. Müller, R. Skibba, M. Crocce, P. Fosalba,
S. Gottlöber, A. A. Klypin, C. Power, C. Tao, and V. Turchaninov. nIFTy cos-
mology: Galaxy/halo mock catalogue comparison project on clustering statis-
tics. MNRAS, 452:686–700, September 2015.

[4] Planck Collaboration, P. A. R. Ade, N. Aghanim, M. Arnaud, M. Ashdown,

J. Aumont, C. Baccigalupi, A. J. Banday, R. B. Barreiro, J. G. Bartlett, and
et al. Planck 2015 results. XIII. Cosmological parameters. ArXiv e-prints,
February 2015.

[5] G. Bertone, D. Hooper, and J. Silk. Particle dark matter: evidence, candidates
and constraints. Phys. Rep., 405:279–390, January 2005.

[6] L. Amendola and S. Tsujikawa. Dark Energy: Theory and Observations. Cam-
bridge University Press, 2010.

147
148 Bibliography

[7] A. Linde. Inflationary Cosmology after Planck 2013. ArXiv e-prints, February
2014.

[8] J. Lee and E. Komatsu. Bullet Cluster: A Challenge to ΛCDM Cosmology.

ApJ, 718:60–65, July 2010.

[9] S. Chongchitnan and J. Silk. Primordial non-Gaussianity and extreme-value

statistics of galaxy clusters. Phys. Rev. D, 85(6):063508, March 2012.

[10] A. Klypin, A. V. Kravtsov, O. Valenzuela, and F. Prada. Where Are the

Missing Galactic Satellites? ApJ, 522:82–92, September 1999.

[11] Planck Collaboration, N. Aghanim, M. Arnaud, M. Ashdown, J. Aumont,

C. Baccigalupi, A. J. Banday, R. B. Barreiro, J. G. Bartlett, N. Bartolo, and
et al. Planck 2015 results. XI. CMB power spectra, likelihoods, and robustness
of parameters. ArXiv e-prints, July 2015.

[12] J. S. Bullock, A. V. Kravtsov, and D. H. Weinberg. Reionization and the

Abundance of Galactic Satellites. ApJ, 539:517–521, August 2000.

[13] A. Drlica-Wagner, K. Bechtol, E. S. Rykoff, E. Luque, A. Queiroz, Y.-Y. Mao,

R. H. Wechsler, J. D. Simon, B. Santiago, B. Yanny, E. Balbinot, S. Dodel-
son, A. Fausti Neto, D. J. James, T. S. Li, M. A. G. Maia, J. L. Marshall,
A. Pieres, K. Stringer, A. R. Walker, T. M. C. Abbott, F. B. Abdalla, S. Al-
lam, A. Benoit-Lévy, G. M. Bernstein, E. Bertin, D. Brooks, E. Buckley-Geer,
D. L. Burke, A. Carnero Rosell, M. Carrasco Kind, J. Carretero, M. Crocce,
L. N. da Costa, S. Desai, H. T. Diehl, J. P. Dietrich, P. Doel, T. F. Eifler, A. E.
Evrard, D. A. Finley, B. Flaugher, P. Fosalba, J. Frieman, E. Gaztanaga, D. W.
Gerdes, D. Gruen, R. A. Gruendl, G. Gutierrez, K. Honscheid, K. Kuehn,
N. Kuropatkin, O. Lahav, P. Martini, R. Miquel, B. Nord, R. Ogando, A. A.
Plazas, K. Reil, A. Roodman, M. Sako, E. Sanchez, V. Scarpine, M. Schubnell,
I. Sevilla-Noarbe, R. C. Smith, M. Soares-Santos, F. Sobreira, E. Suchyta,
M. E. C. Swanson, G. Tarle, D. Tucker, V. Vikram, W. Wester, Y. Zhang,
J. Zuntz, and DES Collaboration. Eight Ultra-faint Galaxy Candidates Dis-
covered in Year Two of the Dark Energy Survey. ApJ, 813:109, November
2015.
Bibliography 149

[14] D. Kraljic and S. Sarkar. How rare is the Bullet Cluster (in a ΛCDM universe)?
Journal of Cosmology and Astroparticle Physics, 4:050, April 2015.

[15] I. Harrison and P. Coles. Testing cosmology with extreme galaxy clusters.
MNRAS, 421:L19–L23, March 2012.

[16] A. Rassat, J.-L. Starck, P. Paykari, F. Sureau, and J. Bobin. Planck CMB
anomalies: astrophysical and cosmological secondary effects and the curse of
masking. Journal of Cosmology and Astroparticle Physics, 8:006, August 2014.

[17] R. Amanullah, C. Lidman, D. Rubin, G. Aldering, P. Astier, K. Barbary, M. S.

Burns, A. Conley, K. S. Dawson, S. E. Deustua, M. Doi, S. Fabbro, L. Faccioli,
H. K. Fakhouri, G. Folatelli, A. S. Fruchter, H. Furusawa, G. Garavini, G. Gold-
haber, A. Goobar, D. E. Groom, I. Hook, D. A. Howell, N. Kashikawa, A. G.
Kim, R. A. Knop, M. Kowalski, E. Linder, J. Meyers, T. Morokuma, S. No-
bili, J. Nordin, P. E. Nugent, L. Östman, R. Pain, N. Panagia, S. Perlmutter,
J. Raux, P. Ruiz-Lapuente, A. L. Spadafora, M. Strovink, N. Suzuki, L. Wang,
W. M. Wood-Vasey, N. Yasuda, and T. Supernova Cosmology Project. Spectra
and Hubble Space Telescope Light Curves of Six Type Ia Supernovae at 0.511
¡ z ¡ 1.12 and the Union2 Compilation. ApJ, 716:712–738, June 2010.

[18] Planck Collaboration, P. A. R. Ade, N. Aghanim, C. Armitage-Caplan, M. Ar-

naud, M. Ashdown, F. Atrio-Barandela, J. Aumont, C. Baccigalupi, A. J.
Banday, and et al. Planck 2013 results. XVI. Cosmological parameters. A&A,
571:A16, November 2014.

[19] J. Frieman and Dark Energy Survey Collaboration. The Dark Energy Survey:
Overview. In American Astronomical Society Meeting Abstracts 221, volume
221 of American Astronomical Society Meeting Abstracts, page 335.01, January
2013.

[20] R. Laureijs, J. Amiaux, S. Arduini, J. . Auguères, J. Brinchmann, R. Cole,

M. Cropper, C. Dabin, L. Duvet, A. Ealet, and et al. Euclid Definition Study
Report. ArXiv e-prints 1110.3193, October 2011.
150 Bibliography

[21] P. J. E. Peebles. Principles of Physical Cosmology. Princeton Theories in

Physics. Princeton University Press, 1993.

[22] D. W. Hogg. Distance measures in cosmology. ArXiv Astrophysics e-prints,

May 1999.

[23] A. G. Riess, A. V. Filippenko, P. Challis, A. Clocchiatti, A. Diercks, P. M. Gar-

navich, R. L. Gilliland, C. J. Hogan, S. Jha, R. P. Kirshner, B. Leibundgut,
M. M. Phillips, D. Reiss, B. P. Schmidt, R. A. Schommer, R. C. Smith, J. Spy-
romilio, C. Stubbs, N. B. Suntzeff, and J. Tonry. Observational Evidence from
Supernovae for an Accelerating Universe and a Cosmological Constant. AJ,
116:1009–1038, September 1998.

[24] S. Perlmutter, G. Aldering, G. Goldhaber, R. A. Knop, P. Nugent, P. G. Cas-

tro, S. Deustua, S. Fabbro, A. Goobar, D. E. Groom, I. M. Hook, A. G. Kim,
M. Y. Kim, J. C. Lee, N. J. Nunes, R. Pain, C. R. Pennypacker, R. Quimby,
C. Lidman, R. S. Ellis, M. Irwin, R. G. McMahon, P. Ruiz-Lapuente, N. Wal-
ton, B. Schaefer, B. J. Boyle, A. V. Filippenko, T. Matheson, A. S. Fruchter,
N. Panagia, H. J. M. Newberg, W. J. Couch, and T. S. C. Project. Measure-
ments of Ω and Λ from 42 High-Redshift Supernovae. ApJ, 517:565–586, June
1999.

[25] P. J. E. Peebles and J. T. Yu. Primeval Adiabatic Perturbation in an Expand-

ing Universe. ApJ, 162:815, December 1970.

[26] R. A. Sunyaev and Y. B. Zeldovich. Small-Scale Fluctuations of Relic Radia-

tion. Ap&SS, 7:3–19, April 1970.

[27] S. Cole, W. J. Percival, J. A. Peacock, P. Norberg, C. M. Baugh, C. S. Frenk,

I. Baldry, J. Bland-Hawthorn, T. Bridges, R. Cannon, M. Colless, C. Collins,
W. Couch, N. J. G. Cross, G. Dalton, V. R. Eke, R. De Propris, S. P. Driver,
G. Efstathiou, R. S. Ellis, K. Glazebrook, C. Jackson, A. Jenkins, O. Lahav,
I. Lewis, S. Lumsden, S. Maddox, D. Madgwick, B. A. Peterson, W. Suther-
land, and K. Taylor. The 2dF Galaxy Redshift Survey: power-spectrum anal-
ysis of the final data set and cosmological implications. MNRAS, 362:505–534,
September 2005.
Bibliography 151

[28] D. J. Eisenstein, I. Zehavi, D. W. Hogg, R. Scoccimarro, M. R. Blanton,

R. C. Nichol, R. Scranton, H.-J. Seo, M. Tegmark, Z. Zheng, S. F. Anderson,
J. Annis, N. Bahcall, J. Brinkmann, S. Burles, F. J. Castander, A. Connolly,
I. Csabai, M. Doi, M. Fukugita, J. A. Frieman, K. Glazebrook, J. E. Gunn,
J. S. Hendry, G. Hennessy, Z. Ivezić, S. Kent, G. R. Knapp, H. Lin, Y.-S. Loh,
R. H. Lupton, B. Margon, T. A. McKay, A. Meiksin, J. A. Munn, A. Pope,
M. W. Richmond, D. Schlegel, D. P. Schneider, K. Shimasaku, C. Stoughton,
M. A. Strauss, M. SubbaRao, A. S. Szalay, I. Szapudi, D. L. Tucker, B. Yanny,
and D. G. York. Detection of the Baryon Acoustic Peak in the Large-Scale
Correlation Function of SDSS Luminous Red Galaxies. ApJ, 633:560–574,
November 2005.

[29] F. Beutler, C. Blake, M. Colless, D. H. Jones, L. Staveley-Smith, L. Campbell,

Q. Parker, W. Saunders, and F. Watson. The 6dF Galaxy Survey: baryon
acoustic oscillations and the local Hubble constant. MNRAS, 416:3017–3032,
October 2011.

[30] C. Blake, E. A. Kazin, F. Beutler, T. M. Davis, D. Parkinson, S. Brough,

M . Colless, C. Contreras, W. Couch, S. Croom, D. Croton, M. J. Drinkwater,
K. Forster, D. Gilbank, M. Gladders, K. Glazebrook, B. Jelliffe, R. J. Jurek, I.-
H. Li, B. Madore, D. C. Martin, K. Pimbblet, G. B. Poole, M. Pracy, R. Sharp,
E. Wisnioski, D. Woods, T. K. Wyder, and H. K. C. Yee. The WiggleZ Dark
Energy Survey: mapping the distance-redshift relation with baryon acoustic
oscillations. MNRAS, 418:1707–1724, December 2011.

[31] L. Anderson, E. Aubourg, S. Bailey, D. Bizyaev, M. Blanton, A. S. Bolton,

J. Brinkmann, J. R. Brownstein, A. Burden, A. J. Cuesta, L. A. N. da Costa,
K. S. Dawson, R. de Putter, D. J. Eisenstein, J. E. Gunn, H. Guo, J.-C.
Hamilton, P. Harding, S. Ho, K. Honscheid, E. Kazin, D. Kirkby, J.-P. Kneib,
A. Labatie, C. Loomis, R. H. Lupton, E. Malanushenko, V. Malanushenko,
R. Mandelbaum, M. Manera, C. Maraston, C. K. McBride, K. T. Mehta,
O. Mena, F. Montesano, D. Muna, R. C. Nichol, S. E. Nuza, M. D. Olmstead,
D. Oravetz, N. Padmanabhan, N. Palanque-Delabrouille, K. Pan, J. Parejko,
I. Pâris, W. J. Percival, P. Petitjean, F. Prada, B. Reid, N. A. Roe, A. J.
152 Bibliography

Ross, N. P. Ross, L. Samushia, A. G. Sánchez, D. J. Schlegel, D. P. Schneider,

C. G. Scóccola, H.-J. Seo, E. S. Sheldon, A. Simmons, R. A. Skibba, M. A.
Strauss, M. E. C. Swanson, D. Thomas, J. L. Tinker, R. Tojeiro, M. V. Magaña,
L. Verde, C. Wagner, D. A. Wake, B. A. Weaver, D. H. Weinberg, M. White,
X. Xu, C. Yèche, I. Zehavi, and G.-B. Zhao. The clustering of galaxies in the
SDSS-III Baryon Oscillation Spectroscopic Survey: baryon acoustic oscillations
in the Data Release 9 spectroscopic galaxy sample. MNRAS, 427:3435–3467,
December 2012.

[32] N. G. Busca, T. Delubac, J. Rich, S. Bailey, A. Font-Ribera, D. Kirkby, J.-

M. Le Goff, M. M. Pieri, A. Slosar, É. Aubourg, J. E. Bautista, D. Bizyaev,
M. Blomqvist, A. S. Bolton, J. Bovy, H. Brewington, A. Borde, J. Brinkmann,
B. Carithers, R. A. C. Croft, K. S. Dawson, G. Ebelke, D. J. Eisenstein, J.-
C. Hamilton, S. Ho, D. W. Hogg, K. Honscheid, K.-G. Lee, B. Lundgren,
E. Malanushenko, V. Malanushenko, D. Margala, C. Maraston, K. Mehta,
J. Miralda-Escudé, A. D. Myers, R. C. Nichol, P. Noterdaeme, M. D. Olm-
stead, D. Oravetz, N. Palanque-Delabrouille, K. Pan, I. Pâris, W. J. Percival,
P. Petitjean, N. A. Roe, E. Rollinde, N. P. Ross, G. Rossi, D. J. Schlegel, D. P.
Schneider, A. Shelden, E. S. Sheldon, A. Simmons, S. Snedden, J. L. Tinker,
M. Viel, B. A. Weaver, D. H. Weinberg, M. White, C. Yèche, and D. G. York.
Baryon acoustic oscillations in the Lyα forest of BOSS quasars. A&A, 552:A96,
April 2013.

[33] A. Slosar, V. Iršič, D. Kirkby, S. Bailey, N. G. Busca, T. Delubac, J. Rich,

É. Aubourg, J. E. Bautista, V. Bhardwaj, M. Blomqvist, A. S. Bolton, J. Bovy,
J. Brownstein, B. Carithers, R. A. C. Croft, K. S. Dawson, A. Font-Ribera,
J.-M. Le Goff, S. Ho, K. Honscheid, K.-G. Lee, D. Margala, P. McDonald,
B. Medolin, J. Miralda-Escudé, A. D. Myers, R. C. Nichol, P. Noterdaeme,
N. Palanque-Delabrouille, I. Pâris, P. Petitjean, M. M. Pieri, Y. Piškur, N. A.
Roe, N. P. Ross, G. Rossi, D. J. Schlegel, D. P. Schneider, N. Suzuki, E. S.
Sheldon, U. Seljak, M. Viel, D. H. Weinberg, and C. Yèche. Measurement of
baryon acoustic oscillations in the Lyman-α forest fluctuations in BOSS data
release 9. Journal of Cosmology and Astroparticle Physics, 4:026, April 2013.
Bibliography 153

[34] A. A. Penzias and R. W. Wilson. A Measurement of Excess Antenna Temper-

ature at 4080 Mc/s. ApJ, 142:419–421, July 1965.

[35] Planck Collaboration, P. A. R. Ade, N. Aghanim, M. Arnaud, M. Ashdown,

J. Aumont, C. Baccigalupi, A. J. Banday, R. B. Barreiro, J. G. Bartlett, and
et al. Planck 2015 results. XV. Gravitational lensing. ArXiv e-prints, February
2015.

[36] R. K. Sachs and A. M. Wolfe. Perturbations of a Cosmological Model and

Angular Variations of the Microwave Background. ApJ, 147:73, January 1967.

[37] G. F. Smoot, C. L. Bennett, A. Kogut, E. L. Wright, J. Aymon, N. W. Boggess,

E. S. Cheng, G. de Amici, S. Gulkis, M. G. Hauser, G. Hinshaw, P. D. Jackson,
M. Janssen, E. Kaita, T. Kelsall, P. Keegstra, C. Lineweaver, K. Loewenstein,
P. Lubin, J. Mather, S. S. Meyer, S. H. Moseley, T. Murdock, L. Rokke, R. F.
Silverberg, L. Tenorio, R. Weiss, and D. T. Wilkinson. Structure in the COBE
differential microwave radiometer first-year maps. ApJ, 396:L1–L5, September
1992.

[38] G. Hinshaw, M. R. Nolta, C. L. Bennett, R. Bean, O. Doré, M. R. Grea-

son, M. Halpern, R. S. Hill, N. Jarosik, A. Kogut, E. Komatsu, M. Limon,
N. Odegard, S. S. Meyer, L. Page, H. V. Peiris, D. N. Spergel, G. S. Tucker,
L. Verde, J. L. Weiland, E. Wollack, and E. L. Wright. Three-Year Wilkinson
Microwave Anisotropy Probe (WMAP) Observations: Temperature Analysis.
ApJS, 170:288–334, June 2007.

[39] W. H. Press and P. Schechter. Formation of Galaxies and Clusters of Galaxies

by Self-Similar Gravitational Condensation. ApJ, 187:425–438, February 1974.

[40] Dark Energy Survey Collaboration. The Dark Energy Survey Science Program.

[41] B. Sartoris, A. Biviano, C. Fedeli, J. G. Bartlett, S. Borgani, M. Costanzi,

C. Giocoli, L. Moscardini, J. Weller, B. Ascaso, S. Bardelli, S. Maurogordato,
and P. T. P. Viana. Next Generation Cosmology: Constraints from the Euclid
Galaxy Cluster Survey. MNRAS, March 2016.
154 Bibliography

[42] M. Kowalski, D. Rubin, G. Aldering, R. J. Agostinho, A. Amadon, R. Aman-

ullah, C. Balland, K. Barbary, G. Blanc, P. J. Challis, A. Conley, N. V.
Connolly, R. Covarrubias, K. S. Dawson, S. E. Deustua, R. Ellis, S. Fab-
bro, V. Fadeyev, X. Fan, B. Farris, G. Folatelli, B. L. Frye, G. Garavini, E. L.
Gates, L. Germany, G. Goldhaber, B. Goldman, A. Goobar, D. E. Groom,
J. Haissinski, D. Hardin, I. Hook, S. Kent, A. G. Kim, R. A. Knop, C. Lid-
man, E. V. Linder, J. Mendez, J. Meyers, G. J. Miller, M. Moniez, A. M.
Mourão, H. Newberg, S. Nobili, P. E. Nugent, R. Pain, O. Perdereau, S. Perl-
mutter, M. M. Phillips, V. Prasad, R. Quimby, N. Regnault, J. Rich, E. P.
Rubenstein, P. Ruiz-Lapuente, F. D. Santos, B. E. Schaefer, R. A. Schommer,
R. C. Smith, A. M. Soderberg, A. L. Spadafora, L.-G. Strolger, M. Strovink,
N. B. Suntzeff, N. Suzuki, R. C. Thomas, N. A. Walton, L. Wang, W. M.
Wood-Vasey, and J. L. Yun. Improved Cosmological Constraints from New,
Old, and Combined Supernova Data Sets. ApJ, 686:749–778, October 2008.

[43] O. Lahav and A. R Liddle. The Cosmological Parameters 2014. ArXiv e-prints,
January 2014.

[44] V. Springel, S. D. M. White, A. Jenkins, C. S. Frenk, N. Yoshida, L. Gao,

J. Navarro, R. Thacker, D. Croton, J. Helly, J. A. Peacock, S. Cole, P. Thomas,
H. Couchman, A. Evrard, J. Colberg, and F. Pearce. Simulations of the for-
mation, evolution and clustering of galaxies and quasars. Nature, 435:629–636,
June 2005.

[45] J.W Eastwood R.W Hockney. Computer simulation using particles. A. Hilger,
special student ed edition, 1988.

[46] A. Klypin. Numerical Simulations in Cosmology I: Methods. ArXiv Astro-

physics e-prints, May 2000.

[47] M. Kuhlen, M. Vogelsberger, and R. Angulo. Numerical simulations of the dark

universe: State of the art and the next decade. Physics of the Dark Universe,
1:50–93, November 2012.

[48] Antony Lewis, Anthony Challinor, and Anthony Lasenby. Efficient computa-
Bibliography 155

tion of CMB anisotropies in closed FRW models. Astrophys. J., 538:473–476,

2000.

[49] F. Moutarde, J.-M. Alimi, F. R. Bouchet, R. Pellat, and A. Ramani. Precol-

lapse scale invariance in gravitational instability. ApJ, 382:377–381, December
1991.

[50] F. R. Bouchet, S. Colombi, E. Hivon, and R. Juszkiewicz. Perturbative La-

grangian approach to gravitational instability. A&A, 296:575, April 1995.

[51] Y. B. Zel’dovich. Gravitational instability: An approximate theory for large

density perturbations. A&A, 5:84–89, March 1970.

[52] M. Crocce, S. Pueblas, and R. Scoccimarro. Transients from initial conditions

in cosmological simulations. MNRAS, 373:369–381, November 2006.

[53] J. Barnes and P. Hut. A hierarchical O(N log N) force-calculation algorithm.

Nature, 324:446–449, December 1986.

[54] A. W. Appel. An Efficient Program for Many-Body Simulation. SIAM Journal

on Scientific and Statistical Computing, vol. 6, no. 1, January 1985, p. 85-103.,
6:85–103, January 1985.

[55] L. Hernquist. Performance characteristics of tree codes. ApJS, 64:715–734,

August 1987.

[56] A. Klypin and J. Holtzman. Particle-Mesh code for cosmological simulations.

ArXiv Astrophysics e-prints, December 1997.

[57] G. Efstathiou, M. Davis, S. D. M. White, and C. S. Frenk. Numerical tech-

niques for large cosmological N-body simulations. ApJS, 57:241–260, February
1985.

[58] H. M. P. Couchman. Mesh-refined P3M - A fast adaptive N-body algorithm.

ApJ, 368:L23–L26, February 1991.

[59] J. Dubinski, J. Kim, C. Park, and R. Humble. GOTPM: a parallel hybrid

particle-mesh treecode. NewA, 9:111–126, February 2004.
156 Bibliography

[60] V. Springel. The cosmological simulation code GADGET-2. MNRAS,

364:1105–1134, December 2005.

[61] A. V. Kravtsov, A. A. Klypin, and A. M. Khokhlov. Adaptive Refinement

Tree: A New High-Resolution N-Body Code for Cosmological Simulations.
ApJS, 111:73–94, July 1997.

[62] R. Teyssier. Cosmological hydrodynamics with adaptive mesh refinement. A

new high resolution code called RAMSES. A&A, 385:337–364, April 2002.

[63] A. Knebe, F. R. Pearce, H. Lux, Y. Ascasibar, P. Behroozi, J. Casado, C. C.

Moran, J. Diemand, and et al. Structure finding in cosmological simulations:
the state of affairs. MNRAS, 435:1618–1658, October 2013.

[64] C. Srisawat, A. Knebe, F. R. Pearce, A. Schneider, P. A. Thomas, P. Behroozi,

K. Dolag, P. J. Elahi, J. Han, J. Helly, Y. Jing, I. Jung, J. Lee, Y.-Y. Mao,
J. Onions, V. Rodriguez-Gomez, D. Tweed, and S. K. Yi. Sussing Merger Trees:
The Merger Trees Comparison Project. MNRAS, 436:150–162, November 2013.

[65] A. Knebe, F. R. Pearce, P. A. Thomas, A. Benson, J. Blaizot, R. Bower,

J. Carretero, F. J. Castander, A. Cattaneo, S. A. Cora, D. J. Croton, W. Cui,
D. Cunnama, G. De Lucia, J. E. Devriendt, P. J. Elahi, A. Font, F. Fontanot,
J. Garcia-Bellido, I. D. Gargiulo, V. Gonzalez-Perez, J. Helly, B. Henriques,
M. Hirschmann, J. Lee, G. A. Mamon, P. Monaco, J. Onions, N. D. Padilla,
C. Power, A. Pujol, R. A. Skibba, R. S. Somerville, C. Srisawat, C. A. Vega-
Martı́nez, and S. K. Yi. nIFTy cosmology: comparison of galaxy formation
models. MNRAS, 451:4029–4059, August 2015.

[66] J. F. Navarro, C. S. Frenk, and S. D. M. White. The Structure of Cold Dark

Matter Halos. ApJ, 462:563, May 1996.

[67] Y. P. Jing, H. J. Mo, and G. Börner. Spatial Correlation Function and Pairwise
Velocity Dispersion of Galaxies: Cold Dark Matter Models versus the Las
Campanas Survey. ApJ, 494:1–12, February 1998.

[68] J. A. Peacock and R. E. Smith. Halo occupation numbers and galaxy bias.
MNRAS, 318:1144–1156, November 2000.
Bibliography 157

[69] A. A. Berlind and D. H. Weinberg. The Halo Occupation Distribution: Toward

an Empirical Determination of the Relation between Galaxies and Mass. ApJ,
575:587–616, August 2002.

[70] Z. Zheng, A. A. Berlind, D. H. Weinberg, A. J. Benson, C. M. Baugh, S. Cole,

R. Davé, C. S. Frenk, N. Katz, and C. G. Lacey. Theoretical Models of the
Halo Occupation Distribution: Separating Central and Satellite Galaxies. ApJ,
633:791–809, November 2005.

[71] H. Guo, I. Zehavi, and Z. Zheng. A New Method to Correct for Fiber Collisions
in Galaxy Two-point Statistics. ApJ, 756:127, September 2012.

[72] I. Zehavi, Z. Zheng, D. H. Weinberg, M. R. Blanton, N. A. Bahcall, A. A.

Berlind, J. Brinkmann, J. A. Frieman, J. E. Gunn, R. H. Lupton, R. C. Nichol,
W. J. Percival, D. P. Schneider, R. A. Skibba, M. A. Strauss, M. Tegmark,
and D. G. York. Galaxy Clustering in the Completed SDSS Redshift Survey:
The Dependence on Color and Luminosity. ApJ, 736:59, July 2011.

[73] J. Carretero, F. J. Castander, E. Gaztañaga, M. Crocce, and P. Fosalba. An

algorithm to build mock galaxy catalogues using MICE simulations. MNRAS,
447:646–670, February 2015.

[74] S. A. Rodrı́guez-Torres, C.-H. Chuang, F. Prada, H. Guo, A. Klypin,

P. Behroozi, C. H. Hahn, J. Comparat, G. Yepes, A. D. Montero-Dorta, J. R.
Brownstein, C. Maraston, C. K. McBride, J. Tinker, S. Gottlöber, G. Favole,
Y. Shu, F.-S. Kitaura, A. Bolton, R. Scoccimarro, L. Samushia, D. Schlegel,
D. P. Schneider, and D. Thomas. The clustering of galaxies in the SDSS-III
Baryon Oscillation Spectroscopic Survey: Modeling the clustering and halo
occupation distribution of BOSS-CMASS galaxies in the Final Data Release.
ArXiv e-prints, September 2015.

[75] R. A. Skibba and R. K. Sheth. A halo model of galaxy colours and clustering
in the Sloan Digital Sky Survey. MNRAS, 392:1080–1091, January 2009.

[76] C. Conroy, R. H. Wechsler, and A. V. Kravtsov. Modeling Luminosity-

158 Bibliography

dependent Galaxy Clustering through Cosmic Time. ApJ, 647:201–214, August

2006.

[77] P. S. Behroozi, C. Conroy, and R. H. Wechsler. A Comprehensive Analysis of

Uncertainties Affecting the Stellar Mass-Halo Mass Relation for 0 ¡ z ¡ 4. ApJ,
717:379–403, July 2010.

[78] S. Trujillo-Gomez, A. Klypin, J. Primack, and A. J. Romanowsky. Galaxies in

ΛCDM with Halo Abundance Matching: Luminosity-Velocity Relation, Bary-
onic Mass-Velocity Relation, Velocity Function, and Clustering. ApJ, 742:16,
November 2011.

[79] H. Guo, Z. Zheng, P. S. Behroozi, I. Zehavi, C.-H. Chuang, J. Comparat,

G. Favole, S. Gottloeber, A. Klypin, F. Prada, S. A. Rodrı́guez-Torres, D. H.
Weinberg, and G. Yepes. Modelling Galaxy Clustering: Halo Occupation Dis-
tribution versus Subhalo Matching. MNRAS, April 2016.

[80] C. M. Baugh. A primer on hierarchical galaxy formation: the semi-analytical

approach. Reports on Progress in Physics, 69:3101–3156, 2006.

[81] A. Knebe, S. R. Knollmann, S. I. Muldrew, F. R. Pearce, and et al. Haloes

gone MAD: The Halo-Finder Comparison Project. MNRAS, 415:2293–2318,
August 2011.

[82] D. J. Croton, V. Springel, S. D. M. White, G. De Lucia, C. S. Frenk, L. Gao,

A. Jenkins, G. Kauffmann, J. F. Navarro, and N. Yoshida. The many lives
of active galactic nuclei: cooling flows, black holes and the luminosities and
colours of galaxies. MNRAS, 365:11–28, January 2006.

[83] R. S. Somerville, P. F. Hopkins, T. J. Cox, B. E. Robertson, and L. Hernquist.

A semi-analytic model for the co-evolution of galaxies, black holes and active
galactic nuclei. MNRAS, 391:481–506, December 2008.

[84] P. Monaco, F. Fontanot, and G. Taffoni. The MORGANA model for the rise
of galaxies and active nuclei. MNRAS, 375:1189–1219, March 2007.
Bibliography 159

[85] B. M. B. Henriques, P. A. Thomas, S. Oliver, and I. Roseboom. Monte Carlo

Markov Chain parameter estimation in semi-analytic models of galaxy forma-
tion. MNRAS, 396:535–547, June 2009.

[86] A. J. Benson, S. Borgani, G. De Lucia, M. Boylan-Kolchin, and P. Monaco.

Convergence of galaxy properties with merger tree temporal resolution. MN-
RAS, 419:3590–3603, February 2012.

[87] W. H. Press and P. Schechter. Formation of Galaxies and Clusters of Galaxies

by Self-Similar Gravitational Condensation. ApJ, 187:425–438, February 1974.

[88] J. R. Bond, S. Cole, G. Efstathiou, and N. Kaiser. Excursion set mass functions
for hierarchical Gaussian fluctuations. ApJ, 379:440–460, October 1991.

[89] F. Jiang and F. C. van den Bosch. Generating Merger Trees for Dark Matter
Haloes: A Comparison of Methods. ArXiv e-prints, November 2013.

[90] B. F. Roukema, P. J. Quinn, and B. A. Peterson. Spectral Evolution of Merg-

ing/Accreting Galaxies. In G. L. Chincarini, A. Iovino, T. Maccacaro, and
D. Maccagni, editors, Observational Cosmology, volume 51 of Astronomical
Society of the Pacific Conference Series, page 51, January 1993.

[91] C. Lacey and S. Cole. Merger rates in hierarchical models of galaxy formation.
MNRAS, 262:627–649, June 1993.

[92] A. Knebe, N. I. Libeskind, F. Pearce, P. Behroozi, J. Casado, K. Dolag,

R. Dominguez-Tenreiro, P. Elahi, H. Lux, S. I. Muldrew, and J. Onions. Galax-
ies going MAD: the Galaxy-Finder Comparison Project. MNRAS, 428:2039–
2052, January 2013.

[93] J. Onions, A. Knebe, F. R. Pearce, S. I. Muldrew, H. Lux, S. R. Knollmann,

Y. Ascasibar, P. Behroozi, P. Elahi, J. Han, M. Maciejewski, M. E. Merchán,
M. Neyrinck, A. N. Ruiz, M. A. Sgró, V. Springel, and D. Tweed. Subhaloes
going Notts: the subhalo-finder comparison project. MNRAS, 423:1200–1214,
June 2012.
160 Bibliography

[94] J. Onions, Y. Ascasibar, P. Behroozi, J. Casado, P. Elahi, J. Han, A. Knebe,

H. Lux, M. E. Merchán, S. I. Muldrew, M. Neyrinck, L. Old, F. R. Pearce,
D. Potter, A. N. Ruiz, M. A. Sgró, D. Tweed, and T. Yue. Subhaloes gone
Notts: spin across subhaloes and finders. MNRAS, 429:2739–2747, March
2013.

[95] P. J. Elahi, J. Han, H. Lux, Y. Ascasibar, P. Behroozi, A. Knebe, S. I. Mul-

drew, J. Onions, and F. Pearce. Streams going Notts: the tidal debris finder
comparison project. submitted to MNRAS, arXiv:1305.2448, June 2013.

[96] A. Knebe, F. R. Pearce, H. Lux, Y. Ascasibar, and et al. Structure Finding

in Cosmological Simulations: The State of Affairs. submitted to MNRAS,
ArXiv:1304.0585, April 2013.

[97] M. Davis, G. Efstathiou, C. S. Frenk, and S. D. M. White. The evolution

of large-scale structure in a universe dominated by cold dark matter. ApJ,
292:371–394, May 1985.

[98] V. Springel. The cosmological simulation code GADGET-2. MNRAS,

364:1105–1134, December 2005.

[99] E. Komatsu and et al. Seven-year Wilkinson Microwave Anisotropy Probe

(WMAP) Observations: Cosmological Interpretation. 192:18, February 2011.

[100] S. P. D. Gill, A. Knebe, and B. K. Gibson. The evolution of substructure - I.

A new identification method. MNRAS, 351:399–409, June 2004.

[101] S. R. Knollmann and A. Knebe. AHF: Amiga’s Halo Finder. ApJS, 182:608–
624, June 2009.

[102] J. Han, Y. P. Jing, H. Wang, and W. Wang. Resolving subhaloes’ lives with
the Hierarchical Bound-Tracing algorithm. MNRAS, 427:2437–2449, December
2012.

[103] P. S. Behroozi, R. H. Wechsler, and H.-Y. Wu. The ROCKSTAR Phase-space

Temporal Halo Finder and the Velocity Offsets of Cluster Cores. ApJ, 762:109,
January 2013.
Bibliography 161

[104] V. Springel, S. D. M. White, G. Tormen, and G. Kauffmann. Populating

a cluster of galaxies - I. Results at [formmu2]z=0. MNRAS, 328:726–750,
December 2001.

[105] P. S. Behroozi, R. H. Wechsler, H.-Y. Wu, M. T. Busha, A. A. Klypin, and

J. R. Primack. Gravitationally Consistent Halo Catalogs and Merger Trees for
Precision Cosmology. ApJ, 763:18, January 2013.

[106] S. I. Muldrew, F. R. Pearce, and C. Power. The accuracy of subhalo detection.

MNRAS, 410:2617–2624, February 2011.

[107] Y. Wang, F. R. Pearce, A. Knebe, A. Schneider, C. Srisawat, D. Tweed, I. Jung,

J. Han, J. Helly, J. Onions, P. J. Elahi, P. A. Thomas, P. Behroozi, S. K. Yi,
V. Rodriguez-Gomez, Y.-Y. Mao, Y. Jing, and W. Lin. Sussing Merger Trees:
Stability and Convergence. ArXiv e-prints, April 2016.

[108] P. S. Behroozi, R. H. Wechsler, Y. Lu, O. Hahn, M. T. Busha, A. Klypin, and

J. R. Primack. Mergers and Mass Accretion for Infalling Halos Both End Well
Outside Cluster Virial Radii. ArXiv e-prints, October 2013.

[109] F. J. Castander, O. Ballester, A. Bauer, L. Cardiel-Sas, J. Carretero, R. Casas,

J. Castilla, M. Crocce, M. Delfino, M. Eriksen, and et al. The PAU camera and
the PAU survey at the William Herschel Telescope. In Society of Photo-Optical
Instrumentation Engineers (SPIE) Conference Series, volume 8446 of Society
of Photo-Optical Instrumentation Engineers (SPIE) Conference Series, page 6,
September 2012.

[110] K. S. Dawson, D. J. Schlegel, C. P. Ahn, S. F. Anderson, É. Aubourg, S. Bailey,

R. H. Barkhouser, J. E. Bautista, A. Beifiori, and et al. The Baryon Oscillation
Spectroscopic Survey of SDSS-III. AJ, 145:10, January 2013.

[111] M. Levi, C. Bebek, T. Beers, R. Blum, R. Cahn, D. Eisenstein, B. Flaugher,

K. Honscheid, R. Kron, O. Lahav, P. McDonald, N. Roe, D. Schlegel, and
representing the DESI collaboration. The DESI Experiment, a whitepaper for
Snowmass 2013. ArXiv e-prints 1308.0847, August 2013.
162 Bibliography

[112] LSST Science Collaboration, P. A. Abell, J. Allison, S. F. Anderson, J. R.

Andrew, J. R. P. Angel, L. Armus, D. Arnett, S. J. Asztalos, T. S. Axelrod,
and et al. LSST Science Book, Version 2.0. ArXiv e-prints, December 2009.

[113] M. Manera, R. Scoccimarro, and W. J. et al. Percival. The clustering of galaxies

in the SDSS-III Baryon Oscillation Spectroscopic Survey: a large sample of
mock galaxy catalogues. MNRAS, 428:1036–1054, January 2013.

[114] P. Coles and B. Jones. A lognormal model for the cosmological mass distribu-
tion. MNRAS, 248:1–13, January 1991.

[115] R. Scoccimarro and R. K. Sheth. PTHALOS: a fast method for generating

mock galaxy distributions. MNRAS, 329:629–640, January 2002.

[116] P. Monaco, T. Theuns, and G. Taffoni. The pinocchio algorithm: pinpointing

orbit-crossing collapsed hierarchical objects in a linear density field. MNRAS,
331:587–608, April 2002.

[117] P. Monaco, E. Sefusatti, S. Borgani, M. Crocce, P. Fosalba, R. K. Sheth,

and T. Theuns. An accurate tool for the fast generation of dark matter halo
catalogues. MNRAS, 433:2389–2402, August 2013.

[118] F.-S. Kitaura, G. Yepes, and F. Prada. Modelling baryon acoustic oscillations
with perturbation theory and stochastic halo biasing. MNRAS, 439:L21–L25,
March 2014.

[119] S. Tassev, M. Zaldarriaga, and D. J. Eisenstein. Solving large scale structure

in ten easy steps with COLA. Journal of Cosmology and Astroparticle Physics,
6:36, June 2013.

[120] M. White, J. L. Tinker, and C. K. McBride. Mock galaxy catalogues using the
quick particle mesh method. MNRAS, 437:2594–2606, January 2014.

[121] C.-H. Chuang, F.-S. Kitaura, F. Prada, C. Zhao, and G. Yepes. EZmocks:
extending the Zel’dovich approximation to generate mock galaxy catalogues
with accurate clustering statistics. MNRAS, 446:2621–2628, January 2015.
Bibliography 163

[122] Y. Feng, M.-Y. Chu, and U. Seljak. FastPM: a new scheme for fast simulations
of dark matter and halos. ArXiv e-prints, March 2016.

[123] S. R. Knollmann and A. Knebe. AHF: Amiga’s Halo Finder. ApJS, 182:608–
624, June 2009.

[124] A. Klypin, G. Yepes, S. Gottlober, F. Prada, and S. Hess. MultiDark simula-

tions: the story of dark matter halo concentrations and density profiles. ArXiv
e-prints 1411.4001, November 2014.

[125] Planck Collaboration. Planck 2013 results. XVI. Cosmological parameters.

A&A, 571:A16, November 2014.

[126] P. Fosalba, M. Crocce, E. Gaztañaga, and F. J. Castander. The MICE grand

challenge lightcone simulation - I. Dark matter clustering. MNRAS, 448:2987–
3000, April 2015.

[127] M. Crocce, F. J. Castander, E. Gaztañaga, P. Fosalba, and J. Carretero. The

MICE Grand Challenge lightcone simulation - II. Halo and galaxy catalogues.
MNRAS, 453:1513–1530, October 2015.

[128] P. Fosalba, E. Gaztañaga, F. J. Castander, and M. Crocce. The MICE Grand

Challenge light-cone simulation - III. Galaxy lensing mocks from all-sky lensing
maps. MNRAS, 447:1319–1332, February 2015.

[129] R. Scoccimarro. Transients from initial conditions: a perturbative analysis.

MNRAS, 299:1097–1118, October 1998.

[130] M. C. Neyrinck. Quantifying distortions of the Lagrangian dark-matter mesh

in cosmology. MNRAS, 428:141–153, January 2013.

[131] V. Sahni and S. Shandarin. Accuracy of Lagrangian approximations in voids.

MNRAS, 282:641–645, September 1996.

[132] J. R. Bond, S. Cole, G. Efstathiou, and N. Kaiser. Excursion set mass functions
for hierarchical Gaussian fluctuations. ApJ, 379:440–460, October 1991.
164 Bibliography

[133] S. G. Murray, C. Power, and A. S. G. Robotham. HMFcalc: An online tool

for calculating dark matter halo mass functions. Astronomy and Computing,
3:23–34, November 2013.

[134] W. A. Watson, I. T. Iliev, A. D’Aloisio, A. Knebe, P. R. Shapiro, and G. Yepes.

The halo mass function through the cosmic ages. MNRAS, 433:1230–1245,
August 2013.

[135] D. Alonso. CUTE solutions for two-point correlation functions from large
cosmological datasets. ArXiv e-prints 1210.1833, October 2012.

[136] J. L. Tinker, D. H. Weinberg, Z. Zheng, and I. Zehavi. On the Mass-to-Light

Ratio of Large-Scale Structure. ApJ, 631:41–58, September 2005.

[137] R. W. Hockney and J. W. Eastwood. Computer simulation using particles.

1988.

[138] P. Colı́n, A. A. Klypin, and A. V. Kravtsov. Velocity Bias in a Λ Cold Dark

Matter Model. ApJ, 539:561–569, August 2000.

[139] P. J. E. Peebles. Statistics of the distribution of galaxies. In J. Ehlers, J. J.

Perry, and M. Walker, editors, Ninth Texas Symposium on Relativistic Astro-
physics, volume 336 of Annals of the New York Academy of Sciences, pages
161–171, February 1980.

[140] J. N. Fry. Cosmological density fluctuations and large-scale structure From

N-point correlation functions to the probability distribution. ApJ, 289:10–17,
February 1985.

[141] W. C. Saslaw. The Distribution of the Galaxies. February 2000.

[142] Y. P. Jing. Correcting for the Alias Effect When Measuring the Power Spec-
trum Using a Fast Fourier Transform. ApJ, 620:559–563, February 2005.

[143] E. Hubble. The Distribution of Extra-Galactic Nebulae. ApJ, 79:8, January

1934.
Bibliography 165

[144] V. Wild, J. A. Peacock, O. Lahav, E. Conway, S. Maddox, I. K. Baldry,

C. M. Baugh, J. Bland-Hawthorn, T. Bridges, R. Cannon, S. Cole, M. Colless,
C. Collins, W. Couch, G. Dalton, R. De Propris, S. P. Driver, G. Efstathiou,
R. S. Ellis, C. S. Frenk, K. Glazebrook, C. Jackson, I. Lewis, S. Lumsden,
D. Madgwick, P. Norberg, B. A. Peterson, W. Sutherland, and K. Taylor.
The 2dF Galaxy Redshift Survey: stochastic relative biasing between galaxy
populations. MNRAS, 356:247–269, January 2005.

[145] F.-S. Kitaura and S. Heß. Cosmological structure formation with augmented
Lagrangian perturbation theory. MNRAS, 435:L78–L82, August 2013.

[146] C. Zhao, F.-S. Kitaura, C.-H. Chuang, F. Prada, G. Yepes, and C. Tao. Halo
mass distribution reconstruction across the cosmic web. MNRAS, 451:4266–
4276, August 2015.

[147] H. Gil-Marı́n, J. Noreña, L. Verde, W. J. Percival, C. Wagner, M. Manera, and

D. P. Schneider. The power spectrum and bispectrum of SDSS DR11 BOSS
galaxies - I. Bias and gravity. MNRAS, 451:539–580, July 2015.

[148] M. Bolzonella, J.-M. Miralles, and R. Pelló. Photometric redshifts based on

standard SED fitting procedures. A&A, 363:476–492, November 2000.

[149] N. Benı́tez. Bayesian Photometric Redshift Estimation. ApJ, 536:571–583,

June 2000.

[150] A. E. Firth, O. Lahav, and R. S. Somerville. Estimating photometric redshifts

with artificial neural networks. MNRAS, 339:1195–1202, March 2003.

[151] Dark Energy Survey Collaboration, T. Abbott, F. B. Abdalla, S. Allam,

J. Aleksić, A. Amara, D. Bacon, E. Balbinot, M. Banerji, K. Bechtol,
A. Benoit-Lévy, G. M. Bernstein, E. Bertin, J. Blazek, S. Dodelson, C. Bon-
nett, D. Brooks, S. Bridle, R. J. Brunner, E. Buckley-Geer, D. L. Burke,
D. Capozzi, G. B. Caminha, J. Carlsen, A. Carnero-Rosell, M. Carollo,
M. Carrasco-Kind, J. Carretero, F. J. Castander, L. Clerkin, T. Collett,
C. Conselice, M. Crocce, C. E. Cunha, C. B. D’Andrea, L. N. da Costa, T. M.
166 Bibliography

Davis, S. Desai, H. T. Diehl, J. P. Dietrich, P. Doel, A. Drlica-Wagner, J. Ether-

ington, J. Estrada, A. E. Evrard, J. Fabbri, D. A. Finley, B. Flaugher, P. Fos-
alba, R. J. Foley, J. Frieman, J. Garcı́a-Bellido, E. Gaztanaga, D. W. Gerdes,
T. Giannantonio, D. A. Goldstein, D. Gruen, R. A. Gruendl, P. Guarnieri,
G. Gutierrez, W. Hartley, K. Honscheid, B. Jain, D. J. James, T. Jeltema,
S. Jouvel, R. Kessler, A. King, D. Kirk, R. Kron, K. Kuehn, N. Kuropatkin,
O. Lahav, T. S. Li, M. Lima, H. Lin, M. A. G. Maia, M. Makler, M. Man-
era, C. Maraston, J. L. Marshall, P. Martini, R. G. McMahon, P. Melchior,
A. Merson, C. J. Miller, R. Miquel, J. J. Mohr, X. Morice-Atkinson, K. Naidoo,
E. Neilsen, R. C. Nichol, B. Nord, R. Ogando, F. Ostrovski, A. Palmese, A. Pa-
padopoulos, H. Peiris, J. Peoples, A. A. Plazas, W. J. Percival, S. L. Reed,
A. K. Romer, A. Roodman, A. Ross, E. Rozo, E. S. Rykoff, I. Sadeh, M. Sako,
C. Sánchez, E. Sanchez, B. Santiago, V. Scarpine, M. Schubnell, I. Sevilla-
Noarbe, E. Sheldon, M. Smith, R. C. Smith, M. Soares-Santos, F. Sobreira,
M. Soumagnac, E. Suchyta, M. Sullivan, M. Swanson, G. Tarle, J. Thaler,
D. Thomas, R. C. Thomas, D. Tucker, J. D. Vieira, V. Vikram, A. R. Walker,
R. H. Wechsler, W. Wester, J. Weller, L. Whiteway, H. Wilcox, B. Yanny,
Y. Zhang, and J. Zuntz. The Dark Energy Survey: more than dark energy -
an overview. MNRAS, March 2016.

[152] K. Batygin and M. E. Brown. Evidence for a Distant Giant Planet in the Solar
System. AJ, 151:22, February 2016.

[153] B. P. Abbott, R. Abbott, T. D. Abbott, M. R. Abernathy, F. Acernese, K. Ack-

ley, C. Adams, T. Adams, P. Addesso, R. X. Adhikari, and et al. Observation
of Gravitational Waves from a Binary Black Hole Merger. Physical Review
Letters, 116(6):061102, February 2016.

[154] M. Soares-Santos, R. Kessler, E. Berger, J. Annis, D. Brout, E. Buckley-Geer,

H. Chen, P. S. Cowperthwaite, H. T. Diehl, Z. Doctor, A. Drlica-Wagner,
B. Farr, D. A. Finley, B. Flaugher, R. J. Foley, J. Frieman, R. A. Gruendl,
K. Herner, D. Holz, H. Lin, J. Marriner, E. Neilsen, A. Rest, M. Sako, D. Scol-
nic, F. Sobreira, A. R. Walker, W. Wester, B. Yanny, T. M. C. Abbott, F. B.
Abdalla, F. B. Abdalla, S. Allam, R. Armstrong, M. Banerji, A. Benoit-
Bibliography 167

Lévy, R. A. Bernstein, E. Bertin, D. A. Brown, D. L. Burke, D. Capozzi,

A. Carnero Rosell, M. Carrasco Kind, J. Carretero, F. J. Castander, S. B.
Cenko, R. Chornock, M. Crocce, C. B. D’Andrea, C. B. D’Andrea, L. N. da
Costa, S. Desai, J. P. Dietrich, M. R. Drout, T. F. Eifler, J. Estrada, A. E.
Evrard, S. Fairhurst, E. Fernandez, J. Fischer, W. Fong, P. Fosalba, D. B. Fox,
C. L. Fryer, J. Garcia-Bellido, E. Gaztanaga, D. W. Gerdes, D. A. Goldstein,
D. Gruen, G. Gutierrez, K. Honscheid, D. J. James, I. Karliner, D. Kasen,
S. Kent, N. Kuropatkin, K. Kuehn, O. Lahav, T. S. Li, M. Lima, M. A. G.
Maia, R. Margutti, P. Martini, T. Matheson, R. G. McMahon, B. D. Metzger,
C. J. Miller, R. Miquel, J. J. Mohr, R. C. Nichol, B. Nord, R. Ogando, J. Peo-
ples, A. A. Plazas, E. Quataert, A. K. Romer, A. Roodman, A. Roodman,
E. S. Rykoff, E. S. Rykoff, E. Sanchez, V. Scarpine, R. Schindler, M. Schubnell,
I. Sevilla-Noarbe, E. Sheldon, M. Smith, N. Smith, R. C. Smith, A. Stebbins,
P. J. Sutton, M. E. C. Swanson, G. Tarle, J. Thaler, R. C. Thomas, D. L.
Tucker, V. Vikram, R. H. Wechsler, and J. Weller. A Dark Energy Camera
Search for an Optical Counterpart to the First Advanced LIGO Gravitational
Wave Event GW150914. ArXiv e-prints, February 2016.

[155] B. P. Abbott, R. Abbott, T. D. Abbott, M. R. Abernathy, F. Acernese, K. Ack-

ley, C. Adams, T. Adams, P. Addesso, R. X. Adhikari, and et al. Localization
and broadband follow-up of the gravitational-wave transient GW150914. ArXiv
e-prints, February 2016.

[156] M. E. Brown and K. Batygin. Observational constraints on the orbit and

location of Planet Nine in the outer solar system. ArXiv e-prints, March 2016.

[157] J. Kwan, C. Sanchez, J. Clampitt, J. Blazek, M. Crocce, B. Jain, J. Zuntz,

A. Amara, M. Becker, G. Bernstein, C. Bonnett, J. DeRose, S. Dodelson, T. Ei-
fler, E. Gaztanaga, T. Giannantonio, D. Gruen, W. Hartley, T. Kacprzak,
D. Kirk, E. Krause, N. MacCrann, R. Miquel, Y. Park, A. Ross, E. Rozo,
E. Rykoff, E. Sheldon, M. A. Troxel, R. Wechsler, T. Abbott, F. Abdalla,
S. Allam, A. Benoit-Lévy, D. Brooks, D. Burke, A. Carnero Rosell, M. Carrasco
Kind, C. Cunha, C. D’Andrea, L. da Costa, S. Desai, H. T. Diehl, J. Dietrich,
P. Doel, A. Evrard, E. Fernandez, D. Finley, B. Flaugher, P. Fosalba, J. Frie-
168 Bibliography

man, D. Gerdes, R. Gruendl, G. Gutierrez, K. Honscheid, D. James, M. Jarvis,

K. Kuehn, O. Lahav, M. Lima, M. Maia, J. Marshall, P. Martini, P. Melchior,
J. Mohr, R. Nichol, B. Nord, A. Plazas, K. Reil, K. Romer, A. Roodman,
E. Sanchez, V. Scarpine, I. Sevilla, R. C. Smith, M. Soares-Santos, F. Sobreira,
E. Suchyta, M. Swanson, G. Tarle, D. Thomas, V. Vikram, and A. Walker.
Cosmology from large scale galaxy clustering and galaxy-galaxy lensing with
Dark Energy Survey Science Verification data. ArXiv e-prints, April 2016.

[158] T. Kacprzak, D. Kirk, O. Friedrich, A. Amara, A. Refregier, L. Marian, J. P.

Dietrich, E. Suchyta, J. Aleksić, D. Bacon, M. R. Becker, C. Bonnett, S. L.
Bridle, C. Chang, T. F. Eifler, W. Hartley, E. M. Huff, E. Krause, N. Mac-
Crann, P. Melchior, A. Nicola, S. Samuroff, E. Sheldon, M. A. Troxel, J. Weller,
J. Zuntz, T. M. C. Abbott, F. B. Abdalla, R. Armstrong, A. Benoit-Lévy, R. A.
Bernstein, E. Bertin, D. Brooks, D. L. Burke, A. Carnero Rosell, M. Carrasco
Kind, J. Carretero, F. J. Castander, M. Crocce, C. B. D’Andrea, L. N. da
Costa, S. Desai, H. T. Diehl, A. E. Evrard, A. Fausti Neto, B. Flaugher, P. Fos-
alba, J. Frieman, D. W. Gerdes, D. A. Goldstein, D. Gruen, R. A. Gruendl,
G. Gutierrez, K. Honscheid, D. J. James, K. Kuehn, N. Kuropatkin, O. La-
hav, M. Lima, M. March, J. L. Marshall, P. Martini, C. J. Miller, R. Miquel,
J. J. Mohr, R. C. Nichol, B. Nord, A. A. Plazas, A. K. Romer, A. Roodman,
E. S. Rykoff, E. Sanchez, V. Scarpine, M. Schubnell, I. Sevilla-Noarbe, R. C.
Smith, M. Soares-Santos, F. Sobreira, M. E. C. Swanson, G. Tarle, D. Thomas,
V. Vikram, A. R. Walker, and Y. Zhang. Cosmology constraints from shear
peak statistics in Dark Energy Survey Science Verification data. ArXiv e-prints,
March 2016.

[159] J. Clampitt, C. Sánchez, J. Kwan, E. Krause, N. MacCrann, Y. Park, M. A.

Troxel, B. Jain, E. Rozo, E. S. Rykoff, R. H. Wechsler, J. Blazek, C. Bon-
nett, M. Crocce, Y. Fang, E. Gaztanaga, D. Gruen, M. Jarvis, R. Miquel,
J. Prat, A. J. Ross, E. Sheldon, J. Zuntz, T. M. C. Abbott, F. B. Abdalla,
R. Armstrong, M. R. Becker, A. Benoit-Lévy, G. M. Bernstein, E. Bertin,
D. Brooks, D. L. Burke, A. Carnero Rosell, M. Carrasco Kind, C. E. Cunha,
C. B. D’Andrea, L. N. da Costa, S. Desai, H. T. Diehl, J. P. Dietrich, P. Doel,
J. Estrada, A. E. Evrard, A. Fausti Neto, B. Flaugher, P. Fosalba, J. Frieman,
Bibliography 169

R. A. Gruendl, K. Honscheid, D. J. James, K. Kuehn, N. Kuropatkin, O. La-

hav, M. Lima, M. March, J. L. Marshall, P. Martini, P. Melchior, J. J. Mohr,
R. C. Nichol, B. Nord, A. A. Plazas, A. K. Romer, E. Sanchez, V. Scarpine,
M. Schubnell, I. Sevilla-Noarbe, R. C. Smith, M. Soares-Santos, F. Sobreira,
E. Suchyta, M. E. C. Swanson, G. Tarle, D. Thomas, V. Vikram, and A. R.
Walker. Galaxy-Galaxy Lensing in the DES Science Verification Data. ArXiv
e-prints, March 2016.

[160] E. J. Baxter, J. Clampitt, T. Giannantonio, S. Dodelson, B. Jain, D. Huterer,

L. E. Bleem, T. M. Crawford, G. Efstathiou, P. Fosalba, D. Kirk, J. Kwan,
C. Sánchez, K. T. Story, M. A. Troxel, T. M. C. Abbott, F. B. Abdalla,
R. Armstrong, A. Benoit-Lévy, B. A. Benson, G. M. Bernstein, R. A. Bern-
stein, E. Bertin, D. Brooks, J. E. Carlstrom, A. Carnero Rosell, M. Carrasco
Kind, J. Carretero, R. Chown, M. Crocce, C. E. Cunha, C. B. D’Andrea,
L. N. da Costa, S. Desai, H. T. Diehl, J. P. Dietrich, P. Doel, A. E. Evrard,
A. Fausti Neto, B. Flaugher, J. Frieman, D. Gruen, R. A. Gruendl, G. Gutier-
rez, T. de Haan, G. P. Holder, K. Honscheid, Z. Hou, D. J. James, K. Kuehn,
N. Kuropatkin, M. Lima, M. March, J. L. Marshall, P. Martini, P. Melchior,
C. J. Miller, R. Miquel, J. J. Mohr, B. Nord, Y. Omori, A. A. Plazas, C. L.
Reichardt, A. K. Romer, E. S. Rykoff, E. Sanchez, I. Sevilla-Noarbe, E. Shel-
don, R. C. Smith, M. Soares-Santos, F. Sobreira, E. Suchyta, A. A. Stark,
M. E. C. Swanson, G. Tarle, D. Thomas, A. R. Walker, and R. H. Wechsler.
Joint Measurement of Lensing-Galaxy Correlations Using SPT and DES SV
Data. ArXiv e-prints, February 2016.

[161] E. S. Rykoff, E. Rozo, D. Hollowood, A. Bermeo-Hernandez, T. Jeltema,

J. Mayers, A. K. Romer, P. Rooney, A. Saro, C. Vergara Cervantes, H. Wilcox,
T. M. C. Abbott, F. B. Abdalla, S. Allam, J. Annis, A. Benoit-Lévy, G. M.
Bernstein, E. Bertin, D. Brooks, D. L. Burke, D. Capozzi, A. Carnero Rosell,
M. Carrasco Kind, F. J. Castander, M. Childress, C. A. Collins, C. E. Cunha,
C. B. D’Andrea, L. N. da Costa, T. M. Davis, S. Desai, H. T. Diehl, J. P.
Dietrich, P. Doel, A. E. Evrard, D. A. Finley, B. Flaugher, P. Fosalba, J. Frie-
man, K. Glazebrook, D. A. Goldstein, D. Gruen, R. A. Gruendl, G. Gutier-
rez, M. Hilton, K. Honscheid, B. Hoyle, D. J. James, S. T. Kay, K. Kuehn,
170 Bibliography

N. Kuropatkin, O. Lahav, G. F. Lewis, C. Lidman, M. Lima, M. A. G. Maia,

R. G. Mann, J. L. Marshall, P. Martini, P. Melchior, C. J. Miller, R. Miquel,
J. J. Mohr, R. C. Nichol, B. Nord, R. Ogando, A. A. Plazas, K. Reil, M. Sahlén,
E. Sanchez, B. Santiago, V. Scarpine, M. Schubnell, I. Sevilla-Noarbe, R. C.
Smith, M. Soares-Santos, F. Sobreira, J. P. Stott, E. Suchyta, M. E. C. Swan-
son, G. Tarle, D. Thomas, D. Tucker, P. T. P. Viana, V. Vikram, A. R. Walker,
and Y. Zhang. The redMaPPer Galaxy Cluster Catalog From DES Science
Verification Data. ArXiv e-prints, January 2016.

[162] C. Chang, A. Pujol, E. Gaztañaga, A. Amara, A. Réfrégier, D. Bacon, M. R.

Becker, C. Bonnett, J. Carretero, F. J. Castander, M. Crocce, P. Fosalba,
T. Giannantonio, W. Hartley, M. Jarvis, T. Kacprzak, A. J. Ross, E. Sheldon,
M. A. Troxel, V. Vikram, J. Zuntz, T. M. C. Abbott, F. B. Abdalla, S. Al-
lam, J. Annis, A. Benoit-Lévy, E. Bertin, D. Brooks, E. Buckley-Geer, D. L.
Burke, D. Capozzi, A. C. Rosell, M. C. Kind, C. E. Cunha, C. B. D’Andrea,
L. N. da Costa, S. Desai, H. T. Diehl, J. P. Dietrich, P. Doel, T. F. Eifler,
J. Estrada, A. E. Evrard, B. Flaugher, J. Frieman, D. A. Goldstein, D. Gruen,
R. A. Gruendl, G. Gutierrez, K. Honscheid, B. Jain, D. J. James, K. Kuehn,
N. Kuropatkin, O. Lahav, T. S. Li, M. Lima, J. L. Marshall, P. Martini, P. Mel-
chior, C. J. Miller, R. Miquel, J. J. Mohr, R. C. Nichol, B. Nord, R. Ogando,
A. A. Plazas, K. Reil, A. K. Romer, A. Roodman, E. S. Rykoff, E. Sanchez,
V. Scarpine, M. Schubnell, I. Sevilla-Noarbe, R. C. Smith, M. Soares-Santos,
F. Sobreira, E. Suchyta, M. E. C. Swanson, G. Tarle, D. Thomas, and A. R.
Walker. Galaxy bias from the DES Science Verification data: combining galaxy
density maps and weak lensing maps. MNRAS, April 2016.

[163] M. Crocce and et al. Optimisation of the galaxy sample from Y1-DES data
for BAO analysis. in prep.

[164] P. Fosalba, E. Gaztañaga, F. J. Castander, and M. Manera. The onion uni-

verse: all sky lightcone simulations in spherical shells. MNRAS, 391:435–446,
November 2008.

[165] N. Benı́tez. Bayesian Photometric Redshift Estimation. ApJ, 536:571–583,

June 2000.
Bibliography 171

[166] N. Benı́tez, H. Ford, R. Bouwens, F. Menanteau, J. Blakeslee, C. Gronwall,

G. Illingworth, G. Meurer, T. J. Broadhurst, M. Clampin, M. Franx, G. F.
Hartig, D. Magee, M. Sirianni, D. R. Ardila, F. Bartko, R. A. Brown, C. J.
Burrows, E. S. Cheng, N. J. G. Cross, P. D. Feldman, D. A. Golimowski,
L. Infante, R. A. Kimble, J. E. Krist, M. P. Lesser, Z. Levay, A. R. Martel,
G. K. Miley, M. Postman, P. Rosati, W. B. Sparks, H. D. Tran, Z. I. Tsvetanov,
R. L. White, and W. Zheng. Faint Galaxies in Deep Advanced Camera for
Surveys Observations. ApJS, 150:1–18, January 2004.

[167] G. Favole, J. Comparat, F. Prada, G. Yepes, E. Jullo, A. Niemiec, J.-P. Kneib,

S. A. Rodrı́guez-Torres, A. Klypin, R. A. Skibba, C. K. McBride, D. J. Eisen-
stein, D. J. Schlegel, S. E. Nuza, C.-H. Chuang, T. Delubac, C. Yèche, and
D. P. Schneider. Clustering properties of g-selected galaxies at z ∼ 0.8. ArXiv
e-prints, July 2015.

[168] A. Cooray and R. Sheth. Halo models of large scale structure. Phys. Rep.,
372:1–129, December 2002.

[169] A. Klypin, G. Yepes, S. Gottlöber, F. Prada, and S. Heß. MultiDark sim-

ulations: the story of dark matter halo concentrations and density profiles.
MNRAS, 457:4340–4359, April 2016.

[170] G. L. Bryan and M. L. Norman. Statistical Properties of X-Ray Clusters:

Analytic and Numerical Comparisons. ApJ, 495:80–99, March 1998.

[171] R. K. Sheth and A. Diaferio. Peculiar velocities of galaxies and clusters. MN-
RAS, 322:901–917, April 2001.

[172] P. S. Behroozi, C. Conroy, and R. H. Wechsler. A Comprehensive Analysis of

Uncertainties Affecting the Stellar Mass-Halo Mass Relation for 0 ¡ z ¡ 4. ApJ,
717:379–403, July 2010.

[173] R. M. Reddick, R. H. Wechsler, J. L. Tinker, and P. S. Behroozi. The Con-

nection between Galaxies and Dark Matter Structures in the Local Universe.
ApJ, 771:30, July 2013.
172 Bibliography

[174] M. Crocce, A. Cabré, and E. Gaztañaga. Modelling the angular correla-

tion function and its full covariance in photometric galaxy surveys. MNRAS,
414:329–349, June 2011.

[175] E. Sanchez and et al. Extracting BAO from a photometric galaxy survey:
comparison of methods. in prep.

[176] J. Ross, A. and et al. Optimizing BAO Measurements for photoz surveys:
Application to Dark Energy Survey Galaxy Clustering. in prep.

Modern Cosmology Scott Dodelson 2nd Edition
100% (10)
Modern Cosmology Scott Dodelson 2nd Edition
512 pages
Quantum Physics for Beginners
From Everand
Quantum Physics for Beginners
Max Thomson
4.5/5 (3)
Cosmologynotesprime
No ratings yet
Cosmologynotesprime
23 pages
GR Cosmology
100% (2)
GR Cosmology
59 pages
Review - 2019 - The Distribution of Dark Matter in Galaxies
No ratings yet
Review - 2019 - The Distribution of Dark Matter in Galaxies
60 pages
Andrew H. Jaffe - Cosmology
No ratings yet
Andrew H. Jaffe - Cosmology
108 pages
SJSJSJ
No ratings yet
SJSJSJ
185 pages
Fundamentals Astrophysics Cosmology
No ratings yet
Fundamentals Astrophysics Cosmology
154 pages
Ryden Solutions
No ratings yet
Ryden Solutions
59 pages
Copeland, Skordis - Modern Cosmology (Notes) PDF
No ratings yet
Copeland, Skordis - Modern Cosmology (Notes) PDF
188 pages
Cosmological Parameters From Observations of Galaxy Clusters
No ratings yet
Cosmological Parameters From Observations of Galaxy Clusters
53 pages
Computational Cosmology: From The Early Universe To The Large Scale Structure
No ratings yet
Computational Cosmology: From The Early Universe To The Large Scale Structure
63 pages
Multi Probe Cosmology
No ratings yet
Multi Probe Cosmology
30 pages
Mit8 902 f23 Lec Full
No ratings yet
Mit8 902 f23 Lec Full
122 pages
CosmologyLectures Baumann
No ratings yet
CosmologyLectures Baumann
142 pages
GTR CU Lecture Notes
No ratings yet
GTR CU Lecture Notes
64 pages
Cosmology Lecture Notes 2023 (Jonathan Pritchard)
No ratings yet
Cosmology Lecture Notes 2023 (Jonathan Pritchard)
110 pages
Lecture Notes For Introduction To Cosmology
100% (1)
Lecture Notes For Introduction To Cosmology
168 pages
Aldrovandi
No ratings yet
Aldrovandi
144 pages
Cosmology by Amendola PDF
No ratings yet
Cosmology by Amendola PDF
162 pages
(Series in Astronomy and Astrophysics) Derek Raine, E.G. Thomas - An Introduction To The Science of Cosmology-Taylor & Francis (2001)
No ratings yet
(Series in Astronomy and Astrophysics) Derek Raine, E.G. Thomas - An Introduction To The Science of Cosmology-Taylor & Francis (2001)
234 pages
Stuart R. Lange - The Time Evolution of The Cosmic Microwave Background Photosphere
No ratings yet
Stuart R. Lange - The Time Evolution of The Cosmic Microwave Background Photosphere
95 pages
PhysCos Notes 09
No ratings yet
PhysCos Notes 09
64 pages
0904 0382 PDF
No ratings yet
0904 0382 PDF
186 pages
Lecture Notes in Cosmology: Oliver F. Piattella
No ratings yet
Lecture Notes in Cosmology: Oliver F. Piattella
377 pages
ElementsCosmology ADS
No ratings yet
ElementsCosmology ADS
74 pages
Theory of Everything Scalar Potential Mo
100% (4)
Theory of Everything Scalar Potential Mo
459 pages
A Concise Introduction To Astrophysics
No ratings yet
A Concise Introduction To Astrophysics
144 pages
Introduccion A Teoria de Cuerdas
No ratings yet
Introduccion A Teoria de Cuerdas
67 pages
Bahcall J., Ostriker J. (Eds.) Unsolved Problems in Astrophysics (Princeton, 1997) (T) (391s) - PAp
No ratings yet
Bahcall J., Ostriker J. (Eds.) Unsolved Problems in Astrophysics (Princeton, 1997) (T) (391s) - PAp
391 pages
Lectures
No ratings yet
Lectures
130 pages
Cosmology Lectures Part III Mathematical Trips - Daniel Baumann, Cambridge
No ratings yet
Cosmology Lectures Part III Mathematical Trips - Daniel Baumann, Cambridge
130 pages
Cosmology Using Strong Gravitational Lensing: The University of Sydney
No ratings yet
Cosmology Using Strong Gravitational Lensing: The University of Sydney
191 pages
1502 04177
No ratings yet
1502 04177
380 pages
Bonn Galactic Intergalactic B Field
No ratings yet
Bonn Galactic Intergalactic B Field
157 pages
3306 Cosmo
No ratings yet
3306 Cosmo
56 pages
Nearby Active Gal
No ratings yet
Nearby Active Gal
76 pages
đá phiến sét
No ratings yet
đá phiến sét
90 pages
Dark Matter PDF
No ratings yet
Dark Matter PDF
120 pages
Bland Hawthorn Gerhard 2016 The Galaxy in Context Structural Kinematic and Integrated Properties
No ratings yet
Bland Hawthorn Gerhard 2016 The Galaxy in Context Structural Kinematic and Integrated Properties
70 pages
In The Beginning: The First Sources of Light and The Reionization of The Universe
No ratings yet
In The Beginning: The First Sources of Light and The Reionization of The Universe
136 pages
Cosmology
No ratings yet
Cosmology
18 pages
The Hubble Constant
No ratings yet
The Hubble Constant
52 pages
An Introduction To The Science of Cosmology
100% (5)
An Introduction To The Science of Cosmology
235 pages
A. Cooray
No ratings yet
A. Cooray
129 pages
Thesis
No ratings yet
Thesis
165 pages
Energía Oscura y Teoría de La Gravedad Modificada
No ratings yet
Energía Oscura y Teoría de La Gravedad Modificada
28 pages
Cosmology
No ratings yet
Cosmology
82 pages
Krititka Report Sanidhya
No ratings yet
Krititka Report Sanidhya
39 pages
Mortals or Immortals
From Everand
Mortals or Immortals
Konstantinos p Anastasiadis
No ratings yet
A Discourse Analysis of 1 Peter
From Everand
A Discourse Analysis of 1 Peter
Ervin Ray Starwalt
No ratings yet
ADVANCED COLLEGE ALGEBRA STUDY GUIDE
From Everand
ADVANCED COLLEGE ALGEBRA STUDY GUIDE
Harrison K Cook
No ratings yet
Advanced college algebra study guide
From Everand
Advanced college algebra study guide
Harrison Cook
No ratings yet
Kellory the Warlock
From Everand
Kellory the Warlock
Lin Carter
No ratings yet
Human Nature Potential in Nurture
From Everand
Human Nature Potential in Nurture
David L. Hawk
No ratings yet
Unlocking Statistics for the Social Sciences
From Everand
Unlocking Statistics for the Social Sciences
Norma Sinclair
No ratings yet
Time-dependent Behaviour and Design of Composite Steel-concrete Structures
From Everand
Time-dependent Behaviour and Design of Composite Steel-concrete Structures
Massimiliano Bocciarelli
No ratings yet
Gray Hat Hacking the Ethical Hacker's
From Everand
Gray Hat Hacking the Ethical Hacker's
Çağatay Şanlı
5/5 (1)
Keys to Better Reading
From Everand
Keys to Better Reading
Judy McFall
No ratings yet
ChatGPT for Business: Strategies for Success
From Everand
ChatGPT for Business: Strategies for Success
Matthew C. Smith
1/5 (1)
Lovell 2016
No ratings yet
Lovell 2016
13 pages
Big Bang Theory
No ratings yet
Big Bang Theory
22 pages
What Is The Big Bang Theory
No ratings yet
What Is The Big Bang Theory
16 pages
Beyond Newtonian Gravitational Attraction
No ratings yet
Beyond Newtonian Gravitational Attraction
5 pages
A Cosmic 'CT Scan' Shows The Universe Is Far More Complex Than Expected - Live Science
No ratings yet
A Cosmic 'CT Scan' Shows The Universe Is Far More Complex Than Expected - Live Science
16 pages
Inflationary Epoch
No ratings yet
Inflationary Epoch
3 pages
Thanu Padmanabhan
No ratings yet
Thanu Padmanabhan
6 pages
1 Redshift and Conformal Time
No ratings yet
1 Redshift and Conformal Time
10 pages
The Big Bang
No ratings yet
The Big Bang
6 pages
Konsep Kosmologi Dalam Filsafat: Fathur Rahman
No ratings yet
Konsep Kosmologi Dalam Filsafat: Fathur Rahman
11 pages
Scott Watson - Brane Gas Cosmology
No ratings yet
Scott Watson - Brane Gas Cosmology
6 pages
Galaxy Morphology and Clasification Van Der Bergh PDF
No ratings yet
Galaxy Morphology and Clasification Van Der Bergh PDF
124 pages
Presentation Evidence For The Big Bang - P
No ratings yet
Presentation Evidence For The Big Bang - P
27 pages
What Lies Beyond The Singularity of THE Black Hole
No ratings yet
What Lies Beyond The Singularity of THE Black Hole
7 pages
Micro Black Hole
No ratings yet
Micro Black Hole
8 pages
Red-Shift: That The Light Coming From Them Has Undergone Red-Shift. 4-6
No ratings yet
Red-Shift: That The Light Coming From Them Has Undergone Red-Shift. 4-6
2 pages
Dark Matter
No ratings yet
Dark Matter
25 pages
Lesson 1 - The Universe
No ratings yet
Lesson 1 - The Universe
33 pages
The Big Bang: Origin of The Universe
No ratings yet
The Big Bang: Origin of The Universe
2 pages
Earth Science
No ratings yet
Earth Science
3 pages
Baumann Primordial Cosmology
No ratings yet
Baumann Primordial Cosmology
77 pages
The Evolution of The Universe: Edited by David L. Alles Western Washington University E-Mail: Alles@biol - Wwu.edu
100% (1)
The Evolution of The Universe: Edited by David L. Alles Western Washington University E-Mail: Alles@biol - Wwu.edu
36 pages
Black Hole
No ratings yet
Black Hole
14 pages
Scientific 20 Heresies
No ratings yet
Scientific 20 Heresies
180 pages
End of Universe
No ratings yet
End of Universe
2 pages
Redshift and Blueshift Report
No ratings yet
Redshift and Blueshift Report
12 pages
Introduction To Astrophysics
No ratings yet
Introduction To Astrophysics
3 pages
Populating The Landscape (2006)
No ratings yet
Populating The Landscape (2006)
9 pages
Chapter 0ne and Two
No ratings yet
Chapter 0ne and Two
9 pages
Testing CCC+TL Cosmology With Observed BAO Features
100% (1)
Testing CCC+TL Cosmology With Observed BAO Features
11 pages