Madrid
Madrid
INTRODUCTION
The analysis of historical rainfall data provides critical insights into long-term climatic
patterns and their impact on environmental and socio-economic activities. Understanding trends
in rainfall over time is essential for water resource management, agricultural planning, and
climate change adaptation. The dataset presented includes rainfall depths (in millimeters) from
1860 to 1989, offering a comprehensive overview of precipitation trends over more than a
century. Variations in rainfall across the years reflect both natural fluctuations and potential
influences of broader climatic shifts, such as global warming and regional climate changes.
Analyzing this data can help to identify patterns, anomalies, and periods of extreme
rainfall, which are essential for flood risk management, drought mitigation, and designing
sustainable agricultural practices. Additionally, it may provide context for correlating hydrological
events with other environmental factors, offering opportunities for predictive modeling and long-
OBJECTIVES
1. Identify Long-term Rainfall Trends: Analyze the dataset to determine whether there
are discernible long-term increases or decreases in rainfall depth and explore potential
2. Examine Rainfall Variability: Investigate the variability of rainfall across the years,
identifying any anomalies or extreme rainfall events that may indicate periods of drought
or excessive precipitation.
3. Assess Climate Change Indicators: Use the data to assess any potential indicators of
climate change, including significant deviations from historical norms and trends that
4. Inform Water Resource Management: Provide insights into how historical rainfall data
flooding or drought.
6. Predictive Modeling: Develop models that predict future rainfall patterns based on
historical data, contributing to more accurate weather forecasting and climate resilience
strategies.
METHODOLOGY
The rainfall data for Madrid, comprising annual observations, was analyzed to
understand its statistical and probabilistic characteristics. The dataset was subjected to initial
preprocessing, which included: Any gaps in the data were identified. Although no imputation
was necessary, this step ensured completeness. Outliers were detected using the interquartile
range (IQR) method, calculated as: IQR = Q3 – Q1. Observations falling below Q1 – 1.5 x IQR
or above Q3 + 1.5 x IQR were removed to minimize skewness and retain data integrity.
To improve the interpretability and distribution symmetry of the rainfall data, the following
transformations were applied: Square Root Transformation, this reduced the influence of
extremely high values, stabilizing variance while preserving the data’s core structure. Cube Root
Transformation, the cube root was particularly useful for balancing the distribution of both high
and low rainfall observations, making it ideal for exceedance probability analysis. Logarithmic
Transformation, this transformation was applied to differentiate low rainfall values, spreading
m
specific thresholds. The Weibull formula was used: P = . For each transformation,
n+1
Return periods, defined as the average time interval between occurrences of events
1
exceeding specific thresholds, were derived as: T = .This provided actionable insights into the
P
To convey the data’s distribution and exceedance characteristics, Bins were adjusted for
optimal visualization, balancing resolution, and clarity. Relative frequencies were computed as
percentages. Kernel density estimation was applied to smooth the distribution for visual
probabilities were plotted against transformed rainfall values, showcasing trends for extreme
Descriptive metrics (mean, median, standard deviation) were calculated for raw and
transformed data to establish baselines and evaluate the effects of transformations. The
The raw dataset revealed an average annual rainfall of 426.64 mm, with a standard
deviation of 96.00 mm, indicating moderate variability. Rainfall values ranged between 258 mm
and 697 mm, capturing both typical and extreme events. The median rainfall was consistent
with the mean, emphasizing the dataset's central tendency. However, the presence of outliers
highlighted the need for transformation to stabilize variance and improve interpretability.
enhancing the visualization of rainfall distribution: Square Root Transformation, reduced the
impact of extreme rainfall values, producing a more symmetric distribution. This transformation
was particularly effective in stabilizing variance while retaining the dataset’s core structure.
Cube Root Transformation, balanced both high and low rainfall values, resulting in an evenly
spread dataset. This transformation provided the clearest and most interpretable visualizations,
especially for exceedance probabilities and return period analyses. Logarithmic Transformation,
spread smaller rainfall values over a wider range, making low-end variations more
distinguishable. However, it compressed higher values, which slightly limited its effectiveness
specific rainfall thresholds. High rainfall events exceeding 600 mm were rare, with probabilities
below 5%, reflecting their extreme nature. Conversely, typical rainfall values around the mean
(426 mm) had an exceedance probability of approximately 50%, confirming the dataset's central
tendency. Low rainfall values below 300 mm were highly unlikely, with probabilities close to
90%. These findings align with the climatological patterns expected for Madrid, where moderate
implications for rainfall event recurrence: A rainfall event of 600 mm or more had a return period
of approximately 20 years, emphasizing its rarity. Typical rainfall events (e.g., 426 mm) were
expected to occur every 2 years, aligning with the dataset's moderate variability. Low rainfall
events below 300 mm were frequent, with return periods of less than a year, indicating their
rainfall values increased, underscoring the rarity of extreme events. The peaks of the
histograms consistently aligned with typical rainfall values between 400–500 mm. Density plots
revealed a smooth distribution curve across transformations, with the cube root transformation
achieving the best balance between high and low rainfall values. These plots highlighted how
Among the transformations, the cube root emerged as the most effective for balancing
the dataset. It not only smoothed the distribution but also facilitated clear exceedance and return
period visualizations. The square root and logarithmic transformations, while useful, had more
specialized applications: the square root for stabilizing variance and the logarithmic for
The findings emphasize the moderate variability of Madrid's rainfall and the prevalence
of typical rainfall events around 400–500 mm. Extreme events, though rare, are critical for
hydrological and urban planning. By analyzing rainfall through various transformations, the
climatological research and practical applications like disaster risk reduction and resource
allocation.
CONCLUSION
The analysis of annual rainfall data for Madrid revealed critical insights into its statistical
characteristics and patterns. The dataset, with an average annual rainfall of 426.64 mm and a
standard deviation of 96.00 mm, exhibited moderate variability, indicating a relatively stable
climate with occasional extreme events. By applying transformations, particularly the cube root
and logarithmic scales, the study successfully normalized the data distribution, making it easier
to interpret rare and extreme events while preserving the integrity of the data. The square root
transformation was effective in reducing the dominance of high-end values, while the logarithmic
transformation provided a clearer view of lower rainfall magnitudes. The cube root
transformation stood out as the most balanced, effectively spreading data points across the
Exceedance probability calculations highlighted that extreme rainfall events above 600
mm are rare, with probabilities below 5%, while typical rainfall around the median (~426 mm)
occurs with a 50% likelihood. Return periods derived from exceedance probabilities offered
actionable insights, showing that high rainfall events exceeding 600 mm are expected
approximately once every 20 years, whereas moderate rainfall events occur more frequently.
These findings are critical for hydrological planning, flood risk assessments, and water resource
management.
The integration of relative frequency and density visualizations further illuminated the
dataset's behavior, particularly the clustering of typical rainfall values around 400–500 mm.
These graphs also demonstrated the transformations' impact on spreading and smoothing the
transformations and statistical techniques to better understand and predict climate behavior.
Overall, the study offers a robust framework for analyzing rainfall patterns, with applications in
urban planning, agriculture, and disaster risk reduction. The conclusions drawn not only
enhance our understanding of Madrid’s rainfall dynamics but also provide a methodology that
Utilize cube root transformations in future studies to balance rainfall data for analysis and
visualization effectively.
Integrate additional climatic factors, such as temperature and humidity, to enrich the
Develop predictive models leveraging return period data to inform water resource
Conduct further studies on seasonal and monthly rainfall distributions for a finer temporal
analysis.
REFERENCES
Alexander, L. V., & Jones, P. D. (2001). Updated precipitation series for the UK and
47(1-2), 123-138.
Kumar, V., Jain, S. K., & Singh, Y. (2010). Analysis of long-term rainfall trends in India.
Hannaford, J., & Marsh, T. J. (2006). An assessment of trends in UK runoff and low flows
Huntington, T. G. (2006). Evidence for intensification of the global water cycle: Review and
Madsen, H., Lawrence, D., Lang, M., Martinkova, M., & Kjeldsen, T. R. (2014). Review of
trend analysis and climate change projections of extreme precipitation and floods in Europe.
Zhang, X., Zwiers, F. W., Hegerl, G. C., & Francis, W. (2007). Detection of human
Rajeevan, M., Bhate, J., & Kale, J. D. (2006). High resolution daily gridded rainfall data for
the Indian region: Analysis of break and active monsoon spells. Current Science, 91(3), 296-
306.
APPENDIX A: R CODE
library(tidyverse)
library(ggpubr)
library(rstatix)
library(car)
library(broom)
####
#REMOVE OUTLIERS
#########
#MEAN AND SD
#######
#RANKING
##################
#PROBABILITY OF EXCEEDANCE
#wEIBULL AND GRINGORTEN
######
#PLOT
#WEIBULL AND GRINGORTEN
#################
#PROBABILITY OF EXCEEDANCE
#####################
#RETURN PERIOD
####################
#RELATIVE FREQ AND DENSITY VS. RAINFALL
# Calculate density
density_data <- density(MADRID2$observation, na.rm = TRUE)
####################
#PROBABILITY
#RETURN PERIOD
#EVENTS
################
#SQRT TRANSFORMATION
#EXCEEDANCE VS SQRT RAINFALL
# Assuming MADRID2 is already loaded and includes the square root transformed data
# Step 1: Apply square root transformation if not done already
MADRID2 <- MADRID2 %>%
mutate(SQRT_Observation = sqrt(observation))
#######
#FREQ VS SQRT RAINFALL
# Load necessary libraries
library(dplyr)
library(ggplot2)
# Assuming MADRID2 is already loaded and includes the square root transformed data
# Step 1: Apply square root transformation if not done already
MADRID2 <- MADRID2 %>%
mutate(SQRT_Observation = sqrt(observation))
#######
#CUBE ROOT
#EXCEDENCE VS CUBE ROOT RAINFALL
########
#FREQ AND DENSITY VS. CUBE ROOT
# Step 2: Plot Relative Frequency and Density with Increased Bin Width
ggplot(MADRID2, aes(x = CUBE_ROOT_Observation)) +
geom_histogram(aes(y = ..count.. / sum(..count..) * 100), # Relative frequency as a percentage
binwidth = 1.0, # Increased bin width (adjust this value as needed)
fill = "lightblue",
alpha = 0.5,
color = "black") + # Outline color for the bars
geom_density(aes(y = ..density.. * 100), # Convert density to percentage
color = "blue", size = 1) + # Overlay density line
labs(title = "MADRID - Total Rainfall",
x = "Cube Root Transformed Rainfall (mm)",
y = "Relative Frequency (%)") +
scale_y_continuous(sec.axis = sec_axis(~ ., name = "Density (%)")) + # Secondary y-axis for density
theme_minimal() # A cleaner theme
#############
#LOGARITHM
#EXCEEDANCE VS LOGARITHM
###########
#FREQ AND DENSITY VS LOGARITHM