Earthquake Prediction Using Machine Learning Algorithm
Earthquake Prediction Using Machine Learning Algorithm
Abstract: Per the statistics received from BBC, data varies for
every earthquake occurred till date. Approximately, up to I. INTRODUCTION
thousands are dead, about 50,000 are injured, around 1-3 Million
are dislocated, while a significant amount go missing and Earthquake’s association with structural damage and loss of
homeless. Almost 100% structural damage is experienced. It also life is one that keeps on enduring and thus is focal point of
affects the economic loss, varying from 10 to 16 million dollars. consideration for a many fields, say, seismological research
A magnitude corresponding to 5 and above is classified as and environmental engineering yet not limited to these[1].
deadliest. The most life-threatening earthquake occurred till date
took place in Indonesia where about 3 million were dead, 1-2 It’s significance is stretched out to human life too, for to
million were injured and the structural damage accounted to sustain and to survive. A prediction that can be accurate and
100%. Hence, the consequences of earthquake are devastating relied on is a requisite for all the areas prone to disasters and
and are not limited to loss and damage of living as well as non- as well as for locations that have less to none chances. It will
living, but it also causes significant amount of change-from get us ready for all the worst possible scenarios and for
surrounding and lifestyle to economic. Every such parameter necessary measures as well that can be taken before hand to
desiderates into forecasting earthquake. A couple of minutes’
notice and individuals can act to shield themselves from damage solve upcoming crisis. As the technology is evolving and
and demise; can decrease harm and monetary misfortunes, and helping humans for a better and a convenient lifestyle,
property, characteristic assets can be secured. possibility at saving life is taken up with the help of efficient
In current scenario, an accurate forecaster is designed and ML algorithm and Data Science to give accurate forecast.
developed, a system that will forecast the catastrophe. It focuses Machine Learning is a subset of Artificial Intelligence. It
on detecting early signs of earthquake by using machine learning permits the system to adapt to a behaviour of a particular
algorithms. System is entitled to basic steps of developing
learning systems along with life cycle of data science. Data-sets kind based on its own learning and possesses the ability to
for Indian sub-continental along with rest of the World are improve itself naturally solely from experience without any
collected from government sources. Pre-processing of data is explicit programming, human mediation or help[8].
followed by construction of stacking model that combines Initialisation of a machine learning process starts with
Random Forest and Support Vector Machine Algorithms. feeding an honest quality data-set to the algorithm(s), so as
Algorithms develop this mathematical model reliant on “training to build a ML prediction model. Algorithms perform
data-set”. Model looks for pattern that leads to catastrophe and
adapt to it in its building, so as to settle on choices and forecasts knowledge discovery and statistical evaluation, determining
without being expressly customized to play out the task. After patterns and trends in data. Selection of algorithms relies on
forecast, we broadcast the message to government officials and data and on the task that requires automation.
across various platforms. Our target is foreseeing catastrophic events and
The focus of information to obtain is keenly represented by the 3 improving the manner in which we react to them. Great
factors – Time, Locality and Magnitude. forecasts and admonitions spare lives. A notice of an
Keywords: Earthquake, Forecast, Machine Learning, Ran- approaching calamity can be issued well ahead of time as it
dom Forest, Support vector Machine will help in reducing both death occurrence and structural
loss.
ML algorithms construct two types of predictive
models, Regression and Classification models[6]. Each of
them approaches data in a different way. Concerned system
makes use of regression model whose core idea is
forecasting a numerical value.
Published By:
Retrieval Number: F9110038620/2020©BEIESP
Blue Eyes Intelligence Engineering
DOI:10.35940/ijrte.E9110.018620 4684 & Sciences Publication
Earthquake Prediction using Machine Learning Algorithm
recognising suitable, necessary and appropriate parameters, of each differs for batch processing (data given at once) and
identifying patterns in these parameters and understanding online processing (data generation in a continuous manner).
correlations between actual earthquakes from the past so as She concludes that ensembles are usually considered
to predict future occurrence. impractical for systems where online processing takes place
Various Random Forest-Support Vector Machine but here, its performance is better than batch processing
ensemble model are studied, modelled and deployed. with an advantage of low run time, especially for larger
data-sets.[4] Her insights are helpful for us in constructing
II. RELATED WORK our own ensemble models.
Ant´onioE Ruano, Maria G. Ruano, Pedro M.
Wenrui Li , Nakshatra, Nishita Narvekar, Nitisha Raut,
Ferreira, Ozias Barros, G.Madureira, Hamid R.Khosravani
Birsen Sirkeci, Jerry Gao introduce us to the idea that a
acquire seismic information from the PVAQ and the PESTR
strong earthquake is followed by aftershocks. We can detect
station of the seismic monitoring system. They mention a
location of these aftershocks by analysis of arrival time of
significant objective fact that detectors already present at
P-waves and S-waves. Data collection from 16 earthquake
such stations produce enormous number of bogus alarms
stations in SAC file format, which contains time series data
and fail in detection of the event due to their being based
and is a waveform, used by authors to study trends in P-
upon a standard STA/LTA ratio. Thus they present a new
waveand S-wave. Data is clipped followed by noise removal
seismic detector entitled to SVM classifier and its
to only obtain needed waveform by means of triggering
application is in a continuous manner on such stations. They
algorithm and filters. AR picker algorithm used to determine
compare specificity and recall measures obtained for each
values of P-wave and S-wave arrival time which are treated
station, and conclude that the SVM classifier could
as extracted feature. Waveform is then converted into ASCII
differentiate between noise and seismic events successfully.
format. Data is then fed to different machine learning
Next, they shift their focus in reducing detection time in
models-SVM, Decision trees Random forest and linear
Early Warning System. Obtained results (88 and 110 sec)
regression for comparison purpose. Random Forest
are too huge to be considered for deployment, so a new
distinguishes between earthquake leading and non-
approach is inherited of overlapping windows and as a
earthquake leading data the best, with an accuracy of 90.
result, time obtained was 1.3 sec and 1.8 sec respectively.
Use of triangulation technique to calculate epicentre, predict
On the other hand, a change in values of recall and
arrival time of P-wave and S-wave and the difference
specificity, result in increase in correct detection and in false
between the two arrivals.[2]
alarms as well.[5]
Khawaja Muhammad Asim, Adnan Idris, Francisco
Mart´ınez-A´ lvarez, Talat Iqbal carried out prediction of
III. PROPOSED WORK
earthquake for Hindu-Kush region where small to medium
earthquakes hit regularly, in accordance with tree based Developing predictive modelling involves gradual
ensemble classifiers like rotboost, random forest and procedure. Tools which are conventionally used for
rotation forest. They employ earthquake data-set, and developing model are Python, Hadoop and R.
convert magnitude into binary classes, hence adapting Various steps involved are:
concept of binary classification. A new combination of A. DATA ACQUISITION
features based on 3 factors- Gutenberg-Richter relationship,
Data acquisition is the process for bringing data for
seismic rate changes and distribution of fore-shock
production use either from source outside the system and
frequency. Highlighting factor is calculation of 51 seismic
into the system, or from data produced by the system. This
feature using suitable procedures and techniques. Since all
is the underlying advance to start and alludes to gathering
the models performed exceptionally well, we can conclude
required information. We obtain required data sets from
the strategy of calculating 51 features was very effective.
government provided website such as –
Rotation forest gives an accuracy of 95.9% and titles itself
USGS.gov (United States Geological Survey)-
the best among rest models.[3] The useful insights for us
Scientific agency of the United States
come in the fact that for every region on this earth, a
government.[13]
prediction model needs to be deployed however there is no
prediction of when and of what magnitude will an IMD.gov (India Meteorological Department)-
earthquake occur of. Agency of the Ministry of Earth Sciences of the
G.T Prasanna Kumari develops a classification Government of India.[14]
model using ensemble learning methods. Emphasis is Google Acquired Kaggle contains data-set collected from
hugely on two notable ensemble algorithms, named Bagging different agencies of different governments.
and Boosting to foresee how creation of diverse ensembles The columns in the data-set are -
improves precision of algorithm and how they contrast in Date
their effectiveness with respect to traditional approach of Time
constructing a single model, usually followed in ML to build Latitude
classifiers. Bagging and Boosting are discussed in depth by Longitude
specifying how each algorithm’s process flow is different Magnitude
from the other, different ways in which they can be applied, Depth
their respective algorithms, powerfulness, achievements and
limitations as well. She further discusses how performance
Published By:
Retrieval Number: F9110038620/2020©BEIESP
Blue Eyes Intelligence Engineering
DOI:10.35940/ijrte.E9110.018620 4686 & Sciences Publication
Earthquake Prediction using Machine Learning Algorithm
time taken is slightly higher for stacking. Results are as 14. “India Meteorological Department.” Wikipedia, Wikimedia
Foundation, 21 Jan. 2020,
follows : en.wikipedia.org/wiki/IndiaMeteorologicalDepartment.
15. Kumar, Vivek. “Vivek Kumar.” Pluralsight, 13 May 2019,
Table- I: Result Table www.pluralsight.com/guides/building-classification-models-scikit-
Parametes/ ACCURACY TRAINING RESPONSE learn
Algorithms TIME TIME
Bagging 74% 3m5sec 5 sec
Boosting 76% 3m19sec 5sec AUTHORS PROFILE
Stacking 83% 11m37sec 5sec
Pratiksha Bangar, an undergraduate, is pursuing
Bachelor of Engineering, in the branch of
V. CONCLUSION Information Technology from Department Of
Information Technology, JSPM’s Jaywantrao
Thus we can conclude that integration of seismic activity Sawant College of Engineering, Pune and currently
is in her final year. Research area is Machine
with machine learning technology yields efficient and Learning.
significant result and can be used to predict earthquakes
widely, given the past history of the same is well Deeksha Gupta, an undergraduate, is pursuing
Bachelor of Engineering, in the branch of
maintained. Our attempt can be termed successful. The Information Technology from Department Of
collaboration of the two can further be advanced to guard Information Technology, JSPM’S Jaywantrao
earthquakes more acutely. Large datasets prove to be very Sawant College of Engineering, Pune and currently
is in her final year. Area of Interest is Machine
significant. Prediction models can be deployed in an area- Learning.
centric manner, thus increasing the chances of accurate
prediction exponentially but at the cost of studying Sonali Gaikwad, an undergraduate, is pursuing
algorithms used to build Stacking model, as it will perform Bachelor of Engineering, in the branch of
Information Technology from Department Of
well only if the algorithms chosen to build metaregressor are Information Technology, JSPM’S Jaywantrao
accurate themselves. The use of the methodology can be Sawant College of Engineering, Pune and currently
expanded in predicting various natural disasters as well. is in her final year. Research area is Machine
Learning.
Published By:
Retrieval Number: F9110038620/2020©BEIESP
Blue Eyes Intelligence Engineering
DOI:10.35940/ijrte.E9110.018620 4688 & Sciences Publication