0% found this document useful (0 votes)

60 views

A Design Framework For Big Data Analysis in Railway Transport

This document summarizes a paper presented at the Australasian Transport Research Forum conference in 2018. The paper presents a framework for big data analysis in railway transport using descriptive and predictive analytics. It describes applying machine learning techniques to predict anti-social behavior and train punctuality at NSW TrainLink in Australia. The framework includes an integrated key performance indicator system and predictive models built using gradient boosting algorithms like LightGBM. Evaluation of the anti-social behavior prediction model showed it could help allocate policing resources more effectively.

Uploaded by

Nani

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

60 views

A Design Framework For Big Data Analysis in Railway Transport

Uploaded by

Nani

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

Australasian Transport Research Forum 2018 Proceedings

30 October – 1 November, Darwin, Australia

Publication website: http://www.atrf.info

A Design Framework for Big Data Analysis in

Railway Transport
Jingyu Zhang1, Andrew Padley1, Yuqi Wang2, Shiping Chen2, 3
1 NSW Trains, 470 Pitt St. Sydney, Australia 2000
2University of Sydney, Australia 2006
3CSIRO, Australia 2122

{Jingyu.Zhang, Andrew.Pedley}@transport.nsw.gov.au

Abstract
It is well-known that cloud computing has many potential advantages and a multitude of
enterprise applications are currently migrating to public or hybrid cloud environments. This
paper presents the workflow and implementation of a big data analysis framework with both
descriptive and predictive analytics capability. This paper outlines the framework’s ability to
support not only a high level integrated KPI framework for enterprise reporting but also
enabling a structured deployment of machine learning techniques to predict key drivers of
railway performance. The two use cases highlighted are prediction of anti-social behavior
incidence and train punctuality. Based on our evaluation, these two models are able to assist
railway staff to estimate anti-social behavior cases and train punctuality at NSW TrainLink
(NSWTL), Australia.

1.Introduction
With the increase of both data storage capability and the prevalence of distributed computing,
big data is expected to change the railway transport domain completely. These technologies,
like numerous systems before them are another step towards more informed decision making
at both an operational and strategic level. Currently, the most popular areas for advanced
techniques in rail network analytics are safety, operations, and maintenance (Ghofrani, He,
Goverde, & Liu, 2018). It is vital for a railway operating in the 21st century to be able to not
only adapt but capitalise on new technologies to ensure corporate decision making is timely,
effective and cost-efficient. The features of big data can be described using the ‘5Vs’ (Laney,
2001) of volume, velocity, variety, veracity, and value correspondingly.

In this paper, we address the above challenges by developing a framework for data analytics
that allows multiple systems to cooperate and synchronize with each other to analyze and
predict the key elements in efficient railway transport services.

4. NSW TrainLink Integrated Key Performance Indicator System

BABEL has been deployed as an enabling technology for advanced analysis and predictive
modelling at NSW TrainLink (NSWTL). NSWTL is a multi-modal passenger transport service
provider, providing rail and coach services across New South Wales and connecting to other
states inside Australia. The main vision of the Integrated Key Performance Indicator
application is to identify and analyse dominant levers of business objectives to both managers
and operational staff.

This is an abridged version of the paper presented at the conference. The full version is being submitted
elsewhere. Details on the full paper can be obtained from the author(s).
ATRF 2018 Proceedings

Our KPI regime serves three main purposes. Firstly, it served customers as we mature the
ability to enhance the elements of our service that affect them most. Second, it delivers
enhanced evidence to our main stakeholders that the levers of our performance have not only
been identified but are being actively managed. Third, it delivers NSWTL the enhanced
capability of moving from an empirical recording of the performance to deriving general rules
of operational performance. This translates to learning and proving the why behind the what.

Figures 1 and Figure are provided as illustrative examples only. Due to privacy and security
issues the real data cannot be provided. To reiterate, the below data is fabricated and the plots
are provided as an example of the platform outputs.

Figure 1. Customer Injury Analysis

Figure 2. Technical Incidents Analysis

The overall benefit of the KPI framework was to focus enterprise efforts on the levers of
business performance as all employees are provided an overview of the major drivers of
NSWTL performance. The senior leadership was better able to view the business in a strategic

2
ATRF 2018 Proceedings

context and rectification efforts were able to be entered into, tracked and resolved more
efficiently. A derivative benefit was the illumination of data quality holes in the business and
the foundation for robust discussions around prioritization of fixes.

5. Predictive modelling with BABEL

In BABEL, one of the key functions is predictive modelling. We present two predictive
modelling applications that are currently running on BABEL in this section.
5.1. Anti-Social Behaviour (ASB) prediction
Combating anti-social behavior (ASB) that allows for proper public security resource
allocation is pivotal to ensure public safety on the New South Wales railway network. We are,
however, recognize that the Police Transport Command has limited resources to ensure the
safety of our customers and staff at all times. Rail customers who travel short or long distances
by train rate safety as their highest priority (Moore, 2011). In order to enhance public safety,
transport operators must find a way to allocate limited resources effectively. The introduction
of predictive modelling for ASB at NSWTL is a progressive step forward to anticipate ‘hot-
spots’ on the network and help provide a safe environment for staff, customers and the
community on both stations and trains.

Machine learning is a methodology to discover latent information or patterns in datasets

(Nirkhi, Dharaskar, & Thakre, 2012). It is an interdisciplinary study that involves real world
situations and mining algorithms in a pure mathematical description that gives insightful
analytics into various situations.Within the supervised learning area of machine learning,
decision trees methods are frequently utilized. Tree-based methods for regression and
classification involve stratifying and segmenting the predictor space into a number of smaller
regions (James, Witten, Hastie & Tibshirani 2013).

This algorithm is based on information theory that involves the splitting of nodes to allow for
optimal exploration of all available features (Rokach & Maimon, 2005). The split criteria are
based on information gain as trees are expected to be simple. Given defined entropy for
classes as in Equation 1,

, ,…, log 1

where represents the -th class’ chance of appearance, the information gain can be calculated
shown as in Equation 2,

, | 2

where , is Information Gain, represents parent entropy ∑ log and

| states Weighted Sum of Entropy (Children) given the parent :
∑ ∑ Pr | log Pr | .

The method of combination of several decision trees to produce better predictive performance
is called an ensemble method. Gradient Boosting is an ensemble technique to a prediction that
applies gradient descent to optimize boosting of the tree Equation 3 (Friedman, 2001).

3
ATRF 2018 Proceedings

, 3

There are a few effective techniques to perform gradient tree boosting such as XGBoost (Chen
& Guestrin, 2016), pGBRT (Tyree, Weinberger, Agrawal, & Paykin, 2011), and LightGBM
(Ke, Meng, Wang, Chen, Ma, & Liu, 2017). The XGBoost algorithm grows trees depth-wise
and controls model complexity by the maximum depth of each subtree. In contrast, the
LightGBM algorithm uses a leaf-wise algorithm and it outperforms XGBoost in terms of speed
and memory usage (Ke, Meng, Wang, Chen, Ma, Liu, et al., 2017). In this study, we used
LightGBM to predict the number of ASB incidents on every railway station on a weekly basis.

5.1.1. Model evaluation

This step of the analysis is the implementation and calibration of our prediction model.
LightGBM has gained a good reputation in terms of performance in accuracy and computation
memory usage (Ke, Meng, Wang, Chen, Ma, Liu, et al., 2017) and was implemented in this
study. K-fold cross validation is a method that evaluates predictive models by partitioning the
original training sample into k equal size of subsamples. The first group is set aside for testing
and the remaining k-1 subsamples are used for model building. This process repeated k times,
in which each subsample uses once for validation and k-1 time for model building. In our
model, a six-fold cross-validation method has been used to validate the accuracy of the
classification model.

Table 1. LightGBM Core Parameters

Name of Parameter Value of Parameter

objective Regression method
max_depth (one tree) 3
num_leaves (one tree) 25
learning_rate 0.007
n_estimators 30000
min_child_samples 80
subsample 0.8
reg_alpha 0.02
reg_lambda 0.02

After 20 rounds, Table 1 is the best configuration that fit for our ASB forecasting. The
prediction error was calculated using the Root Mean Squared Logarithmic Error (RMSLE) as
follow:

4
ATRF 2018 Proceedings

1
log log 4

where the is the forecasts and is the observed values. From our model the six-fold values
are 0.26820, 0.26167, 0.26978, 0.26467, 0.27509, and 0.26762 respectively. Production of the
machine learning model is the next step after a model been trained.
5.1.2. Model publication
The best model was selected from all the trained models. Deploying a machine learning model
into a production environment is a very important step for a workflow. We used a batch off-
line prediction method to publish our prediction result because our ASB model is run on a
weekly basis.

Figure 3. Visualisation of the Anti-Social Behaviour Predictions

Our forecasting results are shown in

Figure . This map presented ASB cases alone with the location of cities, towns, and regional
boundaries. This is a fully interactive map which allows for user interactivity and ensures
consistency and efficiency in visualisation. It should be noted at this stage that we have, omitted
Central station from this process as the high volume of incidents not only drowns the algorithm
(i.e. all resources would tend to concentrate at that location) but also because a large number
of incidents occur that aren’t related to rail operations.

5.2. Predicting Punctuality

On complex railway networks, passengers are frequently confronted with train delays,
especially during peak hours of demand. Delays can be caused by a number of factors,
including bad weather, freight movements, signal problems, customer or staff injuries, or over-
crowding leading to longer dwell times. A train delay or cancellation will lead to deterioration

5
ATRF 2018 Proceedings

in the quality of service which impacts not only the customers’ satisfaction but also the
reputation of the train operator. Using the collected data, machine learning models can be
trained to predict train punctuality. Automated data collection systems collect daily data on
how passengers use the transit system. Thus, this train network can be abstracted as a time
series, which is an important form of structured data for a train delay. The punctuality time
series is of fixed frequency, which means the data points occur at regular intervals according
to a given operation pattern.

A powerful type of algorithm that is designed to handle sequential data is called Long Short
Term Memory (LSTM). LSTM is a prominent variant of the recurrent neural network
algorithm used in deep learning to take care of input space with latent connections in sequence
(Hochreiter & Schmidhuber, 1997). A typical neuron of LSTM consists of a cell, an input gate,
an output gate and a forget gate, that allow for flexible manipulation of input samples in
sequence. Given an input vector at the timestamp and hidden unit with corresponding
, and for weights and bias to be learned in training, the forward pass of an LSTM unit
equipped with a forget gate can be calculated in the following way.

The forget gate activation vector ∈ :

5
The input gate activation vector ∈ :
6

The output gate activation vector ∈ :

The cell state vector ∈ :

∘ ∘ 8

where ∘ denotes Hadamard product. The hidden unit is a Hadamard product result of and
. Cell states store historical information about units on the previous timestamp, whereby
LSTM is able to have a comprehensive consideration over input spaces with latent sequential
connections.

Figure 4. The Map View Forecasted Punctuality at Key Network Points.

6
ATRF 2018 Proceedings

From the above (Figure 4), the regression analysis is used in order to estimate the punctuality
at for interchange stations where passengers change trains from the regional network to the city
network. According to our experimental results, the produced train delay model, which is
trained on limited features from our raw data, is good at predicting train punctuality.

6. Conclusion
In this paper, we have proposed a novel framework to enable big data analysis within the
operations of a railway. Our framework blends heterogeneous data sources to provide
descriptive and predictive analytics. In the descriptive space, we designed and implemented an
integrated Key Performance Indicator regime over the top of the KPI framework. In the
predictive analytics space we showcased both an anti-social behavior prediction engine as well
as a train punctuality prediction capability. These capabilities were built to ultimately deliver
an evidence base that would flow through to both strategic and tactical level decision making
within railway operations. By using leading edge technology within a very traditional business
we have been able to drive better decisions, enhanced engagement with key stakeholders and
focus efforts and resources in the pursuit of increased community value through a more
successful railway.

References
Chen, T. and Guestrin, C. (2016) ‘XGBoost : Reliable Large-scale Tree Boosting System’, arXiv. doi:
10.1145/2939672.2939785.
Cozens, P. et al. (2002) ‘Managing crime and the fear of crime at railway stations-A case study in South
Wales (UK)’, International Journal of Transport Management. doi: 10.1016/j.ijtm.2003.10.001.
Friedman, J. H. (2001) ‘Greedy function approximation: A gradient boosting machine’, Annals of
Statistics. doi: DOI 10.1214/aos/1013203451.
Ghofrani, F. et al. (2018) ‘Recent applications of big data analytics in railway transportation systems:
A survey’, Transportation Research Part C: Emerging Technologies, 90, pp. 226–246. doi:
https://doi.org/10.1016/j.trc.2018.03.010.
Goverde, R. M. P. and Hansen, I. A. (2000) ‘TNV-Prepare: Analysis of Dutch railway opeartions based
on train detection data’, Computers in Railways VII. doi: 10.2495/CR000751.

7
ATRF 2018 Proceedings

Hochreiter, S. and Schmidhuber, J. (1997) ‘Long Short-Term Memory’, Neural Computation. doi:
10.1162/neco.1997.9.8.1735.
Ke, G., Meng, Q., Wang, T., Chen, W., Ma, W. and Liu, T.-Y. (2017) ‘A Highly Efficient Gradient
Boosting Decision Tree’, Advances in Neural Information Processing Systems 30.
Ke, G., Meng, Q., Wang, T., Chen, W., Ma, W., Liu, T.-Y., et al. (2017) ‘LightGBM: A highly efficient
gradient boosting decision tree’, Advances in Neural Information Processing Systems.
Kecman, P. and Goverde, R. M. P. (2012) ‘Process mining of train describer event data and automatic
conflict identification’, in WIT Transactions on the Built Environment. doi: 10.2495/CR120201.
Kianmehr, K. and Alhajj, R. (2008) ‘Effectiveness of support vector machine for crime hot-spots
prediction’, Applied Artificial Intelligence. doi: 10.1080/08839510802028405.
Kouziokas, G. N. (2017) ‘The application of artificial intelligence in public administration for
forecasting high crime risk transportation areas in urban environment’, in Transportation Research
Procedia. doi: 10.1016/j.trpro.2017.05.083.
Laney, D. (2001) ‘3-D Data Management: Controlling Data Volume, Velocity, and Variety’, META
Group Res Note 6, 6.
Moore, S. (2011) ‘Understanding and managing anti-social behaviour on public transport through value
change: The considerate travel campaign’, Transport Policy. doi: 10.1016/j.tranpol.2010.05.008.d
Neale, R. et al. (2004) ‘Tackling crime and fear of crime while waiting at Britain’s railway stations’,
Journal of Public Transportation. doi: 10.5038/2375-0901.7.3.2.
Nirkhi, S. M. S. M. S. M., Dharaskar, R. V. and Thakre, V. M. (2012) ‘Data Mining: A Prospective
Approach for Digital Forensics’, International Journal of Data Mining & Knowledge Management
Process.
Oneto, L. et al. (2017) ‘Dynamic delay predictions for large-scale railway networks: Deep and shallow
extreme learning machines tuned via thresholdout’, IEEE Transactions on Systems, Man, and
Cybernetics: Systems. doi: 10.1109/TSMC.2017.2693209.
Pongnumkul, S. et al. (2014) ‘Improving arrival time prediction of Thailand’s passenger trains using
historical travel times’, in 2014 11th Int. Joint Conf. on Computer Science and Software Engineering:
‘Human Factors in Computer Science and Software Engineering’ - e-Science and High Performance
Computing: eHPC, JCSSE 2014. doi: 10.1109/JCSSE.2014.6841886.
Rokach, L. and Maimon, O. (2005) ‘Top-Down Induction of Decision Trees Classifiers—A Survey’,
IEEE Transactions on Systems, Man and Cybernetics, Part C (Applications and Reviews). doi:
10.1109/TSMCC.2004.843247.
Tyree, S. et al. (2011) ‘Parallel boosted regression trees for web search ranking’, in Proceedings of the
20th international conference on World wide web - WWW ’11. doi: 10.1145/1963405.1963461.
Wang, R. and Work, D. B. (2015) ‘Data Driven Approaches for Passenger Train Delay Estimation’, in
IEEE Conference on Intelligent Transportation Systems, Proceedings, ITSC. doi:
10.1109/ITSC.2015.94.
Zheng, X., Cao, Y. and Ma, Z. (2011) ‘A mathematical modeling approach for geographical profiling
and crime prediction’, in ICSESS 2011 - Proceedings: 2011 IEEE 2nd International Conference on
Software Engineering and Service Science. doi: 10.1109/ICSESS.2011.5982362.

Data Science Project Report
43% (7)
Data Science Project Report
10 pages
Machine Learning with Clustering: A Visual Guide for Beginners with Examples in Python
From Everand
Machine Learning with Clustering: A Visual Guide for Beginners with Examples in Python
Artem Kovera
No ratings yet
CSR and Sustainability Policy - New PDF
No ratings yet
CSR and Sustainability Policy - New PDF
10 pages
New Text Document
0% (1)
New Text Document
3 pages
Entry Stage of Consultation: CG 621 Dougherty Text, Chapter 3
100% (3)
Entry Stage of Consultation: CG 621 Dougherty Text, Chapter 3
20 pages
Factors Influencing Student Choosing Bsca Degree in One of The Higher Education Institution in Aklan
100% (2)
Factors Influencing Student Choosing Bsca Degree in One of The Higher Education Institution in Aklan
26 pages
Sample Tech Sem Report
No ratings yet
Sample Tech Sem Report
19 pages
AI for Everyone: An Intermediate Guide to Artificial Intelligence
From Everand
AI for Everyone: An Intermediate Guide to Artificial Intelligence
Nova Clarke
No ratings yet
A Visual-Analytics System For Railway Safety Management
No ratings yet
A Visual-Analytics System For Railway Safety Management
6 pages
Edge Computing Applications in Supply Chain Management
From Everand
Edge Computing Applications in Supply Chain Management
Bo Li
No ratings yet
Artificial Intelligence-Based Traffic Flow Predict
No ratings yet
Artificial Intelligence-Based Traffic Flow Predict
50 pages
Lexicon of Computer Science Terminology: Lexicon of Tech and Business, #16
From Everand
Lexicon of Computer Science Terminology: Lexicon of Tech and Business, #16
Mustafa Al-Dori
4/5 (1)
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
César Pérez López
No ratings yet
Data Mining Metrices
No ratings yet
Data Mining Metrices
6 pages
Journal of Data Mining
No ratings yet
Journal of Data Mining
11 pages
Predictive Analytics For Big Data Processing
No ratings yet
Predictive Analytics For Big Data Processing
3 pages
Collective Traffic Forecasting
No ratings yet
Collective Traffic Forecasting
15 pages
The Application of Big Data Analysis and Machine Learning For Kick
No ratings yet
The Application of Big Data Analysis and Machine Learning For Kick
128 pages
Pedestrian Detection: Please, suggest a subtitle for a book with title 'Pedestrian Detection' within the realm of 'Computer Vision'. The suggested subtitle should not have ':'.
From Everand
Pedestrian Detection: Please, suggest a subtitle for a book with title 'Pedestrian Detection' within the realm of 'Computer Vision'. The suggested subtitle should not have ':'.
Fouad Sabry
No ratings yet
(Masters Thesis) Development of A Multi-Dimensional Time-Based Track Safety and Quality Index Tsqi
No ratings yet
(Masters Thesis) Development of A Multi-Dimensional Time-Based Track Safety and Quality Index Tsqi
24 pages
Proactive Network Monitoring
No ratings yet
Proactive Network Monitoring
10 pages
Vijayalakshmi.G-AI Based Rush Collision Prevention in Railways
No ratings yet
Vijayalakshmi.G-AI Based Rush Collision Prevention in Railways
9 pages
Data Mining: Fundamentals and Applications
From Everand
Data Mining: Fundamentals and Applications
Fouad Sabry
No ratings yet
Cloud Brokering
From Everand
Cloud Brokering
Felipe Díaz-Sánchez
No ratings yet
FULLTEXT01
No ratings yet
FULLTEXT01
25 pages
Transient Stability Assessment of Power Systems Using Probabilistic Neural Network With Enhanced Feature Selection and Extraction
No ratings yet
Transient Stability Assessment of Power Systems Using Probabilistic Neural Network With Enhanced Feature Selection and Extraction
12 pages
DSA Unit1
No ratings yet
DSA Unit1
37 pages
"Business Analytics of Railway Industry": S.R. Luthra Institute of Management
No ratings yet
"Business Analytics of Railway Industry": S.R. Luthra Institute of Management
25 pages
Business Data Analytics Part 4
No ratings yet
Business Data Analytics Part 4
52 pages
The Secret Of Machine Learning
From Everand
The Secret Of Machine Learning
Mhd Arjunanta
No ratings yet
CC Unit - 4 Imp Questions
No ratings yet
CC Unit - 4 Imp Questions
4 pages
Group Method of Data Handling: Fundamentals and Applications for Predictive Modeling and Data Analysis
From Everand
Group Method of Data Handling: Fundamentals and Applications for Predictive Modeling and Data Analysis
Fouad Sabry
No ratings yet
Principal Components Analysis and Track Quality Index: A Machine Learning Approach
No ratings yet
Principal Components Analysis and Track Quality Index: A Machine Learning Approach
20 pages
HEART_2020_paper_63
No ratings yet
HEART_2020_paper_63
7 pages
Unit 6: Big Data Analytics Using R: 6.0 Overview
No ratings yet
Unit 6: Big Data Analytics Using R: 6.0 Overview
32 pages
research paper1 -AI in education
No ratings yet
research paper1 -AI in education
12 pages
Congestion Correlation and Classification From Twitter and Waze Map Using Artificial Neural Network
No ratings yet
Congestion Correlation and Classification From Twitter and Waze Map Using Artificial Neural Network
6 pages
Big Data Analytics For Fault Detection and Its Application in Maintenance
No ratings yet
Big Data Analytics For Fault Detection and Its Application in Maintenance
148 pages
Machine Learning - Advanced Concepts
From Everand
Machine Learning - Advanced Concepts
Derrick Mwiti
No ratings yet
Module 2
No ratings yet
Module 2
53 pages
Advanced Analytics For Train Delay Prediction Systems by Including Exogenous Weather Data
No ratings yet
Advanced Analytics For Train Delay Prediction Systems by Including Exogenous Weather Data
10 pages
Analysis of Machine Learning Algorithm With Road Accidents Data Sets
No ratings yet
Analysis of Machine Learning Algorithm With Road Accidents Data Sets
11 pages
Pattern Extraction From Incident Reports Using Proactive and Reactive Data: A Case Study of Contractors Safety in A Steel Plant
No ratings yet
Pattern Extraction From Incident Reports Using Proactive and Reactive Data: A Case Study of Contractors Safety in A Steel Plant
12 pages
1 - MSC Thesis Proposal 2020-Ms-Rle-104 Kashif Ali 3
No ratings yet
1 - MSC Thesis Proposal 2020-Ms-Rle-104 Kashif Ali 3
10 pages
A Study of Soft Computing Techniques
No ratings yet
A Study of Soft Computing Techniques
19 pages
A Review in Fault Diagnosis and Health A
No ratings yet
A Review in Fault Diagnosis and Health A
19 pages
Tools of Machine Learning
No ratings yet
Tools of Machine Learning
3 pages
Cloud-Based Multi-Modal Information Analytics
From Everand
Cloud-Based Multi-Modal Information Analytics
Tanushri Kaniyar
No ratings yet
Xii Ai Capstone Project
No ratings yet
Xii Ai Capstone Project
35 pages
Big Data Analysis of Synchrophasor Data Outcomes of Research Activities Supported by DOE FOA 1861 (PNNL, 2022)
No ratings yet
Big Data Analysis of Synchrophasor Data Outcomes of Research Activities Supported by DOE FOA 1861 (PNNL, 2022)
39 pages
An Insight Into Machine Learning Techniq
No ratings yet
An Insight Into Machine Learning Techniq
8 pages
36718-76598-1-PB
No ratings yet
36718-76598-1-PB
8 pages
A Review On Predictive Analytics in Data Mining
No ratings yet
A Review On Predictive Analytics in Data Mining
8 pages
BA answers
No ratings yet
BA answers
11 pages
Microprediction: Building an Open AI Network
From Everand
Microprediction: Building an Open AI Network
Peter Cotton
No ratings yet
Prediction Stock Price Using Data Science Technique
No ratings yet
Prediction Stock Price Using Data Science Technique
11 pages
Prediction of Vehicles' Trajecto-Ries Based On Driver Behaviour Model
No ratings yet
Prediction of Vehicles' Trajecto-Ries Based On Driver Behaviour Model
76 pages
s00521-021-06522-5 (1)
No ratings yet
s00521-021-06522-5 (1)
12 pages
Process Performance Models: Statistical, Probabilistic & Simulation
From Everand
Process Performance Models: Statistical, Probabilistic & Simulation
Vishnuvarthanan Moorthy
No ratings yet
College Data Analysis and Prediction SRS
100% (1)
College Data Analysis and Prediction SRS
68 pages
Contextual Image Classification: Understanding Visual Data for Effective Classification
From Everand
Contextual Image Classification: Understanding Visual Data for Effective Classification
Fouad Sabry
No ratings yet
Implementation of Knowledge-Based Expert System Using Probabilistic Network Models
No ratings yet
Implementation of Knowledge-Based Expert System Using Probabilistic Network Models
4 pages
An Online Railway Traffic Prediction Tool
No ratings yet
An Online Railway Traffic Prediction Tool
19 pages
Feature Engineering For Machine Learning and Data Analytics
No ratings yet
Feature Engineering For Machine Learning and Data Analytics
26 pages
RR Toolkit EN New 2017 12 27 CH4 PDF
No ratings yet
RR Toolkit EN New 2017 12 27 CH4 PDF
18 pages
Karishma Davada
No ratings yet
Karishma Davada
90 pages
Mission Statement of RPF
No ratings yet
Mission Statement of RPF
1 page
Proceeding AINiC 2019 Final
0% (1)
Proceeding AINiC 2019 Final
171 pages
Emotional Intelligence and Effective Leadership
No ratings yet
Emotional Intelligence and Effective Leadership
6 pages
Effects of Electronic Gadgets in The Study Habits of Grade 10 High School Students of Negros Occidental High School
No ratings yet
Effects of Electronic Gadgets in The Study Habits of Grade 10 High School Students of Negros Occidental High School
16 pages
Describe The Interrelationship Between Consumer Behaviour and Marketing Concept
100% (1)
Describe The Interrelationship Between Consumer Behaviour and Marketing Concept
3 pages
Chapter 7 - Sampling Distributions
No ratings yet
Chapter 7 - Sampling Distributions
43 pages
Revision Matrix
No ratings yet
Revision Matrix
4 pages
Harnessing The Power of Emotional Intelligence, Scientific Literacy, and Problem-Solving Skills For Successful Living
No ratings yet
Harnessing The Power of Emotional Intelligence, Scientific Literacy, and Problem-Solving Skills For Successful Living
15 pages
Mind Mapping in CLIL: How It Facilitates Students' Reading Comprehension
No ratings yet
Mind Mapping in CLIL: How It Facilitates Students' Reading Comprehension
11 pages
Thesis Statement Generator Worksheet
100% (3)
Thesis Statement Generator Worksheet
4 pages
Sciences Assessment Criteria - MYP 1
100% (1)
Sciences Assessment Criteria - MYP 1
7 pages
Igo Homework 2-1
100% (1)
Igo Homework 2-1
4 pages
Cultural_Barriers-Its_Impact_to_School_Participation_Among_the_Bachelor_of_Culture_and_Arts_Education_Students_of_Eastern_Visayas_State_University_YAPANFINALapril-1-3
No ratings yet
Cultural_Barriers-Its_Impact_to_School_Participation_Among_the_Bachelor_of_Culture_and_Arts_Education_Students_of_Eastern_Visayas_State_University_YAPANFINALapril-1-3
22 pages
Nachingwea Nickel Project, Tanzania: Research Proposal
No ratings yet
Nachingwea Nickel Project, Tanzania: Research Proposal
6 pages
Rationale For Reducing Project Duration Rationale For Reducing Project Duration
0% (1)
Rationale For Reducing Project Duration Rationale For Reducing Project Duration
17 pages
MSBP A1
No ratings yet
MSBP A1
8 pages
Shaibu Amana David
No ratings yet
Shaibu Amana David
44 pages
Youth Profiling Docs
No ratings yet
Youth Profiling Docs
3 pages
Reporter: Oribado, Christian Jay N. Beed Iv-A Developmental Reading 2
No ratings yet
Reporter: Oribado, Christian Jay N. Beed Iv-A Developmental Reading 2
3 pages
Module Physics P3B Gas Law Chap 4.4
No ratings yet
Module Physics P3B Gas Law Chap 4.4
7 pages
LIST OF PREDATORY PUBLISHERS - Scholarly Open Access
No ratings yet
LIST OF PREDATORY PUBLISHERS - Scholarly Open Access
27 pages
The Effectiveness of Health Care Teams in The National Health Service
No ratings yet
The Effectiveness of Health Care Teams in The National Health Service
363 pages
Engagement With Social Media and Social Media Advertising: The Differentiating Role of Platform Type
No ratings yet
Engagement With Social Media and Social Media Advertising: The Differentiating Role of Platform Type
18 pages
(LightCastle Partners) Business Dev Trainee Consultant - Second Round Assessment
No ratings yet
(LightCastle Partners) Business Dev Trainee Consultant - Second Round Assessment
4 pages
Preliminary Discussion Assurance Engagements 5 Elements of Assurance Engagement
No ratings yet
Preliminary Discussion Assurance Engagements 5 Elements of Assurance Engagement
3 pages
Subject CM1 Actuarial Mathematics Core Principles Syllabus: For The 2019 Exams
No ratings yet
Subject CM1 Actuarial Mathematics Core Principles Syllabus: For The 2019 Exams
12 pages
Table Format Given Below Is Suitable For Theory, MCQ, Lab, and Tutorial. No Need To Take Printouts
No ratings yet
Table Format Given Below Is Suitable For Theory, MCQ, Lab, and Tutorial. No Need To Take Printouts
5 pages
Dashboard Design & Data Visualization Requirement
No ratings yet
Dashboard Design & Data Visualization Requirement
3 pages

A Design Framework For Big Data Analysis in Railway Transport

Uploaded by

A Design Framework For Big Data Analysis in Railway Transport

Uploaded by

Australasian Transport Research Forum 2018 Proceedings

30 October – 1 November, Darwin, Australia

A Design Framework for Big Data Analysis in

4. NSW TrainLink Integrated Key Performance Indicator System

Figure 1. Customer Injury Analysis

Figure 2. Technical Incidents Analysis

5. Predictive modelling with BABEL

Machine learning is a methodology to discover latent information or patterns in datasets

where , is Information Gain, represents parent entropy ∑ log and

5.1.1. Model evaluation

Table 1. LightGBM Core Parameters

Name of Parameter Value of Parameter

Figure 3. Visualisation of the Anti-Social Behaviour Predictions

Our forecasting results are shown in

5.2. Predicting Punctuality

The forget gate activation vector ∈ :

The output gate activation vector ∈ :

The cell state vector ∈ :

Figure 4. The Map View Forecasted Punctuality at Key Network Points.

You might also like