Overview of Research of Machine Learning in Air TR
Overview of Research of Machine Learning in Air TR
DOI: 10.54254/2755-2721/8/20230091
Zihao Cai
College of Civil Aviation, NanJing University of Aeronautics and Astronautics,
Nanjing, Jiangsu, 211106
Abstract. As the demand for air traffic has grown at a fast pace in recent years, the efficiency
and safety of air traffic management is facing greater challenges with limited airspace resources.
As an important part of the civil aviation air traffic system, the existing air traffic management
capability is no longer able to meet the demand of air traffic growth. Machine learning, as an
advanced method of current computer modelling, has shown good application value and promise
for application in air traffic management. This paper starts by introducing the application
methods and modelling process for machine learning in air traffic management, followed by the
current research status of machine learning in three areas, namely air traffic flow management,
air traffic services and airspace management, respectively, and finally points out the challenges
and further development outlook of applying machine learning to air traffic management.
Overall, the introduction of machine learning into air traffic management represents a major
trend with significant implications for its development.
Keywords: air traffic management, machine learning, deep learning, model building.
1. Introduction
In recent years, as the demand for air traffic has increased, air traffic systems in all countries have been
under enormous pressure, causing significant air traffic delays. Based on statistics from the CAAC,
during the period of 2015-2019, the average normal rate of passenger flights nationwide was 75.71%,
with an average delay time of 18 min, of which the normal rate of passenger flights in 2015 was only
68.33%, with an average delay time of over 20 min. According to the statistics of the US Bureau of
Transportation Statistics [1-2]. During the same period, the on-time performance of major US airlines
was only 79.99%. Although there was a downward trend in global air transport due to the impact of the
new crown epidemic, delays fell briefly. However, with the recovery from the epidemic, the domestic
and international aviation industry is gradually picking up and demand for air traffic is still experiencing
strong growth. While flight delays affect passengers' journeys, they also cause huge losses to the national
economy. According to statistics, flight delays indirectly caused an economic loss of about 350 billion
yuan to China in 2013 [3]; the economic loss caused by flight delays in the United States in 2007 was
about US$3.3 billion [4]. To address the above issues, there is an urgent necessity to improve air traffic
management capabilities and to deploy air traffic flow management strategies in advance to reduce
losses. Currently, relevant regulatory bodies and researchers have started to promote computer
© 2023 The Authors. This is an open access article distributed under the terms of the Creative Commons Attribution License 4.0
(https://creativecommons.org/licenses/by/4.0/).
103
Proceedings of the 2023 International Conference on Software Engineering and Machine Learning
DOI: 10.54254/2755-2721/8/20230091
modelling as an alternative research method. Among them, machine learning, a computer modelling
approach that has been developing rapidly in recent years, has been widely used in many fields such as
natural language processing, image recognition and autonomous driving [5-7]. The learning model of
machine learning is similar to that of humans, in that a program learns generic patterns from a large
amount of data, builds a model and attempts to predict the target value, and by comparing the difference
between predicted and true values, continues to trial and error, iterates and optimises, and eventually
outputs a model with high accuracy. Machine learning has a powerful learning capability and versatility
that could open up new opportunities for air traffic management.
This paper therefore specifically explores the current application of machine learning in air traffic
management research. The first step introduces how machine learning can be applied to air traffic
management in terms of basic machine learning concepts and machine learning modelling processes
(data set construction, data pre-processing, model selection and construction, model analysis)
respectively; the second step discusses existing research in terms of the three components of air traffic
management (air traffic services, air traffic flow management, airspace management) respectively; on
the basis of the present situation of research in the third step, the challenges and prospects of the current
research on machine learning in air traffic management are pointed out. Finally, it is concluded that,
despite the challenges, the introduction of machine learning has significant implications for the research
and development of air traffic management, and this thesis will provide future researchers with
references on research directions to facilitate the application of machine learning in real production
environments.
104
Proceedings of the 2023 International Conference on Software Engineering and Machine Learning
DOI: 10.54254/2755-2721/8/20230091
recurrent neural networks and fully connected neural networks[16]. Frameworks for using neural
networks such as Pyroch, Tensoflow and Caffe have been open sourced to facilitate researchers' use[17-
19].
2.2.1. Dataset construction. A large, high-quality dataset is the basis for building a reliable machine-
learning model, and the construction of a dataset is generally divided into a number of steps. The first
step is data collection, which is generally derived from published papers, relevant public datasets and
some internal datasets of relevant regulatory agencies or departments. In the case of severe data
deficiencies, new datasets can also be constructed from ADS-B messaging systems and ATC
communication records for machine learning mapping purposes. Then comes the extraction of
experimentally relevant information as model inputs, such as aircraft altitude, heading, speed and
position, etc. Finally, the objectives of the study (future air traffic flow, evaluation indicators, etc.) are
taken as the predicted outputs of the model.
Here are some common air traffic databases.
105
Proceedings of the 2023 International Conference on Software Engineering and Machine Learning
DOI: 10.54254/2755-2721/8/20230091
2.2.2. Data pre-processing. The collected data will have problems such as missing data and irregular
data format, so in order to improve the quality of the data, data cleaning and pre-processing are needed
before modelling. When a feature in the data set has a large number of missing values, the feature will
generally be discarded, otherwise, it will have a detrimental impact on the model effect. In the case of
outliers in the data, the data can usually be removed directly. In addition, the data is usually normalised
or normalised in order to limit the values to a certain range and to prevent the model from being biased
due to the large value of a feature. It is important to note that string variables that cannot be recognised
by machine learning (e.g., categorical variables) can be converted to numeric variables that can be
brought into the model using one-hot encoding.
2.2.3. Model selection and construction. Machine learning models, such as classification and regression
predictions, are primarily utilized for the prediction of discrete and continuous data. Most common
algorithms such as decision trees, random forests, support vector machines, artificial neural networks
and XGBoost can be used for both classification and regression, except for plain Bayesian and k-
neighbourhood algorithms which are only used for classification, and multiple linear regression which
can only be used for regression prediction. Within the air traffic management domain, classification
models can be used to determine whether current air traffic conditions will cause delays, while
regression models are often used to predict air traffic flows in the airspace over time.
Model building is a central part of the machine learning process in air traffic management. Before
building the model, the pre-processed data set is split into a training set, a validation set and a test set,
usually in the ratio of 7:2:1, and combined with appropriate algorithms, the training set is applied to
construct the model and the validation set is brought into the model to test the predictions and optimise
the hyperparameters of the model. The hyperparameters of the model (e.g., the number of decision trees
in the random forest, the maximum depth of the tree, the number of neurons in each layer, etc.) often
have a significant impact on the prediction performance of the model. When using the Python language
for machine learning, the GridSearchCV package in the Scikit-Learn library is often used to grid search
the hyperparameters of the model to find the best model hyperparameters, and finally the test set is
applied to check the model's ability to generalise. Alternatively, the robustness of the model can be
tested using k-fold cross-validation by dividing the dataset into k copies, where k-1 copies of the data
are used to train the model, and then a separate copy is used to validate the model. k times of cross-
validation are repeated, and the average k times of validation results are used as the performance output
of the model. 10-fold cross-validation is currently predominantly used.
Depending on the type of target value of the data, these models can be grouped into regression models
and classification models. Regression models are used to predict data with continuous numerical
variables as the target value, and classification models are used when the target value is a discrete type
of variable. Regression models are generally evaluated by the coefficient of resolvability (R2 ) and root
mean square error (RSME), while classification models are evaluated by the accuracy, precision, recall
and receiver operating characteristic curve (ROC) and area under the curve of ROC (AUC) [20].
2.2.4. Model analysis. Because the process of building machine learning models is automated and
opaque, they are also often referred to as 'black box' models [21], which makes it difficult for researchers
to obtain relationships between results and variables from model predictions. Therefore, model
interpretability is also one of the key areas of interest for researchers. Also, methods commonly used to
explore model interpretability such as sensitivity analysis and random forest ranking of feature
importance can help researchers find important features in the data.
106
Proceedings of the 2023 International Conference on Software Engineering and Machine Learning
DOI: 10.54254/2755-2721/8/20230091
By the above definitions, the existing studies are discussed in three categories.
107
Proceedings of the 2023 International Conference on Software Engineering and Machine Learning
DOI: 10.54254/2755-2721/8/20230091
Mollinga et al. proposed an air traffic control model using a graph-based deep learning network to
control aircraft movements in order to reduce the traffic density in the area and reduce potential conflicts
and collisions[28]. Li and Pham used reinforcement learning models to generate agents to detect
potential conflicts in the airspace while issuing commands to control the movement of multiple aircraft
by shifting heading angles, ascending/descending altitude levels or increasing/decreasing speed to avoid
conflicts instead of air traffic controllers[29-30]. These models can be used to inspire air traffic
controllers during training or to assist busy air traffic controllers with decision making in high density
traffic situations, and in the future could act as a replacement for human controllers.
In addition to these, airports also fall under the scope of air traffic control, also with the aim of
identifying and predicting risk precursors and mitigating risks before accidents occur. Herrema uses
machine learning and statistical modelling methods to predict taxi launch times, flight times on final
approach and airspeed profiles, and to determine abnormal runway occupancy times in order to predict
potential risks and accidents[31].
108
Proceedings of the 2023 International Conference on Software Engineering and Machine Learning
DOI: 10.54254/2755-2721/8/20230091
as a guide for the selection of aircraft for temporary re-routing and emergency landings[37]. In addition
to this, the complexity of the airspace sector has a considerable contribution to make in air traffic
management, and the order of magnitude of traffic density in the future open UAV scenario will be
much higher than in the existing situation. Xie et al. proposed a CNN-based framework for airspace
sector complexity assessment and devised a new data structure, the Multichannel Traffic Image Scene
(MTSI), which composes the airspace in a two-dimensional form on a grid and populated with
navigation information, enabling more efficient and flexible extraction of deep features to evaluate
airspace complexity[38]. While Li et al. considered that the integration cost of such data requiring expert
annotation was too high, they propose an unsupervised learning method via a novel loss function which
can better address the adverse effects associated with features related to the complexity of the airspace,
including dimensional coupling, category imbalance and boundary overlap[39]. It has also been
validated in extensive experiments in six regions of southwest Chinese airspace, demonstrating that the
model can provide a more objective evaluation to help air traffic controllers understand airspace
complexity. In more depth, Ribeiro et al. used reinforcement learning to make the hierarchical design of
the airspace more suitable for the flight paths and traffic conditions of aircraft in the current airspace,
reducing the number of conflicts and loss of minimum separation, suggesting that this airspace structure
is better able to distribute aircraft more evenly, thereby increasing airspace capacity[40].
4. Challenges and prospects for research of machine learning in air traffic management
109
Proceedings of the 2023 International Conference on Software Engineering and Machine Learning
DOI: 10.54254/2755-2721/8/20230091
5. Conclusion
This paper discusses the development, applications and prospects of machine learning technology in air
traffic management. Although machine learning is still a short distance away from real mass production
applications, as a key part of the future intelligent transportation system, with the continuous
improvement and upgrading of data computing hardware represented by GPUs and the gradual evolution
of deep learning algorithms, the key technologies of artificial intelligence will also be developed
significantly. There is a growing trend to introduce artificial intelligence into the air traffic management
industry, which can effectively control the growth of air traffic demand due to economic development.
It is important to improve the resource utilization of airspace, reduce environmental pollution and reduce
the occurrence of air traffic accidents by predicting aircraft trajectories, optimizing the airspace structure
and enhancing the productivity of air traffic controllers.
Due to the limitations of the author's research level and the length of this thesis, the three sub-scopes
of Air Traffic Services (ATS), namely Air Traffic Control (ATC) Services, Flight Information Services
(FIS) and Alerting Services, have not been studied separately and in greater depth and compared. In
future research, the author will continue to conduct more investigation and analysis in the direction of
Machine Learning in Air Traffic Management.
References
[1] Civil Aviation Administration of China. Civil aviation industry development statistics bulletin
[R]. Beijing: Civil Aviation Administration of China, 2020: 14.
[2] United States Department of Transportation. Understanding the reporting of causes of flight
delays and cancellations [EB/OL]. [2020 - 11 - 12]. https://www.bts.gov/topics/airlines-and-
airports/understanding-reporting-causes-flight-delays-and-cancellations.
[3] Chen Y, Yu J, Tsai S B, et al. An empirical study on the indirect impact of flight delay on China’s
economy[J]. Sustainability, 2018, 10(2): 357.
[4] Ball M, Barnhart C, Dresner M, et al. Total delay impact study: a comprehensive assessment of
the costs and impacts of flight delay in the United States[J]. 2010.
[5] Otter D W, Medina J R, Kalita J K. A survey of the usages of deep learning for natural language
processing[J]. IEEE transactions on neural networks and learning systems, 2020, 32(2): 604-
624.
[6] Ker J, Wang L, Rao J, et al. Deep learning applications in medical image analysis[J]. Ieee Access,
2017, 6: 9375-9389.
[7] Grigorescu S, Trasnea B, Cocias T, et al. A survey of deep learning techniques for autonomous
driving[J]. Journal of Field Robotics, 2020, 37(3): 362-386.
[8] Aguilera P A, Fernández A, Fernández R, et al. Bayesian networks in environmental modelling[J].
Environmental Modelling & Software, 2011, 26(12): 1376-1388.
[9] Loh W Y. Classification and regression trees[J]. Wiley interdisciplinary reviews: data mining and
knowledge discovery, 2011, 1(1): 14-23.
[10] Burden F R, Winkler D A. Relevance vector machines: sparse classification methods for
QSAR[J]. Journal of chemical information and modeling, 2015, 55(8): 1529-1534.
[11] Svetnik V, Liaw A, Tong C, et al. Random forest: a classification and regression tool for
compound classification and QSAR modeling[J]. Journal of chemical information and
computer sciences, 2003, 43(6): 1947-1958.
[12] Chen T, Guestrin C. Xgboost: A scalable tree boosting system[C]//Proceedings of the 22nd acm
sigkdd international conference on knowledge discovery and data mining. 2016: 785-794.
[13] Pedregosa F, Varoquaux G, Gramfort A, et al. Scikit-learn: Machine learning in Python[J]. the
Journal of machine Learning research, 2011, 12: 2825-2830.
[14] LeCun Y, Bengio Y, Hinton G. Deep learning[J]. nature, 2015, 521(7553): 436-444.
[15] Krizhevsky A, Sutskever I, Hinton G E. Imagenet classification with deep convolutional neural
networks[J]. Communications of the ACM, 2017, 60(6): 84-90.
[16] Hochreiter S, Schmidhuber J. Long short-term memory[J]. Neural computation, 1997, 9(8): 1735-
110
Proceedings of the 2023 International Conference on Software Engineering and Machine Learning
DOI: 10.54254/2755-2721/8/20230091
1780.
[17] Paszke A, Gross S, Massa F, et al. Pytorch: An imperative style, high-performance deep learning
library[J]. Advances in neural information processing systems, 2019, 32.
[18] Abadi M, Barham P, Chen J, et al. {TensorFlow}: a system for {Large-Scale} machine
learning[C]//12th USENIX symposium on operating systems design and implementation
(OSDI 16). 2016: 265-283.
[19] Jia Y, Shelhamer E, Donahue J, et al. Caffe: Convolutional architecture for fast feature
embedding[C]//Proceedings of the 22nd ACM international conference on Multimedia. 2014:
675-678.
[20] awcett T. An introduction to ROC analysis[J]. Pattern recognition letters, 2006, 27(8): 861-874.
[21] Ribeiro M T, Singh S, Guestrin C. " Why should i trust you?" Explaining the predictions of any
classifier[C]//Proceedings of the 22nd ACM SIGKDD international conference on knowledge
discovery and data mining. 2016: 1135-1144.
[22] Naessens H, Philip T, Piatek M, et al. Predicting flight routes with a Deep Neural Network in the
operational Air Traffic Flow and Capacity Management system[J]. EUROCONTROL
Maastricht Upper Area Control Centre, Maastricht Airport, The Netherlands, Tech. Rep, 2017.
[23] Shi Z, Xu M, Pan Q, et al. LSTM-based flight trajectory prediction[C]//2018 International Joint
Conference on Neural Networks (IJCNN). IEEE, 2018: 1-8.
[24] Ma L, Tian S. A hybrid CNN-LSTM model for aircraft 4D trajectory prediction[J]. IEEE access,
2020, 8: 134668-134680.
[25] Yang B, Tan X, Chen Z, et al. ATCSpeech: A multilingual pilot-controller speech corpus from
real air traffic control environment[J]. arXiv preprint arXiv:1911.11365, 2019.
[26] Helmke H, Kleinert M, Ohneiser O, et al. Machine learning of air traffic controller command
extraction models for speech recognition applications[C]//2020 AIAA/IEEE 39th Digital
Avionics Systems Conference (DASC). IEEE, 2020: 1-9.
[27] Shen Z, Wei Y. A high-precision feature extraction network of fatigue speech from air traffic
controller radiotelephony based on improved deep learning[J]. ICT Express, 2021, 7(4): 403-
413.
[28] Mollinga J, van Hoof H. An autonomous free airspace en-route controller using deep
reinforcement learning techniques[J]. arXiv preprint arXiv:2007.01599, 2020.
[29] Wang Z, Li H, Wang J, et al. Deep reinforcement learning based conflict detection and resolution
in air traffic control[J]. IET Intelligent Transport Systems, 2019, 13(6): 1041-1047.
[30] Pham D T, Tran N P, Alam S, et al. A machine learning approach for conflict resolution in dense
traffic scenarios with uncertainties[C]//ATM 2019, 13th USA/Europe Air Traffic
Management Research and Development Seminar. 2019.
[31] Herrema F, Treve V, Desart B, et al. A novel machine learning model to predict abnormal Runway
Occupancy Times and observe related precursors[C]//12th USA/Europe Air Traffic
Management Research and Development Seminar. 2017.
[32] Gui G, Zhou Z, Wang J, et al. Machine learning aided air traffic flow analysis based on aviation
big data[J]. IEEE Transactions on Vehicular Technology, 2020, 69(5): 4817-4826.
[33] Lin Y, Zhang J, Liu H. Deep learning based short-term air traffic flow prediction considering
temporal–spatial correlation[J]. Aerospace Science and Technology, 2019, 93: 105113.
[34] Malakis S, Psaros P, Kontogiannis T, et al. Classification of air traffic control scenarios using
decision trees: insights from a field study in terminal approach radar environment[J].
Cognition, Technology & Work, 2020, 22(1): 159-179.
[35] Gui G, Liu F, Sun J, et al. Flight delay prediction based on aviation big data and machine
learning[J]. IEEE Transactions on Vehicular Technology, 2019, 69(1): 140-150.
[36] Liu Y, Liu Y, Hansen M, et al. Using machine learning to analyze air traffic management actions:
Ground delay program case study[J]. Transportation Research Part E: Logistics and
Transportation Review, 2019, 131: 80-95.
[37] Zhang K, Liu Y, Wang J, et al. Tree-based airspace capacity estimation[C]//2020 Integrated
111
Proceedings of the 2023 International Conference on Software Engineering and Machine Learning
DOI: 10.54254/2755-2721/8/20230091
112