Anomaly detection in GNSS time series using machine learning methods
Students Name: Haidus Oleh
Qualification Level: magister
Speciality: Space Geodesy
Institute: Institute of Geodesy
Mode of Study: full
Academic Year: 2023-2024 н.р.
Language of Defence: ukrainian
Abstract: In today’s world, technological progress in the field of global navigation satellite systems (GNSS) allows improving the quality and quantity of geodetic data. This has led to the development of new methods for processing and analyzing geodetic time series data from various space missions. Technological advances also make it possible to integrate advanced mathematical models with high-performance software to respond to challenges and improve GNSS data processing methods in a timely manner. Computing software is becoming more powerful and provides scientists with tools to process and interpret large amounts of data more efficiently. The accumulation of GNSS station data that have been operating for more than a few decades requires mathematical approaches to the extraction of anomalies that may not be visually distinguishable in large time series. Therefore, for long GNSS data series, the development of algorithms that could identify simultaneous anomalies at several stations is an urgent task. Detecting anomalies in GNSS data is one of the most promising methods for monitoring seismic activity. An anomaly is a deviation from normal behavior. Anomalies in GNSS data can be associated with seismic activity. This information can be used to identify patterns in changes in the position of stations that may be related to seismic activity. Accurate and reliable methods for detecting gaps in time series will help in solving problems related to unknown anomalies in the data and ensure the correctness of further GNSS time series analysis. Anomalies can be caused by various factors, including seismic events, weather conditions, man-made disasters, and others. They can lead to distortion of data analysis results, which can have serious consequences for security and the economy. Detecting seismic events is an important task that can help reduce the risk of loss of life and property damage. Early detection of seismic events can provide sufficient time to evacuate people from the risk zone and prevent or reduce the extent of damage. Current methods of detecting seismic events are based on the use of seismographs that record vibrations of the earth’s crust. However, these methods have limitations as they can be susceptible to noise and other interference. The machine learning method presented in the study is a promising approach to detecting seismic events. The method is based on the use of the Isolation Forest algorithm, which detects outliers in the data that may indicate potential anomalies. At the first stage of the proposed algorithm, GNSS data from one station are processed using the following steps: reading data from a text file, calculating distances, training the model, predicting anomalies, detecting anomalies, visualizing initial anomalies, visualizing recent anomalies, feature engineering, and adjusting model hyperparameters. At the second stage of the study, the following work was carried out: Data selection, data processing, anomaly detection using the Isolation Forest algorithm. Studied object is GNSS data time series. Scope of research is anomalies in GNSS data time series that may be caused by seismic activity. Goal of the research. Development of a method for detecting anomalies in GNSS time series to identify known seismic events. Results of the study. The current trends in the development of machine learning algorithms in geodesy are investigated. A machine learning algorithm based on the Isolation Forest algorithm was developed. The anomalies around a known seismic event were studied for one, two, and five GNSS stations. One of the significant advantages of the method is the condensation of anomalies around the seismic event, which facilitates the detection of patterns in changes in the positions of GNSS stations at a considerable distance from the epicenter of the seismic event. This may be important for further research and analysis of geodynamic phenomena in real time. The study showed that the method of anomaly detection based on GNSS data is promising for monitoring and predicting seismic events. However, to improve the accuracy of the method, additional research is needed to reduce the number of false positives.