Skip to content

PatrickLdA/milan-telecom-analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

59 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MTP-NT: A Mobile Traffic Predictor Enhanced by Neighboring and Transportation Data

MTP-NT by Bing

These code are a technical analisys of A multi-source dataset of urban life in the city of Milan and the Province of Trentino paper and the development of a predictive model to forecast network traffic. The work was carried out during the master's program at the Federal University of Uberlândia.

Table of Contents

Introduction

The development of techniques able to forecast the mobile network traffic in a city can feed data driven applications, as VNF orchestrators, optimizing the resource allocation and increasing the capacity of mobile networks. Despite the fact that several studies have addressed this problem, many did not consider neither the traffic relationship among city regions nor information from public transport stations, which may provide useful information to better anticipate the network traffic.

In this work, we present a new deep learning architecture to forecast the network traffic using representation learning and recurrent neural networks. The framework, named MTP-NT, has two major components: the first responsible to learn from the time series of the region to be predicted, and the second one learning from the time series of both neighboring regions and public transportation stations. The work also reviews the 5G infrastructure based on open 3GPP specifications to explore ways to implement the framework in a real architecture. Several experiments were conducted over a dataset from the city of Milan, as well as comparisons against widely adopted and state-of-the-art techniques. The results shown in this work demonstrate that the usage of public transport information contribute to improve the forecasts in central areas of the city, as well as in regions with aperiodic demands, such as tourist regions.

Thus, this research seeks to evaluate the performance of traffic forecasting models using public data, in order to validate the performance gain with the aggregation of public transport data. The aggregation of unconventional data can be a way of adding information to the model through input that has not been explored in the scope of this research area.

The development of MTP-NT was carried out during the master's program at the Federal University of Uberlândia. The slides used in the defense, presented on 11/21/2023, can be found in the file named defesa.pdf.

Getting Started

Before execute any of the files, please install the environment listed in requirements.txt using pip and Anaconda.

Technical Overview

Database preprocessing

Before all model development, some pre work were done in the original database and in the collected data of public transport hubs.

misc/database_adapt.py: this code is used to take the original dataframe, that is in a format "one file per day" to a format "one region per day".

transport_modelling: contains the code to map the transport hubs in Milan. The sources used was ATM website, Wikipedia list of Milan Metro stations and Google Maps Platform. All data was compilled in transport_modelling/public_transport_locations.csv

Libs

Some code were developed to support the models training (both MTP-NT and its competitors) in different stages. They are:

Code used in model development:

  • libs/get_milano.py: a library build to get the requested data from the dataset.
  • libs/functions.py: NMAE (Normalized Mean Absolute Error) and MARE (Mean absolute Relative error) implementations.

MTP-NT compilling

The MTP-NT is the purposed model, compilled by model_building.py script.

Some variables need to be attended to guarantee the work of the script:

In lines 142--178 the region ids were the model are going to be evaluated are selected. In the end, the list of ids is stored in ids_to_use.

A print of the selected ids is saved in check_selected_ids.jpg in line 191.

transport_hubs is a list that can control the activation of transport hubs data as well as neighorrs controls wich degrees will be compilled.

After model construction and compilling, the results are saved:

Competitors compilling

model_building_ARIMA.py: constructs ARIMA models for a selected number of regions.

model_building_HW.py: constructs Holt-Winters models for a selected number of regions.

model_building_LSTM.py: constructs LSTM models for a selected number of regions.

model_building_ARIMA.py: constructs ARIMA models for a selected number of regions.

model_building_SARIMAX.py: constructs SARIMAX models for a selected number of regions.

Hourly compilling

The original database, after compilling as described in Database Preprocessing can be recompilled again in hourly samples with the script in misc/database_adapt_hourly.py.

After all preprocessing, the resulting data also can be processed by procedures explained in MTP-NT Compilling and Competidors compilling.

Post-processing of results

Code use to validation and compilling of results:

License

This project is licensed under the Creative Commons 4.0.

Acknowledgments

Special thanks to the following contributors:

We would also like to express our gratitude to PPGCO-UFU for their support. And Luis Miguel Contreras Murillo as supporter of the research.

About

Milan telecom analysis

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published