Representation learning for trajectory prediction.
First, it allows the comparison of clusters derived from different segments related to the trajectory prediction task.
Analysis of predictors based on clusters derived from observation, future, and entire segments of trajectories.
Through the clustering analysis, it also allows to study the best representations for clustering. This “clustering analysis” setup allows us to find the best input state (aka modalities) that best map to the so powerful clusters class!
We also study how contrastive learning can be used to force the embeddings of input state representations to be like powerful cluster classes through this framework:
To do so, we create a synthetic dataset to showcase these features:
For instance, contrastive learning can be used to force the embeddings of the input state representations to be grouped like the class representations:
├── LICENSE <- Open-source license if one is chosen
├── Makefile <- Makefile with convenience commands like `make data` or `make train`
├── README.md <- The top-level README for developers using this project.
├── data
│ ├── external <- Data from third party sources.
│ ├── interim <- Intermediate data that has been transformed.
│ ├── processed <- The final, canonical data sets for modeling.
│ └── raw <- The original, immutable data dump.
│
├── docs <- A default mkdocs project; see www.mkdocs.org for details
│
├── models <- Trained and serialized models, model predictions, or model summaries
│
├── notebooks <- Jupyter notebooks. Naming convention is a number (for ordering),
│ the creator's initials, and a short `-` delimited description, e.g.
│ `1.0-jqp-initial-data-exploration`.
│
├── pyproject.toml <- Project configuration file with package metadata for
│ trajrep_learning and configuration for tools like black
│
├── references <- Data dictionaries, manuals, and all other explanatory materials.
│
├── reports <- Generated analysis as HTML, PDF, LaTeX, etc.
│ └── figures <- Generated graphics and figures to be used in reporting
│
├── requirements.txt <- The requirements file for reproducing the analysis environment, e.g.
│ generated with `pip freeze > requirements.txt`
│
├── setup.cfg <- Configuration file for flake8
│
└── trajrep_learning <- Source code for use in this project.
│
├── __init__.py <- Makes trajrep_learning a Python module
│
├── config.py <- Store useful variables and configuration
│
├── dataset.py <- Scripts to download or generate data
│
├── features.py <- Code to create features for modeling
│
├── modeling
│ ├── __init__.py
│ ├── predict.py <- Code to run model inference with trained models
│ └── train.py <- Code to train models
│
└── plots.py <- Code to create visualizations
conda env create -f environment.yml
Run a model 5 times based on a config file:
python -m trajrep_learning.data_modeling.runners.n_runs 5 trajrep_learning/data_modeling/cfgs/prediction/synthetic/two_stage/prediction_tf.yaml cpu
The different types of 1-stage models:
- baseline_X : baseline models, class(es) agnostic.
- cX : model conditioned either on observable human-labeled classes or data-driven classes.
- combined_cl_X : models that combine contrastive learning and prediction (similar to ABC++).
- fm_X : models using feature matching (with class-based branches).
The different types of 2-stage models:
- mtl_X : multi-task (classification and prediction) models with one stage for representation learning and the other for learning the respective tasks.
- prediction_X : the two-stage prediction framework comprising one stage for representation learning through contrastive learning and the other for the prediction.
- prediction_sup_X : the two-stage prediction framework comprising one stage for representation learning through contrastive learning considering also class labels as input and the other for the prediction.
Run:
python -m trajrep_learning.data_processing.preprocessing thor_magni data/external/thor_magni/ data/processed/thor_magni/ 0 2.5 20
,where the first argument is where the raw data (from thor-magni-tools) is stored, the second argument is the output path, the third argument is the minimum speed, the fourth argument is the maximum speed and the last argument is the trajectory length.
Run:
python -m trajrep_learning.data_processing.extract_neighbors data/processed/thor_magni/ data/processed/thor_magni_neighbors/ 3.70 10 "file_name"
,where the first argument is where the input data is stored, the second argument is the output path , the third argument is the radius, the fourth argument is the number of neighbors and the last argument is the unique identifier for the file from which the trajectory belongs to.
Run clustering analysis (1) - trying to map the clusterer's inputs to the cluster class (just learning the clustering basically):
python -m trajrep_learning.data_modeling.runners.unsupervised_cl_analysis.n_runs_baseline 5 trajrep_learning/data_modeling/cfgs/prediction/thor_magni/two_stage/prediction_tf.yaml cpu "cluster_input->cluster_class" Scenario_2 Scenario_3
Run clustering analysis (2) - trying to map the predictor's input to the cluster class:
python -m trajrep_learning.data_modeling.runners.unsupervised_cl_analysis.n_runs_baseline 5 trajrep_learning/data_modeling/cfgs/prediction/thor_magni/two_stage/prediction_tf.yaml cpu "predictor_input->cluster_class"
Run clustering analysis (3) - trying to map the CL embeddings to the cluster_class:
python -m trajrep_learning.data_modeling.runners.unsupervised_cl_analysis.n_runs_embeddings 5 trajrep_learning/data_modeling/cfgs/prediction/thor_magni/two_stage/prediction_tf.yaml cpu "embeddings->cluster_class"
There is the also the possibility to run k-fold cross validation on a dataset. For instance, for (1):
python -m trajrep_learning.data_modeling.runners.unsupervised_cl_analysis.k_fold_baseline 5 trajrep_learning/data_modeling/cfgs/prediction/thor_magni_act/two_stage/prediction_rnn.yaml cpu "cluster_input->cluster_class" Scenario_1 Scenario_2 Scenario_3 Scenario_4 Scenario_5
where the first argument 5 is the number of folds, and the last arguments the file names to consider for the cross validation method.
Analougsly for (3), we just need to change the analysis type:
python -m trajrep_learning.data_modeling.runners.unsupervised_cl_analysis.k_fold_embeddings 5 trajrep_learning/data_modeling/cfgs/prediction/thor_magni_act/two_stage/prediction_rnn.yaml cpu "embeddings->cluster_class" Scenario_1 Scenario_2 Scenario_3 Scenario_4 Scenario_5



