Skip to content

Commit 9e9d795

Browse files
committed
cleaned up folder structure
1 parent 599865a commit 9e9d795

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

51 files changed

+187
-355
lines changed

README.md

+184-2
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,189 @@
1-
# rfml-dev
1+
# RFML
2+
3+
This repo provides the pipeline for working with RF datasets, labeling them and training both IQ and spectragram based models. The SigMF standard is used for managing RF data and the labels/anotations on the data. It also make use ot the Torchsig framework for performing RF related augmentation of the data to help make the trained models more robust and functional in the real world.
4+
5+
## Preqs
6+
7+
### Poetry
8+
9+
Follow the instructions here to install Poetry: https://python-poetry.org/docs/#installation
10+
11+
### Torchsig
12+
13+
Download from [Github](https://github.com/TorchDSP/torchsig) and then use Poetry to install it. The most recent version of Torchsig is not supported and version 0.4.1 should be used instead.
14+
15+
```
16+
git clone https://github.com/TorchDSP/torchsig.git
17+
cd torchsig
18+
git checkout 8049b43
19+
```
20+
21+
Make you have the Poetry environment activated, using `poetry shell`, then run:
22+
23+
```
24+
poetry add ./torchsig
25+
```
26+
27+
(update it with the correct path the directoy where Torchsig is installed)
28+
29+
### Torch Model Archiver
30+
31+
Install the Torch Model Archiver:
32+
```
33+
sudo pip install torch-model-archiver
34+
```
35+
36+
More information about this tool is available here:
37+
https://github.com/pytorch/serve/blob/master/model-archiver/README.md
38+
39+
### Inspectrum (optional)
40+
41+
This utility is useful for inspecting sigmf files and the annotations that the auto label scripts make.
42+
https://github.com/miek/inspectrum
43+
44+
45+
## Start Virtual Env
46+
47+
To start-up the virtual environment with all of the Python modules configured, run the following:
248

349
```bash
450
poetry install
551
poetry shell
652
jupyter notebook
7-
```
53+
```
54+
55+
56+
# Building a Model
57+
58+
59+
## Approach
60+
61+
Our current approach is to capture samples of the background RF environment and then also isolate signals of interest and capture samples of each of the signals. The same label will be applied to all of the signals present in the background environment samples. We use this to essentially teach the model to ignore those signals. For this to work, it is important that none of the signals of interest are present. Since it is really tough these days to find an RF free environment, we have build a mini-faraday cage enclosure by lining the inside of a pelican case with foil. There are lots of instructions, like [this one](https://mosequipment.com/blogs/blog/build-your-own-faraday-cage), available online if you want to build your own. With this, the signal will be very strong, so make sure you adjust the SDR's gain appropriately.
62+
63+
## Labeling IQ Data
64+
65+
The scripts in the [label_scripts](./label_scripts/) use signal processing to automatically label IQ data. The scripts looks at the signal power to detect when there is a signal present in the IQ data. When a signal is detected, the script will look at the frequencies for that set of samples and find the upper and lower bounds.
66+
67+
68+
### Tunning Autolabeling
69+
70+
In the Labeling Scripts, the settings for autolabeling need to be tuned for the type of signals that were collected.
71+
72+
```python
73+
annotation_utils.annotate(
74+
f,
75+
label="mavic3_video", # This is the label that is applied to all of the matching annotations
76+
avg_window_len=256, # The number of samples over which to average signal power
77+
avg_duration=0.25, # The number of seconds, from the start of the recording to use to automatically calculate the SNR threshold, if it is None then all of the samples will be used
78+
debug=False,
79+
estimate_frequency=True, # Whether the frequency bounds for an annotation should be calculated. estimate_frequency needs to be enabled if you use min/max_bandwidth
80+
spectral_energy_threshold=0.95, # Percentage used to determine the upper and lower frequency bounds for an annotation
81+
force_threshold_db=-58, # Used to manually set the threshold used for detecting a signal and creating an annotation. If None, then the automatic threshold calcuation will be used instead.
82+
overwrite=False, # If True, any existing annotations in the .sigmf-meta file will be removed
83+
min_bandwidth=16e6, # The minimum bandwidth (in Hz) of a signal to annotate
84+
max_bandwidth=None, # The maximum bandwidth (in Hz) of a signal to annotate
85+
min_annotation_length=10000, # The minimum numbers of samples in length a signal needs to be in order for it to be annotated. This is directly related to the sample rate a signal was captured at and does not take into account bandwidth. So 10000 samples at 20,000,000 samples per second, would mean a minimum transmission length of 0.0005 seconds
86+
# max_annotations=500, # The maximum number of annotations to automatically add
87+
dc_block=True # De-emphasize the DC spike when trying to calculate the frequencies for a signal
88+
)
89+
```
90+
91+
### Tips for Tuning Autolabeling
92+
93+
#### Force Threshold dB
94+
![low threshold](./images/low_threshold.png)
95+
96+
If you see annotations where harmonics or lower power, unintentional signals are getting selected, try setting the `force_threshold_db`. The automatic threshold calculation maybe selecting a value that is too low. Find a value for `force_threshold_db` where it is selecting the intended signals and ignoring the low power ones.
97+
98+
#### Spectral Energy Threshold
99+
![spectral energy](./images/spectral_energy.png)
100+
101+
If the frequency bounds are not lining up with the top or bottom part of a signal, make the `spectral_energy_threshold` higher. Sometime a setting as high as 0.99 is required
102+
103+
#### Skipping "small" Signals
104+
![small signals](./images/min_annotation.png)
105+
106+
Some tuning is needed for signals that have a short transmission duration and/or limited bandwidth. Here are a couple things to try if they are getting skipped:
107+
- `min_annotation_length` is the minimum number of samples for an annotation. If the signal is has less samples than this, it will not be annotated. Try lowering this.
108+
- The `average_duration` setting maybe too long and the signal is getting averaged into the noise. Try lowering this.
109+
- `min_bandwidth` is the minimum bandwidth (in Hz) for a signal to be detected. If this value is too high, signals that have less bandiwdth will be ignored. Try lowering this.
110+
111+
## Training a Model
112+
113+
After you have finished labeling your data, the next step is to train a model on it. This repo makes it easy to train both IQ and Spectragram based models from sigmf data.
114+
115+
### Configure
116+
117+
This repo provides an automated script for training and evaluating models. To do this, configure the [run_experiments.py](./run_experiments.py) file to point to the data you want to use and set the training parameters:
118+
119+
```python
120+
"experiment_0": { # This is the Key to use, it needs to be `experiment_` followed by an increasing number
121+
"experiment_name": "experiment_1", # A name to refer to the experiment
122+
"class_list": ["mavic3_video","mavic3_remoteid","environment"], # The labels that are present in the sigmf-meta files
123+
"train_dir": ["/home/iqt/lberndt/gamutrf-depoly/data/samples/mavic-30db", "/home/iqt/lberndt/gamutrf-depoly/data/samples/mavic-0db", "/home/iqt/lberndt/gamutrf-depoly/data/samples/environment"], # The sigmf files to use, including the path to the file
124+
"iq_epochs": 10, # Number of epochs for IQ training, if it is 0 or None, it will be skipped
125+
"spec_epochs": 10, # Number of epochs for spctragram training, if it is 0 or None, it will be skipped
126+
"notes": "DJI Mavic3 Detection" # Notes to your future self
127+
}
128+
```
129+
130+
Once you have the **run_experiments.py** file configured, run it:
131+
132+
```bash
133+
python3 run_experiments.py
134+
```
135+
136+
Once the training has completed, it will print out the logs location, model accuracy, and the location of the best checkpoint:
137+
138+
```bash
139+
I/Q TRAINING COMPLETE
140+
141+
142+
Find results in experiment_logs/experiment_1/iq_logs/08_08_2024_09_17_32
143+
144+
Total Accuracy: 98.10%
145+
Best Model Checkpoint: /home/iqt/lberndt/rfml-dev-1/rfml-dev/lightning_logs/version_5/checkpoints/experiment_logs/experiment_1/iq_checkpoints/checkpoint.ckpt
146+
```
147+
148+
### Convert Model
149+
150+
Once you have a trained model, you need to convert it into a portable format that can easily be served by TorchServe. To do this, use **convert_model.py**:
151+
152+
```bash
153+
python3 convert_model.py --model_name=drone_detect --checkpoint=/home/iqt/lberndt/rfml-dev-1/rfml-dev/lightning_logs/version_5/checkpoints/experiment_logs/experiment_1/iq_checkpoints/checkpoint.ckpt
154+
```
155+
This will export a **_torchscript.pt** file.
156+
157+
```bash
158+
torch-model-archiver --force --model-name drone_detect --version 1.0 --serialized-file weights/drone_detect_torchscript.pt --handler custom_handlers/iq_custom_handler.py --export-path models/ -r custom_handler/requirements.txt
159+
```
160+
161+
## Files
162+
163+
164+
[annotation_utils.py](annotation_utils.py) - DSP based automated labelling tools
165+
166+
[auto_label.py](auto_label.py) - CV based automated labelling tools
167+
168+
[data.py](data.py) - RF data operations tool
169+
170+
[experiment.py](experiment.py) - Class to manage experiments
171+
172+
[models.py](models.py) - Class for I/Q models (based on TorchSig)
173+
174+
[run_experiments.py](run_experiments.py) - Experiment configurations and run script
175+
176+
[sigmf_pytorch_dataset.py](sigmf_pytorch_dataset.py) - PyTorch style dataset class for SigMF data (based on TorchSig)
177+
178+
[spectrogram.py](spectrogram.py) - Spectrogram tools
179+
180+
[test_data.py](test_data.py) - Test for data.py (might be outdated)
181+
182+
[train_iq.py](train_iq.py) - Training script for I/Q models
183+
184+
[train_spec.py](train_spec.py) - Training script for spectrogram models
185+
186+
[zst_parse.py](zst_parse.py) - ZST file parsing tool, for GamutRF-style filenames
187+
188+
The [experiments](./experiments/) contains various experiments we have conducted during development.
189+
File renamed without changes.
File renamed without changes.
File renamed without changes.

rfml-dev/data.py data.py

File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.

rfml-dev/models.py models.py

File renamed without changes.

models/NOTE.md

+1
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
Note: Models packaged for Torchserve will be stored here

pyproject.toml

+1-1
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ python-on-whales = "^0.69.0"
1919
sigmf = "^1.2.0"
2020
tqdm = "^4.66.4"
2121
cupy = "^13.2.0"
22-
torchsig = {path = "torchsig"}
22+
2323

2424
[tool.poetry.group.dev.dependencies]
2525
jupyter = "^1.0.0"

0 commit comments

Comments
 (0)