The Gen AI Traffic Modelling Framework (GenAI-TMF) enables the characterization of GenAI LLM and VLM traffic patterns, network QoS requirements, and computing requirements. The framework includes a set of scripts for running experiments with various GenAI LLM and VLM models using various prompts and data sets and under varying network conditions (e.g., latency, packet-loss, jitter, etc.) and computing conditions (e.g., host, CPU, RAM, power levels, etc.).
Documentation for setting up the experimentation platform environment and devices, running experiments, and analyzing the experimentation results.
Experimentation results.
YAML configuration defining the demo scenarios for experimentation using the AdvantEdge platform.
Scripts for configuring the experimentation platform environment and devices.
-
download_hf_dataset_locally.py
Downloads datasets from HuggingFace in a specific directory. -
download_hf_model_locally.py
Downloads models from HuggingFace in a specific directory. -
quantize_model.py
Converts and optimizes models into lower-precision formats for faster inference.
Scripts for configuring and running experiments.
-
start_model_server.sh
The detailed script for model inference serving. -
run_model_wrapper.sh
Wrapper script of start_model_server.sh to run models with predefined parameters. -
device.conf
Configuration file specifying device names, IPs, usernames and passwords. -
prepare_setup.py
Prepares the environment by creating experiment directories and starting command server. -
command_server.py
Backend server that listens for experiment control commands and and executes them on the host inside a specified conda env. -
advantedge_api.py
Module for interacting with the AdvantEdge platform, for configuration. -
openapi_ui.py
Serves an OpenAPI-based user interface for testing model inference. -
benchmark.py
Runs performance benchmarks across different model configurations and hardware setups. -
modify_cpu_state.sh
Script to bring "X" number of CPUs offline. -
run_with_conda.sh
Executes scripts within a specified Conda environment to ensure reproducible dependencies. This script is called by command_server.py module. -
params.py
Centralized configuration parameters for experiments. -
poc_utils.py
Collection of utility functions and helpers used across different python modules. -
launch_experiments.py
Automates the launching of multiple experiment runs with varying configurations. -
process_expr_results.py
Parses raw experiment outputs and aggregates results. -
plot_results.py
Generates visualizations (plots and charts) from processed experiment data. Visualizations are created using Seaborn and Matplotlib.
Name | Version | License | URL |
---|---|---|---|
Flask | 3.1.2 | BSD-3-Clause | https://github.com/pallets/flask/ |
bitsandbytes | 0.45.3 | MIT License | https://github.com/bitsandbytes-foundation/bitsandbytes |
datasets | 4.0.0 | Apache Software License | https://github.com/huggingface/datasets |
fastapi | 0.119.0 | MIT License | https://github.com/fastapi/fastapi |
flashinfer-python | 0.4.0 | Apache-2.0 | https://github.com/flashinfer-ai/flashinfer |
gevent | 25.5.1 | MIT | http://www.gevent.org/ |
gevent | 25.9.1 | MIT | http://www.gevent.org/ |
gradio | 5.49.1 | Apache-2.0 | https://github.com/gradio-app/gradio |
huggingface-hub | 0.35.3 | Apache Software License | https://github.com/huggingface/huggingface_hub |
llmcompressor | 0.8.1 | Apache Software License | https://github.com/vllm-project/llm-compressor |
locust | 2.41.6 | MIT License | https://locust.io/ |
matplotlib | 3.10.7 | Python Software Foundation License | https://matplotlib.org |
ninja | 1.13.0 | Apache Software License; BSD License | http://ninja-build.org/ |
numpy | 2.2.6 | BSD License | https://numpy.org |
openai | 2.3.0 | Apache Software License | https://github.com/openai/openai-python |
orjson | 3.11.3 | Apache Software License; MIT License | https://github.com/ijl/orjson |
pandas | 2.3.3 | BSD License | https://pandas.pydata.org |
pillow | 11.3.0 | MIT-CMU | https://python-pillow.github.io |
pyarrow | 21.0.0 | Apache Software License | https://arrow.apache.org/ |
requests | 2.32.5 | Apache Software License | https://requests.readthedocs.io |
requests | 2.32.5 | Apache Software License | https://requests.readthedocs.io |
seaborn | 0.13.2 | BSD License | https://github.com/mwaskom/seaborn |
torch | 2.8.0 | BSD License | https://pytorch.org/ |
torchaudio | 2.8.0 | BSD License | https://github.com/pytorch/audio |
torchvision | 0.23.0 | BSD | https://github.com/pytorch/vision |
transformers | 4.56.2 | Apache Software License | https://github.com/huggingface/transformers |
triton | 3.4.0 | MIT License | https://github.com/triton-lang/triton/ |
uvicorn | 0.37.0 | BSD License | https://uvicorn.dev/ |
vllm | 0.11.1 | Apache-2.0 | https://github.com/vllm-project/vllm |
xformers | 0.0.32.post1 | BSD License | https://facebookresearch.github.io/xformers/ |