Skip to content

Artifact for "Apparate: Rethinking Early Exits to Tame Latency-Throughput Tensions in ML Serving" [SOSP '24]

License

Notifications You must be signed in to change notification settings

SysML-Princeton/apparate

This branch is up to date with dywsjtu/apparate:main.

Folders and files

NameName
Last commit message
Last commit date

Latest commit

64f4a76 · Nov 21, 2024

History

22 Commits
Aug 21, 2024
Aug 21, 2024
Nov 21, 2024
Aug 21, 2024
Sep 15, 2024
Aug 21, 2024
Aug 21, 2024
Aug 21, 2024
Sep 20, 2024
Sep 15, 2024
Sep 1, 2024
Aug 21, 2024
Aug 21, 2024
Sep 13, 2024
Sep 1, 2024
Aug 21, 2024
Aug 21, 2024
Aug 21, 2024
Aug 21, 2024
Aug 21, 2024
Aug 21, 2024
Aug 21, 2024
Aug 21, 2024

Repository files navigation

Apparate: Rethinking Early Exits to Tame Latency-Throughput Tensions in ML Serving

This repository contains the source code implementation of the SOSP '24 paper Apparate: Rethinking Early Exits to Tame Latency-Throughput Tensions in ML Serving.

Please note that the arXiv version is not up to date with our SOSP submission. We will update the arXiv paper once the camera-ready version is finalized.

Getting Started

Apparate is implemented in Python. We have tested Apparate on Ubuntu 22.04 with Python 3.8.13.

Detailed instructions on how to reproduce the main results from our SOSP paper are in EXPERIMENTS.md.

References

@article{dai2023apparate,
  title={Apparate: Rethinking Early Exits to Tame Latency-Throughput Tensions in ML Serving},
  author={Dai, Yinwei and Pan, Rui and Iyer, Anand and Li, Kai and Netravali, Ravi},
  journal={arXiv preprint arXiv:2312.05385},
  year={2023}
}

About

Artifact for "Apparate: Rethinking Early Exits to Tame Latency-Throughput Tensions in ML Serving" [SOSP '24]

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%