prima.cpp: Speeding up 70B-scale LLM inference on low-resource everyday home clusters
-
Updated
Apr 21, 2025 - C++
prima.cpp: Speeding up 70B-scale LLM inference on low-resource everyday home clusters
Official impl. of ACM MM paper "Identity-Aware Attribute Recognition via Real-Time Distributed Inference in Mobile Edge Clouds". A distributed inference model for pedestrian attribute recognition with re-ID in an MEC-enabled camera monitoring system. Jointly training of pedestrian attribute recognition and Re-ID.
Source code of the paper "Private Collaborative Edge Inference via Over-the-Air Computation".
Add a description, image, and links to the distributed-inference topic page so that developers can more easily learn about it.
To associate your repository with the distributed-inference topic, visit your repo's landing page and select "manage topics."