Skip to content

Latest commit

 

History

History
20 lines (11 loc) · 651 Bytes

File metadata and controls

20 lines (11 loc) · 651 Bytes

Paper-list-resource-efficient-large-language-model

Target venues: system conferences (OSDI/SOSP/ATC/EuroSys/ASPLOS), network conferences (NSDI/SIGCOMM), mobile conferences (MobiCom/MobiSys/SenSys/UbiComp), AI conferences (NeurIPS/ACL/ICLR/ICML)

We will keep maintaining this list :)

Note: We only focus on inference now. We plan to involve training work in the future.

Example: [Conference'year] Title, affilication

Model

[ICLR'23] GPTQ: ACCURATE POST-TRAINING QUANTIZATION FOR GENERATIVE PRE-TRAINED TRANSFORMERS, IST Austria

Input

Inference engine

Compiler

Hardware