MESS+ — Energy-Optimal LLM Inference 🔋🤖

NeurIPS 2024 AFM workshop paper that routes each request to the cheapest LLM that still meets a user-defined service-level guarantee (accuracy ≥ $\alpha$).

Key Contributions

Adaptive model selection: Predicts performance (accuracy, latency) per query using lightweight FastText.
Energy minimization: Solves a mixed-integer program to choose the most energy-efficient model under SLAs.
Operational support: Handles batch inference, power monitoring (Jetson/x86), and reproducible notebooks.

Cite

@inproceedings{zhang2024messplus,
  title = {MESS+: Energy‑Optimal Inference in Language Model Zoos with Service Level Guarantees},
  author = {Zhang, Ryan and Woisetschläger, Herbert and Wang, Shiqiang and Jacobsen, Hans Arno},
  booktitle = {Adaptive Foundation Models Workshop @ NeurIPS 2024},
  year = {2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 51 Commits
.ipynb_checkpoints		.ipynb_checkpoints
data		data
experiments		experiments
fastText		fastText
framework		framework
power_monitoring		power_monitoring
Algorithm.ipynb		Algorithm.ipynb
Analysis.ipynb		Analysis.ipynb
Baseline.ipynb		Baseline.ipynb
Cleaning.ipynb		Cleaning.ipynb
Inference.ipynb		Inference.ipynb
Plotting.ipynb		Plotting.ipynb
README.md		README.md
mfasttext.ipynb		mfasttext.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

MESS+ — Energy-Optimal LLM Inference 🔋🤖

Key Contributions

Cite

About

Uh oh!

Releases

Packages

Languages

ryanzhangofficial/Inferencing-Project

Folders and files

Latest commit

History

Repository files navigation

MESS+ — Energy-Optimal LLM Inference 🔋🤖

Key Contributions

Cite

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages