Replay : https://drive.google.com/file/d/15ycxnqnx-d8nVuAabASGFo0pnJjyke-8/view?usp=drive_link
Agenda
- Conversion de modèle llama 2 7b et 13b
- Docker compose triton + prometheus + node exporter
- Analyse llm benchmark (mmlu)
- Analyse latence / token
Notes