HF Evaluation Leaderboard

Shows MMLU, BigCodeBench, and ARC MC scores pulled from model-index metadata or their pull requests for the top text-generation models.