MMLU
The MMLU benchmarking page on the Papers with Code platform tracks the latest model performance rankings in the field of large-scale multi-task language understanding in real-time. The page displays the accuracy of models such as GPT, LLaMA, and others on 57 disciplinary tasks, provides links to papers with code, and is a core tool for researchers and developers to track cutting-edge advances in AI language understanding.