大模型评测 - Sesame Pie AI

Ai-Ceping

Ai-Ceping is a big language model assessment platform initiated by Professor Wang Haofen of Tongji University, guided by professors from several universities, and dedicated to providing authoritative, fair and transparent assessment data collection and analysis services.

Ai-Ceping 同济大学大模型评测王昊奋

2026年4月15日 396 0

C-Eval ranking

It is mainly used to show the comprehensive ability ranking of different Large Language Models (LLMs) in multi-level and multi-disciplinary Chinese language tasks.

大模型评测

2026年4月15日 447 0

MMBench

The official leaderboard page for MMBench, maintained by the OpenCompass community.

大模型评测

2026年4月15日 315 0

HELM

HELM is a large model evaluation system introduced by Stanford University. The evaluation methodology consists of three main modules: scenarios, fitness, and metrics, and each evaluation run requires the specification of a scenario, a prompt to fit the model, and one or more metrics.

大模型评测开源项目

2026年4月15日 473 0