跳到内容

夜深了,注意休息,愿你今夜好梦。

大模型评测

Ai-Ceping

Ai-Ceping is a big language model assessment platform initiated by Professor Wang Haofen of Tongji University, guided by professors from several universities, and dedicated to providing authoritative, fair and transparent assessment data collection and analysis services.

2026年4月15日 396 0 浏览 396,收藏 0

C-Eval ranking

It is mainly used to show the comprehensive ability ranking of different Large Language Models (LLMs) in multi-level and multi-disciplinary Chinese language tasks.

2026年4月15日 447 0 浏览 447,收藏 0

HELM

HELM is a large model evaluation system introduced by Stanford University. The evaluation methodology consists of three main modules: scenarios, fitness, and metrics, and each evaluation run requires the specification of a scenario, a prompt to fit the model, and one or more metrics.

2026年4月15日 473 0 浏览 473,收藏 0
正文
强调色