ChemGraph Leaderboard

ChemGraph Leaderboard provides a reproducible evaluation of agentic AI frameworks and large language models (LLMs) for computational chemistry and materials science.

Models are evaluated daily on 14 chemistry queries grouped into 8 task categories:

Category	Queries	Description
SMILES Lookup	2	Convert molecule names to SMILES strings
Coordinate Gen	2	Generate 3D coordinates from SMILES
Geometry Opt	1	Geometry optimization with DFT/ML potentials
Vib Frequency	1	Vibrational frequency analysis
Thermochem	1	Thermochemical properties (enthalpy, entropy, Gibbs)
Dipole	1	Dipole moment calculation
Energy	3	Single-point energy and geometry opt with JSON extraction
Reaction Gibbs	3	Reaction Gibbs free energy for multi-step workflows

Each model's score reflects its ability to follow structured tool protocols, generate physically meaningful results, and reason across chemistry-specific contexts. Results are scored by an LLM judge with binary accuracy (correct/incorrect) and 5% relative tolerance for numerical values.

Use this leaderboard to explore how different models and agents perform across core chemistry tasks, from small-molecule modeling to multi-step reaction workflows.

{

"headers": [
- "T",
- "Model",
- "Average ⬆️",
- "SMILES Lookup",
- "Coordinate Gen",
- "Geometry Opt",
- "Vib Frequency",
- "Thermochem",
- "Dipole",
- "Energy",
- "Reaction Gibbs",
- "Type",
- "Architecture",
- "Precision",
- "Hub License",
- "#Params (B)",
- "Hub ❤️",
- "Available on the hub",
- "Model sha"
],
"data": [
- [
  - 1,
  - "<a target="_blank" href="https://huggingface.co/openai/gpt-4o" style="color: var(--link-text-color); text-decoration: underline;text-decoration-style: dotted;">openai/gpt-4o</a>",
  - 75,
  - 50,
  - 50,
  - 100,
  - 100,
  - 100,
  - 100,
  - 66.67,
  - 33.33,
  - "",
  - "?",
  - "float16",
  - "?",
  - 0,
  - 0,
  - true,
  - "main"
  ],
- [
  - 2,
  - "<a target="_blank" href="https://huggingface.co/openai/gpt-5.4" style="color: var(--link-text-color); text-decoration: underline;text-decoration-style: dotted;">openai/gpt-5.4</a>",
  - 58.33,
  - 100,
  - 100,
  - 0,
  - 0,
  - 100,
  - 100,
  - 33.33,
  - 33.33,
  - "",
  - "?",
  - "float16",
  - "?",
  - 0,
  - 0,
  - true,
  - "main"
  ],
- [
  - 3,
  - "<a target="_blank" href="https://huggingface.co/openai/gpt-5.2" style="color: var(--link-text-color); text-decoration: underline;text-decoration-style: dotted;">openai/gpt-5.2</a>",
  - 58.33,
  - 100,
  - 100,
  - 0,
  - 0,
  - 100,
  - 100,
  - 33.33,
  - 33.33,
  - "",
  - "?",
  - "float16",
  - "?",
  - 0,
  - 0,
  - true,
  - "main"
  ],
- [
  - 4,
  - "<a target="_blank" href="https://huggingface.co/openai/gpt-5.1" style="color: var(--link-text-color); text-decoration: underline;text-decoration-style: dotted;">openai/gpt-5.1</a>",
  - 54.17,
  - 100,
  - 100,
  - 0,
  - 0,
  - 100,
  - 100,
  - 33.33,
  - 0,
  - "",
  - "?",
  - "float16",
  - "?",
  - 0,
  - 0,
  - true,
  - "main"
  ],
- [
  - 5,
  - "<a target="_blank" href="https://huggingface.co/anthropic/claude-opus-4.6" style="color: var(--link-text-color); text-decoration: underline;text-decoration-style: dotted;">anthropic/claude-opus-4.6</a>",
  - 41.67,
  - 100,
  - 100,
  - 0,
  - 0,
  - 100,
  - 0,
  - 33.33,
  - 0,
  - "",
  - "?",
  - "float16",
  - "?",
  - 0,
  - 0,
  - true,
  - "main"
  ]
],
"metadata": null

}