ChemGraph Leaderboard
ChemGraph Leaderboard provides a reproducible evaluation of agentic AI frameworks and large language models (LLMs) for computational chemistry and materials science.
Models are evaluated daily on 14 chemistry queries grouped into 8 task categories:
| Category | Queries | Description |
|---|---|---|
| SMILES Lookup | 2 | Convert molecule names to SMILES strings |
| Coordinate Gen | 2 | Generate 3D coordinates from SMILES |
| Geometry Opt | 1 | Geometry optimization with DFT/ML potentials |
| Vib Frequency | 1 | Vibrational frequency analysis |
| Thermochem | 1 | Thermochemical properties (enthalpy, entropy, Gibbs) |
| Dipole | 1 | Dipole moment calculation |
| Energy | 3 | Single-point energy and geometry opt with JSON extraction |
| Reaction Gibbs | 3 | Reaction Gibbs free energy for multi-step workflows |
Each model's score reflects its ability to follow structured tool protocols, generate physically meaningful results, and reason across chemistry-specific contexts. Results are scored by an LLM judge with binary accuracy (correct/incorrect) and 5% relative tolerance for numerical values.
Use this leaderboard to explore how different models and agents perform across core chemistry tasks, from small-molecule modeling to multi-step reaction workflows.
{
- "headers": [
- "T",
- "Model",
- "Average โฌ๏ธ",
- "SMILES Lookup",
- "Coordinate Gen",
- "Geometry Opt",
- "Vib Frequency",
- "Thermochem",
- "Dipole",
- "Energy",
- "Reaction Gibbs",
- "Type",
- "Architecture",
- "Precision",
- "Hub License",
- "#Params (B)",
- "Hub โค๏ธ",
- "Available on the hub",
- "Model sha"
- "data": [
- [
- 1,
- "<a target="_blank" href="https://huggingface.co/openai/gpt-4o" style="color: var(--link-text-color); text-decoration: underline;text-decoration-style: dotted;">openai/gpt-4o</a>",
- 75,
- 50,
- 50,
- 100,
- 100,
- 100,
- 100,
- 66.67,
- 33.33,
- "",
- "?",
- "float16",
- "?",
- 0,
- 0,
- true,
- "main"
- [
- 2,
- "<a target="_blank" href="https://huggingface.co/openai/gpt-5.4" style="color: var(--link-text-color); text-decoration: underline;text-decoration-style: dotted;">openai/gpt-5.4</a>",
- 58.33,
- 100,
- 100,
- 0,
- 0,
- 100,
- 100,
- 33.33,
- 33.33,
- "",
- "?",
- "float16",
- "?",
- 0,
- 0,
- true,
- "main"
- [
- 3,
- "<a target="_blank" href="https://huggingface.co/openai/gpt-5.2" style="color: var(--link-text-color); text-decoration: underline;text-decoration-style: dotted;">openai/gpt-5.2</a>",
- 58.33,
- 100,
- 100,
- 0,
- 0,
- 100,
- 100,
- 33.33,
- 33.33,
- "",
- "?",
- "float16",
- "?",
- 0,
- 0,
- true,
- "main"
- [
- 4,
- "<a target="_blank" href="https://huggingface.co/openai/gpt-5.1" style="color: var(--link-text-color); text-decoration: underline;text-decoration-style: dotted;">openai/gpt-5.1</a>",
- 54.17,
- 100,
- 100,
- 0,
- 0,
- 100,
- 100,
- 33.33,
- 0,
- "",
- "?",
- "float16",
- "?",
- 0,
- 0,
- true,
- "main"
- [
- 5,
- "<a target="_blank" href="https://huggingface.co/anthropic/claude-opus-4.6" style="color: var(--link-text-color); text-decoration: underline;text-decoration-style: dotted;">anthropic/claude-opus-4.6</a>",
- 41.67,
- 100,
- 100,
- 0,
- 0,
- 100,
- 0,
- 33.33,
- 0,
- "",
- "?",
- "float16",
- "?",
- 0,
- 0,
- true,
- "main"
- [
- "metadata": null