IneqMath Leaderboard

Welcome to the IneqMath Leaderboard!

This is the official leaderboard for IneqMath, an expert-curated dataset of Olympiad-level inequalities.

Please see the paper Solving Inequality Proofs with Large Language Models for more details.

๐ŸŒ Project | arxiv | ๐Ÿค— HF Paper | Code | ๐Ÿค— Dataset | ๐Ÿ† Leaderboard | ๐Ÿ”ฎ Visualization

Submit New Model Evaluation Results

Please upload the JSON file with model evaluation results and fill in the following information. If you have any questions, please contact us at jiayi_sheng@berkeley.edu or lupantech@gmail.com.

โš ๏ธ
OpenAI API Key Required for Evaluation
โ€ข A valid OpenAI API key (Tier 3+ with $10+ budget) is required for LLM judge evaluation. We do not save or store your API key - it's only used during evaluation.
โ€ข You can revoke or deactivate your key 15 minutes after evaluation completion. The evaluation process typically costs $5 depending on your submission size.
Model Type *

Select the type of your model

Model Source *

Select whether the model is proprietary or open-source

Reasoning effort

Optional: Select the reasoning effort level

Required JSON Structure:

Your JSON file must include at least these 5 fields for each problem:

[
    {
        "data_id": [integer or string] The ID of the test data,
        "problem": [string] The question text,
        "type": [string] The type of question: 'relation' or 'bound',
        "prompt": [string] The prompt used for the problem,
        "response": [string] The response of the model
    },
    ...
]

You can click the download button below to get an example file. The system will process your submission and calculate accuracy metrics automatically.