InferGrade Hub

Find which quantized setup to run on your hardware for this use case.

Ready state

What should happen next

Not synced yet

Local runner

Readiness and setup

Move one machine from not ready to visibly ready, then keep using that path.

Next best action

Pair, start, run

Keep the main local journey obvious, sequential, and grounded in real readiness.

Recommended for you

Choose the quantized setup

Recent runs

Tracked execution

Secondary tools

Build, export, and learn

Top contributors

Community evidence should stay cumulative, exportable, and easy to inspect.

Recommendations

Which setup should I run?

Decision input

Describe the deployment shape and let InferGrade rank the current evidence.

Decision table

Use the table first, then open the rows and families that need a closer read.

Leading candidates

Shortlist cards stay secondary to the table and plot.

Browse raw evidence

Keep one foot in the broader catalog while the decision slice stays active.

Best under VRAM

Practical local choices for the current envelope.

Pareto frontier

Trade-offs worth opening directly in compare.

Historical Results

Recent benchmark evidence

Model Backend Use Case TTFT Tok/s Hardware Capability Verification

Compare

Choose between families, variants, and quants

Preset views

Start from a useful model-choice stance, then refine the exact variants or drop to raw runs.

Raw result drill-down

Result Detail

One run, fully explained

Family Explorer

Branches, quants, and nearby slices

Build

Create a short decision-suite run

Start with a local-friendly decision suite. Expand only when you need deeper reference evidence.

Benchmark identity

Name the run and the decision it should support.

Capability suites

Choose the questions this run should answer.

Benchmark groups

Broaden or narrow the evidence behind those suites.

Individual checks

These are the exact checks that will run.

Resolve from Hugging Face

Paste a repo id, model URL, or `hf://` artifact.

InferGrade fills the runtime and ontology defaults.

Advanced overrides
Artifact and runtime

Only adjust these if you need an override.

Ontology hints

Most users should keep the inferred values.

Generated Config

Stored server-side so local and cloud execution stay consistent.

No run config generated yet.

Run Status

Active and recent runs

Recent runs

Pick a run to inspect its current stage and progress.

Live timeline

Use the timeline to understand what changed and why.

Stored Run Configs

Reusable benchmark programs

My Runs

Contributor activity