Research — Simulatte

Simulatte International Benchmark — United States — 2026

57.6%

2.3 percentage points from the theoretical human ceiling.

We took 15 published Pew Research Center questions and ran them through a synthetic US population of 60 personas. The result: 88.7% distribution accuracy, cohort-adjusted — sitting just 2.3pp from the ceiling that even humans can't surpass.

60 personas 15 questions Pew Research Center ground truth Reproducible

Read technical report → Audit data ↗

Simulatte Personas vs Pew — Accuracy Racing to Human Ceiling

Gap to ceiling

2.3pp

Total improvement

+31.1pp

Simulatte India Benchmark — 2026

46.2%

The first company to replicate India's political landscape at population scale.

Using Sarvam's proprietary AI infrastructure for language and cultural grounding, we rebuilt India's political landscape from scratch — 40 personas, 22 architectural sprints. Ground truth: Pew Research Center + CSDS-Lokniti survey data, the gold standard for Indian public opinion.

40 personas 22 sprints Pew + CSDS-Lokniti ground truth Sarvam infrastructure

Read technical report →

From baseline to benchmark

Simulatte India Benchmark (current)85.3%

Unoptimized baseline46.2%

Total gain

+39.1pp

Sprints

Personas

Simulatte vs the LLMs — 2026

1.0×

Closer to the human ceiling than any LLM tested.

We ran 10 large language models against the same India Pew benchmark. 5,878 SHA-256 verified API calls. The gap between Simulatte and the best available LLM is not close.

10 LLMs tested 5,878 verified API calls SHA-256 checksums

Read technical report → Audit data ↗

Performance vs Human Ceiling — India Benchmark

Human self-consistency ceiling (Iyengar et al.)

91.0%

Simulatte

85.3%

GPT-4o — best LLM tested

75.6%

Gemini

≈44%

Average LLM — 10 models

≈17%

India Pew + CSDS-Lokniti ground truth · 5,878 SHA-256 verified API calls

Methodology

How accuracy is measured

Distribution accuracy = 1 − Σ|real_i − sim_i| / 2. Identical formula to the UC Berkeley synthetic population study, enabling direct comparison. Human ceiling of 91.0% sourced from Iyengar et al. (Stanford). Every result reproducible via public GitHub repository.

Read accuracy report → US study — full technical report → GitHub: raw data and audit artifacts ↗

The numbers.

2.3 percentage points from the theoretical human ceiling.

The first company to replicate India's political landscape at population scale.

Closer to the human ceiling than any LLM tested.

How accuracy is measured

Get in touch.