Dashboard — Modulith Lab

⚠ No model passes: Transparency | 6 critical changes this week

View all signals → Updated 5 Apr 2026, 10:30 AM

☣

Biggest regression

Grok 3 +18% risk

Security

↓

Biggest improvement

Gemini 1.5 -9% risk

Reliability

●

New threshold crossings

2 models

Became high risk

∿

Models to watch

3 unstable

High week-over-week volatility

✉

Weekly digest

Sent every Monday

Manage alerts

Risk Map ⓘ

Positioned by Likelihood (→) and Impact (↑)

View by

Critical Zone High impact × high likelihood

Impact / Harm ↑

LowMedHighCrit

LowMediumHighCritical

Low 0.00–0.25

Medium 0.25–0.50

High 0.50–0.75

Critical 0.75–1.00

We test models with thousands of behavior probes. How often issues occur (likelihood) and how severe they are (impact).

Click a model on the map
to view details

Top Risk Models Today ⓘ

1 An Claude Haiku 4.5 Critical

2 xA Grok 4.1 Critical

3 Oa GPT-4o High

See full ranking →

Best in Class (this week) ⓘ

Overall

🏆

Gemini 3.1 Flash

Most improved

↗

Gemini 3.1 Flash

-9%

Strongest security

🛡

Mistral Large

Most stable

⚓

Gemma 2

What Changed This Week

Loading summary...

Compare ModeSelect up to 3 models