🎯 Eval System · Layer L503
Test, score and regression-check every AI output automatically.
Score AI outputs against test cases — correctness, tone, safety and format. Run regression suites on every deploy. Catch quality drops before users do.
How it works
Eval System in three steps
Production-grade and live on api.forcedream.ai. One Bearer token, zero extra setup.
-
01
📝
Define
Define test cases — input, expected output, scoring weights. Unlimited cases, reusable across runs.
POST /v1/eval/cases -
02
🎯
Score
Each output is scored on correctness, tone, format, safety, latency and cost. A single composite score per run.
POST /v1/eval/run-case -
03
🔔
Alert
Regression reports flag quality drops automatically. Fail CI/CD builds when regression exceeds your threshold.
GET /v1/eval/report
What's included
Everything you need, nothing you don't
Quick start
$ curl https://api.forcedream.ai/v1/eval/run-case \
-H "Authorization: Bearer $KEY" \
-d '{"case_id":"summarise_v1","output":"Here is the summary...","latency_ms":1200,"cost_pence":4}'
→ {"composite_score":0.91,"pass_match":true,"latency_score":0.88,"cost_score":0.95,"safety_score":1.0}
$ curl https://api.forcedream.ai/v1/eval/report \
-H "Authorization: Bearer $KEY"
→ {"total_cases":47,"pass_rate":0.96,"regressions":1,"cases":[{"id":"summarise_v1","delta":-0.04}]}
Pricing
Simple, transparent pricing
78% of API earnings flow back to you on every call. No hidden fees. Free tier available.
Comparison
How Eval System compares
Purpose-built for AI products. Not retrofitted from general-purpose tools.
| Feature | ForceDream Eval | Braintrust | PromptFoo | LangSmith |
|---|---|---|---|---|
| AI-native scoring | ✓ | ✓ | Partial | ✓ |
| 80% earnings back | ✓ | — | — | — |
| Policy enforcement | ✓ | — | — | — |
| CI/CD integration | ✓ | ✓ | ✓ | Partial |
| WORM audit | ✓ | — | — | — |
| Price | £29/mo | £150+/mo | Free | £100+/mo |
FAQ
Frequently asked questions
Start with Eval System.
Scale to all 22 products.
Free tier available. 80% earnings from your first call. Every call. WORM-sealed by default.
No credit card 80% earnings guaranteed WORM-sealed audit