Reliability Layer for LLM Agents

Guaranteed JSON From
Any LLM, Every Time

Repair syntax, coerce types, validate against your schema — automatically, mid-stream. One base_url change for agents, pipelines, and streaming UIs across every provider.

LLM output json.loads() fails

```json
{"status": "shipped",
 "count": "3",
 "active": True,
 "items": ['a','b','c',],}
```

fence wrong type Python literal single quotes trailing commas

StreamFix output schema valid

{"status": "shipped",
 "count": 3,
 "active": true,
 "items": ["a", "b", "c"]}

X-StreamFix-Applied: fence_strip, fix_single_quotes, fix_python_literals, remove_trailing_comma, type_coerce

X-StreamFix-Schema-Valid: true

33% → 99.5% strict parse • 336 tests • 8 models • Benchmark

From Broken Output to Guaranteed Schema

Repair, coerce, validate, retry — all before your code sees the response.

Streaming Repair

JSON is fixed as it streams. Fences, <think> tags, trailing commas — stripped token-by-token via SSE. Auto-closes truncated streams.

Syntax + Literal Repair

Trailing commas, unquoted keys, single quotes, Python True/None, leading zeros — all fixed with sub-ms overhead.

Type Coercion

LLM returns "age": "30"? Auto-cast to 30 using your schema. Handles string-to-int, string-to-bool, float-to-int across nested structures.

Contract Mode

Pass a JSON Schema — get guaranteed conformance. Validates required fields, types, enums, and min/max constraints. Auto-retries with schema-aware prompts on failure.

Tool-Call Repair

OpenAI warns tool args aren't guaranteed valid JSON. We repair function.arguments so your agent framework doesn't break mid-chain.

Repair Provenance

Every response includes X-StreamFix-Applied headers listing exactly which repairs ran. Build alerts and dashboards on stable repair names.

Zero Data Retention

Passthrough proxy. Content processed in memory and immediately discarded. Never logged or trained on.

Also on streamfix.dev

Transform any JSON to any shape with one POST

POST /v1/map takes a payload and a plain-English description of the output you want. The LLM compiles a Python transformation function once, runs it in a sandbox, caches it, and returns deterministic results from then on. Same Bearer key as the JSON repair endpoint.

Stripe webhook → CRM

Salesforce, HubSpot, Airtable shapes

Shopify order → QuickBooks

Invoice, SalesReceipt, CreditMemo

EDI X12 → JSON

810, 850, 856, 855

Apache / NGINX logs → JSON

Combined, custom log_format

Vendor CSV imports

Reshape, filter, decode codes

Full /v1/map reference →

Contract, sandbox, errors, limits

One Line to Guaranteed JSON

Change base_url. Everything else stays the same.

OpenAI SDK compatible
Multi-model via OpenRouter

from openai import OpenAI

# Just point to our gateway
client = OpenAI(
    base_url="https://streamfix.dev/v1",
    api_key="sk_YOUR_KEY"
)

# Works with any model
resp = client.chat.completions.create(
    model="openai/gpt-4o-mini",
    messages=[...]
)

API Docs Integrations Guides Benchmark

FAQ

Why not just use response_format or structured output?

If you control a single provider with reliable structured output: you should.

The benchmark tested the "plain prompt baseline" because many production setups involve:

Multi-provider routing (OpenRouter, fallback chains)
Models where structured output isn't available or behaves inconsistently
Extra latency from constrained decoding
Wrappers that break consumers even with structured mode enabled

StreamFix is for the "messy middle" where you can't guarantee consistent structured output across all your providers.

"Streaming JSON makes no sense" — How does this actually work?

Correct: a full JSON array isn't valid until the closing ].

But that's not what we're doing. We perform incremental object extraction for UI rendering:

// Model streams:
[{"id":1,"name":"A"}, {"id":2,"name":"B"}, ...

// We don't parse the whole array
// Instead we:
1. Maintain a rolling buffer
2. Detect when an object is complete (brace-balanced ...})
3. Parse that single object
4. Render it immediately

So the UI shows Item #1 while Item #10 is still generating. This improves perceived latency without waiting for the final ].

What did the benchmark actually show?

Across 672 API calls with plain prompts (8 models, 7 tasks, temperature=0):

Strict json.loads(content) worked only 33.3% of the time
But 99.5% of responses contained valid JSON that was merely wrapped in fences/prose/think tags
95.5% of failures were markdown fences — not logic errors

A simple cleanup layer increased strict parse success to 98.4% without changing prompts.

→ Read the full benchmark study

Who is StreamFix for?

High fit:

Agent pipelines (CrewAI, LangGraph, AutoGen, n8n) where one malformed output kills the chain
Multi-provider routing (OpenRouter, fallback chains) with inconsistent structured output support
Tool-calling agents where function.arguments can be malformed
Streaming UIs wanting incremental rendering
Contract Mode: need guaranteed schema conformance with type coercion and auto-retry

Low fit:

Local llama.cpp with grammar constraints
Single provider with reliable structured outputs

Pricing

Beta

Free

While StreamFix is in beta, all features are free.
No credit card. No commitment.

1,000 free requests on signup (1 credit = 1 request)
Full API + streaming + tool-call repair
Contract Mode: schema validation + type coercion + auto-retry
Provenance headers + repair diagnostics
Share feedback → get 1,000 more credits

Get Free Key →

Running low? Reply to your welcome email with what you're building and what's broken — we'll top you up.

Paid plans after beta. Early users get notified first.

json.decoder.JSONDecodeError: Expecting value: line 1 column 1

SyntaxError: Unexpected token } in JSON at position...

json.decoder.JSONDecodeError: Extra data: line 1 column...

Unterminated string starting at: line 1 column...

Unexpected non-whitespace character after JSON data

DeepSeek R1 <think> tag parsing error

Guaranteed JSON From Any LLM, Every Time

API Key Generated