Repair syntax, coerce types, validate against your schema — automatically, mid-stream. One base_url change for agents, pipelines, and streaming UIs across every provider.
```json {"status": "shipped", "count": "3", "active": True, "items": ['a','b','c',],} ```
{"status": "shipped",
"count": 3,
"active": true,
"items": ["a", "b", "c"]}
Copy this now, it won't be shown again.
Repair, coerce, validate, retry — all before your code sees the response.
JSON is fixed as it streams. Fences, <think> tags, trailing commas — stripped token-by-token via SSE. Auto-closes truncated streams.
Trailing commas, unquoted keys, single quotes, Python True/None, leading zeros — all fixed with sub-ms overhead.
LLM returns "age": "30"? Auto-cast to 30 using your schema. Handles string-to-int, string-to-bool, float-to-int across nested structures.
Pass a JSON Schema — get guaranteed conformance. Validates required fields, types, enums, and min/max constraints. Auto-retries with schema-aware prompts on failure.
OpenAI warns tool args aren't guaranteed valid JSON. We repair function.arguments so your agent framework doesn't break mid-chain.
Every response includes X-StreamFix-Applied headers listing exactly which repairs ran. Build alerts and dashboards on stable repair names.
Passthrough proxy. Content processed in memory and immediately discarded. Never logged or trained on.
POST /v1/map takes a payload and a plain-English description of the output you want. The LLM compiles a Python transformation function once, runs it in a sandbox, caches it, and returns deterministic results from then on. Same Bearer key as the JSON repair endpoint.
Change base_url. Everything else stays the same.
response_format or structured output?
If you control a single provider with reliable structured output: you should.
The benchmark tested the "plain prompt baseline" because many production setups involve:
StreamFix is for the "messy middle" where you can't guarantee consistent structured output across all your providers.
Correct: a full JSON array isn't valid until the closing ].
But that's not what we're doing. We perform incremental object extraction for UI rendering:
So the UI shows Item #1 while Item #10 is still generating. This improves perceived latency without waiting for the final ].
Across 672 API calls with plain prompts (8 models, 7 tasks, temperature=0):
json.loads(content) worked only 33.3% of the timeA simple cleanup layer increased strict parse success to 98.4% without changing prompts.
High fit:
function.arguments can be malformedLow fit:
While StreamFix is in beta, all features are free.
No credit card. No commitment.
Running low? Reply to your welcome email with what you're building and what's broken — we'll top you up.
Paid plans after beta. Early users get notified first.