Home/Transformation API

JSON Transformation API: One Endpoint, Any Schema Mapping

Updated April 2026.

POST /v1/map takes any JSON payload and a plain-English description of the output shape you want. It returns the transformed JSON. The first call with a given (input shape, target text) pair compiles a Python transformation function and runs it in an AST-validated sandbox; every subsequent call with the same pair runs the cached function in milliseconds. Same input shape always produces the same output - no drift, no surprises.

1. 30-second quickstart

curl

curl https://streamfix.dev/v1/map \
  -H "Authorization: Bearer sk_YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "payload": {"first_name": "Ada", "last_name": "Lovelace", "email": "[email protected]"},
    "target": "Object with email and full_name (first + space + last)."
  }'

response

{
  "output": {"email": "[email protected]", "full_name": "Ada Lovelace"},
  "cached": false,
  "fingerprint": "a1b2c3...",
  "elapsed_ms": 2841,
  "code_length": 312
}

Every subsequent call with the same input shape and target text returns in milliseconds with cached: true. The transformation function is reused; you don't pay the LLM-generation cost again.

Python

import httpx

r = httpx.post(
    "https://streamfix.dev/v1/map",
    headers={"Authorization": "Bearer sk_YOUR_KEY"},
    json={
        "payload": any_json_value,
        "target": "plain English description of the output shape",
    },
    timeout=30,
)
output = r.json()["output"]

2. The contract

POST /v1/map. Auth via Authorization: Bearer sk_YOUR_KEY (the same key you use for any other StreamFix endpoint).

Request body

payload - any JSON value: object, array, string (CSV / log line / EDI segment), or anything else.
target - plain-English description of the output you want. This is also the cache key, so keep wording stable.

Response body

output - the transformed payload, ready to use.
cached - true if the transformation function was reused; false on the first call with this (shape, target) pair.
fingerprint - opaque cache key. Same fingerprint = same generated function.
elapsed_ms - total request time.
code_length - size of the generated function in characters. Useful sanity check ("is this transformation as complex as I expected?").

That's the entire public surface. No model knob, no provider routing, no tier selection. The endpoint contract stays stable as we change what runs underneath.

3. How it works

Fingerprint. We compute a structural fingerprint of your payload (just the shape - field names and types - not the values) plus a hash of your target string.
Cache lookup. If the fingerprint is in the cache, skip to step 4 with the cached function. Common case for any production workload.
Compile. On a miss, we ask an LLM to generate a Python def transform(payload): ... that produces the shape your target text describes. The function is parsed, AST-validated, and cached.
Execute. The function runs in an AST-validated sandbox - whitelisted stdlib only (csv, decimal, datetime, re, json, math, collections, itertools, statistics, base64, hashlib, html, io, urllib.parse, xml.etree, etc.). No network, no filesystem, no os, no subprocess. Same input always produces the same output.
Return. You get the transformed JSON, plus the fingerprint and cache state for debugging.

The architecture: LLM is the discovery primitive (called once per shape); deterministic Python is the runtime primitive (called every time). You pay for code generation once and replay it as much as you want.

4. Use cases

Five pages with worked examples and honest comparisons against named alternatives. Each shows real verified inputs and outputs from /v1/map.

Stripe Webhook to CRM →

customer.created -> Salesforce contact, HubSpot properties, Airtable record with computed MRR. Plus invoice.paid and charge.failed.

Parse Apache Logs to JSON →

Combined log format, NGINX log_format, error categorization, bot detection, multi-line aggregation.

Shopify Order to QuickBooks →

orders/create -> QB Invoice, orders/paid -> SalesReceipt, orders/refunded -> CreditMemo, multi-currency with computed exchange rate.

Parse EDI X12 to JSON →

810 invoices, 850 purchase orders, 856 ASNs with hierarchical loops, 855 acknowledgements.

Normalize Vendor CSV Imports →

Drop junk columns, parse currency strings + EU dates, group flat rows back into orders, split rows by status, decode cryptic vendor codes. The per-vendor pattern: N target strings beats N hand-written scripts.

Your transformation isn't on this list? It's still the same endpoint - describe your target shape in plain English. Webhook-to-anything, log-to-event, document-to-record, "this CSV my vendor sends" - all one POST.

5. Why this is not "just an LLM wrapper"

You can call an LLM yourself with the same prompt. Three reasons not to:

The LLM call only happens once per shape. Subsequent requests run cached, deterministic Python - same speed and reliability as a hand-written transform. You don't pay the LLM tax on every request, and your latency floor is milliseconds, not seconds.
Sandboxed execution. Lifting LLM-generated code out into a runtime that won't os.system("rm -rf /") if the LLM hallucinates is the engineering. AST validation + import whitelist + restricted builtins are the load-bearing pieces. We've shipped them so you don't have to.
Determinism. Same input shape + same target = same generated function = same output. No drift between call #1 and call #1000. No hallucinated fields months later.

The engine has been internally benchmarked on a fixed suite of 48 hard transformation tasks across CSV, JSON, webhooks, logs, EDI, document parsing, compute, and joins. Strict pass rate (output exactly equals the expected output) is 92% on the current default tier and 85% on the fast tier. Failure modes are documented in the per-page deep dives.

6. Errors and status codes

Status	Meaning	What to do
`200`	Success.	`output` is the transformed payload.
`400`	Body missing `payload` or `target`, or invalid JSON.	Both fields are required; `target` must be a non-empty string.
`401`	Missing or invalid Bearer token.	Include `Authorization: Bearer sk_...`.
`402`	API key has zero credits remaining.	Top up at streamfix.dev.
`404`	The endpoint is disabled on this deployment.	Should not happen on streamfix.dev. Self-hosters: set `STREAMFIX_MAP_ENABLED=1`.
`422`	The generated function ran but raised an exception. The body includes the exception type and message.	Usually means the target text didn't match the input shape (e.g. asking for an int cast on a column that's actually strings, or a field that doesn't exist). Adjust the target text.
`502`	Internal generation failure (rare).	Retry. If persistent, simplify the target text or break the transformation into smaller steps.

7. Limits and sandbox scope

Compile vs. run. When generating the function, the LLM sees the first ~4,000 characters of your payload. Once compiled, the cached function runs on the full payload regardless of size.
Determinism. Same (input shape, target text) pair always produces the same function. The function is deterministic on its input.
Sandbox whitelist. Whitelisted stdlib modules: csv, decimal, base64, hashlib, hmac, uuid, codecs, io, json, re, math, string, collections, itertools, functools, statistics, datetime, html, time, xml.etree, urllib.parse, fractions, copy, textwrap, unicodedata, binascii, zlib. Blocked: os, sys, subprocess, socket, http.client, ctypes, pickle, anything that touches network, filesystem, or process state.
Timeout. 120 seconds for the LLM call (uncached). Cached calls return in milliseconds.
Cache lifetime. The cache is in-memory per node and survives until the next deploy. Once we add Postgres-backed caching (v2), entries are durable.

8. When NOT to use this

The transformation is trivial and stable. If three fields map 1:1 from your input to your output, write the mapping in 10 lines. No external dependency.
You're not a developer. If you don't have application code, you don't need an API. Use Zapier or Make.com - their visual editors solve the same problem without the code.
You need PII to never leave your network. The first call sends the input shape (the structural fingerprint, not the values themselves) and the target text to OpenAI for code generation. Cached calls run locally in our sandbox and don't send anything. If first calls are unacceptable, this isn't the right fit.
You need air-gapped processing or specific compliance certifications beyond what StreamFix offers - check the security page for current scope.
You're operating at very high throughput (thousands of distinct shapes per second). The HTTP round-trip per first-call-of-shape doesn't fit; for steady-state cached calls it's fine, but bursty new-shape volume can be expensive in latency.

9. Get an API key

Free trial credits on signup. The same Bearer key works for /v1/map and StreamFix's other endpoints.

Use-case deep-dives

Stripe Webhook to CRM →

Salesforce, HubSpot, Airtable shapes from Stripe events.

Shopify Order to QuickBooks →

Orders -> Invoice / SalesReceipt / CreditMemo with multi-currency.

Parse EDI X12 to JSON →

810, 850, 856 with HL loops, 855.

Parse Apache Logs to JSON →

Combined, NGINX, error categorization, bot detection.

Normalize Vendor CSV Imports →

Reshape messy supplier/vendor exports into your canonical schema. The per-vendor pattern.