Zero Data Retention at ScaleDown
At ScaleDown, ZDR means one specific thing “your prompt data is never written to disk.” It exists in memory for the duration of the request and nowhere else. When the response is returned, the prompt is gone. There is no retention period because there is no retention.
The Request Lifecycle
Understanding the guarantee requires understanding exactly what happens to your data from the moment a request enters ScaleDown to the moment it leaves.
A request arrives at the ScaleDown API. The input is tokenized in memory. The tokenized representation is passed to the model for processing. While the model runs, system-level metrics are collected from the tokenized data and infrastructure telemetry data such as latency, throughput, compression ratio. The model produces a response. The response is returned to the caller. The prompt data is discarded.
Nothing in that sequence involves writing the prompt to a database, a log file, a message queue, or any persistent store. The telemetry collected during processing is derived from structural properties of the request, not from the contents.
What We Retain vs. What We Never Retain
What we retain is request latency, throughput, token counts, compression ratios, model version, error codes.
The data ScaleDown keeps is limited to what is needed to operate and improve the infrastructure.
What we never retain is prompt contents, tool call inputs and outputs, user-supplied context, conversation history, request bodies in any form.
The data we keep has no semantic content. Token count tells us a request was 2,400 tokens. It does not tell us what those tokens said. Compression ratio tells us how efficiently the input was encoded. It does not tell us what was encoded. There is no path from retained telemetry back to the original prompt.
No Exceptions
The behavior is uniform across all request types and environments. Production requests, staging requests, and debug requests go through the same pipeline with the same data handling.
A request that fails partway through (model timeout, upstream error, rate limit) logs error metadata such as the error type, the point of failure, the relevant system metrics. It does not log the request contents that were in flight when the failure occurred.
This matters because exceptions are where guarantees break down. A system that handles nominal requests correctly but logs full request bodies on errors is not a ZDR system. The failure path is often the highest-risk path: failures happen under load, during debugging, in environments with reduced observability controls. ScaleDown applies the same handling regardless.
Key Takeaways
Prompt data exists only for the duration of processing. It is never persisted beyond the request lifecycle, not for debugging, not for model improvement, not in any form that survives the request boundary.
If you’re building on ScaleDown with sensitive inputs (legal documents, financial records, healthcare data, internal communications), the guarantee is structural, not contractual. The data is not retained because the system has no mechanism to retain.
We offer 50M free tokens for every agent. Try it at scaledown.ai.


