Wafer Privacy Policy

Last updated: June 22, 2026

Wafer, Inc. (“Wafer,” “we,” “us”) operates an inference platform that serves optimized open-source large language models via API. This Privacy Policy explains what data we collect when you use our services at wafer.ai and our API endpoints, how we use it, and the choices you have.

1. Scope

This policy applies to:

Our website (wafer.ai and subdomains).
Our inference API, including Serverless and any dedicated or enterprise endpoints.
Any account or billing interactions you have with us directly or through a reseller (e.g., an API aggregator).

It does not cover third-party services that integrate with Wafer (such as agent harnesses, IDE extensions, or aggregators); those services have their own privacy policies.

2. Data We Collect

2.1 Account and billing data

When you sign up or subscribe, we collect your name, email, and billing information. Payments are processed by our payment provider; we do not store full credit card numbers on our servers.

2.2 API request data

When you call our API, we process:

Prompts, completions, and any tool/function call payloads you send or receive.
Request metadata such as model, timestamp, token counts, latency, IP address, user agent, and API key identifier.
Error traces and diagnostic information when a request fails.

2.3 Website data

When you visit wafer.ai, we collect standard web analytics (pages viewed, referrer, approximate location derived from IP, device and browser type). We use cookies only where needed to operate the site and for basic analytics.

3. How We Use Data

We use the data above to:

Serve your inference requests and return responses.
Operate, monitor, and improve the reliability, performance, and security of our systems.
Investigate abuse, fraud, and violations of our Terms of Service.
Bill for usage and manage your account.
Communicate with you about your account, service changes, and (if you opt in) product updates.

4. Model Training

We do not train the models we serve on your prompts or completions. Inputs and outputs from your API requests are never used to train, fine-tune, or otherwise improve the capabilities or behavior of the open-source models we serve, and we do not share your prompts or completions with any third-party model provider for training.

Speculative decoding. To make inference faster and cheaper, we train small internal “draft” models used only for speculative decoding. These draft models propose candidate tokens that the main model then verifies, so every token you receive is still produced by the open-source model you selected. Speculative decoding does not change model outputs; it only accelerates them. We may use prompts and completions from traffic that is not subject to Zero Data Retention or another contractual retention restriction to train and evaluate these draft models. Draft models are used internally for serving performance, are not exposed as a product, and are not shared with third parties. Traffic covered by Zero Data Retention in Section 6 is excluded from draft-model training.

We also use aggregated, non-content metadata (e.g., token counts, latency distributions, error rates) to improve our kernel optimization and serving infrastructure.

5. Data Retention

For standard Serverless usage and other offerings not covered by Zero Data Retention, request logs containing prompts and completions may be stored in an anonymized form, stripped of direct account identifiers beyond an internal request ID, for up to 30 days to detect, prevent, and investigate abuse, then deleted.

For Serverless requests made with Zero Data Retention enabled or required on models marked ZDR supported in app.wafer.ai, and for grandfathered accounts with Zero Data Retention enabled, prompts and completions are not written to durable storage after request processing. We retain only the operational metadata needed to run, secure, and bill the service.

Other inference API offerings, including dedicated, enterprise, or custom endpoints, may have different retention terms based on the product, endpoint configuration, or customer agreement that applies to that service.

Operational metadata (token counts, latency, error codes, and related service diagnostics) may be retained longer for billing reconciliation, capacity planning, service security, and abuse prevention.

Account and billing records are retained for as long as your account is active and for the period required by applicable tax and accounting laws after closure.

6. Serverless Zero Data Retention (ZDR)

Wafer Serverless supports Zero Data Retention for every model marked ZDR supported in app.wafer.ai. When a request uses a Privacy, Enterprise, or other ZDR-enabled key, includes Wafer-ZDR: required for a ZDR-supported model, or comes from an account we explicitly grandfather into ZDR:

Prompts and completions are not written to durable storage after request processing.
Prompts and completions are not used for the speculative-decoding draft-model training described in Section 4.
Only minimal metadata required to operate the service, bill for usage, and prevent abuse, such as token counts, timestamps, request IDs, model, and latency, may be retained longer.

This ZDR commitment applies to Serverless requests where ZDR is enabled or required for a ZDR-supported model, grandfathered ZDR access, or a customer agreement that explicitly says it applies. Other Wafer inference API products may have different retention terms by product, endpoint, or contract.

We share data only as needed to run the service:

Infrastructure providers. We run inference on cloud and hardware partners under contractual confidentiality and data protection terms. These providers process data only on our instructions.
Payment and business tools. We share limited data with our payment processor, billing platform, and business communications tools.
Legal requests. We may disclose data to comply with valid legal process, to protect our rights or property, or to protect the safety of users or the public.
Business transfers. If Wafer is involved in a merger, acquisition, or asset sale, your data may be transferred as part of that transaction, subject to this policy.

We do not sell personal data, and we do not share prompts or completions with advertisers or data brokers.

8. Security

We use industry-standard safeguards to protect your data, including TLS encryption for data in transit, encryption at rest for stored logs, access controls on internal systems, and audit logging. No system is perfectly secure; we cannot guarantee absolute security, but we work to meet or exceed industry norms for inference providers.

9. Your Rights and Choices

Depending on where you live, you may have rights to access, correct, delete, or port your personal data, or to object to or restrict certain processing. To exercise these rights, email us at the address below. We will verify your identity before acting on a request.

You can also:

Delete your API keys at any time from your account.
Request deletion of your account and associated logs, subject to legal retention requirements.
Opt out of marketing emails using the unsubscribe link in any such email.

10. International Users

Inference is served from data centers in the United States. If you access Wafer from outside the U.S., your data will be transferred to and processed in the U.S., which may have different data protection laws than your jurisdiction. By using the service, you consent to this transfer.

11. Children’s Privacy

Wafer is not directed to children under 13 (or the equivalent minimum age in your jurisdiction), and we do not knowingly collect personal data from them. If you believe a child has provided us personal data, contact us and we will delete it.

12. Changes to This Policy

We may update this policy from time to time. We will post the new version at wafer.ai/privacy-policy and update the “Last updated” date. For material changes, we will provide additional notice (e.g., by email or an in-product banner) before the changes take effect.

13. Account Use and Acceptable Use

Wafer Serverless API access is licensed to the account holder or organization that owns the API key. To keep the service fast and fair for everyone, we enforce the following limits on Serverless usage:

One account per person or organization. You may not operate multiple Serverless accounts to bypass rate limits, billing controls, abuse prevention systems, or eligibility requirements.
Rate and concurrency limits. Serverless enforces per-account, per-key, and per-model rate or concurrency limits. Requests beyond these limits may be queued or rejected until earlier requests complete.
No reselling or pooling. You may not resell access to your Serverless account, proxy it for third parties, or pool a single account across unaffiliated users or organizations. Teams and organizations should contact us at hi@wafer.ai about a dedicated or enterprise plan.

The contractual terms that govern your Serverless access, including billing, refunds, cancellation, enforcement of the limits above, and our right to suspend or terminate accounts that violate them, are described in our Terms of Service.

14. Contact Us

Questions, requests, or complaints about this policy can be sent to:

Wafer, Inc.
Email: hi@wafer.ai
Website: https://wafer.ai