Security and trust
Data Handling by Deployment Level
ML Mind can start with minimal information and only requires deeper request-path access when customers choose stronger control. This reduces friction and helps teams adopt AI cost governance safely.
| Level | Access | Data typically used | What ML Mind can do |
|---|---|---|---|
| Level 0 | Reports only | Billing exports, usage reports, summarized logs | Discover waste and recommend savings |
| Level 1 | Telemetry | request_id, model, provider, tokens, latency, error type, cost | Build cost dashboards and identify retry/cost patterns |
| Level 2 | Pre-model control | Question metadata, context, retrieved chunks, token budget | Optimize prompts and RAG context before the model sees them |
| Level 3 | Inference gateway | Request, selected model, response status, verification result | Route, cache, verify, prevent retries and enforce fallback |
| Level 4 | ModelOps | GPU metrics, queue, model replicas, OOM events, batching | Optimize self-hosted serving and GPU utilization |
| Level 5 | Lifecycle | Training runs, datasets, checkpoints, validation metrics | Control training cost, release gates and experiment duplication |
Data minimization principles
Start with aggregate data
Many audits can begin from reports, billing exports and telemetry without raw prompt content.
Use only what is required
Deeper request-path access is limited to the optimization function being deployed.
Separate savings from content
Cost, latency, retry and routing insights can often be generated from metadata before content-level access is needed.
Free AI FinOps Audit
Choose the safest starting level
Request a deployment review and learn which level fits your AI stack and privacy requirements.
Turn this page into action
ML Mind is designed to move from content to evidence: simulate your workload, generate a savings report, then request a structured AI FinOps audit.