OWASP LLM Top 10 · LLM10
Unbounded Consumption
No limits on usage, so attackers can run up your bill, degrade service, or extract your model.
What it is
LLM calls cost money and compute. Unbounded consumption is the absence of limits on how much a caller can use. That opens the door to denial-of-service, 'denial-of-wallet' (running up your inference bill), and resource-heavy extraction attacks that try to clone your model or harvest its behaviour at scale.
How it shows up in real apps
- No rate or quota limits per user or key, so a single caller can flood the API.
- Unbounded input or output sizes and context windows that drive cost and latency.
- Denial-of-wallet: cheap-to-send, expensive-to-serve requests aimed at your bill.
- High-volume querying to extract or distil the model's behaviour.
A concrete example
Scenario
A public-facing assistant accepts large inputs and long generations.
Attack
An attacker scripts many maximal requests in parallel.
Result
Latency spikes for real users and the monthly inference bill balloons. No exploit of the model needed, just no limits.
How we test for it
We check for rate and quota controls per identity, input and output bounds, and behaviour under bursty load, and we assess whether high-volume querying could feasibly extract model behaviour. Any load-style testing is strictly scoped and authorised. We don't DoS your production without explicit agreement.
How to reduce the risk
- Enforce rate limits and quotas per user or API key, and cap input and output sizes.
- Set spend alerts and budgets on inference, and throttle or queue under load.
- Detect and limit abusive high-volume patterns, and require auth for expensive paths.
- Right-size context windows and timeouts to bound worst-case cost.
EU AI Act: commonly maps to Art. 15 (robustness and availability). Redproof reports findings as independent testing evidence, not a conformity verdict.
Test this on your own AI before someone else does
Redproof is independent red-teaming for LLM and AI-agent products. We probe your system for unbounded consumption and the rest of the OWASP LLM Top 10, hand you severity-ranked findings with reproductions, fixes, and EU AI Act mapping, and re-test after you patch. That is the evidence your self-assessment needs, before a regulator or customer asks.