Contact Us

If you still have questions or prefer to get help directly from an agent, please submit a request.
We’ll get back to you as soon as possible.

Please fill out the contact form below and we will reply as soon as possible.

  • Contact Us
  • Login
  • Home
  • Articles

Why Your Rate Limits Fail Under Distributed DDoS Attacks

Discover how distributed DDoS attacks exploit common rate limit failures and learn strategies to enhance your defenses.

Written by I. Solomon

Updated at May 28th, 2026

Contact Us

If you still have questions or prefer to get help directly from an agent, please submit a request.
We’ll get back to you as soon as possible.

Please fill out the contact form below and we will reply as soon as possible.

  • White Papers
  • Articles
  • DDoS Attack Vectors
+ More

Table of Contents

Key takeaways Rate Limiting Is the Foundation of Layer 7 DDoS Protection How Rate Limiting Worked in Traditional Architectures The Distributed Counting Problem Why This Matters What Security Teams Should Do 1. Run Distributed DDoS Simulations 2. Ask Vendors Direct Questions 3. Tune Policies Based on Real Behavior Final Thoughts FAQs Why do rate limits fail during distributed DDoS attacks? Are cloud WAF rate limits global or local? How can attackers bypass rate limiting? How can organizations verify their rate limiting actually works?

There is often a significant gap between what security teams believe their DDoS protections are doing and how those protections behave during a real attack.

Rate limiting is one of the most widely deployed application-layer DDoS defense mechanisms, yet it is also one of the most misunderstood. Many organizations configure rate limit rules and assume they are protected — only to discover during an incident that the rules behave very differently under distributed attack conditions.

This post explains why.

Key takeaways

  • Rate limiting is still a core Layer 7 DDoS defense, but it is often misunderstood in distributed environments
  • Cloud and CDN architectures split traffic across PoPs, which can prevent global thresholds from triggering
  • A rule like 12 RPS may be enforced per edge location instead of globally across the attack
  • Attackers can bypass protections by distributing traffic across multiple regions or edge nodes
  • The real security outcome depends on how the vendor aggregates and synchronizes request counters
  • Effective protection requires testing, not just configuration
  • DDoS simulations reveal whether rate limits behave globally or only locally
  • Vendor architecture details directly impact real-world protection strength

Rate Limiting Is the Foundation of Layer 7 DDoS Protection

Rate limiting is designed to prevent clients from exhausting application resources by enforcing thresholds on request volume.

The concept is straightforward:

  • Define an acceptable request rate
  • Monitor incoming traffic
  • Block, challenge, or throttle clients that exceed the threshold

Without rate limiting, application-layer defenses are incomplete.

However, configuring a rate limit rule does not guarantee effective protection. The effectiveness of rate limiting depends heavily on where traffic is inspected, how counters are synchronized, and how requests are aggregated across distributed infrastructure.

Those implementation details matter far more than most teams realize.

How Rate Limiting Worked in Traditional Architectures

Legacy on-premise WAF appliances operated as centralized inspection points inside the data center.

Because every request passed through a single enforcement point, request counting was highly accurate. If a policy allowed 12 requests per second (RPS), the 13th request was immediately blocked.

The limitation of this model was scalability. Under volumetric attacks, the appliance itself often became the bottleneck, with challenges including CPU exhaustion, memory saturation and network pipe congestion.

Modern cloud WAFs and CDN-based protection platforms solved many of these scaling problems by distributing enforcement across global infrastructure. But distributed enforcement introduced a new challenge: How are requests actually counted across the network?

The Distributed Counting Problem

This is where many rate limiting strategies fail.

Most cloud WAF providers process traffic across multiple edge locations, data centers, or Points of Presence (PoPs). Depending on the vendor architecture, rate limit counters may be maintained:

  • Per edge server
  • Per PoP
  • Per region
  • Or globally with synchronization delays

That distinction is critical during distributed attacks. Consider the following scenario:

A security team configures a rule to block any client exceeding 12 RPS. An attacker launches traffic from Singapore at 30 RPS from a single source IP while intentionally distributing requests across multiple CDN edge locations.

The cloud provider operates several PoPs in the region, and the traffic is distributed across them:

  • PoP A sees 8 RPS
  • PoP B sees 10 RPS
  • PoP C sees 9 RPS

The total attack rate is 30 RPS.

However, no individual PoP observes traffic exceeding the 12 RPS threshold. The result is that the rate limit rule never triggers. The attack bypasses the WAF and reaches the origin infrastructure despite technically violating the configured threshold.

 

<–Test your rate limits under real distributed attack conditions with Red Button–>

 

Why This Matters 

Many security teams assume rate limiting is globally enforced across the provider network. In reality, enforcement behavior varies significantly between vendors and architectures.

Some platforms aggregate counters locally. Others synchronize counters regionally with delays. Some offer global aggregation only under specific configurations or licensing tiers.

As a result, the effective protection level may be dramatically weaker than what appears in the dashboard configuration. A threshold configured at 12 RPS may effectively behave like:

  • 12 RPS per edge server
  • 12 RPS per PoP
  • 12 RPS per region

Each of these has a very different security outcome.

What Security Teams Should Do

1. Run Distributed DDoS Simulations

The only reliable way to validate rate limiting behavior is through controlled, distributed attack simulations. Testing traffic from multiple geographic regions quickly reveals whether counters are local or global, how synchronization behaves under load, and whether protections degrade during distributed attacks

2. Ask Vendors Direct Questions

Most organizations never validate how their provider performs aggregation. Ask your WAF or CDN vendor:

  • Are counters maintained per server, PoP, region, or globally?
  • What synchronization delays exist between enforcement points?
  • How does rate limiting behave under highly distributed traffic patterns?
  • Are there differences between product tiers or deployment modes?

3. Tune Policies Based on Real Behavior

Once you understand the aggregation model, calibrate thresholds accordingly. If your vendor isolates counters per data center, you may need to aggressively lower your RPS threshold, or layer your defenses by adding Bot protection, behavioral analysis, and on

Final Thoughts

Rate limiting remains one of the most important Layer 7 DDoS defenses. But modern cloud architectures fundamentally changed how enforcement works. If you have never tested how your rate limiting behaves during a globally distributed attack, you should assume your protection model contains blind spots.

During a real DDoS event, the difference between “configured” and “effective” protection becomes very visible.

Distributed attacks don’t respect clean assumptions about thresholds and enforcement layers. If you want to understand how your protections behave under real-world conditions, explore Red Button’s DDoS simulation and testing capabilities to validate your defenses before attackers do.

 

FAQs

Why do rate limits fail during distributed DDoS attacks?

Because traffic is often split across multiple edge locations, each node may stay below the threshold even when total traffic exceeds it.

Are cloud WAF rate limits global or local?

It depends on the vendor. Some enforce per server or PoP limits, while others offer regional or global aggregation with varying delays.

How can attackers bypass rate limiting?

They distribute requests across multiple regions or edge nodes, keeping each location under the configured threshold.

How can organizations verify their rate limiting actually works?

By running distributed DDoS simulations and testing how traffic is counted across different geographic locations and edge points.

rate limits ddos defense

Was this article helpful?

Yes
No
Give feedback about this article

Related Articles

  • Basic Rate Limit Configuration for DDoS Protection
[email protected]

Services

  • DDoS Testing
  • DDoS 360
  • Technology Hardening
  • DDOS Training
  • Incident Response

Resources

  • Resource Library
  • DDoS Resiliency Score (DRS)
  • DDoS Glossary
  • DDoS Day Conferences

Company

  • About Us
  • Careers
  • Contact
Red Button Inc. All rights reserved
  • Privacy policy
  • Site Terms
Expand