blog
Ottorino Bruni  

How to Use Rate Limiting in ASP.NET Core for Better API Security and Performance

Introduction

When you build an API, it’s important to control how often clients can send requests.
Without limits, a single user or app could make too many calls in a short time. This can slow down your service or even make it crash for everyone.

Rate limiting helps you prevent this.
It defines how many requests are allowed in a specific period (for example, 100 requests per minute). If a client sends more than that, the server will temporarily reject new requests with a standard HTTP 429 – Too Many Requests response.

Using rate limiting helps you:

  • Protect your API from abuse and accidental overload
  • Keep the system stable and responsive
  • Share resources fairly between different users

Starting from .NET 7, ASP.NET Core includes a built-in Rate Limiting Middleware. It became even more powerful and flexible in .NET 8 and .NET 9.
You can now add rate limits to your app with just a few lines of code, without needing extra libraries or external tools.

In this article, you’ll learn how to use this middleware, configure different limiter types, and apply rate limits to specific endpoints in your ASP.NET Core apps.

What is Rate Limiting and Why It Matters

Rate limiting is a technique used to control how many requests a client can send to your API in a specific period of time.
For example, you might allow up to 100 requests per minute. If the client exceeds that number, the server will temporarily reject further requests until the time window resets.

Beyond being a simple throttle mechanism, rate limiting plays a key role in maintaining the overall health, security, and fairness of your system.

Here are the main reasons why it matters:

Preventing Abuse

Rate limiting helps protect your application from misuse or malicious activity by restricting how often a user, client, or IP can access your endpoints.
This is especially important for public APIs, where you cannot fully control who makes the requests.

Ensuring Fair Usage

Without limits, a small number of aggressive clients could consume most of your system’s capacity.
By applying rate limits, you ensure that all users get fair and predictable access to your resources.

Protecting Resources

Every API call consumes CPU time, memory, and possibly database or storage operations.
Rate limiting helps keep these resources under control, preventing overload and maintaining stability even during traffic spikes.

Enhancing Security

Rate limiting can mitigate the risk of Denial of Service (DoS) attacks and brute-force attempts by limiting how quickly requests are processed.
It becomes a first layer of defense that slows down attackers before they can cause serious harm.

Improving Performance

When you control the flow of incoming requests, your system can maintain consistent response times and avoid the cascading failures that happen under heavy load.
This leads to a smoother and more predictable experience for users.

Cost Management

If your backend relies on paid services such as third-party APIs, databases, or cloud storage—rate limiting helps you control operational costs by keeping usage within defined limits.

The Built-in Rate Limiting Middleware

Starting with .NET 7, ASP.NET Core introduced a built-in Rate Limiting Middleware, designed to help developers easily apply rate control to API requests.
This middleware is part of the ASP.NET Core pipeline and can be configured directly in your Program.cs file no external packages or proxies required.

The middleware works by applying rate limiting policies that define how many requests are allowed within a given time window.
These policies can be global, affecting the entire app, or endpoint-specific, applying only to selected routes.

Enabling the Middleware

To start using it, you first need to register it in your service container and enable it in the pipeline.

var builder = WebApplication.CreateBuilder(args);

builder.Services.AddRateLimiter(options =>
{
    options.AddFixedWindowLimiter("fixed", limiterOptions =>
    {
        limiterOptions.Window = TimeSpan.FromSeconds(10);
        limiterOptions.PermitLimit = 5;
        limiterOptions.QueueLimit = 2;
        limiterOptions.QueueProcessingOrder = QueueProcessingOrder.OldestFirst;
    });
});

var app = builder.Build();

app.UseRateLimiter();

app.MapGet("/api/hello", () => "Hello from Otto!")
   .RequireRateLimiting("fixed");

app.Run();

In this example:

  • AddRateLimiter() registers the middleware and defines a named policy called “fixed”.
  • The Fixed Window limiter allows up to 5 requests every 10 seconds.
  • When the limit is reached, additional requests are placed in a queue (up to 2 in this case).
  • QueueProcessingOrder.OldestFirst ensures requests are processed in the order they arrived.
  • RequireRateLimiting(“fixed”) applies the policy to the specific endpoint /api/hello.

How It Works

When a request arrives:

  1. The middleware checks the active limiter for the given endpoint.
  2. If the request is under the limit, it proceeds normally.
  3. If the limit is exceeded, the middleware responds with HTTP 429 – Too Many Requests, automatically including standard headers such as:
    • Retry-After
    • RateLimit-Limit
    • RateLimit-Remaining
    • RateLimit-Reset

This makes it easy for clients to understand when they can safely retry a request.

Understanding Different Limiter Types

The ASP.NET Core Rate Limiting Middleware supports several limiter types, each designed for different usage patterns.
Choosing the right one depends on how your API traffic behaves whether it’s steady, bursty, or highly concurrent.

Let’s look at the main limiter types available in .NET 8 and .NET 9.

1. Fixed Window Limiter

The Fixed Window Limiter divides time into fixed intervals (windows) and allows a certain number of requests per window.

options.AddFixedWindowLimiter("fixed", limiterOptions =>
{
    limiterOptions.Window = TimeSpan.FromSeconds(10);
    limiterOptions.PermitLimit = 5;
});

How it works:
If you allow 5 requests every 10 seconds, and a client sends all 5 requests in the first second, the next requests within that same 10-second window will be rejected.

Best for:
Simple, predictable rate limiting rules where small timing inaccuracies are acceptable for example, protecting low-traffic endpoints or internal APIs.

2. Sliding Window Limiter

The Sliding Window Limiter provides smoother control by splitting each time window into segments.
This reduces the “burst” effect that can happen at the edges of fixed windows.

options.AddSlidingWindowLimiter("sliding", limiterOptions =>
{
    limiterOptions.Window = TimeSpan.FromSeconds(10);
    limiterOptions.PermitLimit = 5;
    limiterOptions.SegmentsPerWindow = 2;
});

How it works:
The window “slides” as time moves forward, recalculating the allowed request count continuously.
It prevents users from making two bursts back-to-back at the boundary of a fixed window.

Best for:
Public APIs or endpoints that receive continuous traffic, where you want to prevent spikes while keeping the flow smooth.

3. Token Bucket Limiter

The Token Bucket Limiter allows short bursts while maintaining a long-term rate limit.
Each request consumes a “token,” and tokens are refilled over time.

options.AddTokenBucketLimiter("token", limiterOptions =>
{
    limiterOptions.TokenLimit = 10;
    limiterOptions.TokensPerPeriod = 5;
    limiterOptions.ReplenishmentPeriod = TimeSpan.FromSeconds(10);
});

How it works:
Clients can make several quick requests as long as tokens are available.
Once the bucket is empty, requests are delayed or rejected until tokens are replenished.

Best for:
APIs that need to handle bursty traffic gracefully for example, mobile or IoT clients that send data in batches.

4. Concurrency Limiter

The Concurrency Limiter restricts how many requests can be processed at the same time, rather than over time.

options.AddConcurrencyLimiter("concurrent", limiterOptions =>
{
    limiterOptions.PermitLimit = 2;
    limiterOptions.QueueLimit = 5;
    limiterOptions.QueueProcessingOrder = QueueProcessingOrder.OldestFirst;
});

How it works:
If more than two requests arrive simultaneously, the extra ones are queued or rejected depending on the queue settings.

Best for:
CPU- or I/O-intensive operations where you must limit the number of active requests for example, image processing or report generation APIs.

Customizing the Response and Headers

When a client exceeds the configured rate limit, the middleware automatically returns an HTTP 429 – Too Many Requests response.
By default, it also includes standard rate-limit headers that tell the client when it can retry safely.

These headers follow the RateLimit header specification (RFC 9333):

  • RateLimit-Limit – the maximum number of requests allowed during the current window
  • RateLimit-Remaining – how many requests are still available
  • RateLimit-Reset – the time (in seconds) until the limit resets

This makes it easier for well-behaved clients to automatically throttle their own requests.

Example of a Standard Response

HTTP/1.1 429 Too Many Requests
Content-Type: text/plain
RateLimit-Limit: 5
RateLimit-Remaining: 0
RateLimit-Reset: 8
Retry-After: 8

Too many requests. Please try again later.

Customizing the Response

You can change how the middleware responds when the rate limit is exceeded by setting the OnRejected callback inside your limiter configuration.

builder.Services.AddRateLimiter(options =>
{
    options.AddFixedWindowLimiter("fixed", limiterOptions =>
    {
        limiterOptions.Window = TimeSpan.FromSeconds(10);
        limiterOptions.PermitLimit = 5;
        limiterOptions.QueueLimit = 0;
    });

    options.OnRejected = async (context, cancellationToken) =>
    {
        context.HttpContext.Response.StatusCode = StatusCodes.Status429TooManyRequests;
        context.HttpContext.Response.ContentType = "application/json";

        var message = new
        {
            error = "Too many requests",
            retryAfterSeconds = context.Lease.TryGetMetadata(MetadataName.RetryAfter, out var retry) ? retry.TotalSeconds : null
        };

        await context.HttpContext.Response.WriteAsJsonAsync(message, cancellationToken);
    };
});

Explanation:

  • OnRejected is triggered whenever a request is denied due to rate limiting.
  • You can customize the status code, message format, or even return structured JSON for better API client handling.
  • The example above includes a retryAfterSeconds field, allowing clients to wait before retrying.

This flexibility lets you integrate rate limiting gracefully into your existing API design and provide clear, developer-friendly feedback to your users.

Example: Protecting an API Endpoint

Disclaimer: This example is purely for educational purposes. There are better ways to write code and applications that can optimize this example. Use this as a starting point for learning, but always strive to follow best practices and improve your implementation.

Prerequisites

Before starting, make sure you have the following installed:

  • .NET SDK: Download and install the .NET SDK if you haven’t already.
  • Visual Studio Code (VSCode): Install Visual Studio Code for a lightweight code editor.
  • C# Extension for VSCode: Install the C# extension for VSCode to enable C# support.

Step 1 – Create a Minimal API Project

Open your terminal and create a new project using the .NET CLI:

dotnet new web -n RateLimiterDemo
cd RateLimiterDemo

This creates a simple Minimal API project using the ASP.NET Core template.

Next, open the project in VS Code:

code .

You should now see a Program.cs file with a minimal application setup.

Step 2 – Add and Configure Rate Limiting

Open Program.cs and register the Rate Limiting Middleware in the service container.

var builder = WebApplication.CreateBuilder(args);

builder.Services.AddRateLimiter(options =>
{
    // Define a simple fixed window limiter
    options.AddFixedWindowLimiter("fixed", limiterOptions =>
    {
        limiterOptions.PermitLimit = 5; // allow 5 requests
        limiterOptions.Window = TimeSpan.FromSeconds(10); // per 10 seconds
        limiterOptions.QueueLimit = 0; // no queued requests
    });

    // Optional: customize rejection response
    options.OnRejected = async (context, cancellationToken) =>
    {
        context.HttpContext.Response.StatusCode = StatusCodes.Status429TooManyRequests;
        await context.HttpContext.Response.WriteAsync("Too many requests. Please try again later.");
    };
});

var app = builder.Build();

app.UseRateLimiter();

This configuration applies a “fixed” limiter that allows 5 requests every 10 seconds.
If the limit is reached, the middleware automatically returns a 429 response.

Step 3 – Create a Protected Endpoint

Now, let’s add a simple endpoint that uses this rate limiting policy:

app.MapGet("/api/weather", () =>
{
    var data = new[]
    {
        new { City = "Munich", Temperature = "16°C", Condition = "Cloudy" },
        new { City = "Rome", Temperature = "22°C", Condition = "Sunny" },
        new { City = "New York", Temperature = "18°C", Condition = "Rainy" }
    };

    return Results.Ok(data);
})
.RequireRateLimiting("fixed");

Here’s what happens:

  • Each client can call /api/weather up to 5 times every 10 seconds.
  • When the limit is exceeded, ASP.NET Core returns HTTP 429.

The complete Program.cs file

var builder = WebApplication.CreateBuilder(args);

builder.Services.AddRateLimiter(options =>
{
    // Define a simple fixed window limiter
    options.AddFixedWindowLimiter("fixed", limiterOptions =>
    {
        limiterOptions.PermitLimit = 5; // allow 5 requests
        limiterOptions.Window = TimeSpan.FromSeconds(10); // per 10 seconds
        limiterOptions.QueueLimit = 0; // no queued requests
    });

    // Optional: customize rejection response
    options.OnRejected = async (context, cancellationToken) =>
    {
        context.HttpContext.Response.StatusCode = StatusCodes.Status429TooManyRequests;
        await context.HttpContext.Response.WriteAsync("Too many requests. Please try again later.");
    };
});

var app = builder.Build();

app.UseRateLimiter();

app.MapGet("/", () => "Hello World!");

app.MapGet("/api/weather", () =>
{
    var data = new[]
    {
        new { City = "Munich", Temperature = "16°C", Condition = "Cloudy" },
        new { City = "Rome", Temperature = "22°C", Condition = "Sunny" },
        new { City = "New York", Temperature = "18°C", Condition = "Rainy" }
    };

    return Results.Ok(data);
})
.RequireRateLimiting("fixed");

app.Run();

Step 4 – Run and Test the API

Start the application from the terminal:

dotnet run

By default, it will listen on https://localhost:5001.

Instead of using curl or Postman, you can test your API directly from Visual Studio Code by creating a simple .http file.

In the root of your project, create a new file named test.http and add the following content:

# Host.
@HostAddress = http://localhost:5087

### 1. Normal request
GET {{HostAddress}}/api/weather
Accept: application/json

### 3. No Rate Limit request
GET {{HostAddress}}/

The first few requests will succeed with HTTP 200 OK.

After the fifth request (within 10 seconds), the middleware will respond with:

HTTP/1.1 429 Too Many Requests
Content-Type: text/plain; charset=utf-8

Too many requests. Please try again later.

Best Practices and Common Pitfalls

Implementing rate limiting is simple, but tuning it for production requires balance. Here are a few key recommendations:

Best Practices

  • Pick the right limiter:
    Use Fixed Window for simple cases, Sliding Window for steady traffic, Token Bucket for bursts, and Concurrency for heavy operations.
  • Apply limits selectively:
    Protect only the endpoints that need it public routes often require stricter rules than internal ones.
  • Return clear feedback:
    Always include RateLimit-* and Retry-After headers so clients know when to retry.
  • Monitor and adjust:
    Log rate-limit rejections and tune thresholds based on real usage.
  • Plan for scaling:
    In multi-instance setups, use a distributed store (like Redis) to keep limits consistent.

Common Pitfalls

  • Forgetting to call app.UseRateLimiter().
  • Applying the same limit to all users.
  • Setting limits too low and blocking valid traffic.
  • Ignoring horizontal scaling (limits are per instance).

Keep your limits realistic, monitor behavior, and adjust as your API grows.

Conclusion

Rate limiting is an essential tool for keeping your APIs fast, secure, and reliable.
With the built-in middleware in ASP.NET Core, you can implement effective rate control in just a few lines of code no external dependencies required.

Start simple: apply limits to your most exposed endpoints, observe how users interact with your API, and adjust the configuration as needed.
Over time, rate limiting becomes not just a way to prevent abuse, but a key part of building scalable and resilient applications.

For more details and advanced options, check the official documentation: Rate limiting middleware in ASP.NET Core

If you think your friends or network would find this article useful, please consider sharing it with them. Your support is greatly appreciated.

Thanks for reading!

Discover CodeSwissKnife Bar, your all-in-one, offline Developer Tools from Your Menu Bar

Leave A Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.