Golang Project Structure

Tutorials, tips and tricks for writing and structuring code in Go (with additional content for other programming languages)

How to Write a Rate Limiter in Go

Language

  • unknown

by

Web applications are used by more of us than ever before. With this growth comes a significant challenge: how can a web developer or administrator manage user traffic effectively and ensure that services remain stable, responsive and secure?

Rate limiting serves as a crucial technique in addressing this challenge, allowing developers to control how frequently users can interact with their applications and webpages.

By implementing a rate limiter, developers can mitigate the risk of abuse, prevent server overload and ensure fair resource allocation among users.

In this blog post, we will explore how to write a simple rate limiter in Go, specifically using the Token Bucket algorithm, which has been chosen because it strikes a good balance between flexibility and control.

What Is Rate Limiting?

Rate limiting is the process of restricting the number of requests that a user can make to a server within a specific time frame.

Rate limiting as a concept in computing has been around since the earliest days of networked applications, but its formalization and widespread implementation began to take shape in the late 1990s and early 2000s.

The growing complexity of web applications and the rise of distributed systems has led to the need for improved mechanisms to manage traffic and ensure a fair allocation of server resources.

Why Is Rate Limiting Needed?

Rate limiting is often used in order to prevent abuse and protect resources, ensure fair usage among users, manage bandwidth consumption and provide a better user experience by reducing the risk of system overload.

A tortoise from the Galapagos Islands in South America. It is a famously slow-moving animal.
Do you remember Aesop’s fable about the tortoise and the hare? Slowing down can sometimes help you reach your goal more quickly.

By enforcing limits, developers can safeguard their services against unexpected spikes in traffic that could lead to performance degradation or even outages.

Additionally, rate limiting helps to identify and mitigate malicious activities such as denial-of-service attacks or automated bot requests, which contributes to the overall security of our web applications and services.

Overview of Common Rate Limiting Algorithms

There are several algorithms that can be used for rate limiting. The most common ones include:

Token Bucket

The Token Bucket algorithm allows a burst of requests but maintains an average rate. It uses a bucket to hold tokens that represent the amount of requests allowed.

Tokens are replenished at a defined rate, enabling bursts of traffic up to the bucket’s capacity. This approach is particularly useful for applications that can tolerate short spikes in request volume while still enforcing an average rate limit.

Leaky Bucket

Similar to the Token Bucket algorithm, the Leaky Bucket algorithm uses a bucket to hold requests; however, this bucket has a leak. Requests are processed at a fixed rate, and if they arrive too quickly, they overflow and are discarded.

This algorithm is effective in smoothing out bursts of requests, ensuring that traffic is processed consistently over time. It is ideal for scenarios where maintaining a relatively steady request rate is critical.

Fixed Window Counter

The Fixed Window Counter algorithm operates by counting the number of requests made within a fixed time window, such as one minute. Once the limit is reached, any additional requests within that window are denied until the window resets.

This algorithm is very straightforward to implement and can be useful for scenarios where precise tracking of request counts is necessary. However, it can lead to uneven usage patterns at the boundaries of the time windows — this is known as the “burstiness” problem.

Sliding Log

The Sliding Log approach maintains a record of the timestamps for each request made by users. When a new request comes in, the algorithm checks the log to determine if the request can be allowed based on the defined limit and the timestamps of previous requests.

This method provides a high degree of granularity and flexibility, as it can adapt to varying request patterns over time. However, it can consume more memory, especially in high-traffic scenarios, as the log must store each request’s timestamp.

Our Chosen Algorithm

In the example code that we’re going to write below, we’ll implement the Token Bucket algorithm, which strikes a balance between allowing bursts of traffic and controlling the average request rate.

By choosing to use this algorithm, we can efficiently manage the rate at which requests are handled while still providing a responsive experience for users.

Beginning to Implement the Token Bucket Rate Limiter

Now that we understand the basic principles behind the concept of rate limiting, let’s start work on some code in Go.

Creating a Token Bucket Struct

Before we begin writing any functions, we need to define a TokenBucket struct that will hold the configuration data for our rate limiter.

package main

import (
    "sync"
    "time"
)

type TokenBucket struct {
    capacity   int
    tokens     int
    refillRate time.Duration
    lastRefill time.Time
    mutex      sync.Mutex
}

The capacity field specifies the maximum number of tokens that the bucket can ever contain, whilst the tokens field specifies the current count of tokens.

The refillRate field is used to determine the the length of time that it takes to refill the bucket with a token, and the lastRefill field simply stores the last time that the bucket was refilled.

Finally, the mutex is used to ensure that our token-bucket can operate safely across concurrent goroutines.

Initializing the Token Bucket

Now let’s create a constructor function to initialize our TokenBucket:

func NewTokenBucket(capacity int, refillRate time.Duration) *TokenBucket {
    return &TokenBucket{
        capacity:   capacity,
        tokens:     capacity,
        refillRate: refillRate,
        lastRefill: time.Now(),
    }
}

This function returns a pointer to a new TokenBucket struct, which is initialized with the given capacity and refill rate.

The bucket always starts full with a maximum amount of tokens.

Refilling Tokens

Now we need to work on a method that can refill the tokens in the bucket.

This should be called before processing any requests to ensure that we have the right number of tokens available.

Here’s the code:

func (tb *TokenBucket) refill() {
    tb.mu.Lock()
    defer tb.mu.Unlock()

    now := time.Now()
    elapsed := now.Sub(tb.lastRefill)

    // calculate the number of tokens to add
    newTokens := int(elapsed / tb.refillRate)

    // update the number of tokens
    if newTokens > 0 {
        tb.tokens += newTokens
        if tb.tokens > tb.capacity {
            tb.tokens = tb.capacity // ensure we don't exceed capacity
        }
        tb.lastRefill = now
    }
}

The refill method starts by locking the bucket’s mutex in order to prevent concurrent access.

Then it calculates the elapsed time since the last refill and determines how many tokens can be added.

Then it updates the token count and ensures that it does not exceed the bucket’s capacity — if it does, only the maximum amount of tokens that the bucket can actually hold are added.

Handling Requests

Now let’s create a method to handle incoming requests. This method will check if there are any tokens available and process the request accordingly:

func (tb *TokenBucket) Allow() bool {
    tb.refill()

    tb.mu.Lock()
    defer tb.mu.Unlock()

    if tb.tokens > 0 {
        tb.tokens-- // consume a token
        return true // request allowed
    }

    return false // request denied
}

The Allow method first calls the refill method that we’ve just written, in order to update the token count.

It then checks if there are tokens available in the bucket. If so, it consumes one token and allows the request.

On the other hand, if no tokens are available, then the request is denied.

Of course, this is the function that actually allows us to rate-limit requests to our server.

Using the Rate Limiter

Now that we have our TokenBucket implementation written, we can use it to limit requests.

Let’s create a simple HTTP server to demonstrate how this works:

package main

import (
    "fmt"
    "net/http"
)

func main() {
    tb := NewTokenBucket(5, time.Second)

    http.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
        if tb.Allow() {
            fmt.Fprintf(w, "Request allowed")
        } else {
            http.Error(w, "Rate limit exceeded", http.StatusTooManyRequests)
        }
    })

    fmt.Println("Starting server on :8080")
    http.ListenAndServe(":8080", nil)
}

We use the constructor function to create a new TokenBucket that is set up to allow five requests per second.

The HTTP handler checks if a request is allowed using the Allow method that we wrote.

If the request is allowed, the server responds with a success message. Otherwise, it returns a "Rate limit exceeded" error.

Testing the Rate Limiter

To test our rate limiter in action with the simple server that we’ve just written, first get the application up and running:

go run main.go

You can use a tool like curl to send requests to the server:

for i in {1..10}; do curl -s http://localhost:8080; done

Note how we make use of a simple loop in the Bash shell in order to send an HTTP GET request to http://localhost:8080 10 times in a row.

We are also using curl in silent mode, so that all progress and error messages are suppressed.

Each iteration retrieves the response from our local web server that should now be running on port 8080.

Checking That the Output Is As Expected

If everything is working properly, you should see “Request allowed” for the first five requests, and “Rate limit exceeded” for the five subsequent requests.

Implementing the Rate Limiter as Middleware

Here’s a simple example of how you might implement the rate limiter as middleware in a Gin application:

package main

import (
    "github.com/gin-gonic/gin"
)

func RateLimiter(tb *TokenBucket) gin.HandlerFunc {
    return func(c *gin.Context) {
        if tb.Allow() {
            // proceed to the next handler
            c.Next() 
        } else {
            // stop because the rate limit has been exceeded
            c.AbortWithStatus(http.StatusTooManyRequests)
        }
    }
}

func main() {
    r := gin.Default()
    tb := NewTokenBucket(5, 1*time.Second)

    r.Use(RateLimiter(tb))

    r.GET("/", func(c *gin.Context) {
        c.String(http.StatusOK, "Request allowed")
    })

    r.Run(":8080")
}

Now the RateLimiter function returns a Gin middleware handler function that checks the rate limit before continuing to process requests.

If the request is allowed, it calls the Next method on the gin.Context struct in order to proceed to the next Gin middleware handler.

However, if the rate that we set has been exceeded, then the server will send a 429 HTTP status code, signifying that it is currently unable to handle any more requests.

Summing up Our Work on the Rate Limiter

In this blog post, we’ve walked through the process of implementing a simple rate limiter in Go using the Token Bucket algorithm.

We have covered how to create a simple bucket and refill it with tokens, as well as how to allow or deny requests based on the current count of tokens.

As we discussed in an earlier section, rate limiting is essential for maintaining fair and stable access to web services, especially when dealing with potentially abusive traffic.

With this simple implementation, you should now be able to enhance your Go web applications to manage traffic more effectively.

Feel free to experiment with the code, customize the rate-limiting strategy and integrate it into your own projects.

Please let me know how you use and improve it.

Leave a Reply

Your email address will not be published. Required fields are marked *