An Inexact Introduction to Envoy

6 min readApr 3, 2019

A quick primer before you jump in to read the Envoy documentation and tutorials.

Backstory

Of late I have spent a bit of time on Envoy. They say it’s the next big thing for cloud services. It changes how microservices are written and deployed. With that kind of interest and developer traction, you’d imagine that they’d have a fantastic set of tutorials in their docs to get any interested engineer started. Well, they do. But like everything else in this day and age, to read those fantastic docs you already have to know a fair bit about proxies, API gateways, rate limiting, and stuff that I feel not every Envoy newbie need know. What is a dummy like me to do? For one, persevere, and for two, demystify it for the ones who follow (aka help separate the wheat from the chaff). So this one is about essentially what I managed to learn about Envoy so far. Precious little, but I’ll try summarizing nonetheless.

Why Envoy?

Building a web app requires lots of moving parts — so called microservices communicating with each other and handling different aspects of a request. The functionality must be cohesively exposed through a common application, that aggregates all the APIs and functionality exposed by the system. As the customer base grows, more users would engage with the system and generate a higher volume of traffic, and perhaps greater concurrency. All of it would need to be secured and authenticated. Microservices confer a great deal of flexibility in how we address availability, responsiveness, and scalability of our cloud services. But leveraging all of it isn’t always easy if you’re a microservice author. Simply put, there is a lot of cross-cutting concern that every microservice author has to think about — right from transport layer security and authentication, to discovering peer services, to load balancing, rate limiting — concerns that are not central to the business logic of the microservice. Addressing them is hard enough. Once you consider that different microservices in the same cloud app could be written in different languages or frameworks, the problem becomes harder still. That’s where Envoy comes in.

First look at Envoy

In a nutshell, Envoy, developed at Lyft and written in modern C++, allows you to build microservices without bothering too much about how to discover other microservices, route requests to them, how to manage secure service-to-service connections, authenticate users, do load balancing, rate limiting, circuit breaking, and lots more (patience, I’ll explain all the terms). In other words, it helps address all of the cross-cutting concerns mentioned earlier in a polyglot microservices environment. And it does so in an extensible way allowing anything from hard-coded static configuration to completely dynamic configuration for everything from the endpoints used to serve requests, the clusters serving them, to the load-balancing policies.

Now true to the promise above, I owe a short explanation of some terms I casually threw at you.

Load balancing: You have lots of requests coming in and you want to serve them all responsively and reliably. What you do is create multiple replicas of your service and then route requests to them spreading the load across the replicas. The exact strategy can vary, and usually depends on whether your services are stateful or stateless.

Circuit Breaking: Service A talks to service B in order to serve requests. If B be heavily loaded and unresponsive, or simply unavailable, A owes it to the user to degrade gracefully instead of being hung. A also owes it to a perhaps already-loaded B to not bombard it with even more requests in such a situation. Detecting such a situation, and preventing requests from A to B for a short time, before once again resuming them is what circuit breaking is about.

Rate limiting: You don’t want overeager clients to swamp your service with more requests than you can reliably handle. Rate limiting does this using various strategies and algorithms. You reject requests beyond a threshold number per second, or throttle requests by introducing small random delays while routing them. You put such checks at the client, and at the server.

Envoy in wee bit more detail

Ok, back to Envoy. So what does Envoy deal in? As in, what are the abstractions or domain objects in terms of which Envoy operates? I ask this, because without being able to satisfactorily answer this question about a given software system, I have noticed I never make a good job of trying to make sense of the system itself. So here are the key abstractions.

Some notion of a gateway or an entry point — IP address + port + protocol — that downstream clients connect to and to send requests to multiple services. Envoy calls them listeners.
Some notion of an endpoint — an IP+port pair that serves a specific service. Envoy calls them, surprisingly, endpoints! The endpoints could use raw IP addresses or FQDNs that are resolved via a DNS.
Some concept of an logical grouping of endpoints running a specific service. Envoy calls these clusters.
A notion of routing requests from the gateway to the clusters. Envoy calls these routes and route_configs.
Some concept of a request URL that a client hits, consisting of a virtual host or domain name, an API path prefix, etc. This is used to determine which routing rules are invoked.
Some concept of pluggable middleware for intercepting requests and processing them. Envoy calls them filter_chains and they consist of one or more filters through which each request passes. You can do all sorts of things in these filters, such as handling specific network protocols, authentication, rate limiting, etc.
Policies around load balancing, rate limiting, circuit breaking, etc.

An Envoy configuration typically consists of one or more listeners, each of which defines one or more filter chain. An incoming request would be accepted by a listener and based on the attributes of the request, one of the filter chains would be used to process the request. Each filter in the matching filter chain would process the request in order. For http requests, the http_connection_manager (hcm) filter is used, while for handling plain TCP requests, the tcp_proxy filter is used. Routes and route_configs are defined within the http_connection_manager or tcp_proxy configuration, which define how matching requests are forwarded to specific backend services. The backend services are modeled as clusters, each consisting of one or more endpoint addresses of the backend services.

Envoy control planes

Finally, you would soon start hearing about xDS and Envoy control planes. It’s good to understand what these are. What if you had a big envoy deployment of several tens of services, listeners, clusters, and all the filters they entail? Having all of that configuration information in a single static configuration is quite hard. Envoy can use APIs and other infrastructure data source that allow you to dynamically discover available services, their addresses, listeners, routes, etc. These are called control planes and aggregate APIs that go by the names Cluster Discovery Service (CDS), Route Discovery Service (RDS), Listener Discovery Service (LDS), Secrets Discovery Service (SDS), etc. This information could be as dynamic as you wanted, taking into account real time availability of these entities. Envoy defines specifications for these so called control planes, and there are free (github.com/envoyproxy/go-control-plane) as well as commercial implementations available. Note that these could be implemented in a language of your choice, not necessarily C++ (and quite often golang).

Envoy deployment

So how does Envoy run alongside your own services? Several ways are possible. Most commonly it is deployed to run in both of the following roles:

As an API gateway that handles and routes all incoming requests to different microservices. This is called the edge proxy because it sits on the edge of your app boundary. It’s a gateway into your app, so to speak.
As a peer process of each service, intercepting, qualifying, checking, routing all its incoming and outgoing data. This is called a service proxy. Imagine that you the microservice writer don’t need to bother about TLS-secured connections, authentication, discovering service endpoints to talk to, etc. You just identify which other services to talk to and send requests and response to your peer Envoy (usually called a sidecar) on some port on the local machine (technically, in the same network namespace). It takes care of routing those requests.

Now in the overwhelming majority of cases, Envoy would run as a Docker container. As a service proxy, it would likely run as a sidecar container alongside the service container. But it can also run as standalone binaries.

Summary

Thus, Envoy serves as an edge and service proxy. It handles routing of incoming requests and service-to-service requests, and takes care of lots of common concerns. It allows you to write really simple microservices which practically need to do nothing more then getting their own business logic right. Now the above is a deliberately dumbed-down version of the truth, because Envoy does a lot more. It can work at both TCP/UDP + SSL level (L3/L4 proxy), as well as at HTTP level (L7). It can handle GRPC, and HTTP/2. And there is much more to it. But at its core, it is a proxy for microservice-based apps that makes routing between services declarative and easy, and adds a whole host of useful services.