/ Proxies

4 Ways to Reverse Proxy with Nginx

Reverse proxying web applications and services improves system security, performance and operations. This article reviews four ways to route requests from a proxy web server to an origin web server.

Motivation

Nginx is in the top three web servers, along with Apache and IIS. I like it because of its performance and the clean configuration approach. It was designed from the ground up as a non-blocking reverse proxy.

4WaysToReverseProxyWithNginx_Overview-1

I find nginx a perfect fit in these six cases:

  1. SSL/TLS offloading
    Aggregate system performance can increase with a proxy handling Transport Layer Security (TLS) for applications. (TLS is the successor to Secure Sockets Layer (SSL)). This C-based web server will do TLS more efficiently than Java, Node (JavaScript), Ruby, etc. Client performance can can also be boosted when there are multiple endpoints, by aggregating the multiple app interaction into a single TLS session with the proxy. Management of certificates, protocols and ciphers whitelists, and library patching is also simplified.
  2. Network segmentation
    The proxy can be put on a public subnet (DMZ) while applications can be on private subnets and only accepting traffic from the proxy.
  3. Access control
    Apply authentication and authorization to unprotected applications, or apply extra security layers on a protected app, such as source IP whitelists.
  4. Subdomain aggregation
    An org could create one subdomain per app, but the SSL certificate costs could be prohibitive. A proxy funnels requests through one or a few subdomains and removing any caps on the number of applications (think microservices).
  5. Request/response manipulation
    You have unlimited power to 'rewrite' (transform) request URIs, headers and bodies, and response headers and bodies. There are multiple directives to search and replace content, usually with the power of regex, specifically Perl Compatible Regular Expressions (PCRE). You could obfuscate use of a third party application, remove sensitive content, fix a bug, inject CORS headers, automate log in in some SSO flow, and so on.
  6. Static content
    Yup, it's a plain old web server too. Improve performance of your Tomcat, Ruby on Rails, Django, etc. apps by serving the HTML, JavaScript, images and other fixed content from nginx.

Below we see it used to offload TLS and segment public from private subnets:

4WaysToReverseProxyWithNginx_Security-1

There are other use cases too, particularly with the commercial product, nginx plus, like load balancing, HA and content caching. But I use the vanilla FOSS (Free and Open Source Software) community version for the aforementioned use cases.

If you'd like a quick recipe to get nginx running, see my Nginx Cookbook.

Reverse Proxy Design Options

For a primer on reverse proxy configuration, start here intro on reverse proxies and then move on to the proxy module reference.

I use four request routing designs: subdomain, port, symmetric path and assymetric path.

Option 1: Subdomain

Context: You own a subdomain and can map it one-to-one to an application.

4WaysToReverseProxyWithNginx_Subdomain

Add another server block to your nginx conf file with the directive server_name, as in:

events {
    worker_connections 1024;
}

http {
    ...

    server {
        server_name app-a.example.com;
      
        listen 443 ssl; 
        ssl_certificate /etc/nginx/cert/certificate.pem;
        ssl_certificate_key /etc/nginx/cert/key.pem;
    
        proxy_pass http://10.0.4.14:8080;
   }
}

The proxy_pass is the heart of the reverse proxy, establishing the proxied host, port and path.

Here app-a is hosted on 10.0.4.14 and serves on port 8080. In this example, we use a static IP, but you can use a subdomain, e.g. app-a.private.example.com, or an internal (VPC-facing) load balancer.

The path is not set and is implicitly passed along without transformation. So for example a request to https://app-a.example.com/report/cust/67?mode=monthly will be passed as http://10.0.4.14:8080/report/cust/67?mode=monthly.

We'll have two legs in this request and the request into app-a will look like it's initiated from the proxy. So better to add some headers to indicate it's proxied:

    server {
        ...
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-Host $host:$server_port;
        proxy_set_header X-Forwarded-Proto $scheme;
     }

If the remote client is at, say 68.73.123.56, this will add the header to the second leg of the request as:

GET /report/cust/67?mode=monthly HTTP/1.1
...
X-Forwarded-For: 68.73.123.56
...

App-a can now take action on this if necessary. At a minimum it should use it for access logging. (For example use the RemoteIpValve for a Tomcat app).

For brevity's sake, I'll leave these out of the conf in the other three options.

Option 2: Port

Context: The application serves content on the root path or its content is difficult to transform, however you can open up a dedicated port.

4WaysToReverseProxyWithNginx_Port

Add another server block on a new port.

...
http {
    ...

    server {
        listen 8443 ssl; 
        ssl_certificate /etc/nginx/cert/certificate.pem;
        ssl_certificate_key /etc/nginx/cert/key.pem;
    
        proxy_pass http://10.0.4.15:8080;
   }
}

Here app-b is hosted on 10.0.4.15 and serves on port 8080.

You'll have to open up 8443 for inbound on the nginx host(s) firewall (and/or security groups, network ACLs, etc).

Option 3: Symmetric Path

Context: The application serves content on a unique path.

4WaysToReverseProxyWithNginx_Symmetric

Hopefully most of your apps fall into this category.

Add a location block to the main server and pass it on:

...
http {
    ...

    server {
        listen 443 ssl; 
        ...
        
        location /app-c {
            proxy_pass http://10.0.4.16:8080/app-c;
        }
   }
}

Here app-c is hosted on 10.0.4.16 and serves on port 8080.

The location expression is a prefix match without any modifiers. You can get fancy with exact and regex matches. (See here for a great intro).

If the app sends redirects to its host/port, you can fix that with:

        location /app-c {
            ...
            proxy_redirect http://$proxy_host/ $scheme://$host:$server_port/;
        }

This will take the Location header in the response, and replace the string http://10.0.4.16:8080 with nginx's scheme, host and port. So let's say app-c redirects requests to /app-c to /app-c/landing.html. The 301 redirect is to http://10.0.4.16:8080/app-c/landing.html, the above directive will transform it to https://www.example.com:443/app-c/landing.html. The browser then makes a call back to that URL.

Take note, the terminology and variable use is odd. The $proxy_ variables are the origin servers or target server, and do not refer to the nginx proxy. The $proxy_host variable includes the host and port. The $proxy_port variable does exist and it contains only the port. The $host only contains the nginx host, and the $server_port is separate.

Option 4: Asymmetric Path

Context: The system meets none of the criteria above. The publically exposed port and path will be different than the application port and path.

4WaysToReverseProxyWithNginx_Asymmetric

This is the most difficult case.

You could get lucky with an app that uses 100% relative URLs, and no content transformation is necessary. But most likely not. Applications often reference their own content with absolute URIs and/or absolute URLs (scheme://host:port/path), and use a variety of mechanisms, for example, anchor href, image and link src, css @import and JavaScript-built dynamic paths. You'll need one transformation directive for each usage pattern in the target application.

It could involve days of trial and error. You'll definitely need your browser's developer tools view. You'll become intimately familiar with the internals of the application, navigating through the entire application, inspecting how it loads external files — HTML, JavaScript, CSS and media. Single page AJAX/Comet and modern frameworks — Angular, React and Vue — may be especially challenging as URLs may be programmatically built.

This is also the most brittle approach. After all that initial effort to proxy, the developer could change a content access approach in some future upgrade and you'll need to fix your reverse proxy rules.

The two primary directives I use in this option are:

  1. rewrite for request URI changes
  2. sub_filter for response content changes

sub_filter uses fixed string substitutions and covers all the cases I have encountered. (I haven't used it but you can rebuild with a non-standard substitutions module to get regex capability.)

Let's say the target app serves on http://10.0.4.17:8080/origin-d and we publish on https://www.example.com/app-d. Add a location block as follows:

...
http {
    ...

    server {
        listen 443 ssl; 
        ...
        
        location /app-d {
            rewrite /origin-d$1 /app-d(.*);
            proxy_pass http://10.0.4.16:8080/origin-d;
            proxy_redirect http://$proxy_host/origin-d/ $scheme://$host:$server_port/app-d/;
            sub_filter 'href="/origin-d' 'href="/app-d';
        }
   }
}

In rewrite, proxy_redirect and sub_filter, the first argument is a string to match, the second the string to substitute.

  • The rewrite changes a URL like /app-d/product/list?category=blue to /origin-d/product/list?category=blue.
  • The redirect changes an Location headers in the response so that it works with the published path.
  • Finally the sub_filter makes hyperlinks in the response body to work with the published path.

You'll need to add other sub_filter statements for the other patterns used in the app.

Few other things I usually add:

         location /app-d {
             ...
             sub_filter_types *;
             sub_filter_once off;
             sub_filter_last_modified on;
         }

sub_filter_types applies it to all content types. By default it only applies to text/html, but often you'll need to change js and css.

sub_filter_once by default is on, meaning it matches once and stops. How useful is that? Not at all. You want to replace every instance. Keep in mind that each string is replaced once, so you can't replace say http://10.0.4.16:8080/origin-d/blah/blah with two rules, one for http://$proxy_host and another for /origin-d. You'll need a rule with the full URL.

Without sub_filter_last_modified set to on, you bust all the client (and possibly other intermediate proxy) caching. Turning this on preserves the “Last-Modified” header field from the original response during replacement to enable upstream response caching.

Summary

Reverse proxies are essential for organizing a complex backend into a simple facade. Nginx has powerfully expressive configuration directives to simplify this effort.


Published Jan 29, 2018
Updated October 3, 2020 (option 4 conf corrected)
NGINX icon used with permission from NGINX, Inc.
Comments? email me - mark at sawers dot com