Channable

Tech

Setting up a private Nix cache for fun and profit

November 7, 2022

Summary: We set up a private Nix cache for our office to (massively) speed up downloads and to save large amounts of bandwidth.

See our previous blog post for an intro to the Nix package manager.

Intro

Imagine you would want to build a centralized caching system for arbitrary data that works over the internet. Your first design decision would be that the same URL should always return the same data, i.e URLs should be immutable. Imagine also that you have an origin server that serves the original data, and you have a caching server that is allowed to cache responses from the origin server. Whenever the caching server gets a request from a client for a URL that it has already previously requested from the origin server, it can serve the cached response instead of going back to the origin again.

One neat consequence of this design is that you can have many caching servers that all point to the same origin server. This way you can horizontally scale out the number of caching servers. Another neat consequence is that you can have a hierarchy of caching servers: One caching server can point to another caching server as its upstream. And the latter caching server can then point to the origin server. The key property that makes this design possible is the immutability of the URLs. What we have just described is how e.g. content distribution networks (CDNs) work. However, it is also how Nix caches work. And in this blog post we will explain how you can set up a private Nix cache in your office. This will speed up the downloading of build dependencies and minimize the amount of traffic that needs to be downloaded from outside the office network.

Overview

Our private Nix cache is implemented as an Nginx server that is running on a box in our office. We have three upstream servers that we want to cache:

  1. https://cache.nixos.org - the official Nix binary cache providing prebuilt binaries for Nixpkgs and NixOS. It is used automatically by the Nix package manager to speed up builds.
  2. https://channable-public.cachix.org - a public Nix cache for our own open-source projects, hosted by https://www.cachix.org/.
  3. https://channable.cachix.org - a private Nix cache for our own software, hosted by https://www.cachix.org/.

Nginx does all the heavy lifting in our setup - we use it as a reverse proxy for the three upstreams above and have configured it to cache all of the responses that we get from these three servers. Whenever Nginx gets a request for a URL that it has already previously seen it will first look in its local file cache, and serve the cached response from there. This response will be served from our local network and will thus be faster than getting the same data over the internet. This speeds up the builds of everyone in the office, leading to happier developers. Additionally, it also saves a lot of bandwidth, since we won't be downloading the same binaries for 50 different machines over the internet.

Configuring Nix and Nginx was not entirely straightforward. We encountered the following challenges:

  1. We must configure Nix to use our private cache instead of the upstream caches
  2. We must configure Nginx to route to three different upstreams
  3. We must support authentication for our private cache
  4. We must configure Nix to skip querying the cache if it is unreachable, e.g. if a developer is using a laptop and working from home

Let's look at each of these steps in turn.

Configuring Nix to use a private cache

The configuration in this section will have to be done on each developer machine (not on the server running Nginx).

First, we will have to tell Nix where to find our local caching server. We will assume that the local cache server is called cachixcache and will have IP 10.0.0.10. We are going to point three different domain names to this IP address by editing /etc/hosts and adding for example:

10.0.0.10       channable-cachix-org.cachixcache channable-public-cachix-org.cachixcache cache-nixos-org.cachixcache

Note, that you can pick any domain name here, as long as you pick different ones for each upstream server. Having these three distinct domain names is required to be able to route to different upstream servers in Nginx, as we will explain in the next section. If you have an internal DNS server in your office, you can also configure the domain names there, and avoid having to edit /etc/hosts on each dev machine.

Next, we have to configure Nix to look for binaries on our private cache server. We do that by editing ~/.config/nix/nix.conf and changing the line that starts with substituters = ... to:

substituters = https://channable-cachix-org.cachixcache?priority=10 https://channable-public-cachix-org.cachixcache?priority=10 https://cache-nixos-org.cachixcache?priority=10 https://cache.nixos.org https://channable.cachix.org https://channable-public.cachix.org

A 'substituter' in Nix terminology is an upstream server that can serve prebuilt binaries for a given derivation (it can 'substitute' the binary instead of having to build from source).

Each substituter defines a "priority", which Nix uses as a hint to determine in which order the various subtituters should be queried. The priority is an integer value, where a lower value means the substituter should be queried earlier. cache.nixos.org has a default priority of 40, and Cachix advertises a priority of 41.

We want our cache to be queried before the upstream caches: if Nix were to try the upstream caches first we'd end up not using the cache at all. We can do this by overriding the priority of our substituter, which is possible by appending ?priority=10 to the substituter URL, a feature seemingly only noted in the release notes for Nix 2.4 (link).

If you only use public upstream caches, you are now done with the local Nix configuration and can continue reading the next section. However, if you use a private cache, with authentication, then you still need to configure Nix to know about that. This works as follows:

Open ~/.config/nix/netrc and add a line like this:

machine channable-cachix-org.cachixcache password __your_password__

With this configuration in place Nix will always first try to contact the local cache before falling back to the global caches.

Next, we will see how we need to configure Nginx to serve as a local caching proxy.

Configuring Nginx as a caching proxy

The configuration in this section will have to be done on the local caching server running Nginx.

We assume that you have nginx installed on the server. We then need to add a configuration file that tells nginx about our three upstream servers, and also tells it to cache responses on the local filesystem somewhere.

We provide a heavily commented nginx config file below. This file can be saved in /etc/nginx/sites-available as e.g. cache-nixos-org-proxy and must then be symlinked to /etc/nginx/sites-enabled so that nginx picks it up (you need to restart Nginx after you created the file).

Have a look at the comments in the file below to see how it works.

# Caching reverse proxy for Nix substituters.

# This file defines Nginx configuration that makes Nginx forward requests to the
# Nix substituters that we use, and then cache the responses on disk. Subsequent
# requests for the same URLs will then be served from disk instead of having to
# go over the network, which should be much faster.

# === COMMON OPTIONS FOR ALL UPSTREAMS ===
# First off, some common configuration options for all vhosts in this file.
# Note that because these options are in the `http` context these options will
# also apply to any other `proxy_pass` directives for this server!
# If you apply this to an existing Nginx server you'll want to move these
# directives into the `server` blocks.

# Tell Nginx to set up a response cache at `/data/nginx/cache`, with a maximum
# size of 800 GB or 400_000 stored files.
# Also set `inactive` to tell Nginx to keep files until they haven't been
# accessed for a year, instead of the default of removing files after they
# haven't been accessed for 10 minutes.
# See: https://nginx.org/en/docs/http/ngx_http_proxy_module.html#proxy_cache_path
proxy_cache_path /data/nginx/cache max_size=800G keys_zone=cache_zone:50m inactive=365d;

# Tell Nginx to actually use the response cache to cache requests.
# See: https://nginx.org/en/docs/http/ngx_http_proxy_module.html#proxy_cache
proxy_cache cache_zone;

# Since Nix store paths are immutable, we can cache successful responses for a
# long time.
# We only want to cache successful responses: if we get a 404 error or a 401
# error, we want the request to be retried the next time a client asks for it.
# This is the default behaviour of `proxy_cache_valid` if no specific response
# codes are specified.
# See: https://nginx.org/en/docs/http/ngx_http_proxy_module.html#proxy_cache_valid
proxy_cache_valid 200 365d;
proxy_cache_use_stale error timeout invalid_header updating http_500 http_502 http_504 http_403 http_404 http_429;

# Important: We need to ignore all common Cache-Control headers, since nginx by default
# gives them HIGHER priority than the proxy_cache_valid directive above. We do not
# want that, since we know that Nix urls are immutable.
# See: https://nginx.org/en/docs/http/ngx_http_proxy_module.html#proxy_ignore_headers
proxy_ignore_headers X-Accel-Expires Expires Cache-Control Set-Cookie;

# Enable request deduplication for requests to the upstream servers.
# This means that if two requests come in for the same store path the second
# request will wait for the first request to complete (hopefully successfully)
# and then serve that response, instead of opening two connections to the
# upstream server.
# See: https://nginx.org/en/docs/http/ngx_http_proxy_module.html#proxy_cache_lock
proxy_cache_lock on;

# Disable IPv6 resolution: our office network does not offer IPv6 addresses, but
# Nginx will still attempt to connect to IPv6 addresses, which spams the error
# log.
# You'll probably want to remove this option if your network does support IPv6
# or you use a different DNS server.
# See: https://nginx.org/en/docs/http/ngx_http_core_module.html#resolver
resolver 8.8.8.8 ipv6=off;

# When connecting to an upstream server, do use TLS SNI to indicate which server
# to connect to. Without this option Nginx fails to connect to Cachix upstreams.
# See: https://nginx.org/en/docs/http/ngx_http_proxy_module.html#proxy_ssl_server_name
proxy_ssl_server_name on;

# When connecting to an upstream server, do verify the TLS certificate since
# this is outside of our network. Nginx defaults to not verifying certificates.
# See: https://nginx.org/en/docs/http/ngx_http_proxy_module.html#proxy_ssl_trusted_certificate
proxy_ssl_verify on;
proxy_ssl_trusted_certificate /etc/ssl/certs/ca-certificates.crt;

# === PER-UPSTREAM CONFIGURATION ===
# With the above common configuration options, we can now define the upstream
# substituters to connect to.

# For each upstream substituter we define a separate `server` block that
# forwards traffic to the actual server. Nginx will determine based on the Host
# header which upstream server is to be used, and will check if the requested
# data is present in the cache before forwarding the request to the upstream
# substituter.

server {
    listen 80;

    server_name cache-nixos-org.cachixcache;

    location / {
        proxy_set_header Host $proxy_host;
        proxy_pass https://cache.nixos.org;
    }
}


server {
    listen 80;

    server_name channable-cachix-org.cachixcache;

    location / {
        # This cache requires authentication, so forward the Authorization
        # header to Cachix.
        proxy_set_header Authorization $http_authorization;
        proxy_set_header Host $proxy_host;

        proxy_pass https://channable.cachix.org;
    }
}

server {
    listen 80;

    server_name channable-public-cachix-org.cachixcache;

    location / {
        proxy_set_header Host $proxy_host;
        proxy_pass https://channable-public.cachix.org;
    }
}

A few notes on this config file:

  • make sure to only pass the Authorization header to the private upstream to avoid accidentally leaking it to another server
  • proxy_cache_valid by default gives a higher priority to Cache-Control headers sent by the upstream, than to our caching directives in this config file. We don't want that, since Nix urls are immutable. We therefore set proxy_ignore_headers X-Accel-Expires Expires Cache-Control Set-Cookie; to ignore those cache control headers. Note, that we also ignore Set-Cookie since the presence of that header also leads to responses not being cached (and CDNs like e.g. Cloudflare sometimes set this cookie even for static content).
  • make sure to set the inactive=365d parameter in proxy_cache_path /data/nginx/cache max_size=800G keys_zone=cache_zone:50m inactive=365d; Otherwise Nginx will drop entries after 10 minutes by default.

This config file now allows us to cache responses from both public and private Nix upstreams.

Making Nix work both from the office and from home

However, there was still one remaining problem that prevented us from rolling this out to the whole dev team: most of our devs work on laptops and are only in the office on some days, while working from home on other days. This is problematic because the private cache is only available when people are in the office, not when they are at home. However, Nix will now always try to contact the private cache first, even when people are at home. This will make Nix slower at home, since Nix will retry for a few seconds before falling back to the upstream binary caches. This was clearly not acceptable, so we needed to find a solution for this, so that Nix would be faster in the office, while not being slower when working from home.

To work around this issue we installed another Nginx process in the pipeline, but this time on our developers' devmachines. We configure this local Nginx process to do two things:

  1. Always respond to /nix-cache-info with a fixed response, which is required for Nix to consider it a binary cache, and
  2. Forward any other requests to the local cache server if it is available, and to respond with a 404 error otherwise.

A response with status code 404 causes Nix to not retry querying the substituter for the path, and Nix will instead immediately query the next substituter, if any.

The above behavior can be achieved with the following Nginx configuration:

server {
    listen 80;

    server_name cache-nixos-org.cachixcache channable-cachix-org.cachixcache channable-public-cachix-org.cachixcache;

    location /nix-cache-info {
        return 200 "StoreDir: /nix/store\nWantMassQuery: 1\nPriority: 41\n";
    }

    location @fallback {
        return 200 "404";
    }

    location / {
        # Use a very short timeout for connecting to the cache, since it should be available in the
        # local network.
        proxy_send_timeout 100ms;
        proxy_connect_timeout 100ms;

        # Serve a 404 response if the cache server cannot be reached:
        error_page 502 504 =404 @fallback;

        # Forward to the actual cache server:
        proxy_set_header Host $http_host;
        proxy_set_header Authorization $http_authorization;
        proxy_pass https://10.0.3.10;
    }
}

After this Nginx configuration is running locally it can be used by adjusting your /etc/hosts to have the *.cachixcache hosts resolve to localhost instead:

127.0.0.1    cache-nixos-org.cachixcache channable-cachix-org.cachixcache channable-public-cachix-org.cachixcache

Once this configuration is in place Nix should automatically fall back to the upstream substituters if the local cache server is unreachable.

How Nix uses compression

Nix uses xz compression by default for Nix archives (nar files) served from cache.nixos.org. Cachix also uses xz (and there is work being done to support zstd). Nix itself also supports using the brotli and zstd compression algorithms. However, both cache.nixos.org and Cachix only offer xz-compressed files at the moment. One drawback of using xz is that compressing and decompressing files is relatively slow, and this becomes a bottleneck when the network is fast.

In our testing (on a dev laptop) we saw decompression for a single derivation max out at 21 MB/s, while we could download the Nix archive at 110 MB/s. So while the private Nix cache solves the problem of upstreams sometimes being slow, decompressing Nix archives has now become the bottleneck.

(Note, that Nix does download and decompress derivations in parallel, so we can still saturate bandwidth and/or CPU if there are enough store paths to download and decompress.)

As a next step, we are going to look into tweaking xz to use a lower compression ratio, to achieve faster compression and decompression speeds. And as soon as zstd is available, we will look into upgrading to that, since it offers very fast (de)compression speeds for only slightly bigger storage sizes.

Conclusion

This project started out as a fun Hackathon project with the goal to speed up our builds. We are happy to report that our local downloads are now significantly faster when served from the local Nix cache as when they are served from the internet. We also save a lot of bandwidth, since we have many developers working in the office on any day, and many of them will have shared build artifacts, which can now be served from the cache (after the first download).

This project has sped up our local builds considerably as we are now able to get prebuilt binaries at 1 Gbit/s and are only limited by our internal network. The latency for cache hits is also considerably lower now, leading to happy devs.

Robert KreuzerCo-founder & CTO
Maarten van den BergDevOps
Rodrigo LourençoDevOps

We are hiring

Are you interested in working at Channable? Please take a look at our vacancy page to see if we have an open position that suits you!

Apply now