Categories
nginx

Sandboxing nginx with systemd


nginx uses a master process, and several worker processes. Normally, the master process runs as root. If you look online, the common wisdom is that there’s no way around this, and nginx needs root access to bind to low-numbered ports:

The process you noticed is the master process, the process that starts all other nginx processes. This process is started by the init script that starts nginx. The reason this process is running as root is simply because you started it as root! […]
 Most importantly; Only root processes can listen to ports below 1024. A webserver typically runs at port 80 and/or 443. That means it needs to be started as root.
 In conclusion, the master process being run by root is completely normal and in most cases necessary for normal operation.

However, Linux has a feature called capabilities, which allow a process to do one privileged operation without being able to do any kind of privileged operation. If you look through that manual page, you’ll find one capability which is exactly what we need: CAP_NET_BIND_SERVICE. This allows a process to bind to a low-numbered port, despite not being root. Perfect!

Editing the systemd service file

Now we need a way to start nginx as an unprivileged user, with this one additional capability. You can do this with systemd. We just need to change a few configuration files.

First, stop the nginx process.

sudo systemctl stop nginx

Now, copy the system-provided nginx service file into the local configuration area.

sudo cp /lib/systemd/system/nginx.service \
/etc/systemd/system/nginx.service

Now, use your favorite text editor to edit /etc/systemd/system/nginx.service. When we make edits to this file, it will override the system-provided service file.

Go down to the [Service] section, and add these two lines:

User=www-data
Group=www-data

This will start nginx as an unprivileged user. However, to make this work, we need to give nginx the CAP_NET_BIND_SERVICE capability. Add this line:

AmbientCapabilities=CAP_NET_BIND_SERVICE

Next, we need to create a place for nginx to write its PID file. Currently, it writes to /run/nginx.pid, which is a directory owned by root. We need to create a directory called /run/nginx which is owned by www-data. To do this, add this line:

RuntimeDirectory=nginx

systemd will automatically create this directory with the correct ownership.

Now, we need to move the PID file. Edit the line starting with PIDFile to read:

PIDFile=/run/nginx/nginx.pid

We’ll also need to tell nginx about this new PID file.

Edit the file /etc/nginx/nginx.conf. Change the line starting with pid to read:

pid /run/nginx/nginx.pid;

Now restart nginx. Run

sudo systemctl daemon-reload
sudo systemctl restart nginx

If you get an error, run this command to see a detailed error message.

sudo journalctl -u nginx

Additional sandboxing

Note: the following section assumes you have a systemd version greater than 235. To see your systemd version, run systemctl --version .

Running nginx as a non-root user is a good first step, but what else can we do to make this more secure? Linux has many built-in sandboxing features which systemd can make use of.

I added the following to my systemd configuration for nginx.service:

# Process may not gain any capabilities besides the one we just gave it
CapabilityBoundingSet=CAP_NET_BIND_SERVICE
# Process is not allowed to gain new privileges using SUID binaries such as sudo
NoNewPrivileges=true
# Disables use of the personality(2) system call, which may have security bugs
LockPersonality=true
# Allows only common service-related system calls
SystemCallFilter=@system-service
# When system call is disallowed, return error code instead of killing process 
SystemCallErrorNumber=EPERM

You can download my full systemd service file and my nginx configuration here.

Using systemd-analyze

systemd ships with a tool to analyze how much each of the services on your system make use of systemd-related security features. (Note: this report doesn’t consider non-systemd methods of sandboxing, such as a service dropping privileges using setuid.) Run this command to see the report:

SYSTEMD_EMOJI=0 systemd-analyze security

You can also get detailed information about a single unit by running

systemd-analyze security nginx

By following this guide, you can reduce the systemd’s risk score for nginx from 9.5 (UNSAFE) to 5.0 (MEDIUM.)

Further work

There are several other things you could do to improve this sandbox:

  • Make the syscall filter more restrictive. The @system-service filter is very broad and over-inclusive. Using perf, you can record exactly which syscalls a service makes, and allow only those syscalls. However, keep in mind that loading new plugins into nginx, or changing its configuration, may cause your syscall list to become out-of-date. For example, an nginx configuration which serves static files will use different syscalls than one which proxies traffic to another service. Here’s a writeup on how to do this: https://prefetch.net/blog/2017/11/27/securing-systemd-services-with-seccomp-profiles/
  • Disallow nginx from changing kernel tunables and modules.
  • Disallow nginx from connecting to unix domain sockets, netlink sockets, or opening raw sockets.
  • Whitelist which devices in /dev nginx is allowed to read/write.
  • Blacklist namespace-altering syscalls.

However, I chose to not include these things. First, many of them would require an attacker to have root privilege anyway, so once the service is no longer running as root, they have little value. Second, they have some possibility of breaking someone’s configuration. The sandbox settings I show are intended to be general-purpose and work in a variety of contexts.

Testing notes

I have tested this configuration on recent versions of Debian, Fedora, and Ubuntu. Here’s what I’ve found:

  • Works on Debian Buster
  • Partially works on Debian Stretch (Note: You must comment out LockPersonality and SystemCallFilter.)
  • Doesn’t work on Fedora 32. The use of NoNewPrivileges interferes with SELinux somehow. If you skip the “Additional sandboxing” step, and substitute ‘nginx’ for ‘www-data’, it will work. This is possibly fixable, but I don’t have much knowledge of SELinux.
  • Works on Ubuntu 20.04
  • Partially works on Ubuntu 18.04. (Note: You must comment out SystemCallFilter.)