Recently one of my older servers died and I decided to move its data and services to other, newer systems and get rid of the old power hungry hardware. One those services was my local Ubuntu mirror for all the other servers in that colo. Accelerating package updates is nice, but storing hundreds of GB data that is rarely if ever used isn’t. Better replace it with a small reverse proxy, basically mirroring those mirrors I regularly use and save only the stuff that’s actually requested…
Squid seems a bit much and way too complex for this, Varnish works but would need another small webserver besides it if I want to serve some local files (for example misc. ISO images) as well, so… let’s go with nginx. A very lightweight and fast http server and proxy in one package. And let’s run it as a docker service so I can quickly deploy it wherever I want in the future by copying just two files:
docker-compose.yml
mirror:
image: nginx
volumes:
- ./mirror.conf:/etc/nginx/conf.d/default.conf
- ./index.html:/var/www/index.html
- ./repo-ubunu:/var/repo_mirror/
- ./cdimages:/var/www/cdimages/
ports:
- "80:80"
command: /bin/bash -c "nginx -g 'daemon off;'"
And the mirror.conf with the nginx configuration:
upstream ubuntu {
# choose your nearest mirror
server de.archive.ubuntu.com;
server uk.archive.ubuntu.com;
server us.archive.ubuntu.com;
server archive.ubuntu.com backup;
}
tcp_nopush on;
tcp_nodelay on;
types_hash_max_size 2048;
# where the cache is located on disk
# to keep the data persistent, make it a docker volume
proxy_cache_path /var/repo_mirror # defines where the cache is stashed
# defines cache path heirarchy
levels=1:2
# defines name and size of zone where all cache keys and cache metadata are stashed.
keys_zone=repository_cache:50m
# data access timeout - don't cache packages for more than two weeks
inactive=14d
# Cache size limit
max_size=10g;
server {
listen 80;
root /var/www/;
# some additional ISO files on the mirror, added via docker volume
location /cdimages/ {
autoindex on;
}
# don't log in production mode, way too much info
access_log off;
# Location directive for the /ubuntu path
location /ubuntu {
# cache root, see above
root /var/repo_mirror/index_data;
# look for files in the following order
try_files $uri @ubuntu;
}
# directive for the location defined above
location @ubuntu {
# map to upstream
proxy_pass http://ubuntu;
# two weeks of caching for http code 200 response content
# 15 minutes for 301 and 302
# one minute for everything else
proxy_cache_valid 200 14d;
proxy_cache_valid 301 302 15m;
proxy_cache_valid any 1m;
# set "repository_cache" zone defined above
proxy_cache repository_cache;
# Use stale data in those error events
proxy_cache_use_stale error timeout invalid_header updating http_500 http_502 http_503 http_504;
# go to backup server those error events
proxy_next_upstream error timeout invalid_header http_500 http_502 http_503 http_504;
# lock parallel requests and fetch from backend only once
proxy_cache_lock on;
# set some debug headers, just in case
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded_For $proxy_add_x_forwarded_for;
add_header X-Mirror-Upstream-Status $upstream_status;
add_header X-Mirror-Upstream-Response-Time $upstream_response_time;
add_header X-Mirror-Status $upstream_cache_status;
}
}
Adjust to your liking – this currently works for me.