Installing Arch Linux on Bare-metal Server

2021 / 02

written by uetchy

This is all the commands I typed when I set up Arch Linux on my new compute server.

PSA[1]: I published a toolchain for creating/testing PKGBUILD in a clean-room Docker container https://github.com/uetchy/archpkgs

PSA[2]: Also published cfddns (AUR), a Cloudflare DDNS client written in Rust.

Table of Contents
Goals
Setup
Miscellaneous stuff
Maintenance
Troubleshooting
Useful links

Goals

/dev/sda - NVMe M.2 SSD
- /dev/sda1 - EFI system partition mounted on /boot (systemd-boot)
- /dev/sda2 - LUKS partition contains Btrfs subvolumes
  - @ -> /
  - @home -> /home
  - @srv -> /srv
    - Docker stacks directory (nginx-proxy, Mail, Nextcloud, Minio, JupyterHub, Weights & Biases, etc)
  - @log -> /var/log
  - @cache -> /var/cache
/dev/sdb - HDD for data vault
- /dev/sdb1 - LUKS partition contains Btrfs mounted on /mnt/vault
/dev/sdc - HDD for backups
- /dev/sdc1 - Btrfs mounted on /mnt/backups
/dev/sde - SSD for analytical database (intensive write-ops)
- /dev/sde1 - XFS mounted on /mnt/analytics

Why XFS for analytical database storage? Refer to Production Notes — MongoDB Manual and Configure Scylla | Scylla Docs.

Setup

Wipe a disk

# Erase file-system magic strings (insecure but super fast, suitable when reusing a disk)
wipefs -a /dev/sdN

# or

# Write (random then zeroes) to the device (takes longer but more secure, suitable when selling a disk)
shred -v -n 1 -z /dev/sdN

Create partitions

This command will calculate optimal sector alignments correctly. You can confirm it by running sfdisk -d /dev/sda.

# Overwrite new GPT
sgdisk -og /dev/sda

# Create 1GiB EFI system partition
sgdisk -n 1:0:+1G -c 1:boot -t 1:ef00 /dev/sda

# Fill the rest with a LUKS partition
sgdisk -n 2:0:0 -c 2:crypt -t 2:8308 /dev/sda

# Data disks
sgdisk -og /dev/sdb
sgdisk -n 1:0:0 -c 1:vault -t 1:8308 /dev/sdb # LUKS

sgdisk -og /dev/sdc
sgdisk -n 1:0:0 -c 1:backups /dev/sdb

sgdisk -og /dev/sde
sgdisk -n 1:0:0 -c 1:analytics /dev/sdb

# Verify the result
sgdisk -p /dev/sdN

NOTE: Since my server has 128GB of physical memory, I would rather let OOM Killer do its job than creating a swap partition. Should the need for swap comes up later, consider swap file (no perf difference in general)

Write file systems

# VFAT32 ESP
mkfs.vfat -F 32 -n ESP /dev/sda1

# LUKS2
cryptsetup luksFormat /dev/sda2
cryptsetup \
  --allow-discards \
  --perf-no_read_workqueue \
  --perf-no_write_workqueue \
  --persistent \
  open /dev/sda2 crypt

cryptsetup luksFormat /dev/sdb1
cryptsetup open /dev/sdb1 vault

# Verify the LUKS devices
cryptsetup luksDump /dev/sdN # Dump LUKS2 header
dmsetup table # Show flags for the currently opened devices

# Also, backup the LUKS headers to safe storage
cryptsetup luksHeaderBackup /dev/sdN --header-backup-file /path/to/luks_header_sdN

# Btrfs for root partition
mkfs.btrfs -L crypt /dev/mapper/crypt
mount /dev/mapper/crypt /mnt # Temporary mounted to create subvolumes
# btrfs [su]bvolume [cr]eate
btrfs su cr /mnt/@
btrfs su cr /mnt/@home
btrfs su cr /mnt/@cache
btrfs su cr /mnt/@log
btrfs su cr /mnt/@srv # Home for Docker Compose stacks
btrfs su set-default 256 /mnt # Required for remote unlocking
umount /mnt

# Btrfs
mkfs.btrfs -L vault /dev/mapper/vault
mkfs.btrfs -L backups /dev/sdc1

# XFS
mkfs.xfs -L analytics /dev/sde1

See Discard/TRIM support for solid state drives (SSD) - Dm-crypt - ArchWiki for the reasoning behind these cryptsetup flags.

User:I2Oc9/Btrfs subvolumes - ArchWiki

Mount partitions

# Root partition
mount /dev/mapper/crypt /mnt
mount -m -o subvol=@home /dev/mapper/crypt /mnt/home
mount -m -o subvol=@cache /dev/mapper/crypt /mnt/var/cache
mount -m -o subvol=@log /dev/mapper/crypt /mnt/var/log
mount -m -o subvol=@srv /dev/mapper/crypt /mnt/srv

# EFI system partition
mount -m /dev/sda1 /mnt/boot

# Extra disks
mount -m /dev/mapper/vault /mnt/mnt/vault
mount -m /dev/sde1 /mnt/mnt/analytics
mount -m /dev/sdc1 /mnt/mnt/backups

Install Linux kernel

# This is necessary for older Arch ISO image
pacman -Sy archlinux-keyring

# Choose between 'linux-lts' and 'linux'
pacstrap /mnt base linux-lts linux-firmware \
  btrfs-progs xfsprogs vim man-db man-pages

Generate fstab

# Generate fstab based on current /mnt structure
genfstab -U /mnt >> /mnt/etc/fstab

Tweak pacman

# Optimize mirrorlist (replace `country` params with your nearest countries)
pacman -S --needed pacman-contrib
curl -s 'https://archlinux.org/mirrorlist/?use_mirror_status=on&protocol=https&country=JP&country=KR&country=HK' | sed -e 's/#//' -e '/#/d' | rankmirrors -n 10 - > /mnt/etc/pacman.d/mirrorlist

# Colorize output
sed '/#Color/a Color' -i /mnt/etc/pacman.conf

# Parallel downloads
sed '/#ParallelDownloads/a ParallelDownloads = 5' -i /mnt/etc/pacman.conf

# ILoveCandy
sed '/# Misc/a ILoveCandy' -i /mnt/etc/pacman.conf

Chroot into the installation

# Chroot
arch-chroot /mnt

# Change root password
passwd

Finish structuring file systems

# Verify fstab entries
findmnt --verify --verbose

crypttab

echo "crypt UUID=$(blkid /dev/sda2 -s UUID -o value) none luks" >> /etc/crypttab
echo "vault UUID=$(blkid /dev/sdb1 -s UUID -o value) none luks" >> /etc/crypttab

cat /etc/crypttab

Remote unlocking

pacman -S --needed mkinitcpio-systemd-tool openssh cryptsetup tinyssh busybox mc python3

# crypttab for initramfs
echo "crypt UUID=$(blkid /dev/sda2 -s UUID -o value) none luks" >> /etc/mkinitcpio-systemd-tool/config/crypttab
# [!] Add every other device whose password is different from `crypt` device
#     to make sure that all the passwords will be asked during the remote unlocking

# fstab for initramfs
echo "UUID=$(blkid /dev/mapper/crypt -s UUID -o value) /sysroot auto x-systemd.device-timeout=9999h 0 1" >> /etc/mkinitcpio-systemd-tool/config/fstab

# Append 'systemd systemd-tool' to and remove 'udev' from mkinitcpio HOOKS
sed -r '/^HOOKS=/s/^/#/' -i /etc/mkinitcpio.conf
sed -r '/^#HOOKS=/a HOOKS=(base autodetect modconf block filesystems keyboard fsck systemd systemd-tool)' -i /etc/mkinitcpio.conf

# Change SSH port
mkdir -p /etc/systemd/system/initrd-tinysshd.service.d
cat > /etc/systemd/system/initrd-tinysshd.service.d/override.conf <<EOD
[Service]
Environment=
Environment=SSHD_PORT=12345
EOD

# Assign static IP because we are behind NAT
cat > /etc/mkinitcpio-systemd-tool/network/initrd-network.network <<EOD
[Match]
# [!] use kernel interface name, not udev name
Name=eth0

[Network]
Address=10.0.1.2
Gateway=10.0.1.1
DNS=9.9.9.9
EOD

# Enable required services
systemctl enable initrd-cryptsetup.path
systemctl enable initrd-tinysshd
systemctl enable initrd-debug-progs
systemctl enable initrd-sysroot-mount

# Generate host SSH key pair
ssh-keygen -A

# Download SSH public keys to use ([!] tinysshd only supports ed25519)
curl -s https://github.com/<username>.keys >> /root/.ssh/authorized_keys

# Build initramfs
mkinitcpio -P

# Verify initramfs contents
lsinitcpio -l /boot/initramfs-linux-lts.img

Case: Complete BTRFS Setup (step by step)

Periodic TRIM

systemctl enable fstrim.timer

Run lsblk --discard to see the TRIM-supported devices (it does if both DISC-GRAN and DISC-MAX have non-empty values).

Solid state drive - ArchWiki

SSH

vim /etc/ssh/sshd_config
# Change port
sed '/#Port /a Port 12345' -i /etc/ssh/sshd_config
# Limit to pubkey auth
sed '/#PasswordAuthentication /a PasswordAuthentication no' -i /etc/ssh/sshd_config
systemctl enable sshd

Bootloader (systemd-boot)

Because GRUB's LUKS2 support is still limited (It does not support cryptsetup's default Argon2id yet. I've tested in a VM and confirmed it doesn't work).

In the end, I end up liking systemd-boot (formerly Gummiboot) more. It's refreshingly simple and easier to understand, doesn't this sound like Arch Linux?

# Install AMD microcode updates (pick `intel-ucode` for Intel CPU)
pacman -S amd-ucode

# Install systemd-boot on /boot
bootctl install

# Add bootloader config
cat > /boot/loader/loader.conf <<EOD
default arch-lts.conf
timeout 3
console-mode max
editor no
EOD

# Add an entry for `linux-lts` (omit -lts for `linux`)
cat > /boot/loader/entries/arch-lts.conf <<EOD
title Arch Linux (LTS)
initrd /amd-ucode.img
initrd /initramfs-linux-lts.img
linux /vmlinuz-linux-lts
options root=/dev/mapper/crypt
EOD

options are kernel params.

systemd-boot - ArchWiki

Network

systemd-networkd

/etc/systemd/network/wired.network
[Match]
# `ip l` to find the right interface
Name=enp5s0

[Network]
Address=10.0.1.2/24
Gateway=10.0.1.1
MulticastDNS=yes
#DHCP=yes

systemctl enable systemd-networkd

systemd-resolved

mkdir /etc/systemd/resolved.conf.d
cat > /etc/systemd/resolved.conf.d/dns.conf <<EOD
[Resolve]
DNS=1.1.1.1 1.0.0.1
DNSOverTLS=yes
EOD
systemctl enable systemd-resolved

sysctl

# Increase max map count for Elasticsearch on Docker
# https://www.elastic.co/guide/en/elasticsearch/reference/current/docker.html#_linux
echo "vm.max_map_count=262144" > /etc/sysctl.d/96-map-count.conf

# Increase inotify limit to avoid "too many open files"
echo "fs.inotify.max_user_watches=1048576" > /etc/sysctl.d/97-inotify.conf

# Auto reboot after 60s of kernel panic
echo "kernel.panic=60" > /etc/sysctl.d/98-kernel-panic.conf

# Tweak swappiness value for memory-rich servers
# https://linuxhint.com/understanding_vm_swappiness/
echo "vm.swappiness=10" > /etc/sysctl.d/99-swappiness.conf

faillock

Change deny from 3 to 5:

sed '/^# deny/a deny = 5' -i /etc/security/faillock.conf

NVIDIA driver

# 'nvidia' for 'linux'
pacman -S nvidia-lts

Create operator user

# Install ZSH and sudo
pacman -S zsh sudo

# Add operator user (op) with wheel membership
useradd -m -s /bin/zsh -G wheel op

# Change operator user password
passwd op

# Populate SSH public keys
mkdir /home/op/.ssh
curl -s https://github.com/<username>.keys >> /home/op/.ssh/authorized_keys
chown -R op:op /home/op/.ssh

# [!] Don't put SSH key pairs on the server. Use SSH agent forwarding instead.

# Grant wheel group sudo priv
(umask 0337; echo "%wheel ALL=(ALL) ALL" > /etc/sudoers.d/wheel)

visudo -c # Verify sudoers
userdbctl # Verify users
userdbctl group # Verify groups

Octal codes - umask - Wikipedia

Time and locales

# Set time zone
ln -sf /usr/share/zoneinfo/Asia/Tokyo /etc/localtime

# Enable NTP
systemctl enable systemd-timesyncd

# Sync system time to hardware clock
hwclock --systohc

sed '/#en_US.UTF-8 UTF-8/s/^#//' -i /etc/locale.gen
locale-gen
echo "LANG=en_US.UTF-8" >> /etc/locale.conf

Leave chroot and reboot

exit # leave chroot

# Symlink stub resolver config (non-chroot required)
ln -rsf /run/systemd/resolve/stub-resolv.conf /mnt/etc/resolv.conf

umount -R /mnt # unmount /mnt recursively
reboot

[!] From now on, run all commands as the operator user (use sudo if necessary)

Set hostname

hostnamectl set-hostname tako
hostnamectl set-chassis server
echo "127.0.0.1 tako" >> /etc/hosts

Check-ups

# Check network status
networkctl status
resolvectl status
resolvectl query uechi.io
resolvectl query -p mdns tako.local

# Verify time and NTP status
timedatectl status

# Verify sysctl values
sysctl --system

If networkctl keeps showing enp5s0 as degraded, then run ip addr add 10.0.1.2/24 dev enp5s0 to manually assign static IP address for the workaround.

S.M.A.R.T.

pacman -S smartmontools

# Needed for sending email
pacman -S s-nail

Automated disk health check-ups and reporting

/etc/smartd.conf
# Scan all but removable devices and notify any test failures
# Also, start a short self-test every day around 1-2am, and a long self test every Saturday around 3-4am
DEVICESCAN -a -o on -S on -n standby,q -s (S/../.././02|L/../../6/03) -m me@example.com

Tips: Add -M test immediately after DEVICESCAN to send test mail

systemctl enable --now smartd

Manual testing

smartctl -t short /dev/sda
smartctl -l selftest /dev/sda

AUR Helper (yay)

pacman -S base-devel git
git clone https://aur.archlinux.org/yay.git
cd yay
makepkg -si

Docker

pacman -S docker docker-compose
yay -S nvidia-container-runtime

/etc/docker/daemon.json
{
  "log-driver": "json-file", // default: "json-file"
  "log-opts": {
    "max-size": "10m", // default: -1 (unlimited)
    "max-file": "3" // default: 1
  },
  "runtimes": {
    // for Docker Compose
    "nvidia": {
      "path": "/usr/bin/nvidia-container-runtime",
      "runtimeArgs": []
    }
  }
}

systemctl enable --now docker

# Allow operator user to run docker command without sudo (less secure)
#   Re-login for the changes to take effect
usermod -aG docker op

# Enable Swarm
docker swarm init --advertise-addr $(curl -s https://ip.seeip.org)
# Create overlay network for Swarm stack
# docker network create --attachable -d overlay --subnet 10.11.0.0/24 <network>

# Verify installation
docker run --rm --gpus all nvidia/cuda:11.7.1-cudnn8-runtime-ubuntu20.04 nvidia-smi

NVIDIA/nvidia-container-runtime: NVIDIA container runtime

Cross-platform build support (BuildKit, QEMU)

docker run --rm --privileged multiarch/qemu-user-static --reset --persistent yes

# Verify
docker run --rm --platform linux/arm64/v8 -t arm64v8/ubuntu uname -m # => aarch64

Tips: Use `journald` log driver in Docker Compose

This is particularly useful when you want to feed container logs to fail2ban through journald.

services:
  web:
    logging:
      driver: "journald"
      options:
        tag: "{{.ImageName}}/{{.Name}}/{{.ID}}" # default: "{{.ID}}"

DNS resolver (Pi-hole + unbound)

git clone https://github.com/uetchy/docker-dns /srv/dns
cd /srv/dns
rm -rf .git
cp .env.example .env
vim .env
mkdir -p data/unbound
cp examples/unbound/forward-records.conf data/unbound/
vim data/unbound/forward-records.conf # see below
docker compose up -d

For Quad9 resolver, I chose ECS-enabled resolver because their nearest anycast server from Tokyo is in another country (Singapore), which could confuse CDN server selection and result in higher latency.

If your favorite DNS resolver DOES have their anycast servers near your city, you don't need ECS at all.

If you are in Japan, I would recommend IIJ Public DNS. They offer secure DoT/DoH resolvers (actually, they don't support "normal" unencrypted DNS queries so there's no room for the "accidental fallback to an unencrypted query in Opportunistic TLS configuration" scenario)

/etc/systemd/network/dns-shim.netdev
# workaround to route local dns lookups to Docker managed MACVLAN interface
[NetDev]
Name=dns-shim
Kind=macvlan

[MACVLAN]
Mode=bridge

/etc/systemd/network/dns-shim.network
# workaround to route local dns lookups to Docker managed MACVLAN interface
[Match]
Name=dns-shim

[Network]
IPForward=yes

[Address]
Address=10.0.1.103/32
Scope=link

[Route]
Destination=10.0.1.100/30

cat >> /etc/systemd/network/wired.network <<EOD
# workaround to route local dns lookups to Docker managed MACVLAN interface
MACVLAN=dns-shim
EOD

cat > /etc/systemd/resolved.conf.d/dns.conf <<EOD
[Resolve]
DNS=10.0.1.100
EOD

If you want to do the same thing but using ip:

ip link add dns-shim link enp5s0 type macvlan mode bridge # add macvlan shim interface
ip a add 10.0.1.103/32 dev dns-shim # assign the interface an ip address
ip link set dns-shim up # enable the interface
ip route add 10.0.1.100/30 dev dns-shim # route macvlan subnet (.100 - .103) to the interface

DDNS (cfddns)

Dynamic DNS for Cloudflare.

Star the GitHub repository if you like it :)

yay -S cfddns

/etc/cfddns/cfddns.yml
token: <token>
notification:
  # You'll need local mail transfer agent such as Mailu/Mailcow
  enabled: true
  from: cfddns@localhost
  to: me@example.com
  server: localhost

/etc/cfddns/domains
example.com
dev.example.com
example.org

systemctl enable --now cfddns

Reverse proxy (nginx-proxy)

nginx-proxy serves as an ingress gateway for port 80 and 443, as well as a TLS terminal.

git clone --recurse-submodules https://github.com/evertramos/nginx-proxy-automation.git /srv/proxy
cd /srv/proxy/bin

./fresh-start.sh --yes -e your_email@domain --skip-docker-image-check

nginx-proxy-automation/docs at master · evertramos/nginx-proxy-automation · GitHub

ACME CA (step-ca)

With nginx-proxy, you can generate and auto-rotate self-signed ACME certificates for private Docker containers.

/srv/ca/docker-compose.yml
version: "3"
services:
  step-ca:
    image: smallstep/step-ca:0.22.1
    restart: unless-stopped
    ports:
      - "9000:9000"
    environment:
      DOCKER_STEPCA_INIT_NAME: ${DOCKER_STEPCA_INIT_NAME}
      DOCKER_STEPCA_INIT_DNS_NAMES: ${DOCKER_STEPCA_INIT_DNS_NAMES}
    volumes:
      - "./data/step-ca:/home/step"
    dns:
      # Split horizon DNS server for private web services (also point <domain> to the server)
      - 10.0.1.100

/srv/ca/.env
DOCKER_STEPCA_INIT_NAME=MySign Root CA
DOCKER_STEPCA_INIT_DNS_NAMES=localhost,<hostname>,<domain>

pacman -S step-cli

# Start step-ca
docker compose up -d

# Show CA password
docker compose exec step-ca cat secrets/password

# Enable ACME module
docker compose exec step-ca step ca provisioner add acme --type ACME

# Download root cert and CA configuration
CA_FINGERPRINT=$(docker compose exec step-ca step certificate fingerprint certs/root_ca.crt)
step-cli ca bootstrap --ca-url https://localhost:9000 --fingerprint $CA_FINGERPRINT

# Test installation
step-cli certificate inspect $(step-cli path)/certs/root_ca.crt
step-cli certificate inspect https://<domain>:9000

# Install root cert system-wide
step-cli certificate install $(step-cli path)/certs/root_ca.crt

Auth gateway and identity provider (Authelia)

Authelia acts as:

OIDC identity provider (single sign-on)
Auth gateway for some self-hosted web apps lacking user authentication

/srv/authelia/docker-compose.yml
version: "3.9"
secrets:
  JWT_SECRET:
    file: ./data/authelia/secrets/JWT_SECRET
  SESSION_SECRET:
    file: ./data/authelia/secrets/SESSION_SECRET
  STORAGE_PASSWORD:
    file: ./data/authelia/secrets/STORAGE_PASSWORD
  STORAGE_ENCRYPTION_KEY:
    file: ./data/authelia/secrets/STORAGE_ENCRYPTION_KEY
  OIDC_HMAC_SECRET:
    file: ./data/authelia/secrets/OIDC_HMAC_SECRET
  PRIVATE_KEY:
    file: ./data/authelia/keys/private.pem

services:
  server:
    container_name: authelia
    image: authelia/authelia:4
    restart: unless-stopped
    networks:
      - default
      - webproxy
    secrets:
      - JWT_SECRET
      - SESSION_SECRET
      - STORAGE_PASSWORD
      - STORAGE_ENCRYPTION_KEY
      - OIDC_HMAC_SECRET
      - PRIVATE_KEY
    environment:
      AUTHELIA_JWT_SECRET_FILE: /run/secrets/JWT_SECRET
      AUTHELIA_SESSION_SECRET_FILE: /run/secrets/SESSION_SECRET
      AUTHELIA_STORAGE_POSTGRES_PASSWORD_FILE: /run/secrets/STORAGE_PASSWORD
      AUTHELIA_STORAGE_ENCRYPTION_KEY_FILE: /run/secrets/STORAGE_ENCRYPTION_KEY
      AUTHELIA_IDENTITY_PROVIDERS_OIDC_HMAC_SECRET: /run/secrets/OIDC_HMAC_SECRET
      AUTHELIA_IDENTITY_PROVIDERS_OIDC_ISSUER_PRIVATE_KEY_FILE: /run/secrets/PRIVATE_KEY
      VIRTUAL_PROTO: https
      VIRTUAL_HOST: ${VIRTUAL_HOST}
      LETSENCRYPT_HOST: ${VIRTUAL_HOST}
    volumes:
      - ./data/authelia/config:/config
      - ${AUTHELIA_CERTS}:/certs:ro
    depends_on:
      - redis
      - postgres

  redis:
    image: redis:7-alpine
    restart: unless-stopped
    volumes:
      - ./data/redis:/data

  postgres:
    image: postgres:11-alpine
    restart: unless-stopped
    secrets:
      - STORAGE_PASSWORD
    environment:
      POSTGRES_USER: authelia
      POSTGRES_PASSWORD_FILE: /run/secrets/STORAGE_PASSWORD
      POSTGRES_DB: authelia
    volumes:
      - ./data/postgres:/var/lib/postgresql/data

networks:
  webproxy:
    external: true

/srv/authelia/.env
VIRTUAL_HOST=auth.example.com
# Use nginx-proxy managed TLS cert
AUTHELIA_CERTS=/srv/proxy/data/certs/auth.example.com

Mail server (Mailu)

See Setup a new Mailu server — Mailu, Docker based mail server

Nextcloud

git clone https://github.com/uetchy/docker-nextcloud.git /srv/cloud
cd /srv/cloud
cp .env.example .env
vim .env # fill the blank variables
make # pull, build, start
make applypatches # apply custom patches (run only once after the update)

Monitor (Telegraf + InfluxDB + Grafana)

Grafana + InfluxDB (Docker)

git clone https://github.com/uetchy/docker-monitor.git /srv/monitor
cd /srv/monitor
docker compose up -d

Telegraf (Host)

yay -S telegraf

/etc/telegraf/telegraf.conf
# Global tags can be specified here in key="value" format.
[global_tags]

# Configuration for telegraf agent
[agent]
  interval = "15s"
  round_interval = true
  metric_batch_size = 1000
  metric_buffer_limit = 10000
  collection_jitter = "0s"
  flush_interval = "10s"
  flush_jitter = "0s"
  precision = ""
  hostname = "tako"
  omit_hostname = false

# Read InfluxDB-formatted JSON metrics from one or more HTTP endpoints
[[outputs.influxdb]]
  urls = ["http://127.0.0.1:8086"]
  database = "<db>"
  username = "<user>"
  password = "<password>"

# Read metrics about cpu usage
[[inputs.cpu]]
  percpu = true
  totalcpu = true
  collect_cpu_time = false
  report_active = false

# Read metrics about disk usage by mount point
[[inputs.disk]]
  ignore_fs = ["tmpfs", "devtmpfs", "devfs", "iso9660", "overlay", "aufs", "squashfs"]

# Read metrics about disk IO by device
[[inputs.diskio]]

# Get kernel statistics from /proc/stat
[[inputs.kernel]]

# Read metrics about memory usage
[[inputs.mem]]

# Get the number of processes and group them by status
[[inputs.processes]]

# Read metrics about system load & uptime
[[inputs.system]]

# Read metrics about network interface usage
[[inputs.net]]
  interfaces = ["enp5s0"]

# Read metrics about docker containers, requires docker group membership for telegraf user
[[inputs.docker]]
  endpoint = "unix:///var/run/docker.sock"
  perdevice = false
  total = true

[[inputs.fail2ban]]
  interval = "15m"
  use_sudo = true

# Pulls statistics from nvidia GPUs attached to the host
[[inputs.nvidia_smi]]
  timeout = "30s"

[[inputs.http_response]]
  interval = "5m"
  urls = [
    "https://example.com"
  ]

# Monitor sensors, requires lm-sensors package
[[inputs.sensors]]
  interval = "60s"
  remove_numbers = false

/etc/sudoers.d/telegraf
Cmnd_Alias FAIL2BAN = /usr/bin/fail2ban-client status, /usr/bin/fail2ban-client status *
telegraf  ALL=(root) NOEXEC: NOPASSWD: FAIL2BAN
Defaults!FAIL2BAN !logfile, !syslog, !pam_session

chmod 440 /etc/sudoers.d/telegraf
chown -R telegraf /etc/telegraf
usermod -aG docker telegraf

# Verify config
telegraf -config /etc/telegraf/telegraf.conf -test

systemctl enable --now telegraf

Bruce-force attack mitigation (fail2ban)

pacman -S fail2ban

/etc/fail2ban/jail.local
[DEFAULT]
ignoreip = 127.0.0.1/8 10.0.1.0/24 10.0.10.0/24

[sshd]
enabled = true
port = 12345
bantime = 1h
mode = aggressive

# https://mailu.io/1.9/faq.html?highlight=fail2ban#do-you-support-fail2ban
[mailu]
enabled = true
backend = systemd
filter = mailu
action = docker-action
findtime = 15m
maxretry = 10
bantime = 1w

[gitea]
enabled = true
backend = systemd
filter = gitea
action = docker-action
findtime = 30m
maxretry = 5
bantime = 1w

/etc/fail2ban/filter.d/mailu.conf
[INCLUDES]
before = common.conf

[Definition]
__date = \d{4}/\d{2}/\d{2} \d{2}:\d{2}:\d{2}
__mailu_prefix = ^%(__prefix_line)s%(__date)s \[info\] \d+#\d+: \*\d+ client login failed:
__mailu_suffix = while in http auth state, client: <HOST>,
failregex =
    %(__mailu_prefix)s "AUTH not supported" %(__mailu_suffix)s
    %(__mailu_prefix)s "Authentication credentials invalid" %(__mailu_suffix)s
journalmatch = CONTAINER_NAME=mail-front-1

/etc/fail2ban/filter.d/gitea.conf
[INCLUDES]
before = common.conf

[Definition]
failregex = ^%(__prefix_line)sDisconnected from invalid user \S+ <HOST> port \d+ \[preauth\]
journalmatch = CONTAINER_NAME=gitea

/etc/fail2ban/action.d/docker-action.conf
[Definition]

actionstart = iptables -N f2b-bad-auth
              iptables -A f2b-bad-auth -j RETURN
              iptables -I DOCKER-USER -p tcp -j f2b-bad-auth

actionstop = iptables -D DOCKER-USER -p tcp -j f2b-bad-auth
             iptables -F f2b-bad-auth
             iptables -X f2b-bad-auth

actioncheck = iptables -n -L DOCKER-USER | grep -q 'f2b-bad-auth[ \t]'
actionban = iptables -I f2b-bad-auth 1 -s <ip> -j DROP
actionunban = iptables -D f2b-bad-auth -s <ip> -j DROP

# Test regex pattern or specific filter against journald logs
fail2ban-regex systemd-journal -m 'CONTAINER_NAME=gitea' ': Disconnected from invalid user .+ <HOST> port \d+ \[preauth\]'
fail2ban-regex systemd-journal -m 'CONTAINER_NAME=gitea' gitea --print-all-matched

# Test config
fail2ban-client --test

systemctl enable --now fail2ban
fail2ban-client status

Firewall (ufw)

pacman -S ufw
systemctl enable --now ufw

VPN (WireGuard)

pacman -S wireguard-tools

# gen private key
(umask 0077; wg genkey > server.key)

# gen public key
wg pubkey < server.key > server.pub

# gen preshared key for each client
(umask 0077; wg genpsk > secret1.psk)
(umask 0077; wg genpsk > secret2.psk)
...

/etc/wireguard/wg0.conf
[Interface]
Address = 10.0.10.1/24
ListenPort = 121212
PrivateKey = <content of server.key>

PostUp   = iptables -A FORWARD -i %i -j ACCEPT; iptables -t nat -A POSTROUTING -o dns-shim -d 10.0.1.100/32 -j MASQUERADE; iptables -t nat -A POSTROUTING -o enp5s0 ! -d 10.0.1.100/32 -j MASQUERADE
PostDown = iptables -D FORWARD -i %i -j ACCEPT; iptables -t nat -D POSTROUTING -o dns-shim -d 10.0.1.100/32 -j MASQUERADE; iptables -t nat -D POSTROUTING -o enp5s0 ! -d 10.0.1.100/32 -j MASQUERADE

[Peer]
PublicKey = <public key>
PresharedKey = <content of secret1.psk>
AllowedIPs = 10.0.10.2/32

[Peer]
PublicKey = <public key>
PresharedKey = <content of secret2.psk>
AllowedIPs = 10.0.10.3/32

ufw allow 121212/udp # If ufw is running

sysctl -w net.ipv4.ip_forward=1

systemctl enable --now wg-quick@wg0

# Show active settings
wg show

Backup (restic)

pacman -S restic

/etc/restic/systemd/restic.service
[Unit]
Description=Daily Backup Service

[Service]
Nice=19
IOSchedulingClass=idle
KillSignal=SIGINT
ExecStart=/etc/restic/cmd/run

/etc/restic/systemd/restic.timer
[Unit]
Description=Daily Backup Timer

[Timer]
OnCalendar=*-*-* 0,6,12,18:0:0
RandomizedDelaySec=15min
Persistent=true

[Install]
WantedBy=timers.target

/etc/restic/cmd/config
export RESTIC_REPOSITORY=/mnt/backups/restic
export RESTIC_PASSWORD_FILE=/etc/restic/key # a file contains password
export RESTIC_CACHE_DIR=/var/cache/restic
export RESTIC_PROGRESS_FPS=1

/etc/restic/cmd/run
#!/bin/bash -ue

# https://restic.readthedocs.io/en/latest/040_backup.html#

DIR=$(dirname "$(readlink -f "$0")")
source "$DIR/config"

date

# system
echo "> system"
restic backup --tag system -v \
  --one-file-system \
  --exclude .cache \
  --exclude .vscode-server \
  --exclude TabNine \
  --exclude /swapfile \
  --exclude "/lost+found" \
  --exclude "/var/lib/docker/overlay2/*" \
  / /boot /home /srv

# vault
echo "> vault"
restic backup --tag vault -v \
  --one-file-system \
  --exclude 'appdata_*/preview' \
  --exclude 'appdata_*/dav-photocache' \
  /mnt/vault

echo "! prune"
restic forget --prune --group-by tags \
  --keep-last 4 \
  --keep-within-daily 7d \
  --keep-within-weekly 1m \
  --keep-within-monthly 3m

echo "! check"
restic check

/etc/restic/cmd/show
#!/bin/bash -ue

DIR=$(dirname "$(readlink -f "$0")")
source "$DIR/config"

TAG=${TAG:-system}
ID=$(restic snapshots --tag $TAG --json | jq -r ".[] | [.time, .short_id] | @tsv" | fzy | awk '{print $2}')

TARGET=${1:-$(pwd)}
MODE="ls -l"
if [[ -f $TARGET ]]; then
  TARGET=$(realpath ${TARGET})
  MODE=dump
fi
>&2 echo "Command: restic ${MODE} ${ID} ${TARGET}"

restic $MODE $ID ${TARGET}

/etc/restic/cmd/restore
#!/bin/bash -ue

# https://restic.readthedocs.io/en/latest/050_restore.html

DIR=$(dirname "$(readlink -f "$0")")
source "$DIR/config"

TARGET=${1:?Specify TARGET}
TARGET=$(realpath ${TARGET})

TAG=$(restic snapshots --json | jq -r '[.[].tags[0]]|unique|.[]' | fzy)
ID=$(restic snapshots --tag $TAG --json | jq -r ".[] | [.time, .short_id] | @tsv" | fzy | awk '{print $2}')

>&2 echo "Command: restic restore ${ID} -i ${TARGET} -t /"

read -p "Press enter to continue"

restic restore $ID -i ${TARGET} -t /

(umask 0377; echo -n "<password>" > /etc/restic/key)
chmod 700 /etc/restic/cmd/config
ln -sf /etc/restic/systemd/restic.{service,timer} /etc/systemd/system/
systemctl enable --now restic.timer
systemctl status restic.timer
systemctl status restic

Restic Documentation — restic 0.12.1 documentation

Miscellaneous stuff

Kubernetes

pacman -S minikube

# see https://github.com/kubernetes/minikube/issues/4172#issuecomment-1267069635
#   for the reason having `--kubernetes-version=v1.23.1`
minikube start \
  --driver=docker \
  --cpus=max \
  --disable-metrics=true \
  --subnet=10.100.0.0/16 \
  --kubernetes-version=v1.23.1

alias kubectl="minikube kubectl --"

# Allow the control plane to allocate pods to itself
kubectl taint nodes --all node-role.kubernetes.io/control-plane:NoSchedule-

# NGINX Ingress
minikube addons enable ingress
minikube service list

# Verify
docker network inspect minikube
minikube ip # => should be 10.100.0.2
kubectl cluster-info
kubectl get cm -n kube-system kubeadm-config -o json | jq .data.ClusterConfiguration -r | yq
kubectl get nodes
kubectl get po -A

# Hello world
kubectl create deployment web --image=gcr.io/google-samples/hello-app:1.0
kubectl expose deployment web --type=NodePort --port=8080
kubectl get service web
curl $(minikube service web --url)

# Hello world through ingress
kubectl apply -f https://k8s.io/examples/service/networking/example-ingress.yaml
kubectl get ingress
curl -H "Host: hello-world.info" http://$(minikube ip)

Install useful tools

# Tips: to find packages that provide specific command, say `pygmentize`:
pacman -Fy pygmentize # => python-pygments

yay -S --needed htop mosh tmux direnv ncdu fx jq yq fd ripgrep exa bat fzy peco fastmod rsync \
  antibody-bin hub lazygit git-lfs git-delta difftastic ghq-bin ghq-gst iperf gptfdisk lsof lshw lostfiles \
  ffmpeg yt-dlp prettier age gum pyenv neofetch pqrs tea

Make SSH forwarding work with tmux + sudo

/home/op/.ssh/rc
if [ ! -S ~/.ssh/ssh_auth_sock ] && [ -S "$SSH_AUTH_SOCK" ]; then
  ln -sf $SSH_AUTH_SOCK ~/.ssh/ssh_auth_sock
fi

/home/op/.tmux.conf
set -g update-environment -r
setenv -g SSH_AUTH_SOCK $HOME/.ssh/ssh_auth_sock

(umask 0337; echo "Defaults env_keep += SSH_AUTH_SOCK" > /etc/sudoers.d/ssh)

Temperature sensors

pacman -S lm_sensors
sensors-detect
systemctl enable --now lm_sensors

# Now you can configure htop to show the CPU temps
htop

lm_sensors - ArchWiki

Telegram notifier

/usr/local/bin/telegram-notifier
#!/bin/bash

BOT_TOKEN=<your bot token>
CHAT_ID=<your chat id>
PAYLOAD=$(ruby -r json -e "print ({text: ARGF.to_a.join, chat_id: $CHAT_ID}).to_json" </dev/stdin)

OK=$(curl -s -X "POST" \
  -H "Content-Type: application/json; charset=utf-8" \
  -d "$PAYLOAD" \
  https://api.telegram.org/bot${BOT_TOKEN}/sendMessage | jq .ok)

if [[ $OK == true ]]; then
  exit 0
else
  exit 1
fi

Audio

pacman -S alsa-utils # may require rebooting system

# Grant op user audio priv
usermod -aG audio op

# List devices as root
aplay -l
arecord -L
cat /proc/asound/cards

# Test speaker
speaker-test -c2

# Test mic
arecord -vv -Dhw:2,0 -fS32_LE mic.wav
aplay mic.wav

# GUI mixer
alsamixer

# For Mycroft.ai
pacman -S pulseaudio pulsemixer
pulseaudio --start
pacmd list-cards

/etc/pulse/default.pa
# INPUT/RECORD
load-module module-alsa-source device="default" tsched=1
# OUTPUT/PLAYBACK
load-module module-alsa-sink device="default" tsched=1
# Accept clients -- very important
load-module module-native-protocol-unix
load-module module-native-protocol-tcp

/etc/asound.conf
pcm.mic {
  type hw
  card M96k
  rate 44100
  format S32_LE
}

pcm.speaker {
  type plug
  slave {
    pcm "hw:1,0"
  }
}

pcm.!default {
  type asym
  capture.pcm "mic"
  playback.pcm "speaker"
}

#defaults.pcm.card 1
#defaults.ctl.card 1

Maintenance

Quick checkups

htop # show task overview
systemctl --failed # show failed units
free -h # show memory usage
lsblk -f # show disk usage
networkctl status # show network status
userdbctl # show users
nvidia-smi # verify nvidia cards
ps aux | grep "defunct" # find zombie processes

Delve into system logs

journalctl -p err -b-1 -r # show error logs from previous boot in reverse order
journalctl -u sshd -f # tail logs from sshd unit
journalctl --no-pager -n 25 -k # show latest 25 logs from the kernel without pager
journalctl --since="6 hours ago" --until "2020-07-10 15:10:00" # show logs within specific time range
journalctl CONTAINER_NAME=service_web_1 # show error from the docker container named 'service_web_1'
journalctl _PID=2434 -e # filter logs based on PID and jump to the end of the logs
journalctl -g 'timed out' # filter logs based on a regular expression. if the pattern is all lowercase, it will become case-insensitive mode

g - go to the first line
G - go to the last line
/ - search for the string

Force overriding installation

pacman -S <pkg> --overwrite '*'

Check memory modules

pacman -S lshw dmidecode

lshw -short -C memory # lists installed mems
dmidecode # shows configured clock speed

smartctl -a /dev/sdN

# via USB bridge
smartctl -a -d sat /dev/sdN

Ext4

# e2fsck with badblocks (non-destructive read-write test) and preen enabled
# [!] umount the drive before this ops
# [!] Never perform this on an unmounted LUKS partition, as it may lead to data loss
e2fsck -vcckp /dev/sdNn

-v: Be verbose
-cc: This option causes e2fsck to use badblocks(8) program to do a read-only scan of the device in order to find any bad blocks. If any bad blocks are found, they are added to the bad block inode to prevent them from being allocated to a file or directory. If this option is specified twice, then the bad block scan will be done using a non-destructive read-write test.
-k: When combined with the -c option, any existing bad blocks in the bad blocks list are preserved, and any new bad blocks found by running badblocks(8) will be added to the existing bad blocks list.
-p: Automatically repair ("preen") the file system. This option will cause e2fsck to automatically fix any file system problems that can be safely fixed without human intervention. If e2fsck discovers a problem which may require the system administrator to take additional corrective action, e2fsck will print a description of the problem and then exit with the value 4 logically or'ed into the exit code. This option is normally used by the system's boot scripts. It may not be specified at the same time as the -n or -y options.

Fix broken file system headers

testdisk /dev/sdN

Troubleshooting

systemctl restart systemd-logind
systemctl restart polkit

A comprehensive guide to fixing slow SSH logins – JRS Systems: the blog

Annoying `systemd-homed is not available` messages flooding journald logs

Move pam_unix before pam_systemd_home.

/etc/pam.d/system-auth
#%PAM-1.0

auth       required                    pam_faillock.so      preauth
# Optionally use requisite above if you do not want to prompt for the password
# on locked accounts.
auth       [success=2 default=ignore]  pam_unix.so          try_first_pass nullok
-auth      [success=1 default=ignore]  pam_systemd_home.so
auth       [default=die]               pam_faillock.so      authfail
auth       optional                    pam_permit.so
auth       required                    pam_env.so
auth       required                    pam_faillock.so      authsucc
# If you drop the above call to pam_faillock.so the lock will be done also
# on non-consecutive authentication failures.

account    [success=1 default=ignore]  pam_unix.so
-account   required                    pam_systemd_home.so
account    optional                    pam_permit.so
account    required                    pam_time.so

password   [success=1 default=ignore]  pam_unix.so          try_first_pass nullok shadow
-password  required                    pam_systemd_home.so
password   optional                    pam_permit.so

session    required                    pam_limits.so
session    required                    pam_unix.so
session    optional                    pam_permit.so

[solved] pam fails to find unit dbus-org.freedesktop.home1.service / Newbie Corner / Arch Linux Forums

Annoying `systemd-journald-audit` logs

/etc/systemd/journald.conf
Audit=no

Missing `/dev/nvidia-{uvm*,modeset}`

This usually happens right after updating the Linux kernel.

Run docker run --rm --gpus all --device /dev/nvidia0 --device /dev/nvidiactl --device /dev/nvidia-modeset --device /dev/nvidia-uvm --device /dev/nvidia-uvm-tools -it nvidia/cuda:10.2-cudnn7-runtime nvidia-smi once.

`[sudo] Incorrect password` while password is correct

faillock --reset