dev server tasks

Implement systemd units for server and workers. This helps in resiliency (e.g. a large phylo job can kill all workers, systemd will then make them running again without human intervention).

A simple systemd file for workers:

[Unit]
Description=GGTX workers
After=network.target postgresql.service
Requires=postgresql.service

[Service]
Type=simple
WorkDirectory=%h/haskell-gargantext
ExecStart=/nix/var/nix/profiles/default/bin/nix-shell --run "cabal v2-run gargantext -- worker run --name default --settings-path gargantext-settings.toml --count 4 +RTS -N4"
Restart=on-failure
RestartSec=5s

[Install]
WantedBy=default.target

This should be placed in $HOME/.config/systemd/user/ggtx-workers.service.

Similar definition should be for v2-run gargantext -- server start -m Prod.

To run these and enable on startup: systemctl --user enable --now ggtx-workers (--now will also start this service).

Currently we use Debian bookworm (v 12). However, Debian trixie came out recently and is the recommended version (https://en.wikipedia.org/wiki/Debian_version_history). The security support for Bookworm ends on 10th of June 2026. I suggest migrating. I did migration on 2 servers so far and it went smoothly. You just follow this guide:

https://www.debian.org/releases/trixie/release-notes/upgrading.html

And in fact for standard setups, you don't need to do most of these steps.

Another way to spawn multiple workers is to use such a template (in $HOME/.config/systemd/ggtx-worker@.service):

[Unit]
Description=GGTX worker %i
After=network.target postgresql.service
Requires=postgresql.service

[Service]
Type=simple
WorkDirectory=%h/haskell-gargantext
ExecStart=/nix/var/nix/profiles/default/bin/nix-shell --run "cabal v2-run gargantext -- worker run --name default --settings-path gargantext-settings.toml +RTS -N4"
Restart=on-failure
RestartSec=5s

[Install]
WantedBy=default.target

(it's very similar to the one above, but has no --count).

Then, let systemd manage individual workers:

for i in $(seq 10); do
  systemctl --user enable --now ggtx-worker@{$i}.service
done

Restart is easy:

systemctl --user restart 'ggtx-worker@*.service'

This has couple of advantages over having one process with --count:

Individual worker logs via journalctl --user -xu ggtx-worker@3.service
When a large phylo jobs kills worker n, other workers keep working (unless the system kills them as well to reclaim more RAM)
Contrary, if a large phylo process spawned with --count is killed, all workers get killed. The messages are hidden for some time (depends on their individual timeouts), so other jobs that were running might seem stalled for a while (sometimes around 1h).
Phylo issues could be mitigated with a systemd quota:

[Service]
CPUQuota=20%
MemoryMax=500M

Or you can set ggtx-worker.slice to have quotas for all workers.

We should change default PostgreSQL config, to match our CPU count/RAM size correctly. Suggestion:

work_mem = 512MB
shared_buffers = 8GB
max_parallel_workers_per_gather = 4

For e.g. dev server, it would also be nice to log slow queries and check them from time to time:

log_min_duration_statement = 500   # example: log statements > 500 ms

After config change it's not needed to restart PostgreSQL, only reload the config:

# Adjust the directory
pg_ctl reload -D /var/lib/postgresql/15/main

Edited Sep 11, 2025 by Fabien MANIERE