The api deploy was written to run locally on the black CI runner; from plum it
broke two ways:
- run_remote_cmd passed the command unquoted through ssh, so the remote shell
re-split it: `bash -c "mkdir -p X"` arrived as `bash -c mkdir` (-p/X became
positional args) and mkdir errored "missing operand". %q-quote the command so
it survives the remote re-parse as one -c argument.
- the health check curled 127.0.0.1:3030 on the DEPLOYING host, which is empty on
a remote deploy. Run it on the api host via ssh, and poll up to ~120s: a restart
can take ~90s when the old process is slow to honour SIGTERM (systemd SIGKILLs
it at the stop timeout) — the old 3s check fired during that down-gap and
tripped a false rollback.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>