The restore procedure from backup for the nethsecurity-controller app cannot be completed.
The process gets stuck indefinitely during the database import step restore-module/30restore_database
The restore action repeatedly expects the timescale container to exist and be running, but at this stage no container is available.
Relevant logs:
May 06 10:15:26 rl1 agent@nethsecurity-controller1[41205]: task/module/nethsecurity-controller1/8b6433e2-5e9f-4822-b196-f879e6ea93bb: restore-module/30restore_database is starting
May 06 10:15:26 rl1 agent@nethsecurity-controller1[41205]: Error: no container with name or ID "timescale" found: no such container
May 06 10:15:31 rl1 agent@nethsecurity-controller1[41205]: Error: no container with name or ID "timescale" found: no such container
May 06 10:15:36 rl1 agent@nethsecurity-controller1[41205]: Error: no container with name or ID "timescale" found: no such container
May 06 10:15:41 rl1 agent@nethsecurity-controller1[41205]: Error: no container with name or ID "timescale" found: no such container
May 06 10:15:46 rl1 agent@nethsecurity-controller1[41205]: Error: no container with name or ID "timescale" found: no such container
At this point, the timescale container is not started:
~]# runagent -m nethsecurity-controller1 podman ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
The restore action remains indefinitely blocked:
~]# runagent -m nethsecurity-controller1 ps xf
PID TTY STAT TIME COMMAND
...
41205 ? Ssl 0:00 \_ /usr/local/bin/agent --agentid=module/nethsecurity-controller1 --actionsdir=/usr/local/agent/actions --actionsdir=/home/ne
42839 ? S 0:00 | \_ /bin/sh /home/nethsecurity-controller1/.config/actions/restore-module/30restore_database
43278 ? S 0:00 | \_ sleep 5
A temporary manual workaround is possible, but it requires several manual steps:
- Kill the blocked
30restore_database action
- Temporarily modify the
timescale systemd unit to avoid starting its dependencies and start it manually
- Re-run the
30restore_database action manually
- Stop
timescale and restore the default 'timescale` systemd unit
- Run configure-module to put the restored controller into production
This workaround allows the restore to complete, but it should not be required during a normal restore procedure.
Components
nethsecurity-controller:2.2.3
The restore procedure from backup for the
nethsecurity-controllerapp cannot be completed.The process gets stuck indefinitely during the database import step
restore-module/30restore_databaseThe restore action repeatedly expects the
timescalecontainer to exist and be running, but at this stage no container is available.Relevant logs:
At this point, the timescale container is not started:
The restore action remains indefinitely blocked:
A temporary manual workaround is possible, but it requires several manual steps:
30restore_databaseactiontimescalesystemd unit to avoid starting its dependencies and start it manually30restore_databaseaction manuallytimescaleand restore the default 'timescale` systemd unitThis workaround allows the restore to complete, but it should not be required during a normal restore procedure.
Components
nethsecurity-controller:2.2.3