Lifecycle Guard
The lifecycle guard monitors your pods and takes action when they’re idle — preventing overnight bills on forgotten GPUs.
How it works
Section titled “How it works”A lightweight Python script runs on the pod, monitoring:
- SSH sessions — active interactive logins
- GPU utilization — via nvidia-smi
- Filesystem writes — inotify-based change tracking
- Transfer locks — active s5cmd/pip/scp operations
- Busy processes — pip install, model downloads, training runs
- Load average — system load from /proc/loadavg
A local daemon on your machine polls these signals and takes action.
| Mode | Action when idle |
|---|---|
manual | No automation |
remind | Print a cost warning (30-min cooldown) |
auto-stop | Stop the pod (preserves volume, stops billing) |
auto-down | Push workspace to S3, then terminate |
Configuration
Section titled “Configuration”# Per-podswm guard set runpod:abc123 --mode auto-down --idle-timeout 30
# Global defaultsswm guard defaults --mode auto-down --idle-timeout 60
# Disable for a podswm guard disable runpod:abc123Monitoring
Section titled “Monitoring”swm guard list # show all guarded pods with live statusswm guard run # run guard loop manually (--once for single pass)Transfer awareness
Section titled “Transfer awareness”The guard won’t trigger during active transfers. It detects running s5cmd, pip install, huggingface-cli download, tar, scp, rsync, and uv pip install processes.