No description
  • Nix 65.5%
  • HCL 31.2%
  • Shell 3.3%
Find a file
AlexCaswen c0b092acd3
All checks were successful
infra / infra (push) Successful in 1m7s
Upload files to "containers/claude-code" (#118)
Reviewed-on: #118
2026-06-09 22:55:35 +00:00
.forgejo Update .forgejo/deploy-container.sh (#114) 2026-06-09 12:34:03 +00:00
containers Upload files to "containers/claude-code" (#118) 2026-06-09 22:55:35 +00:00
terraform Update terraform/acl.hujson 2026-06-09 19:54:06 +00:00
.gitignore Terraforms setup 2026-05-03 21:25:13 -06:00
CLAUDE.md fix: post-transfer URL updates and README auto-update rework 2026-05-18 15:52:44 -06:00
flake.lock alexcaswen-flake-conversion (#107) 2026-06-08 00:29:12 +00:00
flake.nix add-ai-gateway (#110) 2026-06-09 09:16:35 +00:00
README.md alexcaswen-subnet-trade-40 (#105) 2026-06-01 15:55:52 +00:00

m3-infra

Infrastructure-as-code for M3 (MidWit Money Management) — a sole proprietorship focused on proprietary algorithmic trading.

What this manages

  • Incus resources (terraform/*.tf) — storage pools, data volumes, containers, profiles, pinned images
  • Tailscale ACL policy (terraform/acl.hujson) — network access controls, tag ownership, SSH policy
  • NixOS container configs (containers/*/configuration.nix) — declarative system configuration
  • Grafana Cloud dashboards (terraform/grafana.tf + terraform/dashboards/) — observability dashboards
  • CI/CD pipeline (.forgejo/workflows/ci.yml) — validate ACL, plan, apply, deploy

Infrastructure

Hardware

m3-incus-os: IncusOS on Intel i7-14700KF, 192GB DDR5, RTX 5070, 6 ZFS storage pools (TPM-encrypted).

IP Scheme

10.121.25.1      host/gateway
10.121.25.11     exit node (tailnet egress)
10.121.25.2x     crypto nodes
10.121.25.3x     Development
10.121.25.4x     Trade (Production)
10.121.25.5x     observability
10.121.25.6x     AI Agents

Containers

Container IP Purpose
m3-incus-exit-node .11 Dedicated Tailscale exit node (egress via incusbr0 NAT)
btc-node .20 Bitcoin Core full archival node
eth-node .21 Erigon + embedded Caplin CL
dev .30 Development environment
trade .40 Trading engine runtime
monitoring .50 Prometheus + Grafana + Loki

Networking

  • Tailscale with subnet routing (10.121.25.0/24)
  • Dedicated exit node (m3-incus-exit-node) advertises tailnet egress, forwarding via incusbr0 NAT
  • Tailnet Lock (signing on MacBook + iPhone only)
  • Static IPs via incusbr0 bridge
  • Grafana accessible via Tailscale Serve SSO at https://m3-monitoring.taild30b6f.ts.net

Secrets Management

All CI/CD secrets are stored in Setec (Tailscale-native secrets manager, AWS KMS-backed) and fetched at runtime by Forgejo Actions steps with setec get ci/<name>. The runner reaches Setec over the tailnet — it holds tag:ci, which the ACL grants read access to the ci/* prefix via tag:ci → tag:m3-keys — and values are masked in logs with ::add-mask::. No secrets in the repository or CI config.

CI/CD

  • Forge: Forgejo (forgejo.midwitmoneymgmt.com)
  • CI: Forgejo Actions — native:host runner on m3-cloud-00 (no Docker; executes directly on the NixOS host)
  • Secrets: Setec ci/* prefix fetched per-step via setec get, masked with ::add-mask::
  • State: Local backend at /var/lib/tofu/m3-infra/terraform.tfstate on m3-cloud-00

Pipeline

Event Steps
Pull request checkout → write certs (Setec) → ACL validate → tofu init → tofu plan
Push to main checkout → write certs (Setec) → tofu init → tofu plan → tofu apply → deploy changed containers

ACL validation on PRs uses OAuth token exchange against the Tailscale API. Apply is automatic on merge to main; container deploys run only for containers whose containers/<name>/ files changed.

Workflow

  1. Create a branch from main
  2. Edit Terraform resources, NixOS configs, or ACL policy
  3. Push and open a pull request — CI runs ACL validation + tofu plan
  4. Review the plan output
  5. Merge to main (squash commit) — CI runs tofu apply automatically

Key Design Decisions

  • Per-container .tf files — self-describing, clean diffs, easy to reason about
  • Terraform subdirectoryterraform/ keeps IaC separate from NixOS configs and CI
  • Shared profilenixos-container profile handles common config; containers declare only unique devices
  • Pinned imageincus_image resource ensures all containers use the same NixOS base; upgrades are intentional
  • Data separation — container root on local pool, persistent data on dedicated pools; rebuilds preserve data
  • Custom systemd services over NixOS modules — LoadCredential is broken in containers
  • Caplin over Lighthouse — Erigon's embedded CL eliminates JWT/Engine API complexity
  • Tailscale Serve SSO — Grafana authenticated via Tailscale identity headers, no passwords
  • Forgejo Actions, native runner — trusted code on a NixOS host; the runner executes directly (native:host), no Docker, NixOS provides reproducibility
  • Inline Setec fetch with log masking — secrets pulled per-step with setec get and masked via ::add-mask::; no standing secret-bridge service to maintain
  • gitops-pusher pattern — ACL tested via Tailscale API on PR, applied via tailscale_acl resource on merge
  • Via negativa — remove complexity rather than add
  • Stability hierarchy — stability → determinism → capacity → speed