{"id":1802,"title":"Review: board-landscaping-observability#1313 -- Pyrra + Falco scope review","slug":"review-1313-2026-06-04","html_content":"<h2 id=\"verdict-block\">Verdict: BLOCK</h2>\n<p><strong>Board item:</strong> board-landscaping-observability#1313 (backlog, 5 points)<br/>\n<strong>Forgejo issue:</strong> ldraney/landscaping-assistant #90 -- \"Deploy Pyrra for SLO tracking and Falco for runtime security\"<br/>\n<strong>Reviewer:</strong> QA Agent<br/>\n<strong>Date:</strong> 2026-06-04</p>\n<h3 id=\"template-completeness\">Template Completeness</h3>\n<p>Checked against <code>template-issue-feature</code>:</p>\n<ul>\n<li>[x] Type -- Feature</li>\n<li>[x] Lineage -- Child of #43, Phase 6 of observability roadmap</li>\n<li>[ ] Repo -- States <code>forgejo_admin/pal-e-platform</code> but issue is filed in <code>ldraney/landscaping-assistant</code>. Mismatch.</li>\n<li>[ ] User Story -- Combines two unrelated capabilities in one \"I want\" clause. \"SLO tracking with error budgets\" and \"runtime container security monitoring\" are distinct user needs for distinct personas (SRE vs Security).</li>\n<li>[x] Context -- Present. Acknowledges \"Two independent capabilities bundled\" but does not justify why.</li>\n<li>[ ] File Targets -- Three targets listed, all in pal-e-platform. SLO YAML location is ambiguous: \"pal-e-platform or pal-e-deployments\" is a question, not a spec.</li>\n<li>[x] Acceptance Criteria -- 5 criteria covering both components.</li>\n<li>[x] Test Expectations -- 3 items, all kubectl-based manual verification. No automated tests.</li>\n<li>[ ] Constraints -- \"SLO targets need to be decided per-service before deployment\" is an unresolved prerequisite, not a constraint. This is a blocker hiding as a constraint.</li>\n<li>[x] Checklist -- Present.</li>\n<li>[x] Related -- References project-pal-e-platform and parent #43.</li>\n</ul>\n<h3 id=\"traceability\">Traceability</h3>\n<ul>\n<li>[x] story:observability -- present on board item labels</li>\n<li>[x] arch:platform -- present on board item labels</li>\n<li>[x] Forgejo issue -- #90 exists and is open</li>\n<li>[x] parent:43 -- present on board item labels, matches Lineage</li>\n</ul>\n<p>Traceability triangle is intact for the board item. However, the traceability to the platform plan is broken (see Decomposition Assessment below).</p>\n<h3 id=\"file-targets\">File Targets</h3>\n<ul>\n<li>[ ] <code>terraform/pyrra.tf</code> (pal-e-platform) -- Cannot verify from landscaping-assistant repo. File would be new. Parent directory <code>terraform/</code> exists in pal-e-platform per project page.</li>\n<li>[ ] <code>terraform/falco.tf</code> (pal-e-platform) -- Same. New file in pal-e-platform.</li>\n<li>[ ] SLO YAML definitions -- Location undecided (\"pal-e-platform or pal-e-deployments\"). Agent cannot act on this without a decision.</li>\n</ul>\n<p>Assessment: Targets are in a different repo than where the issue is filed. An agent dispatched against landscaping-assistant would not find these files.</p>\n<h3 id=\"repo-placement\">Repo Placement</h3>\n<p><strong>PROBLEM:</strong> The issue is filed in <code>ldraney/landscaping-assistant</code> but all work targets are in <code>forgejo_admin/pal-e-platform</code>. This is a cross-repo mismatch. The issue body even says <code>Repo: forgejo_admin/pal-e-platform</code>. Either:</p>\n<ul>\n<li>Move the issue to pal-e-platform (where the Terraform files live), or</li>\n<li>Make #90 a tracking issue in landscaping-assistant with child issues in pal-e-platform (but this adds unnecessary indirection for what is fundamentally platform work)</li>\n</ul>\n<h3 id=\"dependencies\">Dependencies</h3>\n<p>Per <code>plan-pal-e-platform</code>, the two components have distinct dependency chains:</p>\n<ul>\n<li>[ ] <strong>SLO engine (Phase 16)</strong> depends on Phase 14 (synthetic monitoring) and Phase 15 (DORA re-baseline) -- status of both is unclear from available data</li>\n<li>[ ] <strong>Falco (Phase 20b)</strong> depends on Phase 10 (vulnerability scanning -- COMPLETED) and Phase 19 (Kyverno -- NOT STARTED)</li>\n</ul>\n<p>The plan explicitly states \"Tier 1.5 gates Tier 2.\" Falco is Tier 2. Phase 19 (Kyverno) is a prerequisite for Phase 20 and is NOT STARTED. The issue does not acknowledge or justify skipping this dependency.</p>\n<h3 id=\"acceptance-criteria\">Acceptance Criteria</h3>\n<p>Evaluating each AC:</p>\n<ol>\n<li>\"Pyrra deployed and generating PrometheusRules from SLO definitions\" -- Testable via <code>kubectl get prometheusrules</code>. Specific enough.</li>\n<li>\"Grafana dashboard shows error budget remaining and burn rate\" -- Testable but requires manual verification. No dashboard JSON path specified.</li>\n<li>\"At least one SLO defined (e.g., landscaping-assistant 99.5% availability)\" -- Testable. But the constraint says \"SLO targets need to be decided per-service before deployment\" -- so this AC depends on an unresolved prerequisite.</li>\n<li>\"Falco DaemonSet running and detecting test anomaly\" -- Testable via kubectl exec test. Specific enough.</li>\n<li>\"Falco alerts route through Alertmanager to Telegram\" -- Testable. Requires Alertmanager config changes not listed in file targets.</li>\n</ol>\n<p>AC quality is adequate individually, but conflated across two independent capabilities.</p>\n<h3 id=\"blast-radius\">Blast Radius</h3>\n<ul>\n<li><strong>Repos touched:</strong> 1-2 (pal-e-platform, possibly pal-e-deployments for SLO YAMLs)</li>\n<li><strong>Services affected:</strong> All (Falco DaemonSet runs on every node; SLO definitions reference all monitored services)</li>\n<li><strong>Risk:</strong> Falco DaemonSet is medium-risk -- it monitors syscalls on every node. A misconfigured Falco can generate alert storms or consume node resources. SLO engine is low-risk -- namespace-scoped deployment.</li>\n<li><strong>Rollback:</strong> Both are independent Helm releases, so rollback is straightforward per-component. But bundling them in one PR makes selective rollback harder.</li>\n</ul>\n<h3 id=\"decomposition-assessment\">Decomposition Assessment</h3>\n<p><strong>Three-thing limit: VIOLATED.</strong> This ticket has at least 5 discrete changes:</p>\n<ol>\n<li>Deploy Pyrra/Sloth Helm release</li>\n<li>Write SLO YAML definitions</li>\n<li>Create Grafana SLO dashboard</li>\n<li>Deploy Falco Helm release (DaemonSet)</li>\n<li>Configure Falco alerting through Alertmanager</li>\n</ol>\n<p><strong>Five-minute rule: VIOLATED.</strong> An agent would need substantial time to: resolve the Sloth/Pyrra decision, decide SLO YAML location, write SLO definitions for services, create Grafana dashboard JSON, deploy and test two independent Helm releases, and configure Alertmanager routing.</p>\n<p><strong>Critical conflict -- Sloth vs Pyrra:</strong></p>\n<ul>\n<li>The platform plan Phase 16 (<code>phase-pal-e-platform-16-slo-error-budgets</code>) is titled \"SLO Governance (<strong>Sloth</strong>)\" and explicitly scopes deploying Sloth.</li>\n<li>The <code>plan-pal-e-platform</code> summary references \"SLO governance/Sloth (16)\".</li>\n<li>The validation pipeline diagram in <code>project-pal-e-platform</code> shows \"Sloth -- SLO YAML to Recording Rules\".</li>\n<li>This issue says \"Pyrra uses Prometheus-native SLOs (not Sloth)\" -- directly contradicting the plan.</li>\n<li>The <code>docs/observability-roadmap.md</code> in landscaping-assistant references Pyrra.</li>\n</ul>\n<p>Two sources of truth disagree. An agent cannot execute this without a human decision.</p>\n<p><strong>Recommendation: Split into two issues aligned with existing plan phases:</strong></p>\n<ul>\n<li><strong>Issue A:</strong> SLO engine -- aligned with Phase 16. Filed in pal-e-platform. Resolve Sloth vs Pyrra first.</li>\n<li><strong>Issue B:</strong> Falco runtime security -- aligned with Phase 20b. Filed in pal-e-platform. Acknowledge Kyverno (Phase 19) dependency.</li>\n</ul>\n<h3 id=\"helm-chart-verification\">Helm Chart Verification</h3>\n<ul>\n<li><strong>Pyrra:</strong> Official Helm chart repo is <code>pyrra-dev/helm-charts</code> (not <code>pyrra-dev/pyrra</code> which is the project repo). Chart name: <code>pyrra</code>. ArtifactHub listing exists.</li>\n<li><strong>Falco:</strong> Official Helm chart repo is <code>falcosecurity/charts</code> (not <code>falcosecurity/falco</code> which is the project repo). Chart name: <code>falco</code>. Current version 9.0.0 confirmed.</li>\n</ul>\n<p>The observability-roadmap.md references project repos, not chart repos. Terraform helm_release blocks will need the correct chart repository URLs.</p>\n<h3 id=\"recommendation\">Recommendation</h3>\n<ol>\n<li><strong>Resolve Sloth vs Pyrra.</strong> Human decision required. Update the losing side's documentation (either Phase 16 + project-pal-e-platform validation diagram, or observability-roadmap.md + this issue).</li>\n<li><strong>Split into two issues</strong> aligned with existing plan phases: Phase 16 (SLO engine) and Phase 20b (Falco). File both in <code>forgejo_admin/pal-e-platform</code>.</li>\n<li><strong>Address dependency chains</strong> in each new issue. Phase 16 depends on 14+15. Phase 20b depends on 10+19. Either complete prerequisites or document why they can be skipped.</li>\n<li><strong>Decide SLO YAML location</strong> before the SLO issue moves forward. \"pal-e-platform or pal-e-deployments\" is not actionable.</li>\n<li><strong>Add Alertmanager config to file targets</strong> for the Falco issue -- routing Falco alerts requires Alertmanager config changes not currently listed.</li>\n</ol>\n<h3 id=\"related\">Related</h3>\n<ul>\n<li><code>plan-pal-e-platform</code> -- authoritative platform hardening plan</li>\n<li><code>phase-pal-e-platform-16-slo-error-budgets</code> -- Phase 16: SLO Governance (Sloth)</li>\n<li><code>phase-platform-20-security-deepening</code> -- Phase 20: Security Deepening (Falco is subphase 20b)</li>\n<li><code>template-issue-feature</code> -- issue template validated against</li>\n<li><code>template-review</code> -- review template used for this note</li>\n</ul>","is_public":true,"note_type":"doc","status":null,"parent_slug":null,"position":null,"created_by_sub":null,"created_by_name":null,"updated_by_sub":null,"updated_by_name":null,"project":{"id":62,"name":"Landscaping Observability","slug":"landscaping-observability","platform":"forgejo","repo_url":"https://forgejo.tail5b443a.ts.net/ldraney/landscaping-assistant","is_public":true,"page_note":null,"created_at":"2026-06-04T05:48:18.704474","updated_at":"2026-06-04T11:56:08.092486"},"tags":[{"id":84,"name":"review"},{"id":104,"name":"block"}],"created_at":"2026-06-04T11:54:29.981445","updated_at":"2026-06-04T11:54:29.981445"}