Week 20 · 2026
May 4 – 10
Headlines of the Week
Tradebot shared-IP pilot validated at N=2 · Inventory reservation moved to per-doc timestamps · 3 foundation deep-dives shipped
11 work items shipped · 42 PRs merged org-wide · 4 systems documented
Architecture & Performance
reserved-until timestamp directly, and the query just asks "exclude reserved-until > now". Constant cost regardless of reservation count.useOsReservation flag · instant SQL rollback · PRs: backend #1300, parser #147Architecture & Performance
analytics-mcp.tradeit.gg behind Cloudflare Tunnel + Google SSO, read-only at 3 layers, ~$8/mo. Remaining gap: Claude Desktop OAuth flow incompatible with CF Access SSO.Tools & Automation
Documentation
finishedtrades has no subscribers) · foundation for auto-QAmust_not filter, marketplace fanout naming, and all 4 OS-consuming services. Every claim cited with file:line. Published as 32.7K-char wiki page + ops.tradeit.gg HTML mirror with 3 Mermaid diagrams.Infra & Cost
Infra & Cost
Monitoring & Observability
pricing-replicator but the actual container is replicator. Nothing crashed loudly; just silent FailedInvocations in CloudWatch. 5 analytics tables (reserved_items, trade_revert_reserved_items, banned_users, user_favorite_items, guess_questions) had been stuck at their setup-time snapshot.FailedInvocations alarms on both schedule rules (15-min + 4h) wired to existing Slack pipeline · next silent rejection pages in ~1 min, not 36hLesson: any new EventBridge → ECS rule needs real-task verification, not just config-on-paper.
Team Activity
May 11 – 17 · live from Swarmia · DORA metrics are org-wide
CFR elevated this week — worth a look at what tripped the 4 failed deploys.
Observability
Live from Datadog · what we can see right now
Host/service counts pending live Datadog MCP fetch — auth flow in progress.
Next Up
High Priority
useOsReservation in ProductionMedium Priority
analytics-mcp.tradeit.gg. OPS-227.Number of the Week
1 character
The difference between pricing-replicator and replicator
One typo in an EventBridge override silently failed our analytics replicator
for 36 hours. AWS rejected every fire before the task could even start
— no crash, no alert, just FailedInvocations ticking up in CloudWatch.
We caught it, fixed it, and wired alarms so the next silent rejection pages on-call in ~1 minute.
That's the week. Onward.