Agent Learning Daily Digest #22 — 2026-05-21

⚠️ 自动采集 GitHub 成功（82 条），HN Show RSS 失败（502），arXiv 全部失败（429/超时）。通过 HN Algolia API 手动查询 5 组关键词（agent+LLM、Claude+Code、coding+agent、MCP+server、context+engineering）+ delegate_task 批量浏览器验证 15 个 URL，全部确认有效。

今日高信号

1. Forge — Guardrails 让 8B 模型在 Agentic 任务准确率大幅提升（HN 643 pts）

来源：https://github.com/antoinezambelli/forge
⭐~1,200（验证），Python，57 forks
关键：为自托管 LLM 的 tool-calling 提供可靠性层——rescue parsing、retry nudges、step enforcement、VRAM-aware context budgets、tiered compaction。三种使用模式：WorkflowRunner（完整 agent loop）、Guardrails middleware（可组合）、Proxy server（OpenAI 兼容）。

2. Semble — Agent 代码搜索，比 grep 节省 98% token（HN 441 pts）

来源：https://github.com/MinishLab/semble
⭐~3,300（验证），Python，122 forks
关键：为 AI coding agent 优化的代码搜索引擎。索引 ~250ms，查询 ~1.5ms，全 CPU 无需 API/GPU。NDCG@10 达 0.854。支持 MCP server 模式（Claude Code、Cursor、Codex、OpenCode）或 CLI。

3. 10 万行 Rust 的 AI 编程经验——Code Contracts 与 Spec-Driven Development（HN 125 pts）

来源：https://zfhuang99.github.io/rust/claude%20code/codex/contracts/spec-driven%20development/2025/12/01/rust-with-ai.html
作者用 Claude Code、Codex、Copilot、Augment Code、Kiro、Trae 构建 Rust 多 Paxos 共识引擎的经验总结。
关键：Code Contracts — By AI, For AI——使用 design-by-contract 规范让 AI agent 自验证。Lightweight Spec-Driven Development：写 spec 让 AI agent 遵循，包括交互式 Q&A 设计决策。

4. Formal Verification Gates > Smarter Agents——结构化背压胜过更强模型（HN 93 pts）

来源：https://reubenbrooks.dev/blog/structural-backpressure-beats-smarter-agents/
关键论点：行为门控（prompts/指令）在大规模时不可靠；结构化门控（编译器、类型系统、linter）产生确定的 pass/fail。使用 Shen（静态类型 Lisp）写 spec，通过代码生成器降级到 Go/TypeScript。
核心洞察：现有模型已经能写大部分代码——瓶颈是确认代码正确性，这来自 substrate 而非更强的模型。

5. InsForge — 开源 Heroku for Coding Agents（HN 59 pts，⭐10.4k）

来源：https://github.com/InsForge/InsForge
⭐~10,400（验证），872 forks，3,986 commits
关键：为 coding agent 提供数据库（Postgres）、认证、存储（S3 兼容）、Edge Functions、计算、站点部署、AI 模型网关。通过 MCP Server 或 CLI + Skills 接口。Coding agent 可以直接 ship 全栈应用。

6. GSD-2 持续增长——Context Engineering + Spec-Driven 完整路径（⭐~7.7k）

来源：https://github.com/gsd-build/gsd-2
⭐~7,700（验证），780 forks，6,600 commits
关键：从 viral prompt 进化为独立 CLI（Pi SDK）。任务间清空 context、注入精确 context、编程式 session 管理、git 分支管理、cost/token 追踪、stuck-loop 检测、crash recovery。

7. antirez 的 EDIT 工具替代方案——基于 tag 的编辑节省 token（HN 6 pts）

来源：https://antirez.com/news/166
作者：antirez（Redis 创始人 Salvatore Sanfilippo）
关键：当前 EDIT 工具要求 LLM 逐字输出 old text（token 消耗大）。提案：READ/SEARCH 返回行号 + CRC 校验 tag（~2.5 LLM tokens），LLM 编辑时只需指定 line + tag + new_text，大幅节省 token。

8. Stash — 团队 Coding Agent 共享记忆（⭐96）

来源：https://github.com/Fergana-Labs/stash
⭐96（验证），30 forks，1,196 commits
关键：捕获团队所有 coding agent 运行记录，使其可搜索、可组织、可共享。5 个工程师在同一 repo 上跑 Claude Code 时，每个 agent 都能获得其他 session 的上下文。声称长期 Claude Code 实例 49% 加速。

9. BerriAI/litellm-agent-platform — 多 Agent 沙箱平台（⭐455）

来源：https://github.com/BerriAI/litellm-agent-platform
⭐455（验证），45 forks，777 commits
关键：自托管平台，让 Claude Code、Codex、Hermes 在隔离的 K8s pod 中运行，使用 stub 凭证，vault 在出站 TLS 时替换真实密钥。支持 lap CLI、Web UI 和直接 API。

10. Agent Braille — 8-bit 状态编码，节省 ~92% token

来源：https://github.com/Tetrahedroned/Agent-Braille
⭐3（新项目，3 天），Python，含 arXiv 论文
关键：用 Unicode Braille Patterns（U+2800–U+28FF）做 agent 状态编码——1 个 code point = 8 维状态。构建了词汇扩展使每个 Braille cell 恰好 1 token（原生 tokenizer ~3 tokens/cell）。含错误检测码（parity check、Hamming codes）。

11. Lapdog — Datadog 出品的 Claude Code 本地可观测性工具（HN 9 pts）

来源：https://lapdog.datadoghq.com/
关键：brew install datadog/lapdog/lapdog && lapdog claude 即可实时观察 Claude Code session 的每个交互。显示 sessions、traces、spans，含费用追踪。

12. Context Window Fallacy — 更多 context 不一定更好

来源：https://arizenai.com/context-window-fallacy/
关键："Context Window Fallacy"——增加 context tokens 不一定提升性能。三个结构性问题：attention decay、control-boundary collapse、premature convergence。主张在步骤间"budget → compress → reconstruct"，而非塞入更多 token。

观察清单

项目/信号	状态	备注
Forge guardrails	⭐1.2k，快速增长	小模型 agent 赛道的明星项目
Semble code search	⭐3.3k，MCP 集成完善	可作为 Hermes 的 MCP server 集成候选
InsForge	⭐10.4k，3.9k commits	Agent-native 基础设施成熟度高
GSD-2	⭐7.7k，持续迭代	每日追踪
Claude Code sandbox bypass	2 次报告	安全问题持续出现，Claude Code 的网络白名单存在绕过
Lapdog (Datadog)	新发布	大公司正式进入 agent 可观测性市场
Agent Braille	实验性，⭐3	token 优化的极端探索，观望
Claude Soul (cross-session learning)	HN 10 pts	Claude Code 的跨 session 学习引擎，待验证