FPF Agent Stack

Overview Schema Related Servers Score Discussions

04-Dev-Method-and-Plan.md•2.8 KiB

# Development Method (FPF-first): FPF Agent Stack Method + ceremonies + gates. ## 1. Core working agreements - Describe, then test: do not call something a Spec until a harness can falsify it. - Keep plan vs run separate: WorkPlan (intended schedule) is not Work (executed reality). - Treat all model output as untrusted input; the executor is deterministic and schema-driven. - Guards are tri-state: pass | degrade | abstain. Unknown never becomes pass. ## 2. Artifact flow (I/D/S style) Keep the pipeline short and falsifiable: | Stage | Artifact | Gate | | :--- | :--- | :--- | | **D** | Functional Description (this stack) | Stakeholder review: is it useful? | | **S** | BDD feature files (Cucumber) | Cucumber green on CI | | **Impl** | Skills + runtime code | Unit tests + schema checks | | **E** | Evidence bundle (AgentFS audit + test reports) | Auditor can reproduce from snapshot | ## 3. Skill authoring pattern (Atomize + Route) Every skill/tool must ship four parts (lightweight but explicit): - **L-TOOL-xx:** Definition (what it means / contract). - **A-TOOL-xx:** Admissibility (guard predicate; what must be true to run). - **D-TOOL-xx:** Duties (who must do what; retention, reviews). - **E-TOOL-xx:** Evidence carriers (logs/traces used to adjudicate A-TOOL-xx). Put these in `SKILL.md` so the agent and humans see the same contract. ## 4. Quality gates (CI) - **Static:** lint, typecheck, schema validation for every tool interface. - **Dynamic:** cucumber (acceptance) + unit tests (contract-level). - **Audit:** each cucumber run stores an AgentFS session DB + RunTrace artifact. - **Freshness:** evidence artifacts carry valid-until dates; stale evidence accrues epistemic debt and triggers review. ## 5. Proxy audit loop (avoid Goodhart) Whenever you introduce a metric (latency, pass rate, etc.), explicitly declare what objective it proxies and schedule periodic review. ## Appendix A. Dev Plan (6 milestones) | Milestone | Outcome | Acceptance evidence | | :--- | :--- | :--- | | **M0: Scaffold** | Repo layout, skill loader skeleton, cucumber runner wired | Cucumber executes a dummy feature | | **M1: Minimal agent loop** | FunctionGemma selects one tool; schema validation; abstain on errors | Features: tool selection + schema-fail behavior pass | | **M2: AgentFS integration** | All tool execution inside AgentFS session; diff + audit captured | Features: overlay isolation + audit log pass | | **M3: Skill pack v1** | 3-5 core skills (e.g., file ops, repo search, doc generation) | Feature: skill discovery + version pinning pass | | **M4: Guard + evidence discipline** | Per-tool guards + evidence links; tri-state decisions enforced | Features: unknown never pass; degrade/abstain rules pass | | **M5: Train/evaluate loop** | Dataset from traces; fine-tune; regressions prevented | Cucumber green + lower schema-error rate |

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/venikman/fpf-agent-stack'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

04-Dev-Method-and-Plan.md•2.8 KiB