WhenLabs/When

velocity_end_task

Stop task timers and record outcomes to track performance against historical medians. Updates duration, status, and captures git changes for accurate productivity analysis.

Instructions

Stop a task timer started with velocity_start_task, record the outcome, and return the actual duration alongside a comparison to the historical median for similar tasks.

When to use: immediately after finishing — or abandoning — any task started with velocity_start_task. Always call, even on failed or abandoned outcomes; skipping leaves orphaned active rows that pollute future predictions and stats.

Side effects: updates the task row in ~/.velocity-mcp/tasks.db with end timestamp, duration, status, optional file/line counts, and any telemetry passed in. Shells out to git diff --stat HEAD~1 and git log --since (5s timeout each) to capture diff stats; safely no-ops outside a git repo. On completed status: computes a semantic embedding for similarity matching, records a calibration residual, and — if the task belonged to a plan — seals the plan when its last active task ends.

Returns: JSON with task_id, duration_seconds (numeric), duration_human (formatted), category, tags, and a message that compares this run's duration to the historical median for the category+tags combination ("you were 23% faster", "right on pace", etc.). Includes a git_diff block with lines added/removed, files changed, and commits made during the task when a git repo is detected.

Input Schema

TableJSON Schema

Name	Required	Description
`task_id`	Yes	Identifier of the active task to end. Must match a `task_id` returned by an earlier `velocity_start_task` call that has not already been ended.
`status`	Yes	Outcome of the task: "completed" (succeeded as planned), "failed" (attempted but did not produce the intended result), or "abandoned" (intentionally stopped — e.g. requirements changed mid-task). Affects whether calibration residuals and embeddings are recorded.
`actual_files`	No	Number of files actually modified during the task. Compared against `estimated_files` from start-task to feed accuracy metrics.
`notes`	No	Free-form context about what happened, surprises, or follow-ups. Stored as plain text for later review; does not affect predictions.
`tools_used`	No	Names of the tools invoked during the task (e.g. ["Edit", "Bash", "Grep"]). Used for telemetry; ordering does not matter, duplicates are deduplicated.
`tool_call_count`	No	Total number of individual tool invocations during the task — useful for diagnosing tasks that took many small steps versus few large ones.
`turn_count`	No	Number of assistant turns (request/response cycles) the task spanned. Helps correlate task duration with conversational verbosity.
`retry_count`	No	Number of times an operation had to be retried (e.g. a failing test re-run after a fix). Higher counts often correlate with under-estimated tasks.
`tests_passed_first_try`	No	When tests were run as part of the task, whether they passed on the very first execution. Useful signal for code-quality dashboards.
`model_id`	No	Identifier of the model that handled this task, if not already set at start. Required for model-segmented calibration to take effect.
`context_tokens`	No	Approximate tokens in the context window at task end. Stored alongside the start-time value to track context growth across the task.

Tool Definition Quality

A4.6/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure and does so comprehensively. It details side effects (updates task row, shells out to git commands, computes embeddings, records calibration residuals, seals plans), execution constraints (5s timeout, safely no-ops outside git repo), and conditional behaviors based on status. This provides rich behavioral context beyond basic functionality.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured with clear sections (purpose, usage guidelines, side effects, return values) and front-loads the core functionality. While comprehensive, some sentences could be more concise (e.g., the git command explanations are detailed but necessary). Overall, most content earns its place by providing essential behavioral context.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a complex mutation tool with 11 parameters and no annotations or output schema, the description provides exceptional completeness. It covers purpose, usage guidelines, side effects, execution details, conditional behaviors, and return format. The detailed explanation of what happens for different status values and the comprehensive return value description compensate for the lack of structured output schema.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The schema description coverage is 100%, so the schema already documents all 11 parameters thoroughly. The description doesn't add significant parameter-specific information beyond what's in the schema descriptions. It mentions some parameters indirectly (like status affecting embeddings), but doesn't provide additional syntax, format, or usage details for individual parameters.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description explicitly states the tool's purpose with specific verbs ('Stop a task timer', 'record the outcome', 'return the actual duration') and clearly distinguishes it from its sibling 'velocity_start_task'. It identifies the exact resource being operated on (task timer started with velocity_start_task) and the comprehensive actions taken.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit guidance on when to use this tool ('immediately after finishing — or abandoning — any task started with velocity_start_task') and when not to skip it ('Always call, even on failed or abandoned outcomes; skipping leaves orphaned active rows'). It clearly references the alternative/sibling tool (velocity_start_task) and explains the consequences of misuse.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

Your AI Chatbot Just Exposed Your CEO's Salary to an Intern
By Om-Shree-0709 on July 2, 2026.
Agent Identity
MCP Security
OAuth Delegation
Why MCP Servers Need Execution Sandboxing (And Why Your Current Stack Isn't Enough)
By Om-Shree-0709 on June 30, 2026.
Agentic Ai
Prompt Injection
WebAssembly
Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
OpenAI
open source

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/WhenLabs-org/when'

If you have feedback or need assistance with the MCP directory API, please join our Discord server