WhenLabs/When

velocity_start_task

Start tracking a coding task and get a time estimate based on similar past work. Use before beginning bug fixes, features, or refactors to predict duration from historical data.

Instructions

Start a timer for a discrete coding task and, when historical data is available, return a duration estimate derived from similar past tasks.

When to use: before starting any distinct unit of work — a bug fix, a feature, a refactor, a test-writing pass. Use one task per logical unit; do not batch unrelated changes under a single task. Always pair with velocity_end_task so the task row is closed and the dataset stays clean.

Side effects: inserts a new row into the local SQLite database at ~/.velocity-mcp/tasks.db (override via HOME). Computes a best-effort duration prediction by querying historical rows of the same category/tags; predictions run locally and are cached per-task. Federated upload is disabled unless the user has explicitly opted in via velocity-mcp federation enable.

Returns: JSON with task_id (pass this to velocity_end_task), started_at ISO timestamp, message, and — when enough historical data exists — a prediction block containing point estimate in seconds, p25/p75 range, confidence (low/medium/high), whether the estimate was calibrated, and whether it drew on federated data.

Input Schema

TableJSON Schema

Name	Required	Description
`task_id`	No	Stable unique identifier for this task. Pass the same id later to `velocity_end_task`. Omit to have one auto-generated (UUID v4).
`category`	Yes	High-level category of the work: scaffold, implement, refactor, debug, test, config, docs, or deploy. Used for historical matching — pick the closest fit rather than inventing new categories.
`description`	Yes	One-sentence description of the task, specific enough that semantic-similarity matching can find comparable historical tasks (e.g. "wire sqlite migrations into the startup path" beats "db work").
`tags`	No	Free-form tags that describe the technical surface area (e.g. ["typescript", "react", "sqlite"]). Reuse tags across sessions — consistency improves the quality of historical-similarity matches.
`estimated_files`	No	Your a-priori guess for how many files you expect to touch. Used both as a similarity signal and to compute an accuracy residual when `velocity_end_task` supplies `actual_files`.
`project`	No	Project identifier (typically the repo name or directory basename). Auto-detected from the git remote or cwd if omitted.
`model_id`	No	Identifier of the model running this task (e.g. "claude-opus-4-7"). Used to segment calibration residuals by model so predictions adapt to model-specific pacing.
`context_tokens`	No	Approximate tokens already in the context window at task start. Stored as telemetry to correlate context pressure with task duration.
`parent_task_id`	No	If this task is a sub-task spawned from another, pass the parent task's id here so the hierarchy is preserved.
`parent_plan_id`	No	If this task is part of a larger plan being tracked as a unit, pass the plan run id so plan-level metrics can be sealed when the last task in the plan completes.

Tool Definition Quality

A4.9/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full burden and excels at disclosing behavioral traits. It describes side effects ('inserts a new row into the local SQLite database'), computational behavior ('Computes a best-effort duration prediction by querying historical rows'), caching behavior ('predictions run locally and are cached per-task'), and privacy/configuration details ('Federated upload is disabled unless the user has explicitly opted in').

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured and appropriately sized. It begins with the core purpose, then provides usage guidelines, side effects, and return values in logical sections. Every sentence adds value: the first explains what the tool does, the second provides usage context, the third details side effects and computational behavior, and the fourth specifies return values.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a complex tool with 10 parameters, no annotations, and no output schema, the description provides excellent completeness. It explains the tool's purpose, when to use it, behavioral characteristics, side effects, computational approach, privacy considerations, and detailed return structure. The description compensates fully for the lack of structured metadata about outputs and behavior.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the baseline is 3. The description adds value by explaining the overall purpose of parameters ('return a duration estimate derived from similar past tasks') and providing context about how parameters like category and tags affect historical matching. However, it doesn't provide specific guidance on parameter interactions or advanced usage patterns beyond what's in the schema descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose with specific verbs ('Start a timer', 'return a duration estimate') and resources ('discrete coding task', 'historical data'). It distinguishes from sibling 'velocity_end_task' by explaining this starts tasks while the other ends them, and from other siblings by focusing on time tracking with predictions.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit usage guidance: 'before starting any distinct unit of work' with examples (bug fix, feature, refactor), advises 'one task per logical unit', warns against batching unrelated changes, and explicitly states to 'Always pair with `velocity_end_task`'. It also mentions when not to use (when federated upload is disabled unless opted in).

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

Your AI Chatbot Just Exposed Your CEO's Salary to an Intern
By Om-Shree-0709 on July 2, 2026.
Agent Identity
MCP Security
OAuth Delegation
Why MCP Servers Need Execution Sandboxing (And Why Your Current Stack Isn't Enough)
By Om-Shree-0709 on June 30, 2026.
Agentic Ai
Prompt Injection
WebAssembly
Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
OpenAI
open source

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/WhenLabs-org/when'

If you have feedback or need assistance with the MCP directory API, please join our Discord server