roadmap.md•3.77 kB
# Roadmap
This document gives an overview of the ongoing and future development of Serena.
If you have a proposal or want to discuss something, feel free to open a discussion
on Github. For a summary of the past development, see the [changelog](/CHANGELOG.md).
Want to see us reach our goals faster? You can help out with an issue, start a discussion, or
inform us about funding opportunities so that we can devote more time to the project.
## Overall Goals
Serena has the potential to be the go-to tool for most LLM coding tasks, since it is
unique in its ability to be used as MCP Server in any kind of environment
while still being a capable agent. We want to achieve the following goals in terms of functionality:
1. Top performance (comparable to API-based coding agents) when used through official (free) clients like Claude Desktop.
1. Lowering API costs and potentially improving performance of coding clients (Claude Code, Codex, Cline, Roo, Cursor/Windsurf/VSCode etc).
1. Transparency and simplicity of use. Achieved through the dashboard/logging GUI.
1. Integrations with major frameworks that don't accept MCP. Usable as a library.
Apart from the functional goals, we have the goal of having great code design, so that Serena can be viewed
as a reference for how to implement MCP Servers. Such projects are an emerging technology, and
best practices are yet to be determined. We will share our experiences in [lessons learned](/lessons_learned.md).
## Immediate/Ongoing
- Support for projects using multiple programming languages.
- Evaluate whether `ReplaceLinesTool` can be removed in favor of a more reliable and performant editing approach.
- Generally experiment with various approaches to editing tools
- Manual evaluation on selected tasks from SWE-verified
- Manual evaluation of cost-lowering and performance when used within popular non-MCP agents
- Improvements in prompts, in particular giving examples and extending modes and contexts
## Upcoming
- Publishing Serena as a package that can also be used as library
- Use linting and type-hierarchy from the LSP in tools
- Tools for refactoring (rename, move) - speculative, maybe won't do this.
- Tracking edits and rolling them back with the dashboard
- Improve configurability and safety of shell tool. Maybe autogeneration of tools from a list of commands and descriptions.
- Transparent comparison with DesktopCommander and ...
- Automatic evaluation using OpenHands, submission to SWE-Bench
- Evaluation whether incorporating other MCPs increases performance or usability (memory bank is a candidate)
- More documentation and best practices
## Stretch
- Allow for sandboxing and parallel instances of Serena, maybe use openhands or codex for that
- Incorporate a verifier model or generally a second model (maybe for applying edits) as a tool.
- Building on the above, allow for the second model itself to be reachable through an MCP server, so it can be used for free
- Tracking edits performed with shell tools
## Beyond Serena
The technologies and approaches taken in Serena can be used for various research and service ideas. Some thought that we had are:
- PR and issue assistant working with GitHub, similar to how [OpenHands](https://github.com/All-Hands-AI/OpenHands)
and [qodo](https://github.com/qodo-ai/pr-agent) operate. Should be callable through @serena
- Tuning a coding LLM with Serena's tools with RL on one-shot tasks. We would need compute-funding for that
- Develop a web app to quantitatively compare the performance of various agents by scraping PRs and manually crafted metadata.
The main metric for coding agents should be *developer experience*, and that is hard to grasp and is poorly correlated with
performance on current benchmarks.