Skip to main content
Glama

Serena MCP Server

by lin2000wl
lessons_learned.md4.55 kB
# Lessons Learned In this document we briefly collect what we have learned while developing and using Serena, what works well and what doesn't. ## What Worked ### Separate Tool Logic From MCP Implementation MCP is just another protocol, one should let the details of it creep into the application logic. The official docs suggest using function annotations to define tools and prompts. While that may be useful for small projects to get going fast, it is not wise for more serious projects. In Serena, all tools are defined independently and then converted to instances of `MCPTool` using our `make_tool` function. ### Autogenerated PromptFactory Prompt templates are central for most LLM applications, so one needs good representations of them in the code, while at the same time they often need to be customizable and exposed to users. In Serena we address these conflicting needs by defining prompt templates (in jinja format) in separate yamls that users can easily modify and by autogenerated a `PromptFactory` class with meaningful method and parameter names from these yamls. The latter is committed to our code. We separated out the generation logic into the [interprompt](/src/interprompt/README.md) subpackage that can be used as a library. ### Tempfiles and Snapshots for Testing of Editing Tools We test most aspects of Serena by having a small "project" for each supported language in `tests/resources`. For the editing tools, which would change the code in these projects, we use tempfiles to copy over the code. The pretty awesome [syrupy](https://github.com/syrupy-project/syrupy) pytest plugin helped in developing snapshot tests. ### Dashboard and GUI for Logging It is very useful to know what the MCP Server is doing. We collect and display logs in a GUI or a web dashboard, which helps a lot in seeing what's going on and in identifying any issues. ### Unrestricted Bash Tool We know it's not particularly safe to permit unlimited shell commands outside a sandbox, but we did quite some evaluations and so far... nothing bad has happened. Seems like the current versions of the AI overlords rarely want to execute `sudo rm - rf /`. Still, we are working on a safer approach as well as better integration with sandboxing. ### Multilspy The [multilspy](https://github.com/microsoft/multilspy/) project helped us a lot in getting started and stands at the core of Serena. Many more well known python implementations of language servers were subpar in code quality and design (for example, missing types). ### Developing Serena with Serena We clearly notice that the better the tool gets, the easier it is to make it even better ## Prompting ### Shouting and Emotive Language May Be Needed When developing the `ReplaceRegexTool` we were initially not able to make Claude 4 (in Claude Desktop) use wildcards to save on output tokens. Neither examples nor explicit instructions helped. It was only after adding ``` IMPORTANT: REMEMBER TO USE WILDCARDS WHEN APPROPRIATE! I WILL BE VERY UNHAPPY IF YOU WRITE LONG REGEXES WITHOUT USING WILDCARDS INSTEAD! ``` to the initial instructions and to the tool description that Claude finally started following the instructions. ## What Didn't Work ### Lifespan Handling by MCP Clients The MCP technology is clearly very green. Even though there is a lifespan context in the MCP SDK, many clients, including Claude Desktop, fail to properly clean up, leaving zombie processes behind. We mitigate this through the GUI window and the dashboard, so the user sees whether Serena is running and can terminate it there. ### Trusting Asyncio Running multiple asyncio apps led to non-deterministic event loop contamination and deadlocks, which were very hard to debug and understand. We solved this with a large hammer, by putting all asyncio apps into a separate process. It made the code much more complex and slightly enhanced RAM requirements, but it seems like that was the only way to reliably overcome asyncio deadlock issues. ### Cross-OS Tkinter GUI Different OS have different limitations when it comes to starting a window or dealing with Tkinter installations. This was so messy to get right that we pivoted to a web-dashboard instead ### Editing Based on Line Numbers Not only are LLMs notoriously bad in counting, but also the line numbers change after edit operations, and LLMs are also often too dumb to understand that they should update the line numbers information they had received before. We pivoted to string-matching and symbol-name based editing.

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/lin2000wl/Serena-cursor-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server