MCP
MCP has 2 components:
- Server
- Client
I added a third component: Ingest, where we ingest the data into pinecone
The server is the backbone. It is possible to have it locally hosted for a client such as Claude Desktop, or have it hosted on an ip, so a client like the Streamlit app can work with it.
Both examples are shown here. Instructions on how to run them are below.
Excuse my lack of using proper Software Engineering coding practises and structures. Was on a time crunch
SEC Filings Chatbot
There are 4 folders in this repo
ingest
data
server
client
Ingest
This is where the data is ingested into pinecone
First we summarize
There are two agents, stored in oai_agents.py
:
Data Summarizer Agent
Data Verifier Agent
The Summarizer agent makes a summary of each of the sec filing documents.
The Verifier agent verifies the summary.
The prompts are stored int prompt.py
At the end of this step, which can be run by running python summarizer.py
, we have a summary of each of the files for every company.
The quaterly file gets 10 bullet points, and the yearly gets 20 bullet points.
Then we ingest.
Each file is chunked into 1024
with overlap of 128
.
The sumamry of each file is added to the respective chunks.
Finally all of this is ingested to Pinecone.
This can be replicated by running python ingest.py
Data
This is where the SEC filings text files as well as the summary of each file lives
Server
This is the mcp Server.
prompt.py
has all the prompts.
chatbot.py
has all the chatbot logic, using OpenAI's responses api with structured outputs.
pc.py
has all the logic to retrieve the docs from pinecone. It also has reranking, but that needs to be turned on
There are two servers:
server.py
- This is a local server. You can run it on something like Claude Desktop.server_host.py
- This is a hosted server. You can run it with the front end client
To run the Claude Desktop server, simply add this json to claude_desktop_config.json
Restart Claude, and the tools and reseources should appear
Resources:
- Greetings
resource://greeting
- List of companies
resource://companies
Tools
get_sec_files
- Takes a compny Ticker and returns the SEC files available for that companyquery_sec
- Takes the users query and answers the question
Client
This is a streamlit client that imulates a server, client relationship.
There is a Pydantic AI agent that uses the tools from the mcp server to make it run. Thats in client_runner.py
Streamlit app was completely vibecoded.
To run the client, first run the server by running python server_host.py
in the server
folder
Then run the client by running streamlit run app.py
in the client
folder
This server cannot be installed
hybrid server
The server is able to function both locally and remotely, depending on the configuration or use case.
Enables querying and analysis of SEC filing documents through natural language. Uses Pinecone vector search with document summarization to help users retrieve and understand financial filings for various companies.