USAGE.md•5.92 kB
# Usage Guide
This guide shows you how to use the Investment Statement MCP Server.
## Quick Start
### 1. Install Dependencies
```bash
pip install -e ".[dev]"
```
This will install all required packages including:
- FastMCP (MCP server framework)
- pymupdf4llm (PDF parsing)
- LanceDB (vector database)
- sentence-transformers (embeddings)
- aiosqlite (async SQLite)
### 2. Index Your First Statement
You can index statements in two ways:
#### Option A: Using the Example Script
```bash
python examples/index_and_query.py
```
This will:
1. Index the sample Questrade statement
2. Run example searches and queries
3. Display results
#### Option B: Using the MCP Server with Claude Desktop
1. Add the server to your Claude Desktop configuration:
```json
{
"mcpServers": {
"investment-statements": {
"command": "python",
"args": ["-m", "src.server"],
"cwd": "/Users/jasonlo/Repos/AI/investing-mcp",
"env": {
"LANCEDB_PATH": "./data/lancedb",
"SQLITE_PATH": "./data/statements.db",
"PDF_ARCHIVE_PATH": "./data/pdfs",
"JSON_ARCHIVE_PATH": "./data/json",
"MODEL_NAME": "all-MiniLM-L6-v2"
}
}
}
}
```
2. Restart Claude Desktop
3. Use the tools through Claude:
```
You: Index this statement: /path/to/statement.pdf
Claude: [Uses index_statement tool]
You: What were my SPY holdings in September 2025?
Claude: [Uses search_statements or get_holdings_by_symbol]
```
## Available MCP Tools
### Indexing
**`index_statement(file_path: str)`**
Index a new PDF statement. This will:
- Parse the PDF and extract data
- Store structured data in SQLite
- Generate embeddings and store in LanceDB
- Archive the PDF and JSON
```python
await index_statement("/path/to/statement.pdf")
```
**`reindex_statement(statement_id: str)`**
Reindex an existing statement from its original PDF.
**`get_indexing_stats()`**
Get statistics about indexed data.
### Search
**`search_statements(query: str, limit: int = 10)`**
Search using natural language. Examples:
```
"What were my total dividends in Q3 2025?"
"Show me all SPY holdings"
"When did I make deposits?"
```
### Structured Queries
**`get_holdings_by_symbol(symbol: str)`**
Get all holdings for a specific symbol across all statements.
```python
await get_holdings_by_symbol("SPY")
```
**`get_transactions_by_date_range(start_date: str, end_date: str, account_number: str = None)`**
Get transactions in a date range.
```python
await get_transactions_by_date_range("2025-01-01", "2025-12-31")
```
**`get_account_balance(account_number: str, date: str)`**
Get account balance at a specific date.
```python
await get_account_balance("51516162", "2025-09-30")
```
## Available MCP Resources
Resources provide read-only access to data:
**`statements://catalog`**
List all indexed statements.
**`statements://{statement_id}`**
Get specific statement data.
**`statements://{statement_id}/summary`**
Get account summary for a statement.
## Example Queries
### Natural Language Search
```
"What is my current SPY position?"
"Show me all dividends received"
"What was my account balance in September?"
"How much cash do I have in USD?"
```
### Structured Queries
```python
# Track a position over time
holdings = await get_holdings_by_symbol("SPY")
for h in holdings:
print(f"{h['statement_date']}: {h['quantity']} shares @ ${h['market_price']}")
# Find all trades in a period
transactions = await get_transactions_by_date_range("2025-01-01", "2025-12-31")
dividends = [t for t in transactions if "DIV" in t['activity_type']]
# Get historical balances
balance = await get_account_balance("51516162", "2025-09-30")
```
## Data Storage
The MCP server uses a hybrid storage approach:
### SQLite (./data/statements.db)
- Structured data (statements, holdings, transactions)
- Fast indexed queries
- Metadata and relationships
### LanceDB (./data/lancedb/)
- Vector embeddings for semantic search
- Document chunks with metadata
- Similarity search
### Archives
- `./data/pdfs/` - Original PDF files
- `./data/json/` - Parsed JSON data
## Supported Institutions
### Currently Supported
**Questrade** ✅
- Account summary
- Holdings/positions
- Transactions
- TFSA-specific data
### Coming Soon
**Interactive Brokers** ⏳
**Scotia** ⏳
## Troubleshooting
### PDF Parsing Issues
If a statement fails to parse:
1. Check the PDF is readable: `pdftotext statement.pdf`
2. Verify it's a Questrade statement
3. Try reindexing: `await reindex_statement(statement_id)`
### Database Issues
Reset the databases:
```bash
rm -rf data/
python examples/index_and_query.py
```
### Search Not Finding Results
1. Check if statement is indexed: `await get_indexing_stats()`
2. Try broader search terms
3. Use structured queries for specific data
## Advanced Usage
### Batch Indexing
```python
import glob
pdf_files = glob.glob("/path/to/statements/*.pdf")
for pdf in pdf_files:
result = await index_statement(pdf)
print(f"Indexed: {result['statement_id']}")
```
### Custom Queries
You can directly access the database clients:
```python
from src.database.sqlite_client import SQLiteClient
db = SQLiteClient()
await db.initialize()
# Custom SQL query
async with aiosqlite.connect(db.db_path) as conn:
cursor = await conn.execute("SELECT * FROM holdings WHERE symbol = ?", ("SPY",))
results = await cursor.fetchall()
```
## Performance Tips
1. **Index in batches**: Index multiple statements before querying
2. **Use structured queries when possible**: They're faster than semantic search
3. **Limit search results**: Use the `limit` parameter to control result size
4. **Archive old statements**: Keep only recent statements in the active database
## Next Steps
1. Index your historical statements
2. Explore natural language queries
3. Build custom analysis scripts
4. Add support for other institutions (contributions welcome!)