Skip to main content
Glama
knishioka

IB Analytics MCP Server

by knishioka
validate-data.md7.43 kB
--- description: Validate CSV data integrity and format compliance allowed-tools: Read, Bash(python:*), Grep, Glob argument-hint: [csv-file-path|--latest] --- Perform comprehensive data validation on IB Flex Query CSV files to ensure data integrity and format compliance. ## Task Validate CSV data structure, content integrity, and business logic constraints. ### Validation Scope **File Selection**: - If $ARGUMENTS provides path: Validate specified CSV - If $ARGUMENTS contains `--latest`: Validate most recent file in `data/raw/` - Otherwise: Validate all CSVs in `data/raw/` ### Validation Checks **1. File Structure Validation** ```python # Check file exists and readable assert csv_file.exists() assert csv_file.stat().st_size > 0 # Check CSV format with open(csv_file) as f: first_line = f.readline() assert first_line.strip() # Not empty ``` **2. Section Detection** ```python # Verify expected sections present expected_sections = [ "Trades", "Open Positions", "Cash Report" ] for section in expected_sections: assert section in sections_found ``` **3. Date Format Validation** ```python # Trade dates in YYYYMMDD format for trade_date in trade_dates: assert len(trade_date) == 8 assert trade_date.isdigit() # Valid date range year = int(trade_date[:4]) month = int(trade_date[4:6]) day = int(trade_date[6:8]) assert 2000 <= year <= 2030 assert 1 <= month <= 12 assert 1 <= day <= 31 ``` **4. Numeric Field Validation** ```python # Prices, quantities must be numeric for field in numeric_fields: value = Decimal(field) assert value is not None # Financial values shouldn't be extremely large/small assert abs(value) < Decimal("1000000000") # $1B limit ``` **5. Account ID Consistency** ```python # Account ID format (e.g., U1234567) assert account_id.startswith("U") assert account_id[1:].isdigit() assert len(account_id) >= 8 # Consistent across all sections for section in sections: assert section.account_id == account_id ``` **6. Business Logic Validation** ```python # P&L calculations for trade in trades: calculated_pnl = (trade.sell_price - trade.buy_price) * trade.quantity assert abs(calculated_pnl - trade.pnl) < Decimal("0.01") # Allow rounding # Position consistency total_positions = sum(p.quantity for p in positions) assert total_positions > 0 or len(positions) == 0 # Cash balances total_cash = sum(cb.amount for cb in cash_balances) # Cash can be negative (margin account) ``` **7. Data Completeness** ```python # Required fields present required_fields = ["Symbol", "Quantity", "TradePrice", "TradeDate"] for field in required_fields: assert field in headers # No missing critical data for trade in trades: assert trade.symbol is not None assert trade.quantity != 0 assert trade.price > 0 ``` **8. Bond-Specific Validation** ```python # Bond symbols format (CUSIP) for position in bond_positions: assert len(position.symbol) == 9 # CUSIP length assert position.symbol[:6].isdigit() # Maturity date validation if position.maturity_date: assert position.maturity_date > date.today() ``` ### Expected Output ``` === Data Validation Report === File: data/raw/U1234567_2025-01-01_2025-10-05.csv Size: 2.4 MB Validated: 2025-10-11 14:45:00 ✅ FILE STRUCTURE - File exists and readable - Non-empty content (2,456,789 bytes) - Valid CSV format - UTF-8 encoding ✅ SECTION DETECTION Found all expected sections: - ✅ Trades (1,234 rows) - ✅ Open Positions (45 rows) - ✅ Cash Report (3 rows) - ✅ Account Information (1 row) ✅ DATE FORMAT VALIDATION - Trade dates: 1,234 validated - Format: YYYYMMDD (all valid) - Date range: 2025-01-01 to 2025-10-05 - ✅ No invalid dates - ✅ No future dates ✅ NUMERIC FIELD VALIDATION - Price fields: 1,234 validated (all Decimal) - Quantity fields: 1,234 validated - P&L fields: 1,234 validated - ✅ No invalid numeric values - ✅ No extreme outliers ✅ ACCOUNT ID CONSISTENCY - Account ID: U1234567 - Format: Valid (U + 8 digits) - ✅ Consistent across all sections ✅ BUSINESS LOGIC VALIDATION P&L Calculations: - Validated: 1,234 trades - Matches: 1,232 (99.8%) - Discrepancies: 2 (within rounding tolerance <$0.01) Position Consistency: - Total positions: 45 - Total quantity: 12,345 shares - ✅ All positions have valid quantities Cash Balances: - USD: $234,567.89 - Total: $234,567.89 - ✅ Cash balance is reasonable ✅ DATA COMPLETENESS Required Fields: - Symbol: 100% present - Quantity: 100% present - TradePrice: 100% present - TradeDate: 100% present - ✅ No missing critical data Optional Fields: - Commission: 98% present (28 missing, acceptable) - IBOrderID: 100% present ✅ BOND-SPECIFIC VALIDATION Bond Positions: 12 found - CUSIP format: ✅ All valid (9 characters) - Maturity dates: 10 present, 2 missing (acceptable) - ✅ Future maturity dates (2028-2030) ⚠️ WARNINGS (Non-Critical) 1. Commission Missing (28 trades) - Trades: #145, #267, #389, ... (25 more) - Impact: Cost analysis may be incomplete - Severity: LOW - Action: Review with broker if needed 2. Bond Maturity Missing (2 positions) - Symbols: 912810XX, 912810YY - Impact: YTM calculation not possible - Severity: MEDIUM - Action: Manually add maturity dates === VALIDATION SUMMARY === Status: ✅ PASSED WITH WARNINGS Critical Checks: 8/8 passed (100%) - File Structure: ✅ - Section Detection: ✅ - Date Format: ✅ - Numeric Fields: ✅ - Account ID: ✅ - Business Logic: ✅ - Data Completeness: ✅ - Bond Validation: ✅ Warnings: 2 non-critical issues - Missing commission data (28 trades, 2%) - Missing maturity dates (2 bonds, 17%) Data Quality Score: 96/100 Recommended Actions: 1. ✅ Safe to use for analysis 2. 💡 Review missing commission data if cost analysis needed 3. 💡 Add maturity dates for complete bond analysis === DETAILED ERROR LOG === (No critical errors found) === DATA STATISTICS === Trades: - Total: 1,234 - Date range: 186 days - Avg per day: 6.6 trades - Symbols: 67 unique Positions: - Total: 45 - Asset types: Stocks (33), Bonds (12) - Total value: $1,234,567 (estimated) Cash: - USD: $234,567.89 - Other currencies: None Date Coverage: - Start: 2025-01-01 - End: 2025-10-05 - Days: 186 - Trading days: 124 (estimated) === VALIDATION COMPLETE === File ready for analysis ✅ ``` ### Validation Levels **Quick Validation** (default): - File structure - Section detection - Date formats - Account ID consistency **Full Validation** (if $ARGUMENTS contains `--full`): - All quick checks - Business logic validation - P&L calculations - Statistical analysis - Outlier detection **Strict Validation** (if $ARGUMENTS contains `--strict`): - Zero tolerance for warnings - All fields must be present - Perfect P&L matching required ### Error Reporting **Critical Errors** (block analysis): - File not found - Corrupted CSV structure - Invalid account ID - Missing required sections - Invalid date formats **Warnings** (allow analysis with caution): - Missing optional fields - Minor P&L discrepancies (< $0.01) - Missing maturity dates for some bonds ### Examples ``` /validate-data /validate-data --latest /validate-data data/raw/U1234567_2025-01-01_2025-10-05.csv /validate-data --full /validate-data --strict ``` Run this command before analysis to ensure data quality and identify potential issues early.

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/knishioka/ib-sec-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server