Skip to main content
Glama
TRAINING_DATA_SOURCES.md3.02 kB
# Training Data Sources & Import Guide ## Pre-Loaded Training Data The MCP server now includes pre-loaded training data extracted from: ### 1. Intigriti CSRF Guide **Source:** https://www.intigriti.com/researchers/blog/hacking-tools/csrf-a-complete-guide-to-exploiting-advanced-csrf-vulnerabilities **Techniques Included:** - Basic CSRF (form-based attacks) - Content-Type bypass (text/plain for JSON APIs) - Method-based CSRF (HTTP method changes) - Anti-CSRF token bypass (4 methods) - Referrer-based CSRF bypass **Import:** Already included in `training.import_all` ### 2. PortSwigger Research **Sources:** - Top 10 Web Hacking Techniques of 2023 - PortSwigger Academy labs - Research articles **Patterns Included:** - XSS (Reflected, DOM-based) - SQL Injection (Basic, Time-based) - Advanced exploitation techniques ### 3. InfoSec Writeups **Source:** https://infosecwriteups.com/ **Patterns Included:** - User Registration vulnerabilities - Email/Username enumeration - Weak input validation ### 4. Google Dorking **Source:** Medium article on Google Dorking **Patterns Included:** - Filetype dorking - InURL dorking - InTitle dorking - Custom dork crafting ### 5. HackerOne Writeups **Pattern:** GET-based CSRF leading to PII leak **Example from Image:** - Technique: GET request CSRF - Impact: One-click customer data leak - Pattern: `/?corelationID=xxx&email=victim@mail.com&accept=true` ## How to Import ### Quick Import (All Sources) ``` training.import_all ``` ### Import Specific Sources ``` training.import_all { "sources": ["csrf", "xss", "sqli"] } ``` ### Extract from Writeup Text ``` training.extract_from_writeup { "writeupText": "Full writeup text here...", "vulnerabilityType": "CSRF", "source": "hackerone" } ``` ## Training Data Structure Each training entry contains: - `source`: Where it came from (intigriti, portswigger, etc.) - `sourceId`: Unique identifier - `vulnerabilityType`: Type of vulnerability - `targetPattern`: URL/endpoint pattern to match - `payloadPattern`: Payload pattern that works - `successPattern`: Response pattern indicating success - `failurePattern`: Response pattern indicating failure - `contextData`: Additional context and examples - `score`: Severity score (0-10) ## Pattern Matching The system uses simple but effective pattern matching: 1. **Target Matching**: Checks if target URL matches learned patterns 2. **Payload Matching**: Compares payloads against learned patterns 3. **Response Matching**: Looks for success/failure indicators 4. **Confidence Scoring**: Returns confidence (0-1) for matches ## Continuous Learning As you: - Import more training data - Run tests (automatically saved) - Extract from writeups The pattern matcher improves its detection accuracy! ## No LLM Required This system uses: - ✅ Pattern matching (fast, local) - ✅ Statistical learning - ✅ Rule-based systems - ❌ No large language models - ❌ No API costs - ❌ No training needed Just import data and start using it!

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/telmon95/VulneraMCP'

If you have feedback or need assistance with the MCP directory API, please join our Discord server