A comprehensive legal research tool for accessing and analyzing Polish legal acts from the Sejm API, covering both Dziennik Ustaw (Official Journal of Laws) and Monitor Polski (Polish Monitor).
Search & Discovery - Advanced multi-criteria search by year, title, keywords, document type, effectiveness dates, and active status; browse complete annual collections by publisher with pagination support
Document Analysis - Retrieve full metadata, content (PDF/HTML), hierarchical table of contents, legal relationships (references and amendments), and document lifecycle tracking
Reference Data - Access legal keywords, publishers, document statuses (active/repealed/consolidated), document types (laws/regulations/ordinances), and involved institutions (ministries, authorities, organizations)
Date Utilities - Get current date in legal format (YYYY-MM-DD) and calculate date offsets (days/months/years) for legal periods and deadlines
Utilizes Git version control for repository management and development workflow
Hosted on GitHub for source code management, distribution, and collaborative development
Built using Python programming language with FastMCP framework for MCP server implementation
Displays project status badges for Python version, license, and version information
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@Law Scrapper MCPFind recent regulations about data protection from the last 2 years"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
Law Scrapper MCP
A comprehensive Model Context Protocol (MCP) server for accessing and analyzing Polish legal acts from the Sejm API, enabling AI-powered legal research and document analysis.
Features
Comprehensive legal act access - Full access to Polish legal acts from Dziennik Ustaw (DU) and Monitor Polski (MP)
Advanced search and filtering - Multi-criteria search by date, type, keywords, publisher, and status
Result Store with chained filtering - Store search results and filter with regex, type/status/year match, date ranges, sorting
Document Store pattern - Load acts into memory for efficient section-level navigation and search
Detailed document analysis - Metadata, structure, references, and content retrieval
Content processing - Automatic PDF-to-text and HTML-to-Markdown conversion
Date calculations - Specialized date utilities for legal document analysis
System metadata - Keywords, statuses, document types, and institution data
FastMCP integration - Built with FastMCP framework, flexible transport options
Async HTTP client - Efficient httpx client with retry logic and connection pooling
TTL caching - Intelligent response caching with configurable TTL
Structured logging - JSON and text log formats for easy debugging
Docker support - Containerized deployment with docker-compose
Comprehensive documentation - Examples and clear parameter descriptions
Requirements
Python: 3.13 or higher
Package manager: uv (recommended) or pip
Internet connection: Required for accessing Sejm API endpoints
MCP-compatible tool: Cursor IDE, Claude Code, or other MCP clients
Installation
Using uv (recommended)
Using pip
Using uvx (no installation required)
For quick testing without cloning the repository:
Quick start
STDIO transport (default)
STDIO is the default transport for MCP communication. Start the server and connect from your MCP client:
Configure in your MCP client (e.g., Cursor .cursor/mcp.json):
For Claude Code:
HTTP transport (streamable-http)
Run the server on HTTP with streamable-http transport:
Configure in your MCP client:
Note: The URL must include the /mcp path. FastMCP exposes the streamable-http endpoint at /mcp, not at the root. Using http://localhost:7683 without /mcp results in 404 (Not Found).
Docker
Build and run with Docker:
Or use docker-compose:
Configuration
All settings are configured via environment variables with the LAW_MCP_ prefix:
Variable | Default | Description |
|
| Transport: |
|
| HTTP server host (when using streamable-http) |
|
| HTTP server port (when using streamable-http) |
|
| HTTP request timeout in seconds |
|
| Maximum concurrent API requests |
|
| Maximum API request retries |
|
| Metadata cache TTL (24 hours) |
|
| Search results cache TTL (10 minutes) |
|
| Browse results cache TTL (1 hour) |
|
| Act details cache TTL (1 hour) |
|
| Changes tracking cache TTL (5 minutes) |
|
| Maximum cache entries |
|
| Maximum documents in Document Store |
|
| Maximum Document Store size (5 MB) |
|
| Document Store TTL (2 hours) |
|
| Failures before circuit breaker opens |
|
| Seconds before trying recovery |
|
| Test calls in half-open state |
|
| Log level: DEBUG, INFO, WARNING, ERROR |
|
| Log format: |
Example environment configuration:
Tools reference
Law Scrapper MCP provides 13 tools for legal research and analysis:
1. get_system_metadata(category)
Retrieve system metadata for filtering and searching legal acts.
Parameters:
category(string, default: "all") - Metadata category: "keywords", "publishers", "statuses", "types", "institutions", or "all"
Returns: Keywords, publishers, document types, statuses, and institutions available in the system
Examples:
2. search_legal_acts(publisher, year, keywords, detail_level, status, type)
Search for legal acts with advanced filtering options.
Parameters:
publisher(string) - Publisher code: "DU" (Dziennik Ustaw) or "MP" (Monitor Polski)year(integer) - Publication year (e.g., 2024)keywords(string) - Search keywords (AND logic - use multiple searches for OR)detail_level(string, default: "standard") - Response detail: "minimal", "standard", or "full"status(string, optional) - Document status filtertype(string, optional) - Document type filter
Returns: List of matching legal acts with metadata
Search note: Multiple keywords use AND logic. Search one keyword at a time for OR behavior.
Examples:
3. browse_acts(publisher, year, detail_level)
Browse all legal acts published in a specific year by publisher.
Parameters:
publisher(string) - Publisher code: "DU" or "MP"year(integer) - Publication yeardetail_level(string, default: "standard") - Response detail: "minimal", "standard", or "full"
Returns: Complete list of acts published in the specified year
Examples:
4. filter_results(result_set_id, pattern, field, type_equals, ...)
Filter and narrow down previously retrieved search/browse/changes results.
Parameters:
result_set_id(string) - Result set ID from a previous search/browse/changes call (e.g., "rs_1")pattern(string, optional) - Regex pattern for text search (supports OR: "podatek|VAT|akcyza")field(string, default: "title") - Field to search: "title", "eli", "status", "type", "publisher"type_equals(string, optional) - Exact match on document type (e.g., "Ustawa", "Rozporządzenie")status_equals(string, optional) - Exact match on status (e.g., "akt obowiązujący", "akt uchylony")year_equals(integer, optional) - Exact match on publication yeardate_field(string, optional) - Date field for range filter: "promulgation_date" or "effective_date"date_from/date_to(string, optional) - Date range (YYYY-MM-DD)sort_by(string, optional) - Sort field: "title", "year", "pos", "promulgation_date", etc.sort_desc(boolean, default: false) - Sort descendinglimit(integer, optional) - Maximum results to return
Returns: Filtered results with a new result_set_id for chained filtering
Examples:
5. get_act_details(eli, load_content, detail_level)
Retrieve detailed information about a specific legal act and optionally load its content.
Parameters:
eli(string) - Act identifier in format "PUBLISHER/YEAR/NUMBER" (e.g., "DU/2024/1")load_content(boolean, default: false) - Load act content into Document Store for section readingdetail_level(string, default: "standard") - Response detail: "minimal", "standard", or "full"
Returns: Act metadata (title, publication date, status, type, etc.), table of contents if load_content=true
Examples:
6. read_act_content(eli, section)
Read content from a specific section of a loaded legal act.
Parameters:
eli(string) - Act identifier (must be loaded first via get_act_details with load_content=true)section(string) - Section to read (e.g., "Art. 1", "Chapter 2", "Preamble")
Returns: Content of the requested section
Workflow note: Must call get_act_details(eli="...", load_content=true) first, then use this tool.
Examples:
7. search_in_act(eli, query)
Search for specific terms within a loaded legal act.
Parameters:
eli(string) - Act identifier (must be loaded first via get_act_details with load_content=true)query(string) - Search term or phrase
Returns: Matching sections with context and location
Examples:
8. analyze_act_relationships(eli, relationship_type)
Analyze legal relationships and references of an act (amendments, references, etc.).
Parameters:
eli(string) - Act identifierrelationship_type(string, default: "all") - Type: "amends", "amended_by", "references", "referenced_by", or "all"
Returns: List of related acts and their relationships
Examples:
9. track_legal_changes(date_from, date_to, publisher, keywords)
Track legal changes and new acts within a date range.
Parameters:
date_from(string) - Start date (YYYY-MM-DD format)date_to(string) - End date (YYYY-MM-DD format)publisher(string, optional) - Filter by publisher: "DU" or "MP"keywords(string, optional) - Filter by keywords
Returns: Legal acts published in the date range
Examples:
10. calculate_legal_date(days, months, years, base_date)
Calculate legal dates with intuitive sign convention.
Parameters:
days(integer, default: 0) - Days offset (+future, -past)months(integer, default: 0) - Months offset (+future, -past)years(integer, default: 0) - Years offset (+future, -past)base_date(string, optional) - Base date (YYYY, YYYY-MM, or YYYY-MM-DD format, defaults to today)
Returns: Calculated date and relative description
Sign convention: Positive = future, Negative = past
Examples:
11. compare_acts(eli_a, eli_b)
Compare metadata of two legal acts.
Parameters:
eli_a(string) - ELI identifier of the first act (e.g., "DU/2024/1692")eli_b(string) - ELI identifier of the second act (e.g., "DU/2024/1716")
Returns: Comparison of titles, types, statuses, dates, keywords overlap and differences
Examples:
12. list_result_sets()
Display active result sets stored in memory.
Returns: List of result sets with IDs, query summaries, counts, and creation times
13. list_loaded_documents()
Display documents loaded into the Document Store.
Returns: List of loaded documents with ELIs, sizes, section counts, and timestamps
Document Store workflow
The Document Store pattern enables efficient content navigation and search within legal acts:
Workflow steps
Load an act - Call
get_act_details(eli="DU/2024/1", load_content=true)to load the act into the Document StoreRead sections - Use
read_act_content(eli="DU/2024/1", section="Art. 1")to read specific sectionsSearch within act - Use
search_in_act(eli="DU/2024/1", query="penalty")to find terms
Benefits
Efficient memory usage (configurable max documents and TTL)
Fast section-level navigation without refetching
Search within loaded acts without API calls
Automatic content processing (PDF→text, HTML→Markdown)
Configuration
LAW_MCP_DOC_STORE_MAX_DOCUMENTS- How many acts to keep in memory (default: 10)LAW_MCP_DOC_STORE_MAX_SIZE_BYTES- Maximum memory usage (default: 5 MB)LAW_MCP_DOC_STORE_TTL- How long to keep acts in memory (default: 2 hours)
Project structure
Docker
Dockerfile
The included Dockerfile builds a containerized Law Scrapper MCP server:
Build and run:
docker-compose.yml
Deployment with docker-compose:
Migration guide (v1 to v2)
If upgrading from v1.0.2, note these breaking changes:
v1.0.2 (old) | v2.0.0 (new) | Notes |
|
| Call with no parameters for current date |
|
| Use intuitive +future/-past sign convention |
|
| Consolidated into one tool |
|
| Consolidated into one tool |
|
| Consolidated into one tool |
|
| Consolidated into one tool |
|
| Consolidated into one tool |
| N/A | Use |
|
| Enhanced with |
|
| Renamed for clarity |
|
| Added |
|
| Requires pre-loading with |
|
| TOC included in details response |
|
| Renamed for clarity |
ELI format | Single string "DU/2024/1" | Changed from separate parameters |
SSE transport | STDIO (default) | STDIO is default, HTTP via streamable-http |
Port 7683 | Port 7683 | Same default HTTP port |
What's new in v2.3.0
3 new tools —
compare_acts,list_result_sets,list_loaded_documents(total: 13 tools)Circuit breaker — Protects against cascading failures when Sejm API is unavailable
Centralized error handling —
@handle_tool_errorsdecorator with error classification and full tracebacksasyncio.Lock migration — All stores use
asyncio.Lockfor proper async compatibilityDefault search limit — Search/browse return max 20 results by default to limit token usage
Health endpoint —
/healthfor Docker deployments with streamable-http transportPolish error messages — All exception messages in Polish for consistent user experience
Decision tree docstrings — "When to use" / "When NOT to use" for all tools
Development
Setup
Running tests
Code quality
The project follows FastMCP best practices:
Modular architecture - Separated concerns (models, client, services, tools)
Type hints - Full type annotation with Pydantic models
Async throughout - Async/await for all I/O operations
Comprehensive examples - Minimum 5 examples per tool
Tagged tools - Organized by category for easy discovery
Annotated parameters - Clear descriptions for all inputs
Structured logging - Configurable JSON/text formats
Running the server
Contributing
Fork the repository
Create your feature branch (
git checkout -b feature/amazing-feature)Commit your changes using Conventional Commits format
Add tests for new functionality
Ensure all tests pass and coverage is maintained
Push to the branch (
git push origin feature/amazing-feature)Open a Pull Request
Development guidelines
Follow FastMCP best practices for tool definitions
Include comprehensive examples and parameter descriptions
Add appropriate tags for tool categorization
Write async code throughout
Add tests for all new functionality
Update CHANGELOG.md with your changes
Use English for all code comments and documentation
License
This project is licensed under the MIT License. See the LICENSE file for details.
Author
Developed with help from:
And with models:
Legal disclaimer: This tool provides access to Polish legal documents for research purposes. Always consult with qualified legal professionals for legal advice and interpretation of laws.