AusLaw MCP

MIT License

Overview InspectNew Endpoints Schema Related Servers Reviews Score

auslaw-mcp

AGENTS.md•7.62 kB

# AI Agent Instructions for AusLaw MCP This document provides guidance for AI agents (Claude Code, Cursor, etc.) working on this project. ## Project Overview **AusLaw MCP** is a Model Context Protocol (MCP) server that searches Australian legal databases (AustLII, jade.io) and retrieves full-text judgements with citation support. It's designed for legal research workflows requiring primary source authorities. **Primary users**: Legal researchers, law students, lawyers conducting case research **Key requirement**: Always return the most authoritative, relevant version of legal materials ## Architecture ``` src/ ├── index.ts # MCP server setup & tool registration ├── services/ │ ├── austlii.ts # AustLII search integration │ └── fetcher.ts # Document text retrieval (HTML/PDF/OCR) ├── utils/ │ └── formatter.ts # Result formatting (JSON/text/markdown/html) └── test/ └── scenarios.test.ts # Real-world integration tests ``` ## Core Principles ### 1. Primary Sources Only - **NEVER** return journal articles, commentary, or secondary sources - **ALWAYS** filter URLs containing `/journals/` - Focus: Cases from `/cases/` and legislation from `/legis/` ### 2. Citation Accuracy - Extract and preserve neutral citations: `[2025] HCA 26` - Preserve paragraph numbers in `[N]` format - Future: Extract page numbers for reported citations ### 3. Search Quality - ✅ **FIXED**: Intelligent sorting now returns the actual case being searched for - **Implementation**: Auto-detects case name queries vs topic searches - Case names ("X v Y", "Re X", citations) → relevance sorting - Topics ("negligence duty of care") → date sorting for recent cases - **Configuration**: `sortBy` parameter supports "auto" (default), "relevance", "date" ### 4. Real-World Testing - Tests hit live AustLII API (non-deterministic) - Validate with actual legal queries (e.g., "negligence duty of care") - Test scenarios in `src/test/scenarios.test.ts` must pass - 14 test scenarios covering search quality, relevance, and sorting modes ## Development Guidelines ### When Adding Features 1. **Check existing issues**: See GitHub Issues for planned work 2. **Update tests**: Add test scenarios for new functionality 3. **Maintain filtering**: Ensure journal articles remain excluded 4. **Preserve structure**: Keep paragraph numbers intact in text extraction 5. **Update docs**: Modify README.md and ROADMAP.md as needed ### Code Style - **TypeScript strict mode**: All code must type-check with `npm run build` - **Error handling**: Wrap network calls in try/catch with descriptive errors - **Interfaces first**: Define TypeScript interfaces before implementation - **No magic strings**: Use enums/constants for repeated values ### Testing Requirements Every PR must include: - ✅ TypeScript compilation passes (`npm run build`) - ✅ All tests pass (`npm test`) - ✅ New tests for new features - ✅ Tests validate real behaviour (not static mocks) ### Search Implementation Notes **Current AustLII search** (`src/services/austlii.ts`): - Uses `https://classic.austlii.edu.au/cgi-bin/sinosrch.cgi` - Parameters: `method=boolean`, `query=...`, `meta=/austlii`, `view=date|relevance` - Parses `<ol><li>` result structure with Cheerio - **Smart query detection**: - `isCaseNameQuery()`: Detects "X v Y", "Re X", citation patterns, quoted strings - `determineSortMode()`: Auto-selects appropriate sorting - `boostTitleMatches()`: Re-ranks results by title match score for case name queries - **Configurable sorting**: Explicit control via `sortBy` parameter when needed **Document fetching** (`src/services/fetcher.ts`): - Handles HTML, PDF, and OCR fallback (Tesseract) - Extracts text while preserving `[N]` paragraph markers - **Limitation**: Page numbers from reported judgements not extracted ## Common Tasks ### Adding a New Search Source ```typescript // 1. Create new service file // src/services/jade.ts export async function searchJade( query: string, options: SearchOptions ): Promise<SearchResult[]> { // Implementation } // 2. Update search to merge sources const [austliiResults, jadeResults] = await Promise.all([ searchAustLii(query, options), searchJade(query, options), ]); // 3. Deduplicate by citation const merged = deduplicateResults([...austliiResults, ...jadeResults]); // 4. Add tests it("should merge results from multiple sources", async () => { // Test implementation }); ``` ### Improving Text Extraction ```typescript // When adding new structural preservation: function extractTextFromHtml(html: string): string { const $ = cheerio.load(html); // Preserve paragraph numbers $('[class*="para"]').each((_, el) => { const paraNum = $(el).attr('data-para-num'); if (paraNum) { $(el).prepend(`[${paraNum}] `); } }); // Extract preserving structure return $('body').text(); } ``` ### Adding Search Parameters **Example: The `sortBy` parameter (already implemented)** ```typescript // 1. Update SearchOptions interface in src/services/austlii.ts export interface SearchOptions { jurisdiction?: "cth" | "vic" | "federal" | "other"; limit?: number; type: "case" | "legislation"; sortBy?: "relevance" | "date" | "auto"; // ✅ IMPLEMENTED } // 2. Update Zod schema in src/index.ts const sortByEnum = z.enum(["relevance", "date", "auto"]).default("auto"); const searchCasesShape = { query: z.string().min(1), jurisdiction: jurisdictionEnum.optional(), limit: z.number().int().min(1).max(50).optional(), sortBy: sortByEnum.optional(), // ✅ IMPLEMENTED }; // 3. Implement smart detection in src/services/austlii.ts const sortMode = determineSortMode(query, options); if (sortMode === "relevance") { searchUrl.searchParams.set("view", "relevance"); } else { searchUrl.searchParams.set("view", "date"); } // 4. Add post-processing for better results if (sortMode === "relevance" && isCaseNameQuery(query)) { finalResults = boostTitleMatches(results, query); } ``` ## Known Issues & Workarounds ### ~~Issue: Search returns citing cases, not target case~~ ✅ FIXED **Status**: Resolved in this PR **Solution**: Implemented intelligent sorting with auto-detection and title matching **Details**: See Phase 1 implementation in ROADMAP.md ### Issue: Page numbers lost in extraction **Workaround**: Use paragraph numbers for pinpoints **Fix planned**: Parse page markers from reported judgement HTML ### Issue: No deduplication across sources **Workaround**: Currently only one source (AustLII) **Fix planned**: Citation-based deduplication in Phase 2 ## Resources - **AustLII Search Help**: https://www.austlii.edu.au/austlii/help/search.html - **MCP Specification**: https://modelcontextprotocol.io/ - **Project Roadmap**: `docs/ROADMAP.md` - **Test Coverage**: Run `npm test` to see real-world scenarios ## Critical Reminders ⚠️ **NEVER commit without building**: Run `npm run build` before committing ⚠️ **NEVER skip tests**: All tests must pass before pushing ⚠️ **ALWAYS preserve paragraph numbers**: They're critical for citations ⚠️ **NEVER include journal articles**: Primary sources only ## Getting Help - Check `docs/ROADMAP.md` for planned features - Review existing issues on GitHub - Run tests to understand expected behaviour - Check test scenarios for usage examples --- **Remember**: This is a legal research tool. Accuracy and authority of sources are paramount. When in doubt, prioritise returning the most authoritative version of a judgement over returning more results.

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/russellbrenner/auslaw-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server