search_within_teblig
Search within a specific Turkish communiqué (Tebliğ) by number using keyword or semantic queries. Supports Boolean operators (AND/OR/NOT) and natural language.
Instructions
Search within a specific communiqué's (Tebliğ) content using keyword or semantic search.
Tries article-based splitting first; if no articles found, falls back to chunk-based search.
Modes:
semantic=False (default): Keyword search with Boolean operators (AND/OR/NOT, uppercase required)
semantic=True: Natural language semantic search using AI embeddings (requires OPENROUTER_API_KEY)
Keyword examples: "vergi AND muafiyet", '"katma değer"', "istisna OR muafiyet" Semantic examples: "vergi muafiyeti koşulları", "KDV iade işlemleri"
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| mevzuat_no | Yes | The communiqué number to search within (e.g., '42331') | |
| keyword | Yes | Search query. For keyword mode: supports AND/OR/NOT operators (uppercase). For semantic mode: use natural language. | |
| mevzuat_tertip | No | Communiqué series from search results (e.g., '5') | 5 |
| case_sensitive | No | Whether to match case when searching (default: False). Only used in keyword mode. | |
| max_results | No | Maximum number of matching segments to return (1-50, default: 25) | |
| semantic | No | True: semantic search (natural language query, requires OPENROUTER_API_KEY). False: keyword search (Boolean operators AND/OR/NOT). |
Output Schema
| Name | Required | Description | Default |
|---|---|---|---|
| result | Yes |
Implementation Reference
- mevzuat_mcp_server.py:1647-1648 (registration)The @app.tool() decorator registering search_within_teblig as an MCP tool.
@app.tool() async def search_within_teblig( - mevzuat_mcp_server.py:1648-1729 (handler)The search_within_teblig handler function. It receives a communiqué number (mevzuat_no), keyword, tertip, case_sensitive, max_results, and semantic flag. For semantic search, it delegates to _semantic_search_within (line 1694-1696). For keyword search, it fetches content (line 1700-1702), tries article-based search first (line 1709-1718), and falls back to chunk-based keyword search via _keyword_search_chunks (line 1720-1725).
async def search_within_teblig( mevzuat_no: str = Field( ..., description="The communiqué number to search within (e.g., '42331')" ), keyword: str = Field( ..., description='Search query. For keyword mode: supports AND/OR/NOT operators (uppercase). For semantic mode: use natural language.' ), mevzuat_tertip: str = Field( "5", description="Communiqué series from search results (e.g., '5')" ), case_sensitive: bool = Field( False, description="Whether to match case when searching (default: False). Only used in keyword mode." ), max_results: int = Field( 25, ge=1, le=50, description="Maximum number of matching segments to return (1-50, default: 25)" ), semantic: bool = Field( False, description="True: semantic search (natural language query, requires OPENROUTER_API_KEY). False: keyword search (Boolean operators AND/OR/NOT)." ) ) -> str: """ Search within a specific communiqué's (Tebliğ) content using keyword or semantic search. Tries article-based splitting first; if no articles found, falls back to chunk-based search. Modes: - semantic=False (default): Keyword search with Boolean operators (AND/OR/NOT, uppercase required) - semantic=True: Natural language semantic search using AI embeddings (requires OPENROUTER_API_KEY) Keyword examples: "vergi AND muafiyet", '"katma değer"', "istisna OR muafiyet" Semantic examples: "vergi muafiyeti koşulları", "KDV iade işlemleri" """ logger.info(f"Tool 'search_within_teblig' called: {mevzuat_no}, keyword: '{keyword}', semantic: {semantic}") try: if semantic: if not SEMANTIC_SEARCH_AVAILABLE: return "Error: Semantic search requires OPENROUTER_API_KEY environment variable." return await _semantic_search_within( mevzuat_no=mevzuat_no, query=keyword, mevzuat_tur=9, mevzuat_tertip=mevzuat_tertip, max_results=max_results ) # Keyword search content_result = await mevzuat_client.get_content( mevzuat_no=mevzuat_no, mevzuat_tur=9, mevzuat_tertip=mevzuat_tertip ) if content_result.error_message: return f"Error fetching communiqué content: {content_result.error_message}" if not content_result.markdown_content: return f"Error: No content found for Tebliğ {mevzuat_no}" # Try article-based search first matches = search_articles_by_keyword( markdown_content=content_result.markdown_content, keyword=keyword, case_sensitive=case_sensitive, max_results=max_results ) if matches: result = ArticleSearchResult( mevzuat_no=mevzuat_no, mevzuat_tur=9, keyword=keyword, total_matches=len(matches), matching_articles=matches ) return format_search_results(result) # Fallback to chunk-based keyword search return await _keyword_search_chunks( content=content_result.markdown_content, keyword=keyword, mevzuat_no=mevzuat_no, mevzuat_tur=9, case_sensitive=case_sensitive, max_results=max_results ) except Exception as e: logger.exception(f"Error in tool 'search_within_teblig' for {mevzuat_no}") return f"An unexpected error occurred: {str(e)}" - mevzuat_mcp_server.py:185-247 (helper)The _keyword_search_chunks helper function used by search_within_teblig as fallback when article-based search finds no results (lines 1720-1725). It handles chunk-based keyword search, with special logic for Teblig (mevzuat_tur==9) to try article splitting first.
async def _keyword_search_chunks( content: str, keyword: str, mevzuat_no: str, mevzuat_tur: int, case_sensitive: bool = False, max_results: int = 25, ) -> str: """Keyword search for chunk-based content (no article structure).""" # Try article split first for Teblig if mevzuat_tur == 9: from article_search import split_into_articles as _split articles = _split(content) if articles: matches = search_articles_by_keyword(content, keyword, case_sensitive, max_results) if matches: result = ArticleSearchResult( mevzuat_no=mevzuat_no, mevzuat_tur=mevzuat_tur, keyword=keyword, total_matches=len(matches), matching_articles=matches ) return format_search_results(result) # Chunk-based keyword search from semantic_search.processor import MevzuatProcessor as _MevzuatProcessor processor = _processor if SEMANTIC_SEARCH_AVAILABLE else _MevzuatProcessor() chunks = processor.process_legislation(content, mevzuat_no, mevzuat_tur) if not chunks: return f"Error: Could not split content into searchable segments for mevzuat {mevzuat_no}" scored_chunks = [] for chunk in chunks: matches, score = _matches_query(chunk.text, keyword, case_sensitive) if matches and score > 0: scored_chunks.append((chunk, score)) scored_chunks.sort(key=lambda x: x[1], reverse=True) scored_chunks = scored_chunks[:max_results] if not scored_chunks: return f"No matches found for '{keyword}' in mevzuat {mevzuat_no}" output = [] output.append(f"Keyword: '{keyword}'") output.append(f"Total matching segments: {len(scored_chunks)}") output.append("") for chunk, score in scored_chunks: chunk_type = chunk.metadata.get('type', 'chunk') if chunk_type == 'article': madde_no = chunk.metadata.get('madde_no', '?') output.append(f"=== MADDE {madde_no} ===") else: chunk_idx = chunk.metadata.get('chunk_index', 0) total = chunk.metadata.get('total_chunks', 0) output.append(f"=== Chunk {chunk_idx + 1}/{total} ===") output.append(f"Matches: {score}") output.append("") output.append("Full content:") output.append(chunk.text) output.append("") return "\n".join(output) - mevzuat_mcp_server.py:94-182 (helper)The _semantic_search_within shared helper used when semantic=True (line 1694-1696). It handles embedding-based semantic search for any legislation type including Teblig.
async def _semantic_search_within( mevzuat_no: str, query: str, mevzuat_tur: int, mevzuat_tertip: str = "5", max_results: int = 10, threshold: float = 0.3, resmi_gazete_tarihi: Optional[str] = None, ) -> str: """Shared helper for semantic search within any legislation type.""" # 1. Get content with tertip fallback (already cached by mevzuat_client) content_result = await _get_content_with_tertip_fallback( mevzuat_no=mevzuat_no, mevzuat_tur=mevzuat_tur, mevzuat_tertip=mevzuat_tertip, resmi_gazete_tarihi=resmi_gazete_tarihi, ) if content_result.error_message: return f"Error fetching content: {content_result.error_message}" if not content_result.markdown_content: return f"Error: No content found for mevzuat {mevzuat_no}" content = content_result.markdown_content # 2. Check embedding cache cached = _embedding_cache.get(mevzuat_tur, mevzuat_tertip, mevzuat_no, content) if cached: vector_store, chunks = cached else: # 3. Process into chunks chunks = _processor.process_legislation(content, mevzuat_no, mevzuat_tur) if not chunks: return f"Error: Could not split content into searchable segments for mevzuat {mevzuat_no}" # 4. Encode documents texts = [c.text for c in chunks] titles = [c.title for c in chunks] embeddings = _embedder.encode_documents(texts, titles) # 5. Build vector store vector_store = VectorStore(dimension=_embedder.dimension) vector_store.add_documents( ids=[c.chunk_id for c in chunks], texts=texts, embeddings=embeddings, metadata=[c.metadata for c in chunks], ) # 6. Cache _embedding_cache.put(mevzuat_tur, mevzuat_tertip, mevzuat_no, content, vector_store, chunks) # 7. Search query_embedding = _embedder.encode_query(query) results = vector_store.search(query_embedding, top_k=max_results, threshold=threshold) if not results: return f"No semantically similar content found for '{query}' in mevzuat {mevzuat_no}" # 8. Format results # Determine method description chunk_type = chunks[0].metadata.get('type', 'chunk') if chunks else 'chunk' method = "Article-based semantic search" if chunk_type == 'article' else "Chunk-based semantic search" output = [] output.append("Semantic Search Results") output.append(f"Query: \"{query}\"") output.append(f"Legislation: {mevzuat_no} (type: {mevzuat_tur})") output.append(f"Method: {method} | Results: {len(results)}") output.append("") for doc, score in results: if chunk_type == 'article': madde_no = doc.metadata.get('madde_no', '?') madde_title = doc.metadata.get('madde_title', '') output.append(f"=== MADDE {madde_no} === (Similarity: {score:.2f})") if madde_title: output.append(f"Title: {madde_title}") else: chunk_idx = doc.metadata.get('chunk_index', 0) total = doc.metadata.get('total_chunks', 0) output.append(f"=== Chunk {chunk_idx + 1}/{total} === (Similarity: {score:.2f})") output.append("") output.append(doc.text) output.append("") return "\n".join(output) - mevzuat_mcp_server.py:1648-1675 (schema)The function signature and Field definitions serving as input schema for search_within_teblig, defining parameters: mevzuat_no, keyword, mevzuat_tertip, case_sensitive, max_results, and semantic.
async def search_within_teblig( mevzuat_no: str = Field( ..., description="The communiqué number to search within (e.g., '42331')" ), keyword: str = Field( ..., description='Search query. For keyword mode: supports AND/OR/NOT operators (uppercase). For semantic mode: use natural language.' ), mevzuat_tertip: str = Field( "5", description="Communiqué series from search results (e.g., '5')" ), case_sensitive: bool = Field( False, description="Whether to match case when searching (default: False). Only used in keyword mode." ), max_results: int = Field( 25, ge=1, le=50, description="Maximum number of matching segments to return (1-50, default: 25)" ), semantic: bool = Field( False, description="True: semantic search (natural language query, requires OPENROUTER_API_KEY). False: keyword search (Boolean operators AND/OR/NOT)." ) ) -> str: