search_kaggle_datasets
Find datasets on Kaggle by entering a search query. This tool uses the Kaggle API to match and retrieve relevant datasets for analysis or machine learning projects.
Instructions
Searches for datasets on Kaggle matching the query using the Kaggle API.
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| query | Yes |
Implementation Reference
- src/server.py:30-60 (handler)The main handler function for the 'search_kaggle_datasets' tool, decorated with @mcp.tool() for registration. It uses the Kaggle API to search datasets by query, formats top 10 results as JSON, handles errors, and relies on the globally initialized 'api'.@mcp.tool() async def search_kaggle_datasets(query: str) -> str: """Searches for datasets on Kaggle matching the query using the Kaggle API.""" if not api: # Return an informative error if API is not available return json.dumps({"error": "Kaggle API not authenticated or available."}) print(f"Searching datasets for: {query}") try: search_results = api.dataset_list(search=query) if not search_results: return "No datasets found matching the query." # Format results as JSON string for the tool output results_list = [ { "ref": getattr(ds, 'ref', 'N/A'), "title": getattr(ds, 'title', 'N/A'), "subtitle": getattr(ds, 'subtitle', 'N/A'), "download_count": getattr(ds, 'downloadCount', 0), # Adjusted attribute name "last_updated": str(getattr(ds, 'lastUpdated', 'N/A')), # Adjusted attribute name "usability_rating": getattr(ds, 'usabilityRating', 'N/A') # Adjusted attribute name } for ds in search_results[:10] # Limit to 10 results ] return json.dumps(results_list, indent=2) except Exception as e: # Log the error potentially print(f"Error searching datasets for '{query}': {e}") # Return error information as part of the tool output return json.dumps({"error": f"Error processing search: {str(e)}"})