MCP server for finding research data and models for AI/ML training

Glama

Search for:

MCP server for finding research data and models for AI/ML training

View all MCP Servers

Why this server?
This server is highly effective for gathering raw data, as it can scrape and extract structured data from any website globally, bypassing anti-bot systems. This directly fulfills the need to pull information from 'websites, articles' for training data.
Thordata MCP Server
xja1023789-collab
-
security
-
license
-
quality
Enables AI models to scrape and extract data from any website globally using Thordata's 195+ country proxy network. Bypasses anti-bot systems and renders JavaScript content, outputting structured data in Markdown, HTML, or Links format.
Last updated -
MIT License
Why this server?
Provides a direct interface to the Kaggle API, enabling the user to search and access datasets and kernels, which are crucial sources for finding data and models mentioned in the request ('kaggle').
Kaggle-MCP
realbytecode
-
security
A
license
-
quality
Connects Claude AI to the Kaggle API through the Model Context Protocol, enabling users to browse competitions, search and download datasets, analyze kernels, and access pre-trained models through natural language interactions.
Last updated -
MIT License
Why this server?
Allows access to the Hugging Face Hub API to retrieve information about machine learning models and datasets. This is essential for finding existing models or data resources for training AI/ML models.
Hugging Face Hub MCP Server
michaelwaves
A
security
F
license
A
quality
Enables access to the Hugging Face Hub API to search and retrieve information about machine learning models, datasets, and their metadata. Provides comprehensive tools for exploring the Hugging Face ecosystem including model details, dataset information, and parquet file access.
Last updated -
8
Why this server?
Specifically designed to search, filter, and export Software Engineering papers on arXiv, directly addressing the requirement to find information in 'research papers'.
ArxivSearcher MCP Server
emi-dm
A
security
F
license
A
quality
An MCP server that enables intelligent searching, filtering, and exporting of Software Engineering papers on arXiv with tools for querying by keywords, authors, analyzing trends, and finding related research.
Last updated -
8
5
Why this server?
Enables searching and retrieving detailed information from PubMed articles using the NCBI Entrez API, providing access to biomedical 'research papers' and scientific data for LLMs.
PubMed MCP Server
emi-dm
A
security
F
license
A
quality
Enables searching and retrieving detailed information from PubMed articles using the NCBI Entrez API. Supports configurable search parameters including title/abstract filtering and keyword expansion to find relevant scientific publications.
Last updated -
1
Why this server?
Enables web scraping and extraction from any website globally, supporting dynamic content and outputting structured data, perfect for gathering broad information from 'websites, articles' and 'anywhere'.
AnyCrawl MCP Server
any4ai
-
security
-
license
-
quality
Enables web scraping and crawling capabilities for LLM clients, supporting single-page scraping, multi-page website crawling, and web search with multiple engines (Playwright, Cheerio, Puppeteer) and flexible output formats including markdown, HTML, text, and screenshots.
Last updated -
16
4
Why this server?
Facilitates comprehensive web research by leveraging Tavily's APIs to gather and structure data for high-quality markdown document creation, an excellent tool for compiling research from various 'websites' and 'articles'.
Deep Research MCP
ali-kh7
A
security
-
license
A
quality
A Model Context Protocol compliant server that facilitates comprehensive web research by utilizing Tavily's Search and Crawl APIs to gather and structure data for high-quality markdown document creation.
Last updated -
1
48
12
MIT License
Why this server?
A multipurpose tool focused on Retrieval-Augmented Generation that searches, indexes, and processes documents (PDF, DOCX, etc.), ideal for handling and making sense of the raw data collected from research papers and articles for LLM consumption.
MCP RAG
kalicyh
A
security
F
license
A
quality
Intelligent knowledge base system that enables users to process documents in 25+ formats, perform semantic search and Q\&A through vector retrieval. Supports multiple AI models including OpenAI and DouBao with local processing capabilities.
Last updated -
10
3
Why this server?
Offers access to a vast array of public datasets, which directly addresses the need to find 'data' for training AI/ML models from diverse and accessible sources.
Open Data Model Context Protocolofficial
OpenDataMCP
-
security
A
license
-
quality
Access to many public datasets right from your LLM application.
Last updated -
141
MIT License
Why this server?
Provides highly capable web search through proxy servers, ensuring the LLM can find up-to-date information and source material from across the web ('websites, articles') effectively.
Tavily MCP Server with Proxy Support
tulong66
-
security
F
license
-
quality
Enables LLMs to perform sophisticated web searches through proxy servers using Tavily's API, supporting comprehensive web searches, direct question answering, and recent news article retrieval with AI-extracted content.
Last updated -
2

Open Data Model Context Protocolofficial