Skip to main content
Glama
bio-mcp
by bio-mcp

Bio-MCP FastQC Server šŸ”¬

Quality Control Analysis via Model Context Protocol

An MCP server that enables AI assistants to run FastQC and MultiQC quality control analysis on sequencing data. Part of the Bio-MCP ecosystem.

šŸŽÆ Purpose

FastQC is essential for quality assessment of high-throughput sequencing data. This MCP server allows AI assistants to:

  • Analyze single files - Get detailed QC reports for individual FASTQ/FASTA files

  • Batch process - Run QC on multiple files simultaneously

  • Generate summary reports - Create MultiQC reports combining multiple analyses

  • Handle large datasets - Queue system support for computationally intensive jobs

šŸš€ Quick Start

Prerequisites

Install FastQC and MultiQC:

# Via conda (recommended) conda install -c bioconda fastqc multiqc # Via package managers # Ubuntu/Debian sudo apt-get install fastqc pip install multiqc # macOS brew install fastqc pip install multiqc

Installation

# Clone and install git clone https://github.com/bio-mcp/bio-mcp-fastqc.git cd bio-mcp-fastqc pip install -e . # Or install directly pip install git+https://github.com/bio-mcp/bio-mcp-fastqc.git

Claude Desktop Configuration

Add to your claude_desktop_config.json:

{ "mcpServers": { "bio-fastqc": { "command": "python", "args": ["-m", "src.server"], "cwd": "/path/to/bio-mcp-fastqc" } } }

šŸ”§ Available Tools

Core Analysis Tools

fastqc_single

Run FastQC on a single FASTQ/FASTA file.

Parameters:

  • input_file (required): Path to FASTQ or FASTA file

  • threads (optional): Number of threads (default: 1)

  • contaminants (optional): Path to custom contaminants file

  • adapters (optional): Path to custom adapters file

  • limits (optional): Path to custom limits file

Example:

User: "Run quality control on my_sample.fastq.gz" AI: [calls fastqc_single] → Returns detailed QC report with pass/warn/fail status for each module

fastqc_batch

Run FastQC on multiple files in a directory.

Parameters:

  • input_dir (required): Directory containing FASTQ/FASTA files

  • file_pattern (optional): File pattern to match (default: ".fastq")

  • threads (optional): Number of threads (default: 4)

Example:

User: "Analyze all fastq files in the data/ directory" AI: [calls fastqc_batch] → Processes all files and returns summary statistics

multiqc_report

Generate MultiQC report from FastQC results.

Parameters:

  • input_dir (required): Directory containing FastQC and other analysis results

  • title (optional): Custom title for the report

  • comment (optional): Comment to add to the report

  • template (optional): Report template (default, simple, sections, gathered)

Example:

User: "Create a summary report from all the QC results" AI: [calls multiqc_report] → Generates interactive HTML report combining all analyses

Queue System Tools (when queue enabled)

For large datasets or batch processing:

  • fastqc_single_async - Queue single file analysis

  • fastqc_batch_async - Queue batch analysis

  • multiqc_report_async - Queue report generation

  • get_job_status - Check job progress

  • get_job_result - Retrieve completed results

  • cancel_job - Cancel running jobs

šŸ’” Usage Examples

Basic Quality Assessment

User: "What's the quality of my sequencing data in sample1.fastq.gz?" AI Response: āœ… FastQC Analysis Complete for sample1 Module Status Summary: āœ… Basic Statistics: PASS āœ… Per base sequence quality: PASS āš ļø Per sequence quality scores: WARN āœ… Per base sequence content: PASS āœ… Per sequence GC content: PASS āœ… Per base N content: PASS āœ… Sequence Length Distribution: PASS āŒ Sequence Duplication Levels: FAIL āœ… Overrepresented sequences: PASS āœ… Adapter Content: PASS Basic Statistics: • Total Sequences: 2,450,000 • Sequences flagged as poor quality: 0 • Sequence length: 150 • %GC: 47

Batch Analysis

User: "Analyze all FASTQ files in my project/raw_data/ folder" AI Response: FastQC Batch Analysis Complete ======================================== Processed 12 files: āœ… sample01_R1.fastq.gz: 8P/2W/0F āœ… sample01_R2.fastq.gz: 8P/2W/0F āš ļø sample02_R1.fastq.gz: 7P/2W/1F āš ļø sample02_R2.fastq.gz: 7P/2W/1F ... Overall Summary: • Total PASS: 94 • Total WARN: 24 • Total FAIL: 2 Tip: Run multiqc_report on this directory to generate a combined report!

Complete Workflow

User: "I have a directory of paired-end FASTQ files. Can you run quality control and create a summary report?" AI: I'll run a complete QC workflow on your paired-end data: 1. First, let me analyze all FASTQ files in batch: [runs fastqc_batch on directory] 2. Now I'll generate a MultiQC summary report: [runs multiqc_report on results] āœ… Complete QC workflow finished! Summary: - 24 FASTQ files processed (12 samples, paired-end) - Average quality score: 32.5 - 2 samples have adapter contamination warnings - 1 sample shows high duplication levels - Interactive HTML report generated: multiqc_report.html The MultiQC report provides detailed visualizations of: - Quality score distributions across all samples - GC content comparison - Sequence length distributions - Adapter content analysis - Sample correlation analysis

🐳 Docker Usage

Build and Run

# Build the image docker build -t bio-mcp-fastqc . # Run with data mounting docker run -v /path/to/data:/data bio-mcp-fastqc

Docker Compose (with Queue System)

services: fastqc-server: build: . volumes: - ./data:/data environment: - BIO_MCP_QUEUE_URL=http://queue-api:8000 depends_on: - queue-api

āš™ļø Configuration

Environment Variables

  • BIO_MCP_FASTQC_PATH - Path to FastQC executable (default: "fastqc")

  • BIO_MCP_MULTIQC_PATH - Path to MultiQC executable (default: "multiqc")

  • BIO_MCP_MAX_FILE_SIZE - Maximum file size in bytes (default: 10GB)

  • BIO_MCP_TIMEOUT - Command timeout in seconds (default: 1800)

  • BIO_MCP_TEMP_DIR - Temporary directory for processing

Queue System Integration

To enable async processing for large datasets:

from src.server_with_queue import FastQCServerWithQueue server = FastQCServerWithQueue(queue_url="http://localhost:8000")

šŸ“Š Output Files

FastQC generates several output files:

  • HTML Report (*_fastqc.html) - Interactive quality report

  • Data File (fastqc_data.txt) - Raw metrics and statistics

  • Summary File (summary.txt) - Pass/warn/fail status for each module

  • Plots - Various quality plots and charts

MultiQC combines these into:

  • MultiQC Report (multiqc_report.html) - Combined interactive report

  • Data Directory (multiqc_data/) - Processed data and statistics

  • General Stats (multiqc_general_stats.txt) - Summary table

šŸ” Quality Metrics Explained

FastQC analyzes multiple quality aspects:

Key Modules

  • Per base sequence quality - Quality scores across read positions

  • Per sequence quality scores - Distribution of mean quality scores

  • Per base sequence content - A/T/G/C content across positions

  • Per sequence GC content - GC% distribution vs expected

  • Sequence duplication levels - PCR duplication assessment

  • Adapter content - Contaminating adapter sequences

Status Interpretation

  • āœ… PASS - Analysis indicates no problems

  • āš ļø WARN - Slightly unusual, may not be problematic

  • āŒ FAIL - Likely problematic, requires attention

🧬 Integration with Bio-MCP Ecosystem

FastQC works seamlessly with other Bio-MCP tools:

User: "Run the complete preprocessing pipeline on my samples" AI Workflow: 1. fastqc_batch → Initial quality assessment 2. trimmomatic → Trim low-quality bases and adapters 3. fastqc_batch → Post-trimming QC 4. multiqc_report → Combined before/after report

šŸ¤ Contributing

We welcome contributions! See the Bio-MCP contributing guide.

Development Setup

git clone https://github.com/bio-mcp/bio-mcp-fastqc.git cd bio-mcp-fastqc pip install -e ".[dev]" pytest

šŸ“„ License

MIT License - see LICENSE file.

šŸ™ Acknowledgments

  • FastQC by Simon Andrews at Babraham Bioinformatics

  • MultiQC by Phil Ewels and the MultiQC community

  • Bio-MCP project and contributors


Part of the Bio-MCP ecosystem - Making bioinformatics accessible to AI assistants.

For more tools: Bio-MCP Organization

-
security - not tested
F
license - not found
-
quality - not tested

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/bio-mcp/bio-mcp-fastqc'

If you have feedback or need assistance with the MCP directory API, please join our Discord server