SciAgentKit
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@SciAgentKitrank PDB structures for target BRCA1"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
SciAgentKit
What is SciAgentKit?
SciAgentKit is a local-first toolkit that gives AI coding agents real scientific tools instead of letting them invent molecular properties with the confidence of a reviewer who did not read the supplement.
It combines:
Deterministic scientific skills for molecule-library audit, descriptor profiling, scaffold diversity, protein/PDB ranking, ligand preparation, docking planning, MD job generation, trajectory analysis, and report writing.
MCP tool server so Claude Code, Codex, Gemini CLI, Cursor, and other agent runtimes can call the same local scientific tools.
Agent skills that teach the model when to call each tool, what to ask the user, what to refuse to invent, and where to stop when scientific inputs are missing.
Reproducibility layer with
run_manifest.json, input/output hashes, structured CSV files, reports, and methods text.
The philosophy is simple:
Skills decide the workflow. MCP runs the science. Reports preserve the evidence.
Related MCP server: BioContextAI Knowledgebase MCP
Why this exists
AI agents are getting very good at editing files and calling tools. They are still terrible at knowing when a docking score is not biology, when scaffold novelty matters more than SMILES novelty, or when a missing cofactor turns a structure workflow into decorative nonsense.
SciAgentKit is built for researchers working on:
computational biology
AI drug discovery
molecule generation
QSAR / DTI workflows
docking and MD pipelines
protein target selection
scientific figure and report generation
It is not another chatbot. It is a scientific skill layer for agents.
Workflow overview
Stage | What it does | Main outputs |
Literature search | Search literature around target, mutation, binding site, assay context |
|
Protein selection | Fetch UniProt sequence, apply mutation if requested, rank PDB chains by mutation match, coverage/span, resolution |
|
Ligand preparation | Canonicalize SMILES, remove duplicates, generate 3D SDF at pH, recommend or accept force field |
|
Pocket detection | Use crystal ligand center, known residues, or external pocket tools |
|
Docking | Generate Vina configs, run docking when receptor/ligand PDBQT inputs exist, rank top ligands |
|
MD planning | Build OpenMM job folders with neutralization, 0.15 M salt, NVT/NPT/equilibration, seeded replicas |
|
Trajectory analysis | RMSD, RMSF, ProLIF residue interaction frequencies when trajectories are available |
|
Reporting | Human-readable and machine-readable outputs |
|
What it can analyze
Molecule-library audit
For generated molecules, RL molecules, screening libraries, or known actives:
SMILES validation
canonicalization
internal duplicate removal
cross-duplicate detection against reference/training libraries
novelty fraction
scaffold novelty
Bemis-Murcko scaffold diversity
QED, MW, logP, TPSA, HBD, HBA, rotatable bonds
property-distribution figures
report + methods section + reproducibility manifest
Protein/PDB selection
For target-aware workflows:
UniProt target sequence retrieval
mutation application to target sequence
PDB cross-reference discovery
chain-sequence alignment
mutation-aware structure filtering
ranking by residue coverage/span and resolution
selected PDB/chain summary with limitations
Docking and MD workflow support
For structure-based screening:
ligand 3D SDF generation
pH-aware preparation when Open Babel is available
force-field recommendation or user-specified force-field path
pocket center extraction from crystal ligand or known residues
AutoDock Vina wrapper and score parsing
OpenMM job skeletons for top ligands
replica planning with different seeds
MDAnalysis RMSD/RMSF
ProLIF interaction frequency tables
Install
Recommended local install
git clone https://github.com/ehsansyh/sciagentkit.git
cd sciagentkit
bash scripts/install.sh --allThis installs the full Python stack, attempts external scientific tools through conda-forge, generates agent configuration files, registers the Claude Code plugin, runs sciagent doctor, runs tests, and executes the bundled demo. External CLI tools are attempted one by one; if AutoDock Vina or fpocket cannot be installed automatically, see docs/EXTERNAL_TOOLS.md for direct installation links.
Core-only install
For molecule audit and descriptor/scaffold analysis only:
bash scripts/install.sh --core --agentsDocker
docker compose up --build sciagentkitFor MCP server mode:
docker compose up --build sciagentkit-mcpQuick demo
source scripts/activate.sh
sciagent doctor
sciagent demoExpected demo output includes:
runs/install_demo/
├── query/
├── reference/
├── novelty_report.csv
├── descriptor_comparison.csv
├── scaffold_comparison.csv
├── figures/
├── report.md
└── run_manifest.jsonUse from the CLI
Analyze one SMILES library
sciagent analyze-smiles examples/egfr_demo/ligands.smi \
--out runs/egfr_profileAudit generated molecules against known actives
sciagent audit-generated \
examples/hiv_demo/generated_molecules.smi \
examples/hiv_demo/known_actives.smi \
--out runs/hiv_generated_auditSelect best PDB structure for a mutated target
sciagent protein-select EGFR \
--mutation L858R \
--out runs/egfr_proteinPrepare ligands at pH 7.4
sciagent prepare-ligands examples/egfr_demo/ligands.smi \
--ph 7.4 \
--target-family kinase \
--out runs/egfr_ligandsCreate a target-screening project skeleton
sciagent target-screen-project EGFR examples/egfr_demo/ligands.smi \
--mutation L858R \
--ph 7.4 \
--target-family kinase \
--out runs/egfr_projectUse with AI agents
SciAgentKit is designed to be used as:
Agent skill / workflow instruction + local MCP scientific tool serverThe agent selects the stage. The MCP server computes the values. This prevents the model from inventing descriptors, docking scores, protein coverage, RMSD, or interaction frequencies. This keeps scientific values tied to deterministic tool outputs instead of model estimates.
One-command setup for any agent
Fresh GitHub checkout:
git clone https://github.com/ehsansyh/sciagentkit.git
cd sciagentkit
bash scripts/install.sh --allFor an existing checkout, point SciAgentKit at your agent of choice:
sciagent init-agent claude # CLAUDE.md + project .mcp.json
sciagent init-agent codex # AGENTS.md + codex_config.fragment.toml + .codex/skills/
sciagent init-agent gemini # GEMINI.md + .gemini/settings.json + .gemini/skills/
sciagent init-agent cursor # .cursor rule + .mcp/sciagentkit.json
sciagent init-agent all # everything aboveEvery tool uses the same cross-platform launcher (scripts/start_mcp.py, no bash required) and the same MCP server, so behavior is aligned across Claude Code, Codex, Gemini CLI, and Cursor.
Skill source of truth. All skill definitions live in .claude/skills/. The plugin copy and any .gemini/skills/ are generated from it — never edit them directly. After changing a skill, run sciagent sync-skills to propagate the change everywhere.
Claude Code
Claude Code has the best experience because SciAgentKit ships as a plugin with namespaced skills and a bundled MCP launcher.
Install:
bash scripts/install.sh --full --agents --external --claude-pluginOpen Claude Code from the repository root:
claudeInside Claude Code:
/reload-plugins
/plugin list
/mcpUse the stage skills:
/sciagentkit:smiles-analysis Analyze examples/egfr_demo/ligands.smi and save outputs to runs/claude_egfr_profile./sciagentkit:generated-molecule-audit Audit examples/hiv_demo/generated_molecules.smi against examples/hiv_demo/known_actives.smi and save outputs to runs/claude_hiv_audit./sciagentkit:protein-structure-selection Select the best PDB structure for EGFR L858R using UniProt, mutation-aware alignment, residue coverage, and resolution ranking. Save outputs to runs/egfr_protein./sciagentkit:full-target-screening Run a full target-screening project for EGFR L858R using examples/egfr_demo/ligands.smi at pH 7.4. Stop before docking or MD if required external inputs are missing. Save outputs to runs/egfr_full.Available Claude Code skills:
/sciagentkit:smiles-analysis
/sciagentkit:generated-molecule-audit
/sciagentkit:literature-search
/sciagentkit:protein-structure-selection
/sciagentkit:ligand-preparation
/sciagentkit:pocket-and-docking
/sciagentkit:md-workflow
/sciagentkit:trajectory-analysis
/sciagentkit:report
/sciagentkit:full-target-screeningOpenAI Codex
Codex can use SciAgentKit through MCP + Agent Skills + AGENTS.md.
Option A: ask Codex to install it
Open Codex in any directory and ask:
Clone https://github.com/ehsansyh/sciagentkit, run the installer, configure SciAgentKit MCP and skills for Codex, verify with codex mcp list, run sciagent doctor, and run the demo.Option B: one command (recommended)
After installing SciAgentKit (pip install -e . or bash scripts/install.sh), generate Codex config from a single source:
sciagent init-agent codexThis writes AGENTS.md (operating manual), .codex/skills/, and codex_config.fragment.toml with a project-local cross-platform MCP entry. Add the fragment to your Codex config if your Codex surface does not auto-read project MCP fragments.
Option C: configure manually
git clone https://github.com/ehsansyh/sciagentkit.git
cd sciagentkit
bash scripts/install.sh --full --agents --externalAdd the MCP server:
codex mcp add sciagentkit \
--env PYTHONPATH=$(pwd)/src \
-- python scripts/start_mcp.pyThe launcher
scripts/start_mcp.pyis cross-platform (Windows/macOS/Linux) and needs nobash. On Windows, usepython scripts\start_mcp.py.
Verify inside Codex:
/mcpUse natural prompts:
Use the SciAgentKit generated-molecule-audit skill to audit examples/hiv_demo/generated_molecules.smi against examples/hiv_demo/known_actives.smi. Save outputs to runs/codex_hiv_audit.Use SciAgentKit to select the best PDB structure for EGFR L858R. Explain mutation match, sequence coverage, residue span, resolution, and limitations.Codex should treat AGENTS.md as the project-level operating manual and the MCP server as the only source for scientific calculations.
Gemini CLI
Gemini CLI can use SciAgentKit through MCP + Agent Skills + GEMINI.md / skill folders.
One command (recommended)
After installing SciAgentKit, run:
sciagent init-agent geminiThis writes GEMINI.md, adds the sciagentkit MCP server to .gemini/settings.json (merging into any existing file), and copies all skills into .gemini/skills/ from the canonical source. Then just open gemini.
Manual setup
Install SciAgentKit:
git clone https://github.com/ehsansyh/sciagentkit.git
cd sciagentkit
bash scripts/install.sh --full --agents --externalAdd SciAgentKit to ~/.gemini/settings.json or project-level .gemini/settings.json:
{
"mcpServers": {
"sciagentkit": {
"command": "python",
"args": ["/ABSOLUTE/PATH/TO/sciagentkit/scripts/start_mcp.py"],
"cwd": "/ABSOLUTE/PATH/TO/sciagentkit",
"timeout": 30000,
"trust": true
}
}
}Copy skills from the canonical source (.claude/skills/ holds the real SKILL.md files):
mkdir -p .gemini/skills
cp -r .claude/skills/* .gemini/skills/Open Gemini CLI:
geminiVerify:
/mcp list
/skillsUse prompts:
Use SciAgentKit generated-molecule-audit to audit examples/hiv_demo/generated_molecules.smi against examples/hiv_demo/known_actives.smi. Save outputs to runs/gemini_hiv_audit.Use SciAgentKit full-target-screening for EGFR L858R using examples/egfr_demo/ligands.smi at pH 7.4. Ask whether to use the recommended force field or a user-specified force field before ligand preparation.MCP tools exposed by SciAgentKit
When the MCP server is active, agents can call:
canonicalize
descriptors
bemis_murcko_scaffold
analyze_library
compare_libraries
audit_generated_library
literature_search
protein_select
prepare_ligands
detect_pocket
run_docking
md_plan
analyze_trajectory
target_screen_project
write_reportStart manually:
python -m sciagentkit.servers.rdkit_serveror, cross-platform (Windows/macOS/Linux, no bash required):
python scripts/start_mcp.pySecurity model
SciAgentKit is local-first, but local-first still requires explicit boundaries for filesystem writes, external tools, and high-cost workflows.
Safety principles
No invented science: agents must call MCP tools for molecular descriptors, UniProt/PDB ranking, docking results, RMSD/RMSF, and interaction frequencies.
No blind shell execution: scientific workflows should go through SciAgentKit CLI/MCP tools, not arbitrary model-written shell commands.
Runs stay under
runs/: outputs are expected to be written into controlled run directories.Manifest required: each workflow writes
run_manifest.jsonwith command, parameters, hashes, environment information, and output files.External binaries are explicit: OpenBabel, AutoDock Vina, fpocket, OpenMM, MDAnalysis, and ProLIF are checked and reported by
sciagent doctor.No secret harvesting: skills must not ask for API keys unless the user explicitly configures a relevant optional service.
Human approval for high-impact actions: docking/MD execution, large compute jobs, and external downloads should remain user-visible and interruptible.
Recommended agent permissions
Workflow | Suggested approval level |
SMILES audit / descriptors / figures | Low to medium |
Literature search | Medium, because network calls and citations matter |
Protein/PDB selection | Medium |
Ligand preparation | Medium |
Docking execution | High |
MD execution | High |
Report generation from existing outputs | Low |
Trust checklist before enabling a third-party skill
Read SKILL.md
Check scripts/ for shell commands
Check MCP config for external servers
Run sciagent doctor
Use a disposable test folder first
Do not expose private compound libraries to untrusted remote tools
Keep Claude/Codex/Gemini approval prompts enabled for heavy workflowsScientific limitations
SciAgentKit is a workflow scaffold and tool layer. It does not replace expert review for protein preparation, docking validation, force-field parameterization, or MD interpretation.
Current limitations:
receptor protonation needs expert review
missing loops, cofactors, metals, and crystal waters may require manual handling
docking scores are not biological activity
ligand parameterization must be validated before serious MD
1–5 ns MD is a screening-level sanity check, not production evidence
MM/GBSA and free-energy workflows are not yet production-grade in this repository
membrane proteins, covalent inhibitors, metalloenzymes, and unusual cofactors need custom preparation
Example outputs
A generated-molecule audit produces:
runs/hiv_generated_audit/
├── query/
│ ├── canonicalization_report.csv
│ ├── cleaned_molecules.csv
│ ├── descriptors.csv
│ ├── descriptor_summary.csv
│ ├── scaffolds.csv
│ └── scaffold_summary.csv
├── reference/
├── novelty_report.csv
├── cross_duplicates.csv
├── novel_query_molecules.csv
├── descriptor_comparison.csv
├── scaffold_comparison.csv
├── figures/
│ ├── qed_comparison.png
│ ├── mw_comparison.png
│ ├── logp_comparison.png
│ └── tpsa_comparison.png
├── report.md
└── run_manifest.jsonA protein-selection run produces:
runs/egfr_protein/
├── uniprot_target.csv
├── target_sequence.fasta
├── pdb_candidates.csv
├── ranked_structures.csv
└── run_manifest.jsonA docking/MD project can produce:
runs/egfr_project/
├── literature/
├── protein/
├── ligands/
├── pocket/
├── docking/
├── md/
├── analysis/
├── final_report.md
├── final_report.docx
├── final_report.pdf
└── run_manifest.jsonExample figures generated by SciAgentKit
These examples are produced by the bundled demo workflows and show the kind of plots the toolkit writes automatically under each run directory.
Repository layout
sciagentkit/
├── src/sciagentkit/
│ ├── core/ # deterministic scientific utilities
│ ├── skills/ # workflow functions
│ ├── servers/ # MCP server
│ ├── agents/ # reference agent prompts/plans
│ └── plotting/ # figures
├── plugins/
│ └── claude-code-sciagentkit/
├── skills/
│ ├── claude/
│ ├── codex/
│ ├── cursor/
│ └── openclaw/
├── examples/
│ ├── hiv_demo/
│ └── egfr_demo/
├── scripts/
│ ├── install.sh
│ ├── start_mcp.sh
│ └── setup_agents.py
├── docs/
├── tests/
└── runs/Roadmap
v1.1
stronger Codex plugin packaging
Gemini CLI skill installer
better receptor preparation checks
PAINS/Brenk filters
SA score integration
richer report templates
v1.2
Meeko / PDBQT preparation helpers
improved force-field recommendation database
docking pose clustering
interaction heatmaps
publication-ready figure themes
v2.0
hosted/private SciAgentKit Cloud
team workspaces
secure MCP gateway
private compound-library audit
managed docking/MD compute queues
pharma-style audit logs and report templates
Citation
If you use SciAgentKit in a project, cite the repository:
@software{sciagentkit,
title = {SciAgentKit: MCP-native scientific skills for reproducible computational biology and drug-discovery agents},
author = {Ehsan Sayyah},
year = {2026},
url = {https://github.com/ehsansyh/sciagentkit}
}License
Apache-2.0.
Final warning from the tiny guardian of reproducibility
If an agent claims it docked a ligand, ran MD, calculated RMSD, and found a drug without producing structured outputs and a manifest, it did not do science. It performed theater with a GPU costume.
This server cannot be installed
Maintenance
Resources
Unclaimed servers have limited discoverability.
Looking for Admin?
If you are the server author, to access and configure the admin panel.
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/ehsansyh/Sciagentkit'
If you have feedback or need assistance with the MCP directory API, please join our Discord server