Schema | Gigwa-MCP

Gigwa-MCP

Overview Schema Related Servers Score Discussions

Server Configuration

Describes the environment variables required to run the server.

Name	Required	Description	Default
`GIGWA_URL`	No	The base URL of the Gigwa server (without /rest suffix).	https://gigwa.icarda.org:8443/gigwa/
`GIGWA_PASS`	No	The password for authentication.
`GIGWA_USER`	No	The username for authentication.
`GIGWA_TIMEOUT`	No	Read/request timeout in seconds.	120
`GIGWA_URL_OTHER`	No	The base URL of the "other" Gigwa server.	https://gigwa.icarda.org:8443/gigwa/
`GIGWA_PASS_OTHER`	No	The password for authentication.
`GIGWA_USER_OTHER`	No	The username for authentication.
`GIGWA_CONNECT_TIMEOUT`	No	TCP connect timeout in seconds.	10

Capabilities

Features and capabilities supported by this server

Capability	Details
`tools`	{ "listChanged": false }
`prompts`	{ "listChanged": false }
`resources`	{ "subscribe": false, "listChanged": false }
`experimental`	{}

Tools

Functions exposed to the LLM to take actions

Name	Description
gigwa_connectA	Switch the active Gigwa server at runtime — no restart needed. Re-points every subsequent tool (and the gigwa:// resources) at url for the rest of the session. Credentials are never passed through the chat: they are resolved from the environment — a named profile reads GIGWA_USER_/GIGWA_PASS_; anonymous=True sends none. With neither, the default GIGWA_USER/GIGWA_PASS are used only when reconnecting to the configured GIGWA_URL — switching to any other server without a profile connects anonymously, so your home credentials are never transmitted to a different host unasked. The new connection is verified with a live round-trip before this returns; on failure the previous connection is restored.
gigwa_server_infoA	Check connectivity to the configured Gigwa server. Generates an auth token with the configured credentials and reports the server URL and (best-effort) version. Use this first to confirm the connection works before importing data.
list_contentA	List the databases, projects and runs currently hosted on the Gigwa server.
import_dartseqA	Import DArTseq data from xlsx report(s) into Gigwa. Converts the DArTseq SNP and/or Silico-DArT xlsx report(s) to a standard VCF — doing the 2-row genotype calling in Python (so reference homozygotes are not mis-imported as heterozygous, as Gigwa's built-in DArT parser does) — and uploads it to create/append a database (`module`), `project` and `run`. Provide at least one of `snp_xlsx` / `silico_xlsx` (absolute paths). SNP and Silico use different allele models; importing both into the same run is unusual — prefer separate runs unless you specifically intend to combine them. If `reference_fasta` is given (a reference genome FASTA or a prebuilt minimap2 `.mmi` index — an `.mmi` is loaded directly with no re-indexing, preferred for large genomes), the SNP markers' tag sequences are aligned to it and uniquely-mapped markers (mapq ≥ `min_mapq`) are imported genome-anchored (real chromosome/position); the rest stay on an `Unmapped` contig. Without it, all markers go on `Unmapped`. `positions_csv` reuses a mapping already produced by `map_dartseq_to_reference` (its `dartseq_positions.csv`) instead of re-aligning — much faster when you've already inspected the mapping. Provide either `reference_fasta` or `positions_csv`, not both. Set `clear_project_data=True` to replace any existing data in the project, `skip_monomorphic=True` to drop non-variant markers, and `wait=False` to return immediately with a progress token instead of blocking until done.
import_vcfA	Import a VCF file (`.vcf` or `.vcf.gz`) into Gigwa. Uploads the VCF to create/append a database (`module`), `project` and `run`. `technology` is optional free-text (e.g. 'WGS', 'GBS'). Use `clear_project_data=True` to replace existing project data and `wait=False` to return a progress token instead of blocking.
map_dartseq_to_referenceA	Guess genomic positions for DArTseq SNP markers by aligning their tag sequences. Aligns each marker's ~69 bp `AlleleSequence` tag to `reference_fasta` (a reference genome FASTA, or a prebuilt minimap2 `.mmi` index) and reports the inferred chromosome, position and strand of each SNP. Writes `dartseq_positions.csv` (allele_id, chrom, pos, strand, mapq, ref, alt, status). The result can be passed to `import_dartseq` (`reference_fasta=`) to import the data genome-anchored instead of on an `Unmapped` contig. `backend`: `"auto"` uses the minimap2 CLI when available (streams over multi-part indexes → bounded RAM, best for large multi-gigabase genomes), falling back to the in-process `mappy` binding. Markers are classified `unique` (mapq ≥ `min_mapq`), `multi` (ambiguous), or `unmapped`.
get_import_progressA	Report the current status of a running import, given its progress token.
abort_importA	Abort a running import (or other long process), given its progress token. Asks Gigwa to cancel the process identified by `progress_token` (the token returned by `import_dartseq` / `import_vcf` when run with `wait=False`). Returns whether the abort request was accepted; poll `get_import_progress` afterwards to confirm it stopped.
validate_metadataA	Validate an individual-metadata file against a Gigwa database without importing. `metadata_type` is the name of the ID column in the file that links rows to genotype entities — for individual metadata this is the `individual` column (the header must match exactly, case-sensitive). `tsv_path` is a TSV whose first column header equals `metadata_type`.
import_metadataA	Import individual metadata (per-individual attributes) into an existing Gigwa database. The file is a TSV whose first column header equals `metadata_type` (`individual` for individual metadata) and whose values match the individual/sample names already present in the database. Remaining columns become searchable attributes. By default the file is validated first; set `validate_first=False` to skip that check.
get_germplasm_metadataA	Fetch server-stored per-individual metadata (germplasm attributes) for a database. Reads the attributes already stored in Gigwa (imported earlier via `import_metadata` or a BrAPI source) for the module of `variant_set_db_id`, via BrAPI germplasm. Writes `germplasm_metadata.csv` (one row per accession, attribute columns) that can be fed back to the grouping tools (`diversity_fst` / `diversity_by_group` via `metadata_tsv`). Returns an empty-result note when the Gigwa build does not expose germplasm attributes (some 2.12 builds do not).
qc_call_rateA	Per-sample and per-marker call rate (missingness) QC for a variant set. Flags samples/markers below the given thresholds. Writes `call_rate_samples.csv` and `call_rate_markers.csv` and returns a summary with the overall call rate and the worst offenders. `variant_set_db_id` is a BrAPI variantSetDbId (from `list_content` / BrAPI variantsets). For large production sets pass `method="allelematrix"` with `max_markers` (e.g. 20000) to estimate from a server-side marker subset instead of a full VCF export. `region` (`"chrom"` or `"chrom:start-end"`, 1-based; from `list_sequences`) restricts the analysis to one genomic window — available on every QC/diversity tool.
qc_heterozygosityA	Per-sample observed heterozygosity QC, flagging outliers. High Ho relative to the cohort suggests contamination or off-types; very low Ho suggests selfed/inbred or duplicated material. Flags samples more than `outlier_sd` standard deviations from the mean. Writes `heterozygosity_samples.csv`. For large sets pass `method="allelematrix"` + `max_markers` to avoid a full VCF export.
qc_duplicate_accessionsA	Detect duplicate / clonal accessions via pairwise identity-by-state (IBS). Computes IBS allele-sharing similarity between every pair of samples and groups pairs at or above `similarity_threshold` into duplicate sets — the core genebank "cleaning" check for mislabelled duplicates and clones. By default subsamples to `max_markers` evenly-spaced markers for speed (set to 0/None to use all). Writes `duplicate_pairs.csv` and `duplicate_groups.csv`. For large sets pass `method="allelematrix"` to fetch the marker subset without a full export.
qc_maf_filterA	Report markers that would be filtered by MAF / missingness (no changes applied). Computes per-marker minor-allele frequency and missing rate, and counts how many markers are monomorphic, below `maf_threshold`, or above `max_missing` missing. Writes `marker_filter_stats.csv`. This is a report only — it does not modify Gigwa. For large sets pass `method="allelematrix"` + `max_markers` to sample server-side.
diversity_summaryA	Per-marker diversity statistics (MAF, He, Ho, PIC) and dataset means. He is Nei's gene diversity (1 - Σpᵢ²), Ho is observed heterozygosity, PIC is polymorphism information content. Writes `diversity_markers.csv`. For large sets pass `method="allelematrix"` + `max_markers` to sample server-side.
diversity_pcaA	Principal component analysis of population structure. Runs PCA on the alt-allele dosage matrix (monomorphic markers dropped, missing mean-imputed, Patterson scaling). Writes `pca_coords.csv` (per-sample PC coordinates) and reports variance explained plus any PC1/PC2 outlier samples (beyond `outlier_sd` SD). Pass `metadata_tsv` + `group_column` to add a `group` column (population label per sample) for colouring the PC plot. For large sets pass `method="allelematrix"` + `max_markers` to avoid a full VCF export.
diversity_kinshipA	VanRaden genomic relationship (kinship) matrix. Computes G = ZZ'/(2 Σp(1-p)) from alt dosage. Writes the full matrix as `kinship_matrix.csv` (samples × samples) and reports the most-related pairs and the diagonal (self-relationship / inbreeding) range. For large sets pass `method="allelematrix"` + `max_markers` to avoid a full VCF export.
diversity_fstB	Pairwise Weir & Cockerham Fst between groups of samples. Define the groups one of two ways: `groups_json` — a JSON object mapping each group name to a list of accession names (or callset ids), e.g. `{"north": ["112","156"], "south": ["11","42"]}`. `metadata_tsv` + `group_column` — read groups from a metadata TSV (the same file format used by `import_metadata`), keyed on `id_column` (default `individual`) and grouped by `group_column`. Writes `fst_pairwise.csv` with the Fst for every group pair. (Server-side BrAPI attributes are not used for grouping — that endpoint is unavailable on the target Gigwa 2.12 build.)
diversity_by_groupA	Per-population diversity: He, Ho, Fis, MAF, % polymorphic, allelic richness. Define groups the same way as `diversity_fst` — either `groups_json` `{group: [names]}` or `metadata_tsv` + `group_column`. For each group computes n, % polymorphic markers, mean MAF, Nei's He, observed Ho, Fis (1−Ho/He), mean observed allelic richness, and rarefied allelic richness (rarefied to the smallest group's gene-copy count so unequal group sizes are comparable). Writes `diversity_by_group.csv`.
diversity_core_collectionA	Select a core collection that maximises captured allelic diversity. Greedy allele-coverage selection (Core-Hunter style): repeatedly add the accession that contributes the most not-yet-captured marker-alleles. Pick the core `size` directly, or as `fraction` of all accessions (default 10%). Writes `core_collection.csv` (rank, accession, cumulative allele coverage) and reports the fraction of total allelic diversity the core captures.
diversity_structureA	Lightweight population-structure clustering (PCA + K-means, in-Python). Reduces the alt-dosage matrix with PCA (Patterson scaling), then runs K-means for K in `k_min..k_max` and picks the K with the highest pseudo-F (Calinski-Harabasz) between/within variance ratio — a clear maximum when groups are well separated. Writes `structure_clusters.csv` (sample, assigned cluster at the best K, PC coords) and reports the chosen K with cluster sizes. (No external ADMIXTURE binary — computed entirely in Python, consistent with the rest of the analysis layer.)
diversity_treeA	UPGMA dendrogram of accessions from IBS allele-sharing distance (Newick). Builds a pairwise IBS similarity matrix, converts to distance (1 − IBS), and writes a UPGMA tree as `tree.nwk` (standard Newick, loadable in FigTree / iTOL / ape). Marker subsampling (`max_markers`) keeps it tractable on large sets.
audit_import_qualityA	Scan a Gigwa instance for databases imported with genotype-encoding artifacts. With no `variant_set_db_id` this audits every run on the instance; pass one to audit a single variant set. For each run it pulls a bounded genotype sample (up to `max_markers` markers × `max_samples` callsets) via paged BrAPI `search/allelematrix` — cheap and constant-cost regardless of how large the variant set is, so it is safe to run across a whole production instance without exporting multi-GB VCFs. The aggregate genotype-class fractions it needs are estimated tightly from the sample (a true zero hom-alt class stays zero; a rare-but-real one shows up). It flags two import failure modes plus two weaker signals: BROKEN — cohort mean Ho above `het_threshold` (DArT 2-row mis-call), or homozygous-alt genotypes far below their HWE expectation given the alt-allele frequency (lost hom-alt class; the HWE test avoids false positives on low-MAF / mostly-monomorphic panels where near-zero hom-alt is genuine). SUSPECT — call rate above `complete_call_rate` (no missing data, often missing forced to 0/0), monomorphic fraction above `monomorphic_threshold`, or AD/DP depth fields present but uniformly zero (a VCF synthesised from genotype calls with fabricated depth/likelihoods — the same converter often miscalls GT too). Writes `import_quality_scan.csv` (one row per run) under `output_dir` (default `./gigwa_results/`) and returns a summary ranked worst-first. Read-only — it never modifies Gigwa.
count_variantsA	Count variants matching filters, computed server-side (nothing is downloaded). Fast way to size a query before pulling data. Filter by genomic region (`reference_name` + optional `start`/`end`, from `list_sequences`), minor- allele frequency (`min_maf`/`max_maf`) and/or `max_missing_data` (0–1 fraction). With no filters this returns the total variant count of the set. `variant_set_db_id` is a BrAPI variantSetDbId (from `list_variant_sets` / `list_content`).
search_variantsA	Search variants matching filters server-side and write the matching list to CSV. Same filters as `count_variants` (region / MAF / missing-data). Returns variant metadata only (id, chrom, pos, ref, alt) — no genotypes are fetched — and writes `variant_search.csv`. Use `count_variants` first to size the result; `max_variants` caps how many are retrieved. For downstream genotype analysis on a filtered subset, use the `region`/`min_maf` options on the QC/diversity tools instead.
list_sequencesA	List the reference sequences (chromosomes/contigs) available in a variant set. Use this to discover valid `reference_name` values for the region filters on `count_variants` / `search_variants` / the QC & diversity tools.
list_variant_setsA	List every variant set (run) with its exact BrAPI variantSetDbId. The other tools take a `variant_set_db_id`; this returns those ids directly (plus name and variant/callset counts when the server provides them), complementing the human-readable database/project/run view from `list_content`.
export_genotypesA	Export a variant set to a file in the given format. `format` is one of Gigwa's export formats. Which are available depends on the Gigwa build — `VCF` (default), `PLINK` and `FLAPJACK` are commonly supported; others (`HAPMAP`, `DARWIN`, …) may not be, in which case the tool reports the formats this instance actually offers. The export runs server-side and is streamed to `output_path`. For large sets this can take a while; raise `timeout` (seconds).

Prompts

Interactive templates invoked by user choice

Name	Description
`import_and_qc`	Import a genotype dataset (DArTseq xlsx or VCF) into Gigwa, then run standard QC.
`diversity_report`	Produce a population diversity / structure report for a variant set.
`qc_triage`	Run the full QC suite on a variant set and give a go/no-go verdict.
`explore_instance`	Get an overview of the whole Gigwa instance and flag anything that needs attention.
`region_scan`	Characterise variants and diversity within one genomic region.

Resources

Contextual data attached and managed by the client

Name	Description
`Tool catalog`	A categorised catalog of every tool with its EDAM operation/topic annotations.
`Gigwa server info`	Configured connection info — the target URL and auth mode. Deliberately makes no network call: reading a resource must be side-effect-free, and the server should not generate outbound traffic during directory inspection. Use the ``gigwa_server_info`` tool to actually test the live connection and fetch the server version.

Server Configuration
Capabilities
Tools
Prompts
Resources

Latest Blog Posts

Who's Calling? MCP Hosts Are an Identity Blind Spot (And the Spec Knows It)
By Om-Shree-0709 on July 25, 2026.
mcp
Agent Identity
OAuth 2.1
Your AI Chatbot Just Exposed Your CEO's Salary to an Intern
By Om-Shree-0709 on July 2, 2026.
Agent Identity
MCP Security
OAuth Delegation
Why MCP Servers Need Execution Sandboxing (And Why Your Current Stack Isn't Enough)
By Om-Shree-0709 on June 30, 2026.
Agentic Ai
Prompt Injection
WebAssembly

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/gkanogiannis/Gigwa-MCP'

If you have feedback or need assistance with the MCP directory API, please join our Discord server