diversity_by_group
Compute per-population genetic diversity metrics including heterozygosity, FIS, MAF, and rarefied allelic richness. Groups can be defined via JSON or metadata file, and results are saved as a CSV.
Instructions
Per-population diversity: He, Ho, Fis, MAF, % polymorphic, allelic richness.
Define groups the same way as diversity_fst — either groups_json
{group: [names]} or metadata_tsv + group_column. For each group computes
n, % polymorphic markers, mean MAF, Nei's He, observed Ho, Fis (1−Ho/He), mean
observed allelic richness, and rarefied allelic richness (rarefied to the
smallest group's gene-copy count so unequal group sizes are comparable). Writes
diversity_by_group.csv.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| method | No | Genotype source: 'vcf' (full export, cached) or 'allelematrix' (paged, server-side subset). | vcf |
| region | No | Restrict analysis to a genomic window: 'chrom' or 'chrom:start-end' (1-based). | |
| id_column | No | Column in the metadata TSV holding the individual/accession id (default 'individual'). | individual |
| output_dir | No | Directory for the output CSV(s) (default ./gigwa_results/<module>/). | |
| groups_json | No | JSON object mapping each group name to a list of accession names/ids. | |
| max_markers | No | Cap the number of markers analysed (evenly-spaced subsample); omit to use all. | |
| group_column | No | Column in the metadata TSV holding the group/population label. | |
| metadata_tsv | No | Path to a metadata TSV (import_metadata format) used to define groups. | |
| variant_set_db_id | Yes | BrAPI variantSetDbId identifying the run (MODULE§project§run); from list_variant_sets / list_content. |
Output Schema
| Name | Required | Description | Default |
|---|---|---|---|
| result | Yes |