map_dartseq_to_reference
Aligns DArTseq SNP marker tag sequences to a reference genome to infer chromosome, position, and strand for each marker, enabling genome-anchored data import.
Instructions
Guess genomic positions for DArTseq SNP markers by aligning their tag sequences.
Aligns each marker's ~69 bp AlleleSequence tag to reference_fasta (a
reference genome FASTA, or a prebuilt minimap2 .mmi index) and reports the
inferred chromosome, position and strand of each SNP. Writes
dartseq_positions.csv (allele_id, chrom, pos, strand, mapq, ref, alt, status).
The result can be passed to import_dartseq (reference_fasta=) to import
the data genome-anchored instead of on an Unmapped contig.
backend: "auto" uses the minimap2 CLI when available (streams over
multi-part indexes → bounded RAM, best for large multi-gigabase genomes),
falling back to the in-process mappy binding. Markers are classified
unique (mapq ≥ min_mapq), multi (ambiguous), or unmapped.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| preset | No | minimap2 preset (default 'sr' for short reads). | sr |
| backend | No | Aligner backend: 'auto' (minimap2 CLI if available, else mappy), 'cli', or 'mappy'. | auto |
| min_mapq | No | Minimum mapping quality for a tag to count as uniquely mapped. | |
| snp_xlsx | Yes | Path to a DArTseq SNP xlsx report. | |
| output_dir | No | Directory for the output CSV(s) (default ./gigwa_results/<module>/). | |
| reference_fasta | Yes | Path to a reference genome FASTA or a prebuilt minimap2 .mmi index, for genome-anchoring. |
Output Schema
| Name | Required | Description | Default |
|---|---|---|---|
| result | Yes |