ensembl_sequence
Retrieve DNA, RNA, or protein sequences for genes, transcripts, or genomic regions. Specify species, sequence type, and output format to extract precise genomic data from the Ensembl database.
Instructions
Retrieve DNA, RNA, or protein sequences for genes, transcripts, regions. Covers /sequence/id and /sequence/region endpoints.
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| format | No | Output format | json |
| identifier | Yes | Feature ID (gene, transcript, etc.) OR genomic region in format 'chr:start-end' (e.g., 'ENSG00000141510', 'ENST00000288602', '17:7565096-7590856', 'X:1000000-2000000') | |
| mask | No | Mask repeats (soft=lowercase, hard=N) | |
| sequence_type | No | Type of sequence to retrieve | genomic |
| species | No | Species name (e.g., 'homo_sapiens', 'mus_musculus') | homo_sapiens |
Implementation Reference
- src/handlers/tools.ts:198-235 (registration)Tool registration definition for 'ensembl_sequence' including name, description, and input schema.{ name: "ensembl_sequence", description: "Retrieve DNA, RNA, or protein sequences for genes, transcripts, regions. Covers /sequence/id and /sequence/region endpoints.", inputSchema: { type: "object", properties: { identifier: { type: "string", description: "Feature ID (gene, transcript, etc.) OR genomic region in format 'chr:start-end' (e.g., 'ENSG00000141510', 'ENST00000288602', '17:7565096-7590856', 'X:1000000-2000000')", }, sequence_type: { type: "string", enum: ["genomic", "cdna", "cds", "protein"], description: "Type of sequence to retrieve", default: "genomic", }, species: { type: "string", description: "Species name (e.g., 'homo_sapiens', 'mus_musculus')", default: "homo_sapiens", }, format: { type: "string", enum: ["json", "fasta"], description: "Output format", default: "json", }, mask: { type: "string", enum: ["soft", "hard"], description: "Mask repeats (soft=lowercase, hard=N)", }, }, required: ["identifier"], }, },
- src/handlers/tools.ts:501-511 (handler)Handler function that normalizes inputs and delegates to EnsemblApiClient.getSequenceData for execution.export async function handleSequence(args: any) { try { const normalizedArgs = normalizeEnsemblInputs(args); return await ensemblClient.getSequenceData(normalizedArgs); } catch (error) { return { error: error instanceof Error ? error.message : "Unknown error", success: false, }; } }
- index.ts:119-127 (handler)Dispatch handler in main server that routes 'ensembl_sequence' calls to the specific handleSequence function.case "ensembl_sequence": return { content: [ { type: "text", text: JSON.stringify(await handleSequence(args), null, 2), }, ], };
- src/types/ensembl.ts:42-47 (schema)TypeScript interface defining the expected output structure for sequence responses.export interface EnsemblSequence { id: string; desc: string; molecule: string; seq: string; }
- src/utils/ensembl-api.ts:235-264 (helper)Core utility function in EnsemblApiClient that performs the actual API request to retrieve sequences based on identifier type (region or ID).async getSequenceData(args: any): Promise<any> { const { identifier, sequence_type = "genomic", species = "homo_sapiens", format = "json", mask, } = args; const params: Record<string, string> = {}; if (mask) { params.mask = mask; } if (format === "fasta") { params.content_type = "text/x-fasta"; } // Check if identifier looks like a region (contains :) if (identifier.includes(":")) { return this.makeRequest( `/sequence/region/${species}/${identifier}`, params ); } else { // It's a feature ID const typeParam = sequence_type !== "genomic" ? `?type=${sequence_type}` : ""; return this.makeRequest(`/sequence/id/${identifier}${typeParam}`, params); } }