batch_translate
Translate multiple biological identifiers in a batch operation using specified source and target attributes. Returns successful translations, missing IDs, and counts for efficient processing in Biomart MCP.
Instructions
Translates multiple identifiers in a single batch operation.
This function is more efficient than multiple calls to get_translation when
you need to translate many identifiers at once.
Args:
mart (str): The mart identifier (e.g., "ENSEMBL_MART_ENSEMBL")
dataset (str): The dataset identifier (e.g., "hsapiens_gene_ensembl")
from_attr (str): The source attribute name (e.g., "hgnc_symbol")
to_attr (str): The target attribute name (e.g., "ensembl_gene_id")
targets (list[str]): List of identifier values to translate (e.g., ["TP53", "BRCA1", "BRCA2"])
Returns:
dict: A dictionary containing:
- translations: Dictionary mapping input IDs to translated IDs
- not_found: List of IDs that could not be translated
- found_count: Number of successfully translated IDs
- not_found_count: Number of IDs that could not be translated
Example:
batch_translate("ENSEMBL_MART_ENSEMBL", "hsapiens_gene_ensembl", "hgnc_symbol", "ensembl_gene_id", ["TP53", "BRCA1", "BRCA2"])
>>> {"translations": {"TP53": "ENSG00000141510", "BRCA1": "ENSG00000012048"}, "not_found": ["BRCA2"], "found_count": 2, "not_found_count": 1}
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| dataset | Yes | ||
| from_attr | Yes | ||
| mart | Yes | ||
| targets | Yes | ||
| to_attr | Yes |
Implementation Reference
- biomart-mcp.py:307-356 (handler)The main handler function for the batch_translate tool, decorated with @mcp.tool() for registration. It uses a cached translation dictionary to map multiple input identifiers to their corresponding targets efficiently.def batch_translate(mart: str, dataset: str, from_attr: str, to_attr: str, targets: list[str]): """ Translates multiple identifiers in a single batch operation. This function is more efficient than multiple calls to get_translation when you need to translate many identifiers at once. Args: mart (str): The mart identifier (e.g., "ENSEMBL_MART_ENSEMBL") dataset (str): The dataset identifier (e.g., "hsapiens_gene_ensembl") from_attr (str): The source attribute name (e.g., "hgnc_symbol") to_attr (str): The target attribute name (e.g., "ensembl_gene_id") targets (list[str]): List of identifier values to translate (e.g., ["TP53", "BRCA1", "BRCA2"]) Returns: dict: A dictionary containing: - translations: Dictionary mapping input IDs to translated IDs - not_found: List of IDs that could not be translated - found_count: Number of successfully translated IDs - not_found_count: Number of IDs that could not be translated Example: batch_translate("ENSEMBL_MART_ENSEMBL", "hsapiens_gene_ensembl", "hgnc_symbol", "ensembl_gene_id", ["TP53", "BRCA1", "BRCA2"]) >>> {"translations": {"TP53": "ENSG00000141510", "BRCA1": "ENSG00000012048"}, "not_found": ["BRCA2"], "found_count": 2, "not_found_count": 1} """ # Use the cached helper function to get the translation dictionary result_dict = _get_translation_dict(mart, dataset, from_attr, to_attr) translations = {} not_found = [] for target in targets: if target in result_dict: translations[target] = result_dict[target] else: not_found.append(target) if not_found: print( f"The following targets were not found: {', '.join(not_found)}", file=sys.stderr, ) return { "translations": translations, "not_found": not_found, "found_count": len(translations), "not_found_count": len(not_found), }
- biomart-mcp.py:259-271 (helper)Cached helper function that queries Biomart to build a translation dictionary from from_attr to to_attr, used by both get_translation and batch_translate tools.def _get_translation_dict(mart: str, dataset: str, from_attr: str, to_attr: str): """ Helper function to get and cache a translation dictionary. """ try: server = get_server() dataset_obj = server[mart][dataset] df = dataset_obj.query(attributes=[from_attr, to_attr]) return dict(zip(df.iloc[:, 0], df.iloc[:, 1])) except Exception as e: print(f"Error getting translation dictionary: {str(e)}", file=sys.stderr) return {}