Skip to main content
Glama

batch_translate

Translate multiple biological identifiers in a batch operation using specified source and target attributes. Returns successful translations, missing IDs, and counts for efficient processing in Biomart MCP.

Instructions

Translates multiple identifiers in a single batch operation.

This function is more efficient than multiple calls to get_translation when
you need to translate many identifiers at once.

Args:
    mart (str): The mart identifier (e.g., "ENSEMBL_MART_ENSEMBL")
    dataset (str): The dataset identifier (e.g., "hsapiens_gene_ensembl")
    from_attr (str): The source attribute name (e.g., "hgnc_symbol")
    to_attr (str): The target attribute name (e.g., "ensembl_gene_id")
    targets (list[str]): List of identifier values to translate (e.g., ["TP53", "BRCA1", "BRCA2"])

Returns:
    dict: A dictionary containing:
        - translations: Dictionary mapping input IDs to translated IDs
        - not_found: List of IDs that could not be translated
        - found_count: Number of successfully translated IDs
        - not_found_count: Number of IDs that could not be translated

Example:
    batch_translate("ENSEMBL_MART_ENSEMBL", "hsapiens_gene_ensembl", "hgnc_symbol", "ensembl_gene_id", ["TP53", "BRCA1", "BRCA2"])
    >>> {"translations": {"TP53": "ENSG00000141510", "BRCA1": "ENSG00000012048"}, "not_found": ["BRCA2"], "found_count": 2, "not_found_count": 1}

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
datasetYes
from_attrYes
martYes
targetsYes
to_attrYes

Implementation Reference

  • The main handler function for the batch_translate tool, decorated with @mcp.tool() for registration. It uses a cached translation dictionary to map multiple input identifiers to their corresponding targets efficiently.
    def batch_translate(mart: str, dataset: str, from_attr: str, to_attr: str, targets: list[str]):
        """
        Translates multiple identifiers in a single batch operation.
    
        This function is more efficient than multiple calls to get_translation when
        you need to translate many identifiers at once.
    
        Args:
            mart (str): The mart identifier (e.g., "ENSEMBL_MART_ENSEMBL")
            dataset (str): The dataset identifier (e.g., "hsapiens_gene_ensembl")
            from_attr (str): The source attribute name (e.g., "hgnc_symbol")
            to_attr (str): The target attribute name (e.g., "ensembl_gene_id")
            targets (list[str]): List of identifier values to translate (e.g., ["TP53", "BRCA1", "BRCA2"])
    
        Returns:
            dict: A dictionary containing:
                - translations: Dictionary mapping input IDs to translated IDs
                - not_found: List of IDs that could not be translated
                - found_count: Number of successfully translated IDs
                - not_found_count: Number of IDs that could not be translated
    
        Example:
            batch_translate("ENSEMBL_MART_ENSEMBL", "hsapiens_gene_ensembl", "hgnc_symbol", "ensembl_gene_id", ["TP53", "BRCA1", "BRCA2"])
            >>> {"translations": {"TP53": "ENSG00000141510", "BRCA1": "ENSG00000012048"}, "not_found": ["BRCA2"], "found_count": 2, "not_found_count": 1}
        """
        # Use the cached helper function to get the translation dictionary
        result_dict = _get_translation_dict(mart, dataset, from_attr, to_attr)
    
        translations = {}
        not_found = []
    
        for target in targets:
            if target in result_dict:
                translations[target] = result_dict[target]
            else:
                not_found.append(target)
    
        if not_found:
            print(
                f"The following targets were not found: {', '.join(not_found)}",
                file=sys.stderr,
            )
    
        return {
            "translations": translations,
            "not_found": not_found,
            "found_count": len(translations),
            "not_found_count": len(not_found),
        }
  • Cached helper function that queries Biomart to build a translation dictionary from from_attr to to_attr, used by both get_translation and batch_translate tools.
    def _get_translation_dict(mart: str, dataset: str, from_attr: str, to_attr: str):
        """
        Helper function to get and cache a translation dictionary.
        """
        try:
            server = get_server()
            dataset_obj = server[mart][dataset]
            df = dataset_obj.query(attributes=[from_attr, to_attr])
            return dict(zip(df.iloc[:, 0], df.iloc[:, 1]))
        except Exception as e:
            print(f"Error getting translation dictionary: {str(e)}", file=sys.stderr)
            return {}
Install Server

Other Tools

Related Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/jzinno/biomart-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server