Skip to main content
Glama

batch_translate

Translate multiple biological identifiers in a batch operation using specified source and target attributes. Returns successful translations, missing IDs, and counts for efficient processing in Biomart MCP.

Instructions

Translates multiple identifiers in a single batch operation.

This function is more efficient than multiple calls to get_translation when
you need to translate many identifiers at once.

Args:
    mart (str): The mart identifier (e.g., "ENSEMBL_MART_ENSEMBL")
    dataset (str): The dataset identifier (e.g., "hsapiens_gene_ensembl")
    from_attr (str): The source attribute name (e.g., "hgnc_symbol")
    to_attr (str): The target attribute name (e.g., "ensembl_gene_id")
    targets (list[str]): List of identifier values to translate (e.g., ["TP53", "BRCA1", "BRCA2"])

Returns:
    dict: A dictionary containing:
        - translations: Dictionary mapping input IDs to translated IDs
        - not_found: List of IDs that could not be translated
        - found_count: Number of successfully translated IDs
        - not_found_count: Number of IDs that could not be translated

Example:
    batch_translate("ENSEMBL_MART_ENSEMBL", "hsapiens_gene_ensembl", "hgnc_symbol", "ensembl_gene_id", ["TP53", "BRCA1", "BRCA2"])
    >>> {"translations": {"TP53": "ENSG00000141510", "BRCA1": "ENSG00000012048"}, "not_found": ["BRCA2"], "found_count": 2, "not_found_count": 1}

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
datasetYes
from_attrYes
martYes
targetsYes
to_attrYes

Implementation Reference

  • The main handler function for the batch_translate tool, decorated with @mcp.tool() for registration. It uses a cached translation dictionary to map multiple input identifiers to their corresponding targets efficiently.
    def batch_translate(mart: str, dataset: str, from_attr: str, to_attr: str, targets: list[str]):
        """
        Translates multiple identifiers in a single batch operation.
    
        This function is more efficient than multiple calls to get_translation when
        you need to translate many identifiers at once.
    
        Args:
            mart (str): The mart identifier (e.g., "ENSEMBL_MART_ENSEMBL")
            dataset (str): The dataset identifier (e.g., "hsapiens_gene_ensembl")
            from_attr (str): The source attribute name (e.g., "hgnc_symbol")
            to_attr (str): The target attribute name (e.g., "ensembl_gene_id")
            targets (list[str]): List of identifier values to translate (e.g., ["TP53", "BRCA1", "BRCA2"])
    
        Returns:
            dict: A dictionary containing:
                - translations: Dictionary mapping input IDs to translated IDs
                - not_found: List of IDs that could not be translated
                - found_count: Number of successfully translated IDs
                - not_found_count: Number of IDs that could not be translated
    
        Example:
            batch_translate("ENSEMBL_MART_ENSEMBL", "hsapiens_gene_ensembl", "hgnc_symbol", "ensembl_gene_id", ["TP53", "BRCA1", "BRCA2"])
            >>> {"translations": {"TP53": "ENSG00000141510", "BRCA1": "ENSG00000012048"}, "not_found": ["BRCA2"], "found_count": 2, "not_found_count": 1}
        """
        # Use the cached helper function to get the translation dictionary
        result_dict = _get_translation_dict(mart, dataset, from_attr, to_attr)
    
        translations = {}
        not_found = []
    
        for target in targets:
            if target in result_dict:
                translations[target] = result_dict[target]
            else:
                not_found.append(target)
    
        if not_found:
            print(
                f"The following targets were not found: {', '.join(not_found)}",
                file=sys.stderr,
            )
    
        return {
            "translations": translations,
            "not_found": not_found,
            "found_count": len(translations),
            "not_found_count": len(not_found),
        }
  • Cached helper function that queries Biomart to build a translation dictionary from from_attr to to_attr, used by both get_translation and batch_translate tools.
    def _get_translation_dict(mart: str, dataset: str, from_attr: str, to_attr: str):
        """
        Helper function to get and cache a translation dictionary.
        """
        try:
            server = get_server()
            dataset_obj = server[mart][dataset]
            df = dataset_obj.query(attributes=[from_attr, to_attr])
            return dict(zip(df.iloc[:, 0], df.iloc[:, 1]))
        except Exception as e:
            print(f"Error getting translation dictionary: {str(e)}", file=sys.stderr)
            return {}

Tool Definition Quality

Score is being calculated. Check back soon.

Install Server

Other Tools

Related Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/jzinno/biomart-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server