Skip to main content
Glama

PubTator-MCP-Server

MIT License
8
  • Linux
  • Apple
PubTator_search.cpython-310.pyc12.3 kB
o ȿ�g�@� @s�ddlZddlZddlZddlZddlmZmZmZm Z m Z Gdd�d�Z e dk�riddl Z ddlZddlmZejejdd�e�e �Ze jd d �Zejd gd �d dd�e��Z�zz�e ddd�Zdd�Zdd�Zdd�Zdd�Zeeeed�Zejd kr�e� �D]l\Z!Z"ze�#de!���e"�Z$e�#de!�d��Wq�e%y�Z&ze�'de!�d e(e&��d��WYdZ&[&q�dZ&[&wwzeej�Z$e�#dej���Wn e%y�Z&ze�'dej�d e(e&����WYdZ&[&ndZ&[&wwWnQej)j*�yZ&ze�'d!e&���WYdZ&[&nAdZ&[&we+�y(Z&ze�'d"e&���WYdZ&[&n/dZ&[&we%�yBZ&ze�'d#e&���WYdZ&[&ndZ&[&wwWej,�-�dSWej,�-�dSWej,�-�dSWej,�-�dSej,�-�wdS)$�N)�List�Dict�Optional�Union� Generatorc@s�eZdZdZdZd2dedefdd�Zd d �Z d3dee de de de de e e ff dd�Z  d4de dee deede fdd�Z   d5de dee dee deede f dd �Z !  "d6de d#ed$eed%edee ddff d&d'�Z (  d7d)e de d*ee d$eedee ddff d+d,�Zd-e dee fd.d/�Z  "d8de de d$eede d%edee e e fddff d0d1�ZdS)9� PubTator3APIz3https://www.ncbi.nlm.nih.gov/research/pubtator3-apiz8https://www.ncbi.nlm.nih.gov/CBBresearch/Lu/Demo/RESTful��� max_retries�timeoutcCs:t��|_|jj�ddd��d|_||_||_d|_dS)NzPubTator3API Python Client/2.1zapplication/json)z User-Agent�Acceptg��(\���?r) �requestsZSession�session�headers�update� request_delayr r �_last_request_time)�selfr r �r�9D:\code\github\mcp\PubTator-MCP-Server\pubtator_search.py�__init__ s � zPubTator3API.__init__c Os�t��}||j}||jkrt�|j|�|j|d<t|j�D]?}z|jj|g|�Ri|��}t��|_|WSt j j y_}z||jdkrK�t�t d|d��WYd}~q d}~wwdS)z5Request method with rate limiting and retry mechanismr ��� N) �timerr�sleepr �ranger r�requestr � exceptions�RequestException�min) r�method�args�kwargs� current_time�elapsed�attempt�response�errr�_rate_limited_requests"      ���z"PubTator3API._rate_limited_request�pmid�biocjsonF�ids�id_type�format� full_text�returnc Cs�|std��|dvrtd��|dvrtd��zN|dkr$|j�d|��}n|j�d|��}|�d �d �|�i}|r@|d kr@d |d <|jd||d�}|��|dkra|��}t|t�r^d|iWS|WS|jWSt j j y{} z t dt | �����d} ~ ww)u� 导出一组文献的标注结果 参数: ids: 文献ID列表(pmids或pmcids) id_type: ID类型,可以是"pmid"或"pmcid" format: 返回格式("pubtator", "biocxml"或"biocjson") full_text: 是否获取全文(仅适用于biocxml/biocjson格式) 返回: 标注结果(JSON字典或XML字符串) zIDs list cannot be empty)r*�pmcidz!id_type must be 'pmid' or 'pmcid')�pubtatorZbiocxmlr+z2format must be one of: pubtator, biocxml, biocjsonr1z/publications/pmc_export/z/publications/export/�s�,r2�true�full�GET��paramsr+Z documentszFailed to export publications: N)� ValueError�BASE_URL�joinr)�raise_for_status�json� isinstance�list�textr rr� Exception�str) rr,r-r.r/�urlr9r'�resultr(rrr�export_publications*s2   ��z PubTator3API.export_publicationsN�query�concept�limitcCs\|j�d�}d|i}|r|dvrtd��||d<|r||d<|jd||d�}|��|��S) u} 通过自由文本查询查找生物概念的标识符 参数: query: 查询文本 concept: 可选,指定生物概念类型 (如"gene", "disease", "chemical", "species", "mutation") limit: 可选,限制返回结果数量 返回: 包含实体ID的JSON字典 z/entity/autocomplete/rG)�gene�disease�chemicalZspeciesZmutationzInvalid concept typerHrIr7r8N)r;r:r)r=r>)rrGrHrIrDr9r'rrr�find_entity_id]s zPubTator3API.find_entity_id� entity_id� relation_type�target_entity_type� max_resultsc Cs�|j�d�}|�d�std��d|i}gd�}|r)||vr%|dkr%td��||d<|r7|d vr3td ��||d <|r=||d <|jd ||d�}|��|��S)u 查找相关实体 参数: entity_id: 实体ID(通过find_entity_id获取) relation_type: 可选,指定关系类型 (如"treat", "cause", "interact", "associate") target_entity_type: 可选,指定目标实体类型 (如"gene", "disease", "chemical") max_results: 可选,限制返回结果的最大数量 返回: 相关实体结果的JSON字典 z /relations�@zeInvalid entity ID format, should start with '@', e.g., '@CHEMICAL_remdesivir' or '@DISEASE_Neoplasms'�e1) �treat�causeZcotreat�convert�compareZinteractZ associateZpositive_correlateZnegative_correlateZpreventZinhibitZ stimulateZ drug_interact�ANYzInvalid relation type�type)rJrKrL�variantzInvalid target entity type�e2rIr7r8N)r;� startswithr:r)r=r>) rrNrOrPrQrDr9Zvalid_relationsr'rrr�find_related_entities}s$  z"PubTator3API.find_related_entitiesr�d�page� max_pages� batch_sizec cs6�|}d}d}|dus||kr�z.|j�d�}||d�} |jd|| d�} | ��| ��} | �d�s3WdSd}| V|d 7}WnQtjjyk} z |d 7}||krWtd |�d ���t � t d |d ��WYd} ~ qd} ~ wtj y�|d 7}||kr�td |�d���t � t d |d ��Yqw|dus||ksdSdS)uy增强的搜索功能,支持自动分页和错误重试 参数: query: 查询内容(自由文本/实体ID/关系查询) page: 起始页码 max_pages: 最大获取页数(None表示无限制) batch_size: 每批处理的PMID数量 返回: 生成器,逐页产生搜索结果 rrNz/search/)rAr_r7r8�resultsrzSearch terminated after z consecutive request failuresrrz& consecutive response parsing failures) r;r)r=r>�getr rrrBrrr �JSONDecodeError) rrGr_r`raZ current_pageZconsecutive_errorsZmax_consecutive_errorsrDr9r'�datar(rrr�search�s>�     ���zPubTator3API.searchrX�entity1�entity2ccsJ�|durd|�d|��}n d|�d|�d|��}|j||d�EdHdS)ua 专门的关系查询方法 参数: entity1: 第一个实体ID relation_type: 关系类型(ANY/treat/cause等) entity2: 第二个实体ID或类型(如"DISEASE") max_pages: 最大获取页数 返回: 生成器,逐页产生关系搜索结果 Nz relations:�|�r`)rf)rrgrOrhr`rGrrr�search_relations�s �zPubTator3API.search_relationsrbcCsdd�|�dg�D�S)u� 从搜索结果中提取PMID列表 参数: results: 单页搜索结果 返回: PMID列表 cSs g|] }d|vrt|d��qS)r*�rC)�.0rErrr� <listcomp>s z;PubTator3API.extract_pmids_from_results.<locals>.<listcomp>rbN)rc)rrbrrr�extract_pmids_from_results�s z'PubTator3API.extract_pmids_from_resultsc cs|�z�g}|j||d�D]d}|�|�}|�|�t|�|kro|d|�} zdd�| D�} |�| d||�} | V||d�}Wn+tjjyh} ztdt| ��dt | ����|dkrc|d }WYd} ~ q�d} ~ wwt|�|ksq |r�zd d�|D�}|�|d||�} | VWWdStjjy�} ztd t|��dt | �����d} ~ wwWdSt y�} z t d t | �����d} ~ ww) uy搜索并批量导出文献,支持分批处理和错误重试 参数: query: 搜索查询 format: 导出格式 max_pages: 最大搜索页数 full_text: 是否导出全文 batch_size: 每批处理的PMID数量 返回: 生成器,产生导出的文献内容 rjNcS�g|]}t|��qSrrl�rmr*rrrrn(�z9PubTator3API.batch_export_from_search.<locals>.<listcomp>r*u 批量导出失败(批次大小:z): rrcSrprrlrqrrrrn>rru处理剩余PMID失败(数量:u#批量导出过程中发生错误: ) rfro�extend�lenrFr rr�printrCrB) rrGr.r`r/raZ all_pmids� page_resultZ page_pmids�batchrEr(rrr�batch_export_from_search s`�    � �� ��������z%PubTator3API.batch_export_from_search)rr )r*r+F)NN)NNN)rNr^)rXNN)r+rFr^)�__name__� __module__� __qualname__r;Z ANNOTATE_URL�intrr)rrC�boolrrrFrrMr]rrfrkrorxrrrrrs� �����  �6���� �#����� �5�����  �7�����  ��������r�__main__)�datetimez)%(asctime)s - %(levelname)s - %(message)s)�levelr.zPubTator3API Test Program)� descriptionz--test)�all�export�entity�relationZannotaterfr�zSelect test case to run)�choices�default�helprr )r r cCs:t�d�ddg}tj|dd�}t�dt|��d��|S)Nz,Testing export publications functionality...Z25359968Z25359969r+)r.zSuccessfully exported z publications)�logger�info�apirFrt)ZpmidsrErrr�test_export_publicationsbs r�cC�2t�d�d}tj|ddd�}t�d|���|S)Nz'Testing find entity ID functionality...ZdiabetesrK�)rHrIzSuccessfully found entity: )r�r�r�rM)rGrErrr�test_find_entity_idis r�cCr�)Nz.Testing find related entities functionality...z@DISEASE_Diabetes_MellitusrTrL)rOrPz'Successfully found entities related to )r�r�r�r])rNrErrr�test_find_related_entitiesps �r�cCsDt�d�d}g}tj|dd�D]}|�|�qt�d|���|S)NzTesting search functionality...zdiabetes treatmentrrjzSuccessfully searched: )r�r�r�rf�append)rGrbrvrrr� test_search}s  r�)r�r�r�rfz Executing test: zTest successful: � z Test failed z: zNetwork request error: zParameter error: zUnknown error: ).r r>r� urllib.parse�urllib�typingrrrrrrry�argparse�loggingr� basicConfig�INFO� getLoggerr��ArgumentParser�parser� add_argument� parse_argsr"r�r�r�r�r�Z test_funcs�test�items�name�funcr�rErBr(�errorrCrrr:r�closerrrr�<module>s� H �   �   � (�� (������������

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/JackKuo666/PubTator-MCP-Server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server