We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/rainyuniverse/mcp-webscout'
If you have feedback or need assistance with the MCP directory API, please join our Discord server
robots.cpython-310.pyc•1.48 KiB
o
��i� � @ sX d Z ddlZddlmZ ddlmZ ddlZddlmZ dde d e d
e
de
fdd
�ZdS )zRobots.txt checking utility.� N)�urlparse)�RobotFileParser� )� PROXY_URLT�url�
user_agent� use_proxy�returnc
C s� z^t | �}|j� d|j� d�}t� }|�|� d}|r ttd�}zt�� }|j|d|d�}|� |j
�� � W n tyJ t
�d|� d�� Y W d S w |�|| �} t
�d
| � d| � �� | W S tyy }
zt
�d|
� �� W Y d}
~
d S d}
~
ww )
z�Check if robots.txt allows access to the URL.
Args:
url: The URL to check.
user_agent: User agent string.
use_proxy: Whether to use proxy for the request.
Returns:
True if allowed, False if disallowed.
z://z/robots.txtN)�http�https� )�timeout�proxieszCannot read robots.txt: z, assuming allowedTzRobots.txt check - URL: z, allowed: zRobots.txt check failed: )r �scheme�netlocr Zset_urlr �requests�Session�get�parse�text�
splitlines� Exception�logging�warning� can_fetch�info�error)r r r �parsedZ
robots_url�rpr �session�responser �e� r"