Skip to main content
Glama
jonathan-politzki

Official Substack MCP Server

medium.cpython-313.pyc5.24 kB
� �@�g ���SrSSKrSSKJr SSKJr SSKrSSKJr SSKrSSK r SSK J r SSK J r Jr \R"\5r"SS \ 5rg) z- Medium scraper for Writer Context Protocol. �N)�datetime)�List)�urlparse)� BeautifulSoup)� BaseScraper�Postc�H�\rSrSrSrS\\4Sjr\S\ S\ 4Sj5r Sr g) � MediumScraper�zScraper for Medium blogs.�returnc ��# �[UR5nURRS5R S5nUR S:XaU(aUSOSnOUR R S5SnUR S5(aUSSnS U3n[RS U35 [R"5IShv�N nURU5IShv�N nUR5 [RS UR35 [R"UR 5n[RS [#UR$535 SSS5IShv�N /n WR$SUR,H�n [/U R0SR2S5n UR5U R7SSS95n Sn [9U S5(a"[:R<"U R>S5n [EUR5U RF5U RHU U SS9nU RKU5 M� [RS[#U 5S35 U $GN�GN�GN !,IShv�N (df  GN =f![&a-n[R)S [+U535 /sSnA$SnAff=f![@aY [:R<"U R>S5n N�![@a' [RCSU R>35 GN%f=ff=f![&a@n[R)SU RSS5S[+U535 SnAGM�SnAff=f7f)za Scrape posts from a Medium blog. Returns: A list of Post objects with content and metadata �/z medium.comr��.�@�Nzhttps://medium.com/feed/@zFetching RSS feed from: z,RSS feed fetched successfully. Status code: zNumber of entries in feed: zError fetching RSS feed: z html.parser� T)� separator�strip� publishedz%a, %d %b %Y %H:%M:%S %zz%a, %d %b %Y %H:%M:%SzCould not parse date: )�title�url�content�date�subtitlezError processing Medium post �link�unknownz: zScraped z posts from Medium)&rr�pathr�split�netloc� startswith�logger�info�httpx� AsyncClient�get�raise_for_status� status_code� feedparser�parse�text�len�entries� Exception�error�str� max_postsrr�value�_clean_content�get_text�hasattrr�strptimer� ValueError�warningrrr�append)�self� parsed_url� path_parts�username�rss_url�client�response�feed�e�posts�entry�soup� cleaned_text�pub_date�posts �M/Users/jonathanpolitzki/Desktop/Coding/mcp-writer-substack/scrapers/medium.py�scrape�MediumScraper.scrapes����d�h�h�'� ��_�_�*�*�3�/�5�5�c�:� � � � � � ,�(2�z�!�}��H�!�(�(�.�.�s�3�A�6�H� � � �s� #� #����|�H�.�h�Z�8�� � �K�K�2�7�)�<� =��(�(�*�*�f�!'���G�!4�4���)�)�+�� � �J�8�K_�K_�J`�a�b�!�'�'�� � �6��� � �9�#�d�l�l�:K�9L�M�N� +�*����\�\�/�4�>�>�2�E� g�$�U�]�]�1�%5�%;�%;�]�K��#�2�2�4�=�=�3�VZ�=�3[�\� � ���5�+�.�.�W�#+�#4�#4�U�_�_�F`�#a����-�-�e�k�k�:�� � �(�!�� ��� � �T�"�33�: � � �h�s�5�z�l�*<�=�>�� �S+�4�+�*�*�*�� � � �L�L�4�S��V�H�=� >��I�� �� &�W�W�'/�'8�'8����Ja�'b�H��)�W�"�N�N�-C�E�O�O�CT�+U�V�V�W�� W��� g�� � �<�U�Y�Y�v�y�=Y�<Z�Z\�]`�ab�]c�\d�e�f�f�� g�s �BN � 1J�I;�J�J�*I>�+BJ�, J�7J�8J�<N �AL?�1!K�AL?�(N �;J�>J�J�J� J � J�J�N �J� K�)"K� K� N �K�N � L<�$!L�L?�,L8�3L<�4L?�7L8�8L<�<L?�? N � 4N�=N �N � N rc�R�[R"SSU5R5nU$)z+Remove extra whitespace and normalize text.z\s+r)�re�subr)rs rIr3�MediumScraper._clean_contentYs$���&�&���g�.�4�4�6�����N) �__name__� __module__� __qualname__�__firstlineno__�__doc__rrrJ� staticmethodr0r3�__static_attributes__rQrPrIr r s:��#�A�d�4�j�A�F�������rPr )rV�loggingr�typingrrM� urllib.parserr)r$�bs4r� scrapers.baserr� getLoggerrRr"r rQrPrI�<module>r_sE������ �!�� ��+� � � �8� $��K�K�KrP

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/jonathan-politzki/mcp-writer-substack'

If you have feedback or need assistance with the MCP directory API, please join our Discord server