Skip to main content
Glama

Instagram MCP Investigator

by inoue2002
README.md4.09 kB
# Instagram MCP Investigator Node.js MCP server that automates a Chromium session with Playwright, scrapes a target Instagram profile (using a saved login state), and feeds the results to OpenAI for an annotated report. ## Prerequisites - Node.js 20+ - Installed Playwright browsers (`npx playwright install chromium`) - A valid Instagram session captured in `storageState.json` - OpenAI API key (optional but recommended) ## Setup 1. Install dependencies: ```bash npm install ``` 2. Install the Chromium runtime that Playwright drives: ```bash npx playwright install chromium ``` 3. Capture a logged-in Instagram session (opens a non-headless browser): ```bash npm run login ``` - Log in manually in the launched window. - Return to the terminal and press Enter to persist `storageState.json`. 4. Copy `.env.example` to `.env` and fill in the values you need. At minimum set `OPENAI_API_KEY` to enable report generation. ## Running the MCP server The server communicates over stdio, so it can be registered with any MCP-compatible client. ```bash npm start ``` Environment variables influence scraping behaviour: - `PLAYWRIGHT_STORAGE_STATE`: path to the saved session (defaults to `./storageState.json`). - `PLAYWRIGHT_HEADLESS`: set to `false` to watch the browser. - `PLAYWRIGHT_BROWSER_CHANNEL`: set to `chrome` if you want to run the system Chrome instead of bundled Chromium. - `MAX_POSTS`: default number of recent posts to fetch. - `OPENAI_MODEL`: overrides the model used for summarisation. ## MCP tool contract The server exposes a single tool named `instagram_profile_report`. Input schema: ```json { "username": "tanaka_insta", "maxPosts": 12, "headless": false, "storageStatePath": "./storageState.json", "includeRaw": true, "model": "gpt-4o-mini" } ``` - `username` (required): Instagram handle with or without the leading `@`. - `maxPosts` (optional): limit recent posts inspected (max 50). - `headless` (optional): override headless mode per invocation. - `storageStatePath` (optional): use an alternate saved login state. - `includeRaw` (optional): append the raw JSON payload to the textual response. - `model` (optional): set a different OpenAI chat model. ## Workflow overview 1. Tool invocation resolves the target profile URL and opens it via Playwright. 2. The script reuses `storageState.json` to stay logged in. 3. Recent post metadata (image URL, alt text, timestamp, caption preview) and profile stats are serialised into JSON. 4. The JSON is sent to OpenAI for a qualitative summary. 5. The MCP tool returns a textual report (and optional raw JSON) to the requesting client. ## Troubleshooting - **Redirected to login**: refresh the saved session with `npm run login`. - **Not enough post data**: verify the profile is public and that the logged-in account has access. - **No summary returned**: confirm `OPENAI_API_KEY` is set and the specified model is available. ## Next steps - Add richer scraping (e.g., fetch captions via GraphQL requests) if Instagram layout changes. - Extend the MCP tool to cache results or stream image thumbnails for downstream automation. ## ローカルデバッグ クライアントに接続する前に動作を確認したい場合は、デバッグ用スクリプトを呼び出せます。 ```bash npm run debug -- tanaka_insta --max-posts 6 --include-raw ``` - `--no-headless` を付ければブラウザの挙動を確認できます。 - `--storage-state` で別のセッションファイルを指定可能。 - OpenAIの要約が不要な場合は `OPENAI_API_KEY` を設定しなくても実行できます(要約部分は注意書きが表示されます)。 - 進捗ログはデフォルトで表示されます。さらに詳しく見たいときは `--verbose`、静かに実行したいときは `--quiet` を指定してください。 - ページロードが遅くタイムアウトする場合は `--wait-until domcontentloaded` や `--wait-until load` を指定すると安定します(環境変数 `PLAYWRIGHT_WAIT_UNTIL` でも設定可)。

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/inoue2002/instagram-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server