Instagram MCP Investigator

README.md•4.09 kB

# Instagram MCP Investigator Node.js MCP server that automates a Chromium session with Playwright, scrapes a target Instagram profile (using a saved login state), and feeds the results to OpenAI for an annotated report. ## Prerequisites - Node.js 20+ - Installed Playwright browsers (`npx playwright install chromium`) - A valid Instagram session captured in `storageState.json` - OpenAI API key (optional but recommended) ## Setup 1. Install dependencies: ```bash npm install ``` 2. Install the Chromium runtime that Playwright drives: ```bash npx playwright install chromium ``` 3. Capture a logged-in Instagram session (opens a non-headless browser): ```bash npm run login ``` - Log in manually in the launched window. - Return to the terminal and press Enter to persist `storageState.json`. 4. Copy `.env.example` to `.env` and fill in the values you need. At minimum set `OPENAI_API_KEY` to enable report generation. ## Running the MCP server The server communicates over stdio, so it can be registered with any MCP-compatible client. ```bash npm start ``` Environment variables influence scraping behaviour: - `PLAYWRIGHT_STORAGE_STATE`: path to the saved session (defaults to `./storageState.json`). - `PLAYWRIGHT_HEADLESS`: set to `false` to watch the browser. - `PLAYWRIGHT_BROWSER_CHANNEL`: set to `chrome` if you want to run the system Chrome instead of bundled Chromium. - `MAX_POSTS`: default number of recent posts to fetch. - `OPENAI_MODEL`: overrides the model used for summarisation. ## MCP tool contract The server exposes a single tool named `instagram_profile_report`. Input schema: ```json { "username": "tanaka_insta", "maxPosts": 12, "headless": false, "storageStatePath": "./storageState.json", "includeRaw": true, "model": "gpt-4o-mini" } ``` - `username` (required): Instagram handle with or without the leading `@`. - `maxPosts` (optional): limit recent posts inspected (max 50). - `headless` (optional): override headless mode per invocation. - `storageStatePath` (optional): use an alternate saved login state. - `includeRaw` (optional): append the raw JSON payload to the textual response. - `model` (optional): set a different OpenAI chat model. ## Workflow overview 1. Tool invocation resolves the target profile URL and opens it via Playwright. 2. The script reuses `storageState.json` to stay logged in. 3. Recent post metadata (image URL, alt text, timestamp, caption preview) and profile stats are serialised into JSON. 4. The JSON is sent to OpenAI for a qualitative summary. 5. The MCP tool returns a textual report (and optional raw JSON) to the requesting client. ## Troubleshooting - **Redirected to login**: refresh the saved session with `npm run login`. - **Not enough post data**: verify the profile is public and that the logged-in account has access. - **No summary returned**: confirm `OPENAI_API_KEY` is set and the specified model is available. ## Next steps - Add richer scraping (e.g., fetch captions via GraphQL requests) if Instagram layout changes. - Extend the MCP tool to cache results or stream image thumbnails for downstream automation. ## ローカルデバッグクライアントに接続する前に動作を確認したい場合は、デバッグ用スクリプトを呼び出せます。 ```bash npm run debug -- tanaka_insta --max-posts 6 --include-raw ``` - `--no-headless` を付ければブラウザの挙動を確認できます。 - `--storage-state` で別のセッションファイルを指定可能。 - OpenAIの要約が不要な場合は `OPENAI_API_KEY` を設定しなくても実行できます（要約部分は注意書きが表示されます）。 - 進捗ログはデフォルトで表示されます。さらに詳しく見たいときは `--verbose`、静かに実行したいときは `--quiet` を指定してください。 - ページロードが遅くタイムアウトする場合は `--wait-until domcontentloaded` や `--wait-until load` を指定すると安定します（環境変数 `PLAYWRIGHT_WAIT_UNTIL` でも設定可）。

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/inoue2002/instagram-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server