# Private GitHub MCP Server — Architecture Guide
This document describes the architecture of our self-hosted GitHub MCP Server, deployed on AWS ECS behind an internal ALB with Okta OAuth authentication. It is intended for team members who need to understand how the system works, how to connect to it, and how to operate it.
## Overview
We run the official [GitHub MCP Server](https://github.com/github/github-mcp-server) as a private, shared service accessible only through our AWS Client VPN. The server exposes GitHub tools (repository management, code search, pull requests, issues, etc.) to any MCP-compatible IDE client (Kiro, VS Code, Cursor, etc.) over the Streamable HTTP transport, authenticated via Okta OIDC.
```
┌─────────────────────────────────────────────────────────────────────┐
│ Developer Workstation │
│ │
│ ┌──────────┐ stdio ┌────────────┐ HTTPS ┌───────────┐ │
│ │ IDE/Kiro │◄──────────►│ mcp-remote │──────────►│ AWS Client │ │
│ │ │ │ (local │ │ VPN │ │
│ └──────────┘ │ proxy) │ └─────┬──────┘ │
│ └──────┬──────┘ │ │
│ │ OAuth │ │
│ ┌──────▼──────┐ │ │
│ │ Browser │ │ │
│ │ (Okta login)│ │ │
│ └─────────────┘ │ │
└───────────────────────────────────────────────────────────┼─────────┘
│
VPN Tunnel
│
┌───────────────────────────────────────────────────────────┼─────────┐
│ AWS VPC (172.31.0.0/16) │ │
│ │ │
│ ┌────────────────────────────────────────────────────────▼──────┐ │
│ │ Internal ALB (HTTPS :443) │ │
│ │ - Self-signed cert (EasyRSA, imported to ACM) │ │
│ │ - Sticky sessions (lb_cookie) for stateful sessions │ │
│ │ - Only accessible from VPN CIDR + VPC CIDR │ │
│ └────────────────────────┬──────────────────────────────────────┘ │
│ │ │
│ ┌────────────────────────▼──────────────────────────────────────┐ │
│ │ ECS Fargate Task │ │
│ │ │ │
│ │ ┌─────────────────────────────────────────────────────────┐ │ │
│ │ │ JWT Proxy (Node.js, port 8000) │ │ │
│ │ │ - Serves OAuth metadata (/.well-known/*) │ │ │
│ │ │ - Proxies token exchange to Okta (strips RFC 8707 │ │ │
│ │ │ resource param) │ │ │
│ │ │ - Validates JWT (issuer, audience, JWKS) │ │ │
│ │ │ - Proxies authenticated requests to Supergateway │ │ │
│ │ └──────────────────────┬──────────────────────────────────┘ │ │
│ │ │ localhost:8001 │ │
│ │ ┌──────────────────────▼──────────────────────────────────┐ │ │
│ │ │ Supergateway (Streamable HTTP, port 8001) │ │ │
│ │ │ - --outputTransport streamableHttp --stateful │ │ │
│ │ │ - Spawns one github-mcp-server stdio process per │ │ │
│ │ │ session (Mcp-Session-Id) │ │ │
│ │ │ - Multiple concurrent sessions supported │ │ │
│ │ │ - Sessions timeout after 10 minutes of inactivity │ │ │
│ │ └──────────────────────┬──────────────────────────────────┘ │ │
│ │ │ stdio │ │
│ │ ┌──────────────────────▼──────────────────────────────────┐ │ │
│ │ │ github-mcp-server (one per session) │ │ │
│ │ │ - Official GitHub MCP Server binary │ │ │
│ │ │ - Authenticated via GITHUB_PERSONAL_ACCESS_TOKEN │ │ │
│ │ │ - Provides: repos, issues, PRs, code search, files... │ │ │
│ │ └─────────────────────────────────────────────────────────┘ │ │
│ └───────────────────────────────────────────────────────────────┘ │
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌────────────────────┐ │
│ │ NAT Gateway │ │ ECR │ │ Secrets Manager │ │
│ │ (public │ │ (container │ │ (GitHub PAT) │ │
│ │ subnet) │ │ images) │ │ │ │
│ └──────────────┘ └──────────────┘ └────────────────────┘ │
└─────────────────────────────────────────────────────────────────────┘
```
## Components
### 1. mcp-remote (Client-Side Proxy)
[mcp-remote](https://github.com/geelen/mcp-remote) runs locally on each developer's machine. It bridges the gap between the IDE (which speaks stdio to MCP servers) and our remote server (which speaks Streamable HTTP over HTTPS).
Responsibilities:
- Translates stdio ↔ Streamable HTTP
- Handles the full OAuth 2.0 Authorization Code + PKCE flow
- Opens a browser for Okta login, receives the callback on `localhost:3334`
- Exchanges the authorization code for tokens
- Attaches the access token as a `Bearer` header on every request
- Auto-refreshes tokens when they expire
- Auto-negotiates transport (tries Streamable HTTP first, falls back to SSE)
No code changes are needed on the client side when the server transport changes — `mcp-remote` handles it transparently.
### 2. Internal Application Load Balancer
The ALB is internal (not internet-facing) and only reachable through the AWS Client VPN or from within the VPC.
- Listens on HTTPS :443 with a self-signed certificate (EasyRSA, imported to ACM)
- Sticky sessions enabled (`lb_cookie`, 24h duration) to pin a client's requests to the same ECS task for the duration of a Streamable HTTP session
- Health checks hit `/healthz` on port 8000 (bypasses auth)
- Security group allows inbound only from VPN CIDR (`10.100.0.0/16`) and VPC CIDR
### 3. JWT Proxy (jwt-proxy.mjs)
A lightweight Node.js HTTP server running on port 8000 inside the container. It sits in front of Supergateway and handles all OAuth/auth concerns so that Supergateway doesn't need to know about authentication.
#### Request Flow
| Path | Method | Auth Required | Behavior |
|------|--------|---------------|----------|
| `/healthz` | GET | No | Returns `200 ok` (ALB health check) |
| `/.well-known/oauth-protected-resource` | GET | No | RFC 9728 Protected Resource Metadata — tells the client where to find the authorization server |
| `/.well-known/oauth-authorization-server` | GET | No | Returns Okta's OpenID Connect metadata with `token_endpoint` rewritten to proxy through us |
| `/.well-known/openid-configuration` | GET | No | Same as above (alias) |
| `/oauth/token` | POST | No | Proxies token exchange to Okta's real token endpoint, stripping the RFC 8707 `resource` parameter that Okta doesn't support |
| `/mcp` | POST/GET/DELETE | Yes (Bearer JWT) | Validates the JWT against Okta's JWKS, then proxies to Supergateway on localhost:8001 |
| Any other path | Any | Yes (Bearer JWT) | Same JWT validation + proxy |
#### Why We Proxy the Token Endpoint
`mcp-remote` follows the MCP OAuth spec which includes RFC 8707 Resource Indicators. It sends a `resource` parameter in the token exchange request. Okta's custom authorization server doesn't support RFC 8707 and rejects the request. Our proxy strips this parameter before forwarding to Okta.
#### Why We Don't Proxy the Authorization Endpoint
The authorization endpoint is where the browser navigates to for login. Since our ALB is internal and unreachable from the developer's browser, we keep the `authorization_endpoint` in the metadata pointing directly to Okta (`https://<OKTA_DOMAIN>/oauth2/default/v1/authorize`). Only the `token_endpoint` is rewritten to route through the proxy.
### 4. Supergateway (Streamable HTTP Mode)
[Supergateway](https://github.com/supercorp-ai/supergateway) bridges stdio-based MCP servers to HTTP transports. We run it in Streamable HTTP stateful mode:
```
supergateway \
--stdio "github-mcp-server stdio" \
--outputTransport streamableHttp \
--stateful \
--sessionTimeout 600000 \
--port 8001 \
--healthEndpoint /healthz \
--cors
```
Key behaviors:
- Each new client connection gets a unique `Mcp-Session-Id`
- For each session, Supergateway spawns a dedicated `github-mcp-server` stdio process
- Multiple sessions can coexist on the same task (unlike SSE mode which only supported one)
- Idle sessions are cleaned up after 10 minutes (`--sessionTimeout 600000` ms)
- The `/mcp` endpoint handles POST (send messages), GET (open SSE stream for server-initiated messages), and DELETE (close session)
### 5. GitHub MCP Server
The official [github-mcp-server](https://github.com/github/github-mcp-server) binary, copied from the `ghcr.io/github/github-mcp-server:latest` Docker image. It runs in stdio mode, one instance per client session.
Authenticated via the `GITHUB_PERSONAL_ACCESS_TOKEN` environment variable, pulled from AWS Secrets Manager at task startup.
Provides tools for: repository management, file operations, issue and PR management, code search, branch operations, and more.
### 6. Network Architecture
The ECS tasks run in private subnets with no public IP. Outbound internet access (needed for GitHub API and Okta JWKS) is provided via a NAT Gateway:
```
Private Subnets (ECS tasks)
│
▼ 0.0.0.0/0
NAT Gateway (in public subnet)
│
▼
Internet Gateway
│
▼
Internet (api.github.com, <OKTA_DOMAIN>)
```
Security groups:
- ALB SG: inbound 443/80 from VPN CIDR + VPC CIDR, all outbound
- ECS SG: inbound 8000 from ALB SG only, all outbound
## Authentication Flow (Detailed)
```
Developer Browser mcp-remote ALB/Proxy Okta
│ │ │ │ │
│ Start MCP server │ │ │ │
│──────────────────────►│ │ │ │
│ │ │ │ │
│ │ GET /.well-known/oauth-protected-resource │
│ │ │───────────────────►│ │
│ │ │◄───────────────────│ │
│ │ │ {authorization_servers: [ALB_URL]} │
│ │ │ │ │
│ │ GET /.well-known/oauth-authorization-server │
│ │ │───────────────────►│ │
│ │ │◄───────────────────│ │
│ │ │ {authorization_endpoint: Okta, │
│ │ │ token_endpoint: ALB/oauth/token} │
│ │ │ │ │
│ │ Open browser to Okta authorize URL │ │
│ │◄────────────────────│ │ │
│ │ │ │ │
│ │ GET /oauth2/default/v1/authorize │ │
│ │─────────────────────────────────────────────────────────────►
│ │ │ │ │
│ Login + consent │ │ │ │
│──────────────────────►│ │ │ │
│ │ │ │ │
│ │ 302 → localhost:3334/oauth/callback?code=... │
│ │◄────────────────────────────────────────────────────────────│
│ │ │ │ │
│ │ Callback received │ │ │
│ │────────────────────►│ │ │
│ │ │ │ │
│ │ │ POST /oauth/token │ │
│ │ │ (code + PKCE) │ │
│ │ │───────────────────►│ │
│ │ │ │ │
│ │ │ │ POST /oauth2/default/v1/token
│ │ │ │ (resource param stripped)
│ │ │ │─────────────────►│
│ │ │ │◄─────────────────│
│ │ │ │ {access_token} │
│ │ │◄───────────────────│ │
│ │ │ {access_token} │ │
│ │ │ │ │
│ │ │ POST /mcp │ │
│ │ │ Authorization: Bearer <token> │
│ │ │───────────────────►│ │
│ │ │ │ JWT verified ✓ │
│ │ │ │ → Supergateway │
│ │ │ │ → github-mcp │
│ │ │◄───────────────────│ │
│ Tools available │ │ │ │
│◄──────────────────────────────────────────── │ │ │
```
## Streamable HTTP vs SSE
We previously used SSE (Server-Sent Events) transport, which had a critical limitation: Supergateway only supported a single SSE connection per process. A second developer connecting would crash the server. Streamable HTTP solves this:
| Aspect | SSE (old) | Streamable HTTP (current) |
|--------|-----------|---------------------------|
| Connections per task | 1 | Many (one session per developer) |
| Transport | `GET /sse` (persistent) + `POST /message` | `POST /mcp` (request/response) |
| Session affinity | Required (SSE stream is stateful) | Required in stateful mode (session bound to task) |
| Reconnect behavior | Must re-establish SSE stream | Just retry the POST |
| Multi-developer | Not supported | Fully supported |
| ECS scaling | Must be 1 task | Can scale to N tasks |
## Infrastructure (Terraform)
All infrastructure is managed by Terraform in the `terraform/` directory.
### Module Structure
```
terraform/
├── main.tf # Root module — wires infrastructure + application
├── variables.tf # Input variables with defaults
├── terraform.tfvars # Environment-specific values
├── secrets.auto.tfvars # GitHub PAT (gitignored)
├── outputs.tf # Exported values (ALB URL, ECR URL, etc.)
├── modules/
│ ├── infrastructure/ # Network + ALB
│ │ ├── main.tf # IGW, public subnet, NAT Gateway, routes
│ │ ├── alb.tf # Internal ALB, target group, listeners, SGs
│ │ ├── variables.tf
│ │ └── outputs.tf
│ └── application/ # ECR + ECS + Secrets
│ ├── ecr.tf # ECR repository
│ ├── ecs.tf # Cluster, task definition, service
│ ├── secrets.tf # Secrets Manager (GitHub PAT)
│ ├── main.tf # IAM roles (task execution + task)
│ ├── variables.tf
│ └── outputs.tf
```
### Key Resources
| Resource | Purpose |
|----------|---------|
| `aws_nat_gateway.main` | Outbound internet for ECS tasks (GitHub API, Okta JWKS) |
| `aws_lb.main` | Internal ALB, HTTPS termination |
| `aws_lb_target_group.main` | Routes to ECS tasks, sticky sessions enabled |
| `aws_ecs_cluster.main` | Fargate cluster with Container Insights |
| `aws_ecs_service.main` | 1 task, rolling deployment |
| `aws_ecr_repository.main` | Container image registry |
| `aws_secretsmanager_secret.github_pat` | GitHub PAT, injected as env var at runtime |
## Docker Image
The container image is a multi-stage build:
1. Stage 1: Copy `github-mcp-server` binary from the official GitHub image
2. Stage 2: Node.js 20 Alpine base with Supergateway (npm global) + jose (JWT library) + our jwt-proxy.mjs and start.sh
The entrypoint (`start.sh`) runs both Supergateway and the JWT proxy as background processes. If either process exits, the script kills the other and exits with the failure code, causing ECS to restart the task.
## Operations
### Build and Deploy
```bash
# Authenticate
aws sso login --profile <YOUR_SSO_PROFILE>
# Build and push (also triggers ECS force-new-deployment)
./scripts/build-and-push.sh latest
# Apply infrastructure changes
cd terraform && terraform apply
```
### View Logs
```bash
# Recent logs
aws logs tail /ecs/supergateway-prod --since 30m --follow
# Or use CloudWatch Logs Insights
# Log group: /ecs/supergateway-prod
```
### Health Check
```bash
# From VPN — bypasses auth
curl -sk https://<ALB_DNS>/healthz
# Expected: ok
```
### Force Restart
```bash
aws ecs update-service \
--cluster supergateway-prod \
--service supergateway-prod \
--force-new-deployment \
--region us-east-1
```
## Client Configuration
See [mcp-client-config.md](mcp-client-config.md) for IDE-specific setup instructions.
The short version: add this to your MCP config and connect to VPN. `mcp-remote` handles the rest.
```json
{
"mcpServers": {
"github": {
"command": "npx",
"args": [
"mcp-remote",
"https://<ALB_DNS>",
"--static-oauth-client-info",
"{\"client_id\":\"<OKTA_CLIENT_ID>\"}"
],
"env": {
"NODE_TLS_REJECT_UNAUTHORIZED": "0"
}
}
}
}
```
`NODE_TLS_REJECT_UNAUTHORIZED=0` is needed because the ALB uses a self-signed certificate.
## Security Considerations
- The ALB is internal-only — not reachable from the public internet
- All traffic is encrypted (HTTPS with TLS 1.3)
- Authentication is enforced on every MCP request via Okta JWT validation
- The GitHub PAT is stored in AWS Secrets Manager, never in code or environment files
- The Okta app is configured as a Native/public client (no client secret needed)
- PKCE is required for all authorization code flows
- JWTs are validated against Okta's JWKS endpoint with issuer and audience checks