Question 1

What can you do with this server?

Accepted Answer

The HumaneProxy server provides AI safety middleware that intercepts and analyzes user messages for self-harm or criminal intent before they reach LLMs. It exposes three core MCP tools plus broader integration options:

* check_message_safety: Classify a message through a configurable 3-stage pipeline (heuristics → semantic embeddings → reasoning LLM). Returns safety status, risk score, category, triggered patterns, pipeline stage reached, and whether escalation is warranted. Accepts an optional session ID for trajectory-aware scoring.
* get_session_risk: Retrieve a session's full risk trajectory, including spike detection, trend direction, rolling score window, category counts (self_harm vs. criminal_intent), and total message count.
* list_recent_escalations: Query the audit log for recent flagged events, filterable by category and result limit, for operator review.

Additional capabilities:

* Reverse proxy mode: Sits between users and an upstream LLM, forwarding safe messages and intercepting unsafe ones with empathetic responses or alerts.
* Admin API: Health checks, configuration viewing, paginated escalation lists, session history, aggregate statistics, and session record deletion.
* Alerting: Webhook notifications (Slack, Discord, PagerDuty, Teams) or email when harmful content is detected.
* Privacy-focused: Default SHA-256 message hashing with optional raw text storage for human review.
* Multiple integration options: Python library, MCP server, REST API, or reverse proxy.
* Compliance-ready: Designed with HIPAA, GDPR, and SOC 2 considerations, with multi-language support via multilingual embedding models.

Question 2

Which integrations are available for this server?

Accepted Answer

Sends safety alerts and escalation notifications to Discord channels via webhooks when self-harm or criminal intent is detected.

Supports sending safety escalation alerts via email using SMTP, with specific support for Gmail configurations.

Provides dedicated tools and configuration for LangChain agents to perform message safety checks and assess session risk trajectories.

Enables LangGraph workflows to incorporate safety tools for monitoring user messages and managing high-risk escalations.

Utilizes OpenAI's Moderation API and chat models as part of a multi-stage classification pipeline to identify unsafe content.

Routes high-risk safety incidents to on-call teams using PagerDuty routing keys for urgent human intervention.

Posts safety alerts and operator notifications to Slack channels using incoming webhooks for real-time monitoring of flagged content.

Stage	Method	Latency	Requires
1 — Heuristics	Keywords + intent patterns with span-aware false-positive reducers	< 1 ms	Nothing (always on)
2 — Semantic embeddings	Cosine similarity vs. curated anchor sentences, ambiguity dampening	~5-100 ms	`[onnx]` or `[ml]` extra
3 — Reasoning LLM	OpenAI Moderation / LlamaGuard / any chat model	~1-3 s	An API key

Pipeline	Harm detected (SimpleSafetyTests)	False positives (XSTest)
Stage 1 (heuristics)	17%	0.4%
Stage 1 + 2 (+ embeddings)	21%	1.2%
Stage 1 + 2 + 3 (full cascade)	92%	1.2%

Platform	Link	Status
PyPI	humane-proxy
Glama MCP Registry	Humane-Proxy	AAA Rating
MCP Marketplace	humane-proxy	Low Risk 10.0

Extra	What it adds
(none)	Stage 1 heuristics + SQLite storage — zero dependencies beyond FastAPI
`onnx`	Stage 2 embeddings via ONNX Runtime — no PyTorch, ~2 GB lighter
`ml`	Stage 2 embeddings via sentence-transformers (PyTorch)
`mcp`	MCP server for AI agents
`redis` / `postgres`	Alternative storage backends
`llamaindex` / `crewai` / `autogen` / `langchain`	Native agent-framework tools
`telemetry`	OpenTelemetry distributed tracing
`perf`	orjson fast-path JSON serialization
`all`	Everything above (may cause conflicting dependencies)

Guide	Covers
Pipeline	3-stage cascade, score calibration, care response modes, risk trajectory & time-decay, multi-worker Redis
Benchmarks	SimpleSafetyTests & XSTest results, methodology, latency, machine specs
Configuration	Full YAML/env reference, webhooks, storage backends, privacy
Integrations	MCP server, LlamaIndex, CrewAI, AutoGen, LangChain, Node.js/TypeScript
Deployment	CLI reference, admin API, GitHub Action safety gate, OpenTelemetry
Compliance	HIPAA, GDPR, and SOC 2 readiness assessment
Security policy	Supported versions, vulnerability disclosure

humane-proxy

What it does

Quick Start

As a Python library

As an MCP server (Claude Desktop, Cursor, any agent)

How it works

Benchmarks

When something is flagged

Available On

Installation Extras

Documentation

License

Maintenance

Resources

Tools

Latest Blog Posts

MCP directory API