Skip to main content
Glama

detect_bias

Analyze AI-generated text to identify demographic bias, stereotyping, and unfair language patterns for compliance auditing.

Instructions

Analyze text for demographic bias patterns, stereotyping, and unfair language.

Args: model_output: The AI-generated text to analyze for bias. protected_attributes: Comma-separated list of attributes to check (e.g. "race,gender,age"). Leave empty for auto-detection. api_key: Optional MEOK API key for pro tier.

Behavior: This tool generates structured output without modifying external systems. Output is deterministic for identical inputs. No side effects. Free tier: 10/day rate limit. Pro tier: unlimited. No authentication required for basic usage.

When to use: Use this tool when you need to assess, audit, or verify compliance requirements. Ideal for gap analysis, readiness checks, and generating compliance documentation.

When NOT to use: Do not use as a substitute for qualified legal counsel. This tool provides technical compliance guidance, not legal advice.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
model_outputYes
protected_attributesNo
api_keyNo

Implementation Reference

  • The main tool handler for detect_bias. Accepts model_output, protected_attributes, and api_key. Scores the text for bias patterns, detects protected attributes, identifies bias types, performs sentence-level analysis, and returns a structured result with overall risk level, flagged sentences, and recommendations.
    @mcp.tool()
    def detect_bias(
        model_output: str,
        protected_attributes: str = "",
        api_key: str = "",
    ) -> dict:
        """Analyze text for demographic bias patterns, stereotyping, and unfair language.
    
        Args:
            model_output: The AI-generated text to analyze for bias.
            protected_attributes: Comma-separated list of attributes to check (e.g. "race,gender,age"). Leave empty for auto-detection.
            api_key: Optional MEOK API key for pro tier.
    
        Behavior:
            This tool generates structured output without modifying external systems.
            Output is deterministic for identical inputs. No side effects.
            Free tier: 10/day rate limit. Pro tier: unlimited.
            No authentication required for basic usage.
    
        When to use:
            Use this tool when you need to assess, audit, or verify compliance
            requirements. Ideal for gap analysis, readiness checks, and generating
            compliance documentation.
    
        When NOT to use:
            Do not use as a substitute for qualified legal counsel. This tool
            provides technical compliance guidance, not legal advice.
        """
        allowed, msg, tier = check_access(api_key)
        if not allowed:
            return {"error": msg, "upgrade_url": "https://meok.ai/pricing"}
        limit_err = _check_rate_limit("detect_bias", tier)
        if limit_err:
            return {"error": "rate_limited", "message": limit_err}
    
        # Score for bias patterns
        bias_score, pattern_matches = _score_bias_risk(model_output)
    
        # Detect protected attributes mentioned
        auto_detected = _detect_protected_attributes(model_output)
    
        # If user specified attributes, filter/augment
        requested_attrs = []  # type: List[str]
        if protected_attributes:
            requested_attrs = [a.strip().lower() for a in protected_attributes.split(",")]
    
        # Identify specific bias types present
        detected_bias_types = []  # type: List[Dict[str, str]]
        for btype, binfo in BIAS_TYPES.items():
            matched = _match_keywords(model_output, binfo["indicators"])
            if matched:
                detected_bias_types.append({
                    "type": binfo["name"],
                    "severity": binfo["severity"],
                    "matched_indicators": matched,
                    "eu_article": binfo["eu_article"],
                })
    
        # Classify overall risk
        if bias_score >= 0.7:
            overall_risk = "high"
            recommendation = (
                "CRITICAL: High bias detected. This output should not be used for decisions "
                "affecting individuals without significant human review and debiasing."
            )
        elif bias_score >= 0.4:
            overall_risk = "moderate"
            recommendation = (
                "WARNING: Moderate bias patterns detected. Review flagged patterns and consider "
                "rephrasing or adding qualifying context before deployment."
            )
        elif bias_score >= 0.15:
            overall_risk = "low"
            recommendation = (
                "Minor bias indicators detected. Generally acceptable but review flagged "
                "patterns for context appropriateness."
            )
        else:
            overall_risk = "minimal"
            recommendation = (
                "No significant bias patterns detected in this text. Continue monitoring "
                "outputs for emergent patterns."
            )
    
        # Sentence-level analysis
        sentences = [s.strip() for s in re.split(r'[.!?]+', model_output) if s.strip()]
        flagged_sentences = []  # type: List[Dict[str, object]]
        for sentence in sentences:
            s_score, s_matches = _score_bias_risk(sentence)
            if s_score > 0.15:
                flagged_sentences.append({
                    "sentence": sentence,
                    "bias_score": round(s_score, 2),
                    "patterns": [m["category"] for m in s_matches],
                })
    
        return {
            "overall_bias_risk": overall_risk,
            "bias_score": round(bias_score, 2),
            "pattern_matches": pattern_matches,
            "protected_attributes_mentioned": auto_detected,
            "bias_types_detected": detected_bias_types,
            "flagged_sentences": flagged_sentences[:10],
            "recommendation": recommendation,
            "total_sentences_analyzed": len(sentences),
            "sentences_flagged": len(flagged_sentences),
            "next_step": "Use mitigation_recommendations for remediation or fairness_metrics for quantitative assessment",
            "meok_labs": "https://meok.ai",
        }
  • server.py:435-435 (registration)
    Registration of detect_bias as an MCP tool via the @mcp.tool() decorator on the FastMCP instance 'mcp'.
    @mcp.tool()
  • Helper function that scores text for bias patterns using MANIPULATION_PATTERNS regex patterns and protected attribute keyword detection. Returns a score normalized to 0-1 and the list of matched patterns.
    def _score_bias_risk(text):
        # type: (str) -> Tuple[float, List[Dict[str, str]]]
        """Score text for bias patterns. Returns (score 0-1, matched_patterns)."""
        text_lower = text.lower()
        total_weight = 0.0
        matches = []  # type: List[Dict[str, str]]
        seen_categories = set()  # type: set
    
        for pat in MANIPULATION_PATTERNS:
            if re.search(pat["pattern"], text_lower):
                total_weight += pat["weight"]
                if pat["category"] not in seen_categories:
                    seen_categories.add(pat["category"])
                    matches.append({
                        "pattern": pat["pattern"],
                        "category": pat["category"],
                        "weight": str(pat["weight"]),
                    })
    
        # Check for protected attribute mentions without fairness context
        fairness_terms = ["fair", "equit", "bias", "discriminat", "parity", "equal"]
        has_fairness_context = any(ft in text_lower for ft in fairness_terms)
    
        protected_mentioned = []  # type: List[str]
        for attr, info in PROTECTED_ATTRIBUTES_DB.items():
            if _match_keywords(text, info["keywords"]):
                protected_mentioned.append(attr)
                if not has_fairness_context:
                    total_weight += 0.15
    
        # Normalise to 0-1
        score = min(1.0, total_weight / 2.5)
        return score, matches
  • Helper function that detects protected attributes (race, gender, age, disability, religion, etc.) mentioned in text by matching keywords from PROTECTED_ATTRIBUTES_DB.
    def _detect_protected_attributes(text):
        # type: (str) -> List[Dict[str, object]]
        """Detect mentions of protected attributes in text."""
        found = []  # type: List[Dict[str, object]]
        for attr, info in PROTECTED_ATTRIBUTES_DB.items():
            matched = _match_keywords(text, info["keywords"])
            if matched:
                found.append({
                    "attribute": attr,
                    "matched_terms": matched,
                    "eu_reference": info["eu_ref"],
                })
        return found
  • Helper that performs case-insensitive keyword matching used by both _score_bias_risk and _detect_protected_attributes.
    def _match_keywords(text, keywords):
        # type: (str, List[str]) -> List[str]
        """Return matched keywords found in text (case-insensitive)."""
        text_lower = text.lower()
        return [kw for kw in keywords if kw.lower() in text_lower]
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description fully covers behavioral traits: it states the tool generates structured output without side effects, is deterministic, has rate limits (free: 10/day, pro: unlimited), and requires no authentication for basic usage. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured with clear sections (Args, Behavior, When to use, When NOT to use) and each sentence adds value. It is appropriately sized and front-loaded with the main purpose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

While the description covers many aspects, it lacks details on the output format (e.g., what 'structured output' contains) and error handling. Given no output schema, the agent may need more context about return values.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema has 0% description coverage, so the description compensates by explaining each parameter: model_output as 'the AI-generated text,' protected_attributes with example and auto-detection note, and api_key as optional for pro tier. This adds significant meaning beyond the bare schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it 'Analyze text for demographic bias patterns, stereotyping, and unfair language.' The verb 'Analyze' and specific resource 'text for bias' make the purpose explicit, and it distinguishes itself from siblings like 'fairness_metrics' or 'mitigation_recommendations.'

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description includes dedicated 'When to use' and 'When NOT to use' sections, providing context such as compliance needs and legal disclaimer. However, it does not explicitly compare to sibling tools or provide exclusions for alternative use cases.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/CSOAI-ORG/bias-detection-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server