glin-profanity-mcp

Overview Schema Related Servers Score Discussions

configuration.md•13 KiB

# Configuration Guide Complete reference for all configuration options in glin-profanity. ## Table of Contents - [FilterConfig](#filterconfig) - [Language Options](#language-options) - [Detection Options](#detection-options) - [Performance Options](#performance-options) - [ML Options](#ml-options) - [Context-Aware Options](#context-aware-options) - [Presets](#presets) - [Environment Variables](#environment-variables) --- ## FilterConfig Complete TypeScript interface: ```typescript interface FilterConfig { // Language Configuration languages?: Language[]; // Default: ['english'] // Detection Options detectLeetspeak?: boolean; // Default: true leetspeakLevel?: 'basic' | 'moderate' | 'aggressive'; // Default: 'moderate' normalizeUnicode?: boolean; // Default: true caseSensitive?: boolean; // Default: false // Replacement Options replaceWith?: string; // Default: '*' preserveLength?: boolean; // Default: true partialMatching?: boolean; // Default: true // Context-Aware Options enableContextAware?: boolean; // Default: false contextWindow?: number; // Default: 3 confidenceThreshold?: number; // Default: 0.7 domainWhitelists?: Record<string, string[]>; // Default: {} // Performance Options cacheResults?: boolean; // Default: true cacheSize?: number; // Default: 1000 batchSize?: number; // Default: 100 // Advanced Options severityLevels?: boolean; // Default: false includeExplanations?: boolean; // Default: false strictMode?: boolean; // Default: false customDictionary?: Map<string, number>; // Default: undefined excludeWords?: string[]; // Default: [] } ``` --- ## Language Options ### Supported Languages Configure which languages to check against: ```javascript const filter = new Filter({ languages: ['english', 'spanish', 'french'] }); ``` **Available Languages** (24 total): | Code | Language | Dictionary Size | |------|----------|----------------| | `arabic` | Arabic (العربية) | ~800 words | | `chinese` | Chinese (中文) | ~1200 words | | `czech` | Czech (Čeština) | ~600 words | | `danish` | Danish (Dansk) | ~500 words | | `dutch` | Dutch (Nederlands) | ~700 words | | `english` | English | ~1500 words | | `esperanto` | Esperanto | ~400 words | | `finnish` | Finnish (Suomi) | ~550 words | | `french` | French (Français) | ~900 words | | `german` | German (Deutsch) | ~800 words | | `hindi` | Hindi (हिन्दी) | ~700 words | | `hungarian` | Hungarian (Magyar) | ~600 words | | `italian` | Italian (Italiano) | ~850 words | | `japanese` | Japanese (日本語) | ~900 words | | `korean` | Korean (한국어) | ~750 words | | `norwegian` | Norwegian (Norsk) | ~500 words | | `persian` | Persian (فارسی) | ~650 words | | `polish` | Polish (Polski) | ~700 words | | `portuguese` | Portuguese (Português) | ~800 words | | `russian` | Russian (Русский) | ~950 words | | `spanish` | Spanish (Español) | ~900 words | | `swedish` | Swedish (Svenska) | ~550 words | | `thai` | Thai (ไทย) | ~600 words | | `turkish` | Turkish (Türkçe) | ~650 words | ### Single Language (Fastest) ```javascript const filter = new Filter({ languages: ['english'] }); ``` ### Multiple Languages ```javascript const filter = new Filter({ languages: ['english', 'spanish', 'french'] }); ``` ### All Languages (Slower) ```javascript import { SUPPORTED_LANGUAGES } from 'glin-profanity'; const filter = new Filter({ languages: SUPPORTED_LANGUAGES // All 24 languages }); ``` --- ## Detection Options ### Leetspeak Detection Detect obfuscated profanity like `f4ck`, `5h1t`, `@$$`. ```javascript const filter = new Filter({ detectLeetspeak: true, leetspeakLevel: 'aggressive' // basic | moderate | aggressive }); filter.isProfane('f4ck'); // true filter.isProfane('5h1t'); // true filter.isProfane('@$$'); // true (aggressive mode) ``` **Leetspeak Levels:** | Level | Examples | Performance | False Positives | |-------|----------|-------------|-----------------| | `basic` | `f4ck`, `5h1t`, `@ss` | Fast | Low | | `moderate` | + `ph.uck`, `b!tch` | Medium | Medium | | `aggressive` | + `ƒ.u.c.k`, `sh_it` | Slower | Higher | ### Unicode Normalization Detect homoglyphs like `fսck` (Armenian), `shіt` (Cyrillic). ```javascript const filter = new Filter({ normalizeUnicode: true }); filter.isProfane('fսck'); // true (Armenian 'ս' → 'u') filter.isProfane('shіt'); // true (Cyrillic 'і' → 'i') filter.isProfane('ƒuck'); // true (Latin 'ƒ' → 'f') ``` **Supported Homoglyphs:** - Latin Extended (ƒ, ſ, ʃ) - Cyrillic (а, е, і, о, р, с, у, х) - Greek (α, ε, ο, ρ) - Armenian (ս) - And 200+ more ### Case Sensitivity ```javascript // Case-insensitive (default) const filter = new Filter({ caseSensitive: false }); filter.isProfane('FUCK'); // true filter.isProfane('Fuck'); // true filter.isProfane('fuck'); // true // Case-sensitive const strictFilter = new Filter({ caseSensitive: true }); strictFilter.isProfane('FUCK'); // true strictFilter.isProfane('Fuck'); // false (not in dictionary) strictFilter.isProfane('fuck'); // true ``` ### Partial Matching ```javascript // Partial matching enabled (default) const filter = new Filter({ partialMatching: true }); filter.isProfane('unfuckingbelievable'); // true // Exact matching only const exactFilter = new Filter({ partialMatching: false }); exactFilter.isProfane('unfuckingbelievable'); // false exactFilter.isProfane('fuck'); // true ``` --- ## Context-Aware Options Enable intelligent detection that considers surrounding words. ### Basic Context Awareness ```javascript const filter = new Filter({ enableContextAware: true, contextWindow: 5, // Look at 5 words before/after confidenceThreshold: 0.7 // 70% confidence required }); const result = filter.checkProfanity('The damn door is stuck'); // contextScore: 0.4 (low toxicity in context) // May not flag if threshold is 0.7 ``` ### Domain Whitelists Allow certain words in specific contexts: ```javascript const filter = new Filter({ enableContextAware: true, domainWhitelists: { medical: ['rectum', 'penis', 'vagina', 'breast'], technical: ['cock' /* valve */, 'screw'], gaming: ['kill', 'destroy', 'murder'] } }); // With medical context filter.checkProfanity('The rectum is part of the digestive system', { domain: 'medical' }); // containsProfanity: false (whitelisted in medical context) // Without context filter.checkProfanity('The rectum is part of the digestive system'); // containsProfanity: true (not whitelisted) ``` ### Context Window Size ```javascript const filter = new Filter({ enableContextAware: true, contextWindow: 10 // Larger window = more context, slower }); ``` **Recommended Values:** - `contextWindow: 3` - Fast, basic local context - `contextWindow: 5` - Balanced (recommended) - `contextWindow: 10` - Comprehensive, most accurate - `contextWindow: 20` - Slow, full sentence context --- ## Performance Options ### Result Caching ```javascript const filter = new Filter({ cacheResults: true, cacheSize: 5000 // LRU cache size }); // First call: ~0.5ms filter.isProfane('This is a test message'); // Subsequent calls: ~0.01ms (from cache) filter.isProfane('This is a test message'); ``` **Cache Size Guidelines:** - `1000` - Small apps (< 10K users) - `5000` - Medium apps (10K-100K users) - `10000` - Large apps (100K+ users) - `50000` - Very large apps (1M+ users) ### Batch Processing ```javascript const filter = new Filter({ batchSize: 200 // Process 200 texts at a time }); const texts = [...]; // 1000 texts const results = filter.batchCheck(texts); // Automatically processes in batches of 200 ``` --- ## Replacement Options ### Custom Replacement Character ```javascript const filter = new Filter({ replaceWith: '***' }); filter.censorText('shit happens'); // Result: '*** happens' ``` ### Preserve Length ```javascript // Preserve length (default) const filter1 = new Filter({ replaceWith: '*', preserveLength: true }); filter1.censorText('fuck'); // '****' // Don't preserve length const filter2 = new Filter({ replaceWith: '***', preserveLength: false }); filter2.censorText('fuck'); // '***' ``` --- ## ML Options ### Toxicity Detection ```javascript import { Filter } from 'glin-profanity'; import { loadToxicityModel } from 'glin-profanity/ml'; // Load ML model await loadToxicityModel({ threshold: 0.9, // 90% confidence required model: 'toxicity' // or 'toxicity-fast' }); // Create filter with ML enabled const filter = new Filter({ enableML: true, mlThreshold: 0.9 }); const result = await filter.checkProfanityML('You are the worst person ever'); // toxic: true // categories: { toxicity: 0.95, insult: 0.89, ... } ``` **Available Models:** - `toxicity` - Full model (23MB, most accurate) - `toxicity-fast` - Fast model (5MB, good accuracy) --- ## Custom Dictionaries ### Add Custom Words ```javascript const filter = new Filter({ customDictionary: new Map([ ['badword1', 1.0], // severity 1.0 (highest) ['badword2', 0.5], // severity 0.5 (medium) ['badword3', 0.2], // severity 0.2 (low) ]) }); filter.isProfane('badword1'); // true ``` ### Exclude Words (Whitelist) ```javascript const filter = new Filter({ excludeWords: ['damn', 'hell', 'crap'] // Allow these words }); filter.isProfane('damn it'); // false (excluded) filter.isProfane('fuck'); // true (not excluded) ``` --- ## Presets Pre-configured settings for common use cases: ### Strict (Family-Friendly) ```javascript import { PRESETS } from 'glin-profanity'; const filter = new Filter(PRESETS.STRICT); // Equivalent to: // { // languages: ['english'], // detectLeetspeak: true, // leetspeakLevel: 'aggressive', // normalizeUnicode: true, // partialMatching: true, // strictMode: true, // severityLevels: true, // contextWindow: 10 // } ``` ### Moderate (Recommended) ```javascript const filter = new Filter(PRESETS.MODERATE); // Balanced settings for most applications ``` ### Lenient (Casual) ```javascript const filter = new Filter(PRESETS.LENIENT); // Only catches obvious profanity ``` ### Custom Preset ```javascript import { createPreset } from 'glin-profanity'; const myPreset = createPreset({ base: PRESETS.MODERATE, overrides: { languages: ['english', 'spanish'], excludeWords: ['damn', 'hell'] } }); const filter = new Filter(myPreset); ``` --- ## Environment Variables Configure via environment variables: ```bash # .env GLIN_PROFANITY_LANGUAGES=english,spanish,french GLIN_PROFANITY_LEETSPEAK=true GLIN_PROFANITY_LEETSPEAK_LEVEL=aggressive GLIN_PROFANITY_UNICODE=true GLIN_PROFANITY_CACHE=true GLIN_PROFANITY_CACHE_SIZE=5000 ``` **Usage:** ```javascript import { Filter } from 'glin-profanity'; // Auto-loads from environment variables const filter = new Filter(); ``` --- ## Configuration Examples ### Real-Time Chat Application ```javascript const filter = new Filter({ languages: ['english'], detectLeetspeak: true, leetspeakLevel: 'aggressive', normalizeUnicode: true, cacheResults: true, cacheSize: 10000, severityLevels: true }); ``` ### Content Moderation Platform ```javascript const filter = new Filter({ languages: SUPPORTED_LANGUAGES, // All languages detectLeetspeak: true, enable ContextAware: true, contextWindow: 10, severityLevels: true, includeExplanations: true, cacheResults: true, cacheSize: 50000 }); ``` ### Educational Platform (Strict) ```javascript const filter = new Filter({ languages: ['english'], detectLeetspeak: true, leetspeakLevel: 'aggressive', normalizeUnicode: true, partialMatching: true, strictMode: true, excludeWords: [], // No exceptions severityLevels: true }); ``` ### Gaming Platform (Moderate) ```javascript const filter = new Filter({ languages: ['english', 'spanish', 'portuguese'], detectLeetspeak: true, leetspeakLevel: 'moderate', excludeWords: ['kill', 'destroy', 'murder'], // Game terms domainWhitelists: { gaming: ['kill', 'destroy', 'murder', 'dead', 'die'] }, enableContextAware: true, contextWindow: 5 }); ``` ### Medical/Technical Content ```javascript const filter = new Filter({ languages: ['english'], detectLeetspeak: false, // Technical terms might trigger enableContextAware: true, contextWindow: 10, domainWhitelists: { medical: ['rectum', 'penis', 'vagina', 'breast', 'anal'], technical: ['cock', 'screw', 'ballcock'] }, confidenceThreshold: 0.9 // High confidence required }); ``` --- ## Next Steps - [Getting Started](./getting-started.md) - Basic usage - [API Reference](./api-reference.md) - Full API documentation - [Advanced Features](./advanced-features.md) - ML, context-aware detection - [Performance Guide](./performance.md) - Optimization tips --- **Questions?** Check our [FAQ](./faq.md) or open an issue on [GitHub](https://github.com/GLINCKER/glin-profanity/issues).

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/thegdsks/glin-profanity-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

configuration.md•13 KiB