glin-profanity-mcp

Overview Schema Related Servers Score Discussions

ml-guide.md•6.41 KiB

# ML-Based Toxicity Detection Guide This guide covers the optional Machine Learning integration in glin-profanity v3.0+. ## Overview The ML module provides TensorFlow.js-powered toxicity detection for context-aware content filtering beyond simple keyword matching. **Key Benefits:** - Detects subtle toxicity, insults, and threats that keywords miss - Context-aware analysis (understands meaning, not just words) - Configurable confidence thresholds - Works alongside rule-based filtering for comprehensive coverage ## Installation The ML module requires optional peer dependencies: ```bash npm install @tensorflow/tfjs @tensorflow-models/toxicity ``` ## Usage ### Standalone ToxicityDetector For ML-only toxicity analysis: ```typescript import { ToxicityDetector } from 'glin-profanity/ml'; const detector = new ToxicityDetector({ threshold: 0.9 // Confidence threshold (0-1) }); // Load the model (downloads ~5MB on first use) await detector.loadModel(); // Analyze text const result = await detector.analyze('you are terrible'); console.log(result.isToxic); // true/false console.log(result.predictions); // Array of category predictions console.log(result.matchedCategories); // ['insult', 'toxicity'] ``` ### HybridFilter (Rules + ML) Combines rule-based profanity detection with ML analysis: ```typescript import { HybridFilter } from 'glin-profanity/ml'; const filter = new HybridFilter({ // Rule-based options languages: ['english'], detectLeetspeak: true, normalizeUnicode: true, // ML options enableML: true, mlThreshold: 0.85, combinationMode: 'or', // 'or' | 'and' | 'ml-override' | 'rules-first' }); // Initialize (loads ML model) await filter.initialize(); // Async hybrid check (rules + ML) const result = await filter.checkProfanityAsync('text to analyze'); console.log(result.containsProfanity); // Rule-based result console.log(result.isToxic); // ML result console.log(result.mlResult); // Full ML analysis // Sync rule-based check (fast, no ML) filter.isProfane('badword'); // true ``` ## Combination Modes The `combinationMode` option controls how rule-based and ML results combine: | Mode | Description | |------|-------------| | `'or'` | Flag if EITHER rules OR ML detect issues (default) | | `'and'` | Flag only if BOTH rules AND ML agree | | `'ml-override'` | ML result takes precedence over rules | | `'rules-first'` | Use ML only if rules find nothing | ## ML Categories The toxicity model detects these categories: | Category | Description | |----------|-------------| | `toxicity` | General toxic content | | `severe_toxicity` | Highly toxic content | | `insult` | Personal insults and attacks | | `threat` | Threatening language | | `identity_attack` | Identity-based hate speech | | `obscene` | Obscene/vulgar content | | `sexual_explicit` | Sexually explicit content | ## Performance Considerations ### First Load - Model downloads ~5MB from TensorFlow Hub - Takes 2-5 seconds depending on connection - Browser caches model files for subsequent loads ### Analysis Speed - ML analysis: ~500ms-2s per text - Rule-based: ~0.04ms per text - Use rule-based for real-time (typing) validation - Use ML for submit/post validation ### Offline Usage The model requires an internet connection for first download. For offline apps: **Option 1: Browser Cache** ```javascript // Model cached after first load // Works offline on subsequent page loads ``` **Option 2: Service Worker** ```javascript // Cache model files with service worker self.addEventListener('fetch', event => { if (event.request.url.includes('tensorflow')) { event.respondWith(caches.match(event.request)); } }); ``` **Option 3: IndexedDB (TensorFlow.js native)** ```typescript // Save model after load await model.save('indexeddb://toxicity-model'); // Load from IndexedDB later const model = await tf.loadGraphModel('indexeddb://toxicity-model'); ``` ## Best Practices ### 1. Use Appropriate Thresholds ```typescript // Stricter (fewer false positives, may miss subtle toxicity) mlThreshold: 0.95 // Balanced (recommended) mlThreshold: 0.85 // Lenient (catches more, more false positives) mlThreshold: 0.7 ``` ### 2. Combine with Rules ```typescript // Best coverage: use both const filter = new HybridFilter({ languages: ['english'], detectLeetspeak: true, enableML: true, combinationMode: 'or', }); ``` ### 3. Handle Loading States ```typescript const [isReady, setIsReady] = useState(false); useEffect(() => { filter.initialize().then(() => setIsReady(true)); }, []); // Show loading state while model loads if (!isReady) return <LoadingSpinner />; ``` ### 4. Graceful Fallback ```typescript try { await filter.initialize(); } catch (err) { console.warn('ML unavailable, using rules only'); // Filter still works with rule-based detection } ``` ## API Reference ### ToxicityDetector ```typescript interface MLDetectorConfig { threshold?: number; // Default: 0.9 labels?: ToxicityLabel[]; // Which categories to detect } class ToxicityDetector { constructor(config?: MLDetectorConfig); loadModel(): Promise<void>; analyze(text: string): Promise<MLAnalysisResult>; isModelLoaded(): boolean; } ``` ### HybridFilter ```typescript interface HybridFilterConfig extends FilterConfig { enableML?: boolean; mlThreshold?: number; combinationMode?: 'or' | 'and' | 'ml-override' | 'rules-first'; } class HybridFilter extends Filter { constructor(config?: HybridFilterConfig); initialize(): Promise<void>; checkProfanityAsync(text: string): Promise<HybridAnalysisResult>; } ``` ### Result Types ```typescript interface MLAnalysisResult { isToxic: boolean; predictions: ToxicityPrediction[]; matchedCategories: ToxicityLabel[]; processingTime: number; } interface HybridAnalysisResult extends CheckProfanityResult { isToxic: boolean; mlResult?: MLAnalysisResult; confidence: number; } ``` ## Troubleshooting ### CORS Errors on Localhost TensorFlow Hub may block requests from localhost. Solutions: 1. Deploy to a real domain for testing 2. Use a proxy server 3. Pre-download and host model files locally ### "No backend found" Error Ensure TensorFlow.js is imported before the toxicity model: ```typescript // Correct order import '@tensorflow/tfjs'; import * as toxicity from '@tensorflow-models/toxicity'; ``` ### Model Too Large The toxicity model is ~5MB. Alternatives: - Use rule-based only for size-sensitive apps - Load model on-demand (not at app start) - Consider server-side ML for web apps

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/thegdsks/glin-profanity-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

ml-guide.md•6.41 KiB