Skip to main content
Glama
REMnux

REMnux MCP Server

Official
by REMnux

extract_iocs

Extract Indicators of Compromise (IOCs) like IPs, domains, URLs, and hashes from text output to identify security threats. Processes data from malware analysis tools with deduplication and confidence scoring.

Instructions

Extract IOCs (IPs, domains, URLs, hashes, registry keys, etc.) from text. Pass output from run_tool or analyze_file to identify indicators. Works well with Volatility 3 plugin output (netscan, cmdline, filescan). Returns deduplicated IOCs with confidence scores.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
textYesText to extract IOCs from (e.g., output from run_tool or analyze_file)
include_noiseNoInclude low-confidence known-good IOCs
include_private_ipsNoInclude private/internal IP addresses (10.x, 172.16-31.x, 192.168.x)

Implementation Reference

  • The main MCP tool handler that processes extract_iocs requests. Takes ExtractIOCsArgs, calls extractIOCs() helper, and formats the response with iocs, summary, and optional noise fields.
    export async function handleExtractIOCs(
      _deps: HandlerDeps,
      args: ExtractIOCsArgs
    ) {
      const startTime = Date.now();
    
      try {
        const result = extractIOCs(args.text, {
          includePrivateIPs: args.include_private_ips,
        });
    
        const data: Record<string, unknown> = {
          iocs: result.iocs,
          summary: result.summary,
        };
    
        if (args.include_noise) {
          data.noise = result.noise;
        }
    
        return formatResponse("extract_iocs", data, startTime);
      } catch (error) {
        return formatError("extract_iocs", toREMnuxError(error), startTime);
      }
    }
  • Defines the input validation schema for extract_iocs tool with text (required), include_noise (optional, default false), and include_private_ips (optional, default false) parameters.
    export const extractIOCsSchema = z.object({
      text: z.string().describe("Text to extract IOCs from (e.g., output from run_tool or analyze_file)"),
      include_noise: z.boolean().optional().default(false).describe("Include low-confidence known-good IOCs"),
      include_private_ips: z.boolean().optional().default(false).describe("Include private/internal IP addresses (10.x, 172.16-31.x, 192.168.x)"),
    });
    export type ExtractIOCsArgs = z.infer<typeof extractIOCsSchema>;
  • src/index.ts:197-206 (registration)
    Registers the extract_iocs tool with the MCP server, including tool name, description, schema shape, and handler function binding.
    // Tool: extract_iocs - Extract IOCs from text
    server.tool(
      "extract_iocs",
      "Extract IOCs (IPs, domains, URLs, hashes, registry keys, etc.) from text. " +
      "Pass output from run_tool or analyze_file to identify indicators. " +
      "Works well with Volatility 3 plugin output (netscan, cmdline, filescan). " +
      "Returns deduplicated IOCs with confidence scores.",
      extractIOCsSchema.shape,
      (args) => handleExtractIOCs(deps, args)
    );
  • Core IOC extraction logic that uses standard library extraction, custom pattern matching, deduplication, noise filtering, confidence scoring, per-type capping (max 25), and summary generation.
    export function extractIOCs(text: string, options?: ExtractOptions): IOCResult {
      // 1. Standard extraction
      const libResult = extractIOC(text);
    
      // 2. Collect all entries (value, type) avoiding duplicates
      const seen = new Set<string>();
      const allEntries: IOCEntry[] = [];
    
      // Track values already classified as hashes to prevent dual-classification as crypto
      const hashValues = new Set<string>();
      const HASH_TYPES = new Set(["md5", "sha1", "sha256", "sha512"]);
      const CRYPTO_TYPES = new Set(["btc", "eth", "xmr"]);
    
      function add(value: string, type: string) {
        const key = `${type}::${value}`;
        if (seen.has(key)) return;
    
        // If already classified as a hash, skip crypto classification for same value
        if (CRYPTO_TYPES.has(type) && hashValues.has(value)) return;
    
        seen.add(key);
        if (HASH_TYPES.has(type)) hashValues.add(value);
        allEntries.push({ value, type, confidence: scoreIOC(value, type) });
      }
    
      // Standard types from library
      for (const [key, typeName] of Object.entries(TYPE_MAP)) {
        const values = (libResult as unknown as Record<string, string[]>)[key];
        if (values) {
          for (const v of values) {
            add(v, typeName);
          }
        }
      }
    
      // 3. Custom patterns
      for (const m of extractCustomPatterns(text)) {
        add(m.value, m.type);
      }
    
      // 4. Split into iocs and noise
      const iocs: IOCEntry[] = [];
      const noise: IOCEntry[] = [];
    
      for (const entry of allEntries) {
        if (isNoise(entry.value, entry.type, options) || entry.confidence <= NOISE_THRESHOLD) {
          noise.push(entry);
        } else {
          iocs.push(entry);
        }
      }
    
      // 4b. Cap per-type to prevent hash floods (e.g., 397 MD5s from hex output)
      const MAX_PER_TYPE = 25;
      const byTypeCount: Record<string, number> = {};
      const truncatedTypes: string[] = [];
      const cappedIocs: IOCEntry[] = [];
    
      for (const entry of iocs) {
        const count = byTypeCount[entry.type] || 0;
        if (count < MAX_PER_TYPE) {
          cappedIocs.push(entry);
        }
        byTypeCount[entry.type] = count + 1;
      }
    
      for (const [type, count] of Object.entries(byTypeCount)) {
        if (count > MAX_PER_TYPE) {
          truncatedTypes.push(`${type}: showing ${MAX_PER_TYPE} of ${count}`);
        }
      }
    
      // 5. Build summary
      const byType: Record<string, number> = {};
      for (const entry of cappedIocs) {
        byType[entry.type] = (byType[entry.type] || 0) + 1;
      }
    
      return {
        iocs: cappedIocs,
        noise,
        summary: {
          total: cappedIocs.length,
          noise_filtered: noise.length,
          by_type: byType,
          ...(truncatedTypes.length > 0 && { truncated: truncatedTypes }),
        },
      };
    }
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It effectively describes key behavioral traits: the tool returns deduplicated IOCs with confidence scores, mentions handling of low-confidence items via parameters, and implies processing of forensic/memory analysis data. It doesn't cover rate limits, authentication needs, or error conditions, but provides substantial operational context.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three tightly constructed sentences with zero waste. The first states the core purpose, the second provides usage context and integration points, the third describes output characteristics. Every sentence earns its place by adding distinct value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a 3-parameter tool with no annotations and no output schema, the description provides strong context about what the tool does, when to use it, and what to expect in results. It could be more complete by explicitly describing the output format (beyond 'deduplicated IOCs with confidence scores') or error conditions, but covers the essential operational context well.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already fully documents all three parameters. The description mentions 'include_noise' indirectly ('low-confidence known-good IOCs') but doesn't add meaningful semantic context beyond what the schema provides. Baseline 3 is appropriate when the schema does the heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action ('extract') and resource ('IOCs from text'), listing concrete indicator types (IPs, domains, URLs, hashes, registry keys). It distinguishes from siblings by specifying its role in the workflow (processing output from run_tool or analyze_file) rather than performing analysis or file operations directly.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context for when to use this tool ('Pass output from run_tool or analyze_file to identify indicators') and mentions specific compatible data sources ('Works well with Volatility 3 plugin output'). However, it doesn't explicitly state when NOT to use it or name alternatives among siblings for similar extraction tasks.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/REMnux/remnux-mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server