brand_extract_pdf
Extracts brand colors, typography, spacing, and rules from PDF brand guidelines via text extraction and pattern matching. Writes authoritative values to core-identity.yaml, outranking web sources.
Instructions
Extract brand colors, typography, spacing, and guideline rules from a PDF brand guidelines document. Accepts a local file path to a PDF. Uses text extraction and pattern matching to identify hex color values, font names, size specifications, and spacing rules. Writes extracted values to core-identity.yaml with source='guidelines' and updates source-catalog.json. Guidelines source outranks web extraction by default based on brand.config.yaml source_priority. Use when the user has brand guidelines as a PDF file — this is the most accurate extraction source. Use after brand_extract_web to merge authoritative guideline values with web-extracted data. Run brand_resolve_conflicts afterward to review any disagreements between sources. NOT for website extraction — use brand_extract_web. NOT for Figma — use brand_extract_figma.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| file_path | Yes | Path to a PDF brand guidelines document. | |
| pages | No | Page range to parse: "all", "3", or "1-5". | all |
Implementation Reference
- src/tools/brand-extract-pdf.ts:18-104 (handler)Main handler function for brand_extract_pdf tool. Reads the .brand/ directory config and existing identity, calls extractPdfBrandData from pdf-extractor, merges extracted colors/typography/spacing with priority over existing identity, writes updated identity to core-identity.yaml, updates source catalog, and returns extraction results.
async function handler(input: Params) { const brandDir = new BrandDir(process.cwd()); if (!(await brandDir.exists())) { return buildResponse({ what_happened: "No .brand/ directory found", next_steps: ["Run brand_start or brand_init first to create the brand system"], data: { error: ERROR_CODES.NOT_INITIALIZED }, }); } const config = await brandDir.readConfig(); const sourcePriority = getConfiguredSourcePriority(config); const identity = await brandDir.readCoreIdentity(); let extracted; try { extracted = await extractPdfBrandData(input.file_path, input.pages); } catch (error) { return buildResponse({ what_happened: `PDF extraction failed for ${input.file_path}`, next_steps: [ "Check that the file exists and is a readable PDF", "If the PDF is image-only, try a selectable-text export or narrower page range", ], data: { success: false, error: ERROR_CODES.FETCH_FAILED, details: error instanceof Error ? error.message : String(error), }, }); } let colors = [...identity.colors]; for (const color of extracted.colors) { colors = mergeColorWithPriority(colors, color, sourcePriority); } let typography = [...identity.typography]; for (const entry of extracted.typography) { typography = mergeTypographyWithPriority(typography, entry, sourcePriority); } const spacing = extracted.spacing && ( !identity.spacing || sourcePriority.indexOf("guidelines") <= sourcePriority.indexOf(identity.spacing.source) ) ? extracted.spacing : identity.spacing; await brandDir.writeCoreIdentity({ ...identity, colors, typography, spacing, }); await upsertSourceCatalog( brandDir, buildSourceCatalogRecords({ colors: extracted.colors, typography: extracted.typography, spacing: extracted.spacing, }), ); return buildResponse({ what_happened: `PDF guideline extraction complete for ${extracted.filePath}`, next_steps: [ "Review the extracted colors, typography, spacing, and rule snippets", "Run brand_resolve_conflicts with mode \"show\" to inspect differences against existing sources", "Run brand_compile to refresh tokens, DESIGN.md, and runtime artifacts", ], data: { success: true, file_path: extracted.filePath, pages: extracted.pages, page_count: extracted.pageCount, extracted: { colors: extracted.colors, typography: extracted.typography, spacing: extracted.spacing, logos: extracted.logos, brand_rules: extracted.rules, }, }, }); } - src/tools/brand-extract-pdf.ts:106-117 (registration)Registers the 'brand_extract_pdf' tool on the MCP server with schema params (file_path, pages), a description string explaining usage, and a wrapper that validates inputs via Zod schema before delegating to the handler.
export function register(server: McpServer) { server.tool( "brand_extract_pdf", "Extract brand colors, typography, spacing, and guideline rules from a PDF brand guidelines document. Accepts a local file path to a PDF. Uses text extraction and pattern matching to identify hex color values, font names, size specifications, and spacing rules. Writes extracted values to core-identity.yaml with source='guidelines' and updates source-catalog.json. Guidelines source outranks web extraction by default based on brand.config.yaml source_priority. Use when the user has brand guidelines as a PDF file — this is the most accurate extraction source. Use after brand_extract_web to merge authoritative guideline values with web-extracted data. Run brand_resolve_conflicts afterward to review any disagreements between sources. NOT for website extraction — use brand_extract_web. NOT for Figma — use brand_extract_figma.", paramsShape, async (args) => { const result = safeParseParams(ParamsSchema, args); if (!result.success) return result.response; return handler(result.data); }, ); } - src/tools/brand-extract-pdf.ts:10-16 (schema)Zod schema defining input parameters: file_path (required string, path to PDF) and pages (optional string, default 'all', supports '3' or '1-5' range).
const paramsShape = { file_path: z.string().min(1).describe("Path to a PDF brand guidelines document."), pages: z.string().default("all").describe('Page range to parse: "all", "3", or "1-5".'), }; const ParamsSchema = z.object(paramsShape); type Params = z.infer<typeof ParamsSchema>; - src/lib/pdf-extractor.ts:190-241 (helper)Core PDF extraction logic using pdfjs-dist. Reads the PDF file, parses pages, extracts text content, then runs extractors for colors (hex values), typography (font families/sizes), spacing (base unit/scale), logo hints, and rules (dos/donts/guidance). Returns a PdfBrandExtraction object.
export async function extractPdfBrandData(filePath: string, pages = "all"): Promise<PdfBrandExtraction> { const resolvedPath = resolve(filePath); const dataBuffer = await readFile(resolvedPath); // Dynamic import — pdfjs-dist is ESM-only in v5+ const pdfjsLib = await import("pdfjs-dist/legacy/build/pdf.mjs"); const loadingTask = pdfjsLib.getDocument({ data: new Uint8Array(dataBuffer), useSystemFonts: true, }); const doc = await loadingTask.promise; const range = parsePageRange(pages); const maxPage = range.end ?? doc.numPages; const pageTexts: string[] = []; for (let i = range.start; i <= Math.min(maxPage, doc.numPages); i++) { const page = await doc.getPage(i); const textContent = await page.getTextContent({ includeMarkedContent: false, }); const parts: string[] = []; let lastY = 0; for (const item of textContent.items) { if ("str" in item) { const y = (item as { str: string; transform: number[] }).transform[5]; if (lastY && y !== lastY) { parts.push(`\n${item.str}`); } else { parts.push(item.str); } lastY = y; } } pageTexts.push(parts.join("")); } const text = pageTexts.join("\n\n").replace(/\n{3,}/g, "\n\n").trim(); return { filePath: resolvedPath, pages, pageCount: doc.numPages, text, colors: extractColors(text), typography: extractTypography(text), spacing: extractSpacing(text), logos: extractLogoHints(text), rules: extractRules(text), }; } - src/server.ts:39-64 (registration)Import and registration call for brand_extract_pdf in the main server setup, placed in the 'Session 1: Core Identity' section.
import { register as registerExtractPdf } from "./tools/brand-extract-pdf.js"; import { register as registerResolveConflicts } from "./tools/brand-resolve-conflicts.js"; import { register as registerBrandcodeAuth } from "./tools/brand-brandcode-auth.js"; import { register as registerBrandcodeConnect } from "./tools/brand-brandcode-connect.js"; import { register as registerBrandcodeSync } from "./tools/brand-brandcode-sync.js"; import { register as registerBrandcodeStatus } from "./tools/brand-brandcode-status.js"; import { register as registerBrandcodeLive } from "./tools/brand-brandcode-live.js"; import { register as registerRepoConnect } from "./tools/brand-repo-connect.js"; import { register as registerRepoStatus } from "./tools/brand-repo-status.js"; import { register as registerEnrichSkill } from "./tools/brand-enrich-skill.js"; export function createServer(): McpServer { const server = new McpServer({ name: "brandsystem", version: getVersion(), }); // ── Entry points (register first — agents see these first) ── registerStart(server); // #1: Entry point for new brands registerStatus(server); // #2: "What can I do?" / resume point // ── Session 1: Core Identity ── registerExtractWeb(server); // Extract from website (CSS parsing) registerExtractVisual(server); // Extract from website (headless Chrome + vision) registerExtractSite(server); // Extract from website (multi-page rendered crawl) registerExtractPdf(server); // Extract from PDF guidelines