Skip to main content
Glama
README.md13.1 kB
# @prodisco/search-libs A generic library indexing + search solution using [Orama](https://orama.com/). Extract types, methods, and functions from TypeScript libraries (via `.d.ts`) and **ESM JavaScript libraries** (best-effort), index TypeScript scripts, and provide unified structured search for AI agents. ## Table of Contents - [Features](#features) - [Installation](#installation) - [Quick Start](#quick-start) - [API Reference](#api-reference) - [Document Types](#document-types) - [Architecture](#architecture) - [How Library Indexing Works](#how-library-indexing-works-typescript--javascript) - [Extending the Schema](#extending-the-schema) - [License](#license) ## Features - **Generic Library Extraction**: Extract types (classes, interfaces, enums, type-aliases) and methods/functions from npm packages using TypeScript AST parsing (TypeScript `.d.ts` + ESM JavaScript fallback) - **Script Indexing**: Index TypeScript scripts with automatic metadata extraction (description, keywords, API references) - **Unified Search**: Search across types, methods, functions, and scripts with structured queries and structured output - **Extensible Schema**: Base Orama schema with support for custom extensions - **AI-Optimized**: Structured output designed for AI code generation agents ## Installation ```bash npm install @prodisco/search-libs ``` ## Quick Start ```typescript import { LibraryIndexer } from '@prodisco/search-libs'; // Create indexer with packages to extract const indexer = new LibraryIndexer({ packages: [ { name: '@kubernetes/client-node' }, { name: '@prodisco/prometheus-client' }, { name: 'simple-statistics' }, ], }); // Initialize - extracts and indexes all packages await indexer.initialize(); // Search across all indexed content const results = await indexer.search({ query: 'Pod', documentType: 'type', limit: 10, }); console.log(results.results[0]); // { // id: 'type:@kubernetes/client-node:V1Pod', // documentType: 'type', // name: 'V1Pod', // library: '@kubernetes/client-node', // category: 'interface', // description: 'Pod is a collection of containers...', // properties: [...], // typeKind: 'interface', // } ``` ## API Reference ### LibraryIndexer The main entry point for indexing and searching. ```typescript interface LibraryIndexerOptions { packages: PackageConfig[]; basePath?: string; // Defaults to process.cwd() } interface PackageConfig { name: string; // npm package name typeFilter?: RegExp | ((name: string) => boolean); methodFilter?: RegExp | ((name: string) => boolean); } ``` #### Methods ##### `initialize(): Promise<{ indexed: number; errors: ExtractionError[] }>` Extracts and indexes all configured packages. ##### `search(options: SearchOptions): Promise<SearchResult>` Search the index with structured queries. ```typescript interface SearchOptions { query?: string; // Full-text search term documentType?: string; // 'type' | 'method' | 'function' | 'script' | 'all' category?: string; // Filter by category library?: string; // Filter by library limit?: number; // Max results (default: 10) offset?: number; // Pagination offset } interface SearchResult { results: IndexedDocument[]; totalMatches: number; facets: { documentType: Record<string, number>; library: Record<string, number>; category: Record<string, number>; }; searchTime: number; } ``` ##### `addScript(filePath: string): Promise<void>` Add a TypeScript script to the index. Automatically parses for: - Description (from first comment block) - Keywords (from description) - Resource types (from filename and content AST) - API references (from content AST) ##### `addScriptsFromDirectory(dirPath: string): Promise<void>` Add all TypeScript scripts from a directory. ##### `removeScript(filePath: string): Promise<void>` Remove a script from the index. ##### `addDocuments(docs: IndexedDocument[]): Promise<void>` Add custom documents to the index (e.g., from external sources). ##### `shutdown(): Promise<void>` Clean up resources. ## Document Types ### Type Documents Extracted from `.d.ts` files (preferred). If no `.d.ts` is found, types/classes can be extracted from ESM JavaScript source (`.js/.mjs`) as a best-effort fallback (parameter/return types default to `any`). ```typescript { id: 'type:@kubernetes/client-node:V1Pod', documentType: 'type', name: 'V1Pod', library: '@kubernetes/client-node', category: 'interface', description: 'Pod is a collection of containers...', properties: [ { name: 'metadata', type: 'V1ObjectMeta', optional: true }, { name: 'spec', type: 'V1PodSpec', optional: true }, ], typeKind: 'interface', nestedTypes: ['V1ObjectMeta', 'V1PodSpec'], } ``` ### Method Documents Extracted from class methods: ```typescript { id: 'method:@kubernetes/client-node:CoreV1Api:listNamespacedPod', documentType: 'method', name: 'listNamespacedPod', library: '@kubernetes/client-node', category: 'list', description: 'List pods in a namespace', parameters: [ { name: 'namespace', type: 'string', optional: false }, ], returnType: 'Promise<V1PodList>', signature: 'listNamespacedPod(namespace: string): Promise<V1PodList>', } ``` ### Script Documents Indexed from TypeScript files: ```typescript { id: 'script:get-pod-logs.ts', documentType: 'script', name: 'get-pod-logs', library: 'CachedScript', category: 'script', description: 'Retrieves logs from a Kubernetes pod', filePath: '/path/to/scripts/get-pod-logs.ts', keywords: 'logs pod kubernetes', } ``` ## Architecture ``` search-libs/ ├── extractor/ # TypeScript AST extraction │ ├── type-extractor # Extract classes, interfaces, enums │ ├── method-extractor # Extract methods from classes │ ├── function-extractor # Extract standalone functions │ └── package-resolver # Find .d.ts files or ESM JS entrypoints in node_modules ├── script/ # Script parsing │ └── script-parser # Parse scripts for metadata ├── schema/ # Orama schema │ ├── base-schema # Core schema fields │ └── schema-builder # Extensibility └── search/ # Search engine ├── search-engine # Orama wrapper ├── query-builder # Fluent query API └── result-formatter # Format for AI consumption ``` ## How library indexing works (TypeScript + JavaScript) This section explains what happens when you call `LibraryIndexer.initialize()` for a package in `node_modules`, and how `search-libs` decides whether to index from **TypeScript declarations** or **JavaScript source**. ### High-level flow At a high level, indexing a package looks like: - **Resolve package folder**: `basePath/node_modules/<packageName>/` - **Decide extraction strategy**: - Prefer **TypeScript declarations** (`.d.ts`) when discoverable - Otherwise, attempt **ESM JavaScript source fallback** (`.js/.mjs`) - **Extract documents**: - Types (classes/interfaces/enums/type-aliases) - Methods (class methods) - Functions (standalone functions) - **Insert into Orama** and expose them via `search()` ### TypeScript packages (declaration-first) For TypeScript libraries (or JS libraries that ship `.d.ts`), extraction is **declaration-first**: #### 1) Finding `.d.ts` files `search-libs` attempts to locate a main `.d.ts` and then scans for additional `.d.ts` files: - **Main declaration** candidates: - `package.json` `"types"` / `"typings"` - `package.json` `"exports"["."]["types"]` - common fallbacks like `dist/index.d.ts`, `lib/index.d.ts`, `index.d.ts` - **Additional declarations**: - Walks the package’s `types/`, `typings/`, `dist/`, `lib/`, and `src/` trees (bounded depth) - Skips common test/internal files (e.g. `*.test.*`, `*.spec.*`, names containing `__`) #### 2) Understanding what’s “public” Some packages have internal class names that are re-exported or aliased at the entrypoint. To reduce noise for method indexing, `search-libs` parses the package’s **main `.d.ts`** and builds: - **Public export set**: the names users can import - **Alias map**: internal names → public names (e.g. `ObjectCoreV1Api` → `CoreV1Api`) It follows `export * from './x'` chains (relative only) to build a more complete public view. #### 3) Extracting types / methods / functions Once `.d.ts` files are discovered, each file is parsed with the TypeScript compiler AST and we extract: - **Types**: `class`, `interface`, `enum`, and simple `type` aliases - Properties are captured as text (type strings from `.d.ts`) - Nested type references are detected for better searchability - **Methods**: - Extracted from class declarations - If a public export set exists, methods are indexed only for **publicly exported classes** - Aliases are applied so class names match what users import - **Functions**: - Extracts function declarations and exported function-valued variables Notes: - Types are extracted from all discovered `.d.ts` files (often includes internal-but-useful helper types). - `LibraryIndexer` can expand complex parameter/return types by looking up extracted types and embedding a compact definition in method docs. ### JavaScript packages (ESM source fallback) If **no `.d.ts` files are discoverable**, `search-libs` attempts to index the package’s **ESM JavaScript source**. #### 1) Finding an ESM entry file Entry resolution is based on `package.json` and common build layouts: - Prefer `exports['.']` with `"import"` (then `"default"`) - Then `"module"` - Then `"main"` **only when** the package has `"type": "module"` (for `.js`) - Plus common fallbacks (`dist/index.js`, `lib/index.js`, `index.js`, and `.mjs` variants) Only `.js` (ESM via `"type":"module"`) and `.mjs` are considered. CommonJS (`.cjs`) is intentionally ignored. #### 2) Computing the public surface (exports) JavaScript libraries often re-export from multiple files. To avoid indexing internal helpers, `search-libs` first computes the **public export surface** by traversing the entry’s static export graph (relative only). Supported patterns include: - `export { a, b as c } from './x.js'` - `export * from './x.js'` (does not re-export `default`) - `export { a, b as c }` (local exports) - import + re-export: ```js import { foo as localFoo } from './x.js'; export { localFoo as foo }; ``` - direct exports: - `export function foo() {}` - `export class Foo {}` - `export const foo = () => {}` - `export default <identifier>` (best-effort; indexed under the name `default` when resolvable) From this traversal, `search-libs` builds: - a **per-file allowlist** of declaration names that are actually part of the public API - a **per-file alias map** for renamed exports (`internalFn` → `publicFn`) Only relative (`./...`) re-exports are followed. Non-relative re-exports (from dependencies) are ignored. #### 3) Extracting from JavaScript source For each JS module that contributes exports, `search-libs` runs the same AST extractors as TypeScript, but applies the allowlist/aliases so only public symbols are indexed: - **Exported functions**: indexed with parameter/return types defaulting to `any` - **Exported classes**: indexed as type documents, and their methods are indexed as method documents - **Descriptions**: pulled from JSDoc comment blocks when present (e.g. `/** ... */`) ### Filters and tuning You can control noise and focus via `PackageConfig`: - `typeFilter`: include only matching type names - `methodFilter`: include only matching method/function names - `classFilter`: include methods only from matching class names (applies to the public/aliased class name) ### Limitations (by design) - **CommonJS** (`module.exports`, `exports.*`) is not supported by the JS fallback. - **Dynamic exports** are not supported (computed exports, runtime mutation, etc.). - **Re-exports from dependencies** (non-relative specifiers like `'lodash'`) are ignored by the JS fallback. - JS fallback is **best-effort**: it parses syntax but does not run a type checker; parameter/return types default to `any`. ### Tips for best results - If you can, ship `.d.ts` (or add `@types/<pkg>`): declaration-first indexing produces richer type signatures. - For JS-only ESM libraries: - Prefer **static named exports** over dynamic export patterns - Add **JSDoc descriptions** on exported functions/classes/methods to improve search quality - Keep exports **shallow and explicit** at the entrypoint for a clearer public surface ## Extending the Schema For domain-specific fields, use the schema builder: ```typescript import { buildSchema, SearchEngine } from '@prodisco/search-libs'; const customSchema = buildSchema({ extensions: { customField: 'string', customEnum: 'enum', }, }); const engine = new SearchEngine({ schema: customSchema }); ``` ## License MIT

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/harche/ProDisco'

If you have feedback or need assistance with the MCP directory API, please join our Discord server