Skip to main content
Glama

list_files

View all documents stored in your local vector database, showing file paths and chunk counts for each ingested file.

Instructions

List all ingested files in the vector database. Returns file paths and chunk counts for each document.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault

No arguments

Implementation Reference

  • The MCP tool handler for 'list_files'. Calls vectorStore.listFiles(), enriches raw-data files with source information, and returns JSON-formatted list.
     * list_files tool handler
     * Enriches raw-data files with original source information
     */
    async handleListFiles(): Promise<{ content: [{ type: 'text'; text: string }] }> {
      try {
        const files = await this.vectorStore.listFiles()
    
        // Enrich raw-data files with source information
        const enrichedFiles = files.map((file) => {
          if (isRawDataPath(file.filePath)) {
            const source = extractSourceFromPath(file.filePath)
            if (source) {
              return { ...file, source }
            }
          }
          return file
        })
    
        return {
          content: [
            {
              type: 'text',
              text: JSON.stringify(enrichedFiles, null, 2),
            },
          ],
        }
      } catch (error) {
        console.error('Failed to list files:', error)
        throw error
      }
    }
  • Registers the 'list_files' tool in the MCP ListToolsRequestSchema handler, including name, description, and input schema.
    {
      name: 'list_files',
      description:
        'List all ingested files in the vector database. Returns file paths and chunk counts for each document.',
      inputSchema: { type: 'object', properties: {} },
    },
  • Input schema for 'list_files' tool: empty object (no parameters required).
        'List all ingested files in the vector database. Returns file paths and chunk counts for each document.',
      inputSchema: { type: 'object', properties: {} },
    },
  • Core implementation of listFiles() in VectorStore: queries all chunks from LanceDB, groups by filePath, counts chunks per file, selects latest timestamp per file.
     * Get list of ingested files
     *
     * @returns Array of file information
     */
    async listFiles(): Promise<{ filePath: string; chunkCount: number; timestamp: string }[]> {
      if (!this.table) {
        return [] // Return empty array if table doesn't exist
      }
    
      try {
        // Retrieve all records
        const allRecords = await this.table.query().toArray()
    
        // Group by file path
        const fileMap = new Map<string, { chunkCount: number; timestamp: string }>()
    
        for (const record of allRecords) {
          const filePath = record.filePath as string
          const timestamp = record.timestamp as string
    
          if (fileMap.has(filePath)) {
            const fileInfo = fileMap.get(filePath)
            if (fileInfo) {
              fileInfo.chunkCount += 1
              // Keep most recent timestamp
              if (timestamp > fileInfo.timestamp) {
                fileInfo.timestamp = timestamp
              }
            }
          } else {
            fileMap.set(filePath, { chunkCount: 1, timestamp })
          }
        }
    
        // Convert Map to array of objects
        return Array.from(fileMap.entries()).map(([filePath, info]) => ({
          filePath,
          chunkCount: info.chunkCount,
          timestamp: info.timestamp,
        }))
      } catch (error) {
        throw new DatabaseError('Failed to list files', error as Error)
      }
    }

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/shinpr/mcp-local-rag'

If you have feedback or need assistance with the MCP directory API, please join our Discord server