# RETRIEVER Spans
## Purpose
RETRIEVER spans represent document/context retrieval operations (vector DB queries, semantic search, keyword search).
## Required Attributes
| Attribute | Type | Description | Required |
|-----------|------|-------------|----------|
| `openinference.span.kind` | String | Must be "RETRIEVER" | Yes |
## Attribute Reference
### Query
| Attribute | Type | Description |
|-----------|------|-------------|
| `input.value` | String | Search query text |
### Document Schema
| Attribute Pattern | Type | Description |
|-------------------|------|-------------|
| `retrieval.documents.{i}.document.id` | String | Unique document identifier |
| `retrieval.documents.{i}.document.content` | String | Document text content |
| `retrieval.documents.{i}.document.score` | Float | Relevance score (0-1 or distance) |
| `retrieval.documents.{i}.document.metadata` | String (JSON) | Document metadata |
### Flattening Pattern for Documents
Documents are flattened using zero-indexed notation:
```
retrieval.documents.0.document.id
retrieval.documents.0.document.content
retrieval.documents.0.document.score
retrieval.documents.1.document.id
retrieval.documents.1.document.content
retrieval.documents.1.document.score
...
```
### Document Metadata
Common metadata fields (stored as JSON string):
```json
{
"source": "knowledge_base.pdf",
"page": 42,
"section": "Introduction",
"author": "Jane Doe",
"created_at": "2024-01-15",
"url": "https://example.com/doc",
"chunk_id": "chunk_123"
}
```
**Example with metadata:**
```json
{
"retrieval.documents.0.document.id": "doc_123",
"retrieval.documents.0.document.content": "Machine learning is a method of data analysis...",
"retrieval.documents.0.document.score": 0.92,
"retrieval.documents.0.document.metadata": "{\"source\": \"ml_textbook.pdf\", \"page\": 15, \"chapter\": \"Introduction\"}"
}
```
### Ordering
Documents are ordered by index (0, 1, 2, ...). Typically:
- Index 0 = highest scoring document
- Index 1 = second highest
- etc.
Preserve retrieval order in your flattened attributes.
### Large Document Handling
For very long documents:
- Consider truncating `document.content` to first N characters
- Store full content in separate document store
- Use `document.id` to reference full content
## Examples
### Basic Vector Search
```json
{
"openinference.span.kind": "RETRIEVER",
"input.value": "What is machine learning?",
"retrieval.documents.0.document.id": "doc_123",
"retrieval.documents.0.document.content": "Machine learning is a subset of artificial intelligence...",
"retrieval.documents.0.document.score": 0.92,
"retrieval.documents.0.document.metadata": "{\"source\": \"textbook.pdf\", \"page\": 42}",
"retrieval.documents.1.document.id": "doc_456",
"retrieval.documents.1.document.content": "Machine learning algorithms learn patterns from data...",
"retrieval.documents.1.document.score": 0.87,
"retrieval.documents.1.document.metadata": "{\"source\": \"article.html\", \"author\": \"Jane Doe\"}",
"retrieval.documents.2.document.id": "doc_789",
"retrieval.documents.2.document.content": "Supervised learning is a type of machine learning...",
"retrieval.documents.2.document.score": 0.81,
"retrieval.documents.2.document.metadata": "{\"source\": \"wiki.org\"}",
"metadata.retriever_type": "vector_search",
"metadata.vector_db": "pinecone",
"metadata.top_k": 3
}
```