# Search & Indexing Guide
Search allows you to find specific moments inside videos using natural language queries, exact keywords, or visual scene descriptions.
## Prerequisites
Videos **must be indexed** before they can be searched. Indexing is a one-time operation per video per index type.
## Indexing
### Spoken Word Index
Index the transcribed speech content of a video for semantic and keyword search:
```python
video = coll.get_video(video_id)
video.index_spoken_words()
```
This transcribes the audio track and builds a searchable index over the spoken content. Required for semantic search and keyword search.
### Scene Index
Index visual content by generating AI descriptions of scenes:
```python
from videodb import SceneExtractionType
# Extract scenes and index them
video.index_scenes(
extraction_type=SceneExtractionType.shot_based,
prompt="Describe the visual content, objects, actions, and setting in this scene.",
)
```
**Extraction types:**
| Type | Description | Best For |
|------|-------------|----------|
| `SceneExtractionType.shot_based` | Splits on visual shot boundaries | General purpose, action content |
| `SceneExtractionType.time_based` | Splits at fixed intervals | Uniform sampling, long static content |
**Parameters for `time_based`:**
```python
video.index_scenes(
extraction_type=SceneExtractionType.time_based,
extraction_config={"time": 5, "select_frames": ["first", "last"]},
prompt="Describe what is happening in this scene.",
)
```
## Search Types
### Semantic Search
Natural language queries matched against spoken content:
```python
from videodb import SearchType
results = video.search(
query="explaining the benefits of machine learning",
search_type=SearchType.semantic,
)
```
Returns ranked segments where the spoken content semantically matches the query.
### Keyword Search
Exact term matching in transcribed speech:
```python
results = video.search(
query="artificial intelligence",
search_type=SearchType.keyword,
)
```
Returns segments containing the exact keyword or phrase.
### Scene Search
Visual content queries matched against indexed scene descriptions. Requires a prior `index_scenes()` call.
`index_scenes()` returns a `scene_index_id`. Pass it to `video.search()` to target a specific scene index (especially important when a video has multiple scene indexes):
```python
from videodb import SearchType, IndexType
# Index scenes first (returns an index ID)
scene_index_id = video.index_scenes(
extraction_type=SceneExtractionType.shot_based,
prompt="Describe the visual content in this scene.",
)
# Search using semantic search against the scene index
results = video.search(
query="person writing on a whiteboard",
search_type=SearchType.semantic,
index_type=IndexType.scene,
scene_index_id=scene_index_id,
)
```
**Important notes:**
- Use `SearchType.semantic` with `index_type=IndexType.scene` — this is the most reliable combination and works on all plans.
- `SearchType.scene` exists but may not be available on all plans (e.g. Free tier). Prefer `SearchType.semantic` with `IndexType.scene`.
- The `scene_index_id` parameter is optional. If omitted, the search runs against all scene indexes on the video. Pass it to target a specific index.
- You can create multiple scene indexes per video (with different prompts or extraction types) and search them independently using `scene_index_id`.
### Scene Search with Metadata Filtering
When indexing scenes with custom metadata, you can combine semantic search with metadata filters:
```python
from videodb import SearchType, IndexType
results = video.search(
query="a skillful chasing scene",
search_type=SearchType.semantic,
index_type=IndexType.scene,
scene_index_id=scene_index_id,
filter=[{"camera_view": "road_ahead"}, {"action_type": "chasing"}],
)
```
See the [scene_level_metadata_indexing cookbook](https://github.com/video-db/videodb-cookbook/blob/main/quickstart/scene_level_metadata_indexing.ipynb) for a full example of custom metadata indexing and filtered search.
## Working with Results
### Get Shots
Access individual result segments:
```python
results = video.search("your query")
for shot in results.get_shots():
print(f"Video: {shot.video_id}")
print(f"Start: {shot.start:.2f}s")
print(f"End: {shot.end:.2f}s")
print(f"Text: {shot.text}")
print("---")
```
### Play Compiled Results
Stream all matching segments as a single compiled video:
```python
results = video.search("your query")
stream_url = results.compile()
results.play() # opens compiled stream in browser
```
### Extract Clips
Download or stream specific result segments:
```python
from videodb import play_stream
for shot in results.get_shots():
stream_url = shot.generate_stream()
print(f"Clip: {stream_url}")
```
## Cross-Collection Search
Search across all videos in a collection:
```python
coll = conn.get_collection()
# Search across all videos in the collection
results = coll.search(
query="product demo",
search_type=SearchType.semantic,
)
for shot in results.get_shots():
print(f"Video: {shot.video_id} [{shot.start:.1f}s - {shot.end:.1f}s]")
```
> **Note:** Collection-level search only supports `SearchType.semantic`. Using `SearchType.keyword` or `SearchType.scene` with `coll.search()` will raise `NotImplementedError`. For keyword or scene search, use `video.search()` on individual videos instead.
## Search + Compilation Workflow
For a complete automated workflow, use the helper script:
```bash
python scripts/search_and_compile.py --video-id <id> --query "your query"
```
This indexes (if needed), searches, compiles results, and returns a stream URL.
## Tips
- **Index once, search many times**: Indexing is the expensive operation. Once indexed, searches are fast.
- **Combine index types**: Index both spoken words and scenes to enable all search types on the same video.
- **Refine queries**: Semantic search works best with descriptive, natural language phrases rather than single keywords.
- **Use keyword search for precision**: When you need exact term matches, keyword search avoids semantic drift.