# YTPipe Complete Output Schema
**All information generated from processing a YouTube video**
---
## π Output Directory Structure
```
{OUTPUT_DIR}/{VIDEO_ID}/
βββ audio.mp3 # Downloaded audio
βββ exports/
βββ 1. metadata.json # Video metadata (15 fields)
βββ 2. chunks.jsonl # Semantic chunks (9 fields each)
βββ 3. transcript.txt # Raw transcript text
βββ 4. comprehensive_dashboard.html # Interactive HTML dashboard
βββ 5. transcript_structured.md # Structured markdown
βββ 6. granite_docling_enhanced.json # Docling analysis
βββ 7. granite_docling_enhanced.jsonl # Docling chunks (line-delimited)
βββ 8. docling_output/
βββ transcript_structured.json # Docling processing output
```
**Plus**: Vector database in `./vector_store/{backend}/`
---
## π File 1: metadata.json (15 Fields)
**Purpose**: Complete video metadata and processing information
```json
{
"video_id": "dQw4w9WgXcQ", // YouTube video ID (11 chars)
"url": "https://youtube.com/...", // Full YouTube URL
"title": "Rick Astley - Never Gonna...", // Video title
"duration": 213, // Duration in seconds
"upload_date": "20091025", // Upload date (YYYYMMDD)
"view_count": 1738771804, // View count
"like_count": 18772324, // Like count
"comment_count": 2400000, // Comment count
"channel": "Rick Astley", // Channel name
"description": "The official video for...", // Description (500 chars max)
"processed_at": "2026-02-04 21:33:31.721602", // Processing timestamp
"chunks_count": 1, // Total chunks generated
"total_words": 386, // Total word count
"transcription_method": "whisper-ai-base" // Whisper model used
}
```
**Total**: 15 fields
---
## π File 2: chunks.jsonl (9 Fields Per Chunk)
**Purpose**: Semantic text chunks with embeddings and timestamps
**Format**: JSON Lines (one chunk per line)
```json
{
"id": 0, // Chunk ID (0-indexed)
"text": "There are no strangers to love...", // Chunk text content
"word_count": 386, // Words in this chunk
"start_char": 0, // Start character position
"end_char": 1885, // End character position
"quality_score": 8.0, // Quality score (0-10)
"timestamp_start": "0:00", // Start timestamp (MM:SS)
"timestamp_end": "3:33", // End timestamp (MM:SS)
"embedding": [0.123, 0.456, ...] // 384-dim vector
}
```
**Total**: 9 fields per chunk
**Embedding**: 384-dimensional float array
---
## π File 3: transcript.txt
**Purpose**: Raw transcript text (plain text)
**Format**: Single block of text
**Content**: Complete transcription from Whisper AI
**Encoding**: UTF-8
**Use**: Human-readable, simple parsing
---
## π File 4: comprehensive_dashboard.html
**Purpose**: Interactive HTML dashboard for visualization
### Sections Included:
#### 1. **Header Section**
- Video title
- Channel name
- Duration (MM:SS format)
- View count (formatted)
- Like count (formatted)
- Upload date (formatted)
#### 2. **Content Overview** (4 metrics)
- Total chunks
- Total words
- Average words per chunk
- High quality chunk count
#### 3. **Content Chunks** (Visualization)
For each chunk:
- Chunk ID
- Quality badge (High/Medium/Low)
- Text preview (200 chars)
- Word count
- Timestamp range (NEW!)
#### 4. **Top Keywords** (15-20 keywords)
- Keyword text
- Frequency count
- Sorted by frequency
#### 5. **Quality Distribution** (3 categories)
- High quality (β₯7.0): Count + percentage + progress bar
- Medium quality (5.0-7.0): Count + percentage + progress bar
- Low quality (<5.0): Count + percentage + progress bar
### Design Features:
- Modern OKLCH color scheme
- Dark mode optimized
- Responsive layout
- APCA contrast optimization
- Hover effects
- Clean typography
**File Size**: ~12 KB (optimized)
---
## π File 5: transcript_structured.md
**Purpose**: Structured markdown for Docling processing
### Sections:
```markdown
# {Video Title}
## Video Information
- Video ID
- Channel
- Duration
- Views
- Upload Date
---
## Transcript
{Full transcript text}
---
## Metadata
- Transcription Method
- Total Words
- Processing Date
```
**Format**: Markdown with headers and lists
---
## π File 6: granite_docling_enhanced.json
**Purpose**: Docling-enhanced chunk analysis
### Structure:
```json
{
"chunks": [
{
"chunk_id": 0,
"text": "...",
"source_path": "...",
"word_count": 426,
"char_start": 0,
"char_end": 2194,
"markdown_section": "",
"quality_score": 8.0
}
],
"metadata": {
"total_chunks": 1,
"total_words": 426,
"processor": "granite-docling-enhanced",
"version": "1.0",
"timestamp": "2026-02-04T21:33:49.042420"
}
}
```
**Chunk Fields**: 8 per chunk
**Metadata Fields**: 5
---
## π File 7: granite_docling_enhanced.jsonl
**Purpose**: Same as File 6 but in JSONL format (one chunk per line)
**Format**: JSON Lines
**Use**: Streaming processing, easier parsing
**Content**: Same chunk structure as File 6
---
## π File 8: docling_output/transcript_structured.json
**Purpose**: Docling processing output (structured JSON)
**Content**: Docling's internal representation of the structured markdown
**Size**: ~14 KB
**Format**: Docling-specific JSON schema
---
## ποΈ Vector Database (Not a File)
**Purpose**: Semantic search capability
### Stored Information Per Chunk:
- Chunk ID
- Text content
- Embedding vector (384 dimensions)
- Metadata:
- video_id
- timestamp_start
- timestamp_end
- quality_score
- word_count
**Backend Options**: ChromaDB, FAISS, Qdrant
**Query Capabilities**: Semantic similarity search
---
## π **Complete Information Summary**
### Video-Level Information (15 fields)
1. video_id (YouTube ID)
2. url (Full YouTube URL)
3. title
4. duration (seconds)
5. upload_date (YYYYMMDD)
6. view_count
7. like_count
8. comment_count
9. channel name
10. description (truncated to 500 chars)
11. processed_at (timestamp)
12. chunks_count
13. total_words
14. transcription_method
### Per-Chunk Information (9 fields)
1. id (chunk number)
2. text (chunk content)
3. word_count
4. start_char (character position)
5. end_char (character position)
6. quality_score (0-10 scale)
7. timestamp_start (MM:SS format) **NEW!**
8. timestamp_end (MM:SS format) **NEW!**
9. embedding (384-dim vector)
### Docling Enhanced (8 fields per chunk)
1. chunk_id
2. text
3. source_path
4. word_count
5. char_start
6. char_end
7. markdown_section
8. quality_score
### Dashboard Visualizations
1. Video metadata display
2. Content overview stats (4 metrics)
3. Chunk cards (first 20)
4. Top keywords (15-20)
5. Quality distribution (3 categories)
---
## π― Total Information Captured
### Quantitative:
- **Video metadata**: 15 fields
- **Per chunk (standard)**: 9 fields
- **Per chunk (Docling)**: 8 fields
- **Embedding dimensions**: 384 per chunk
- **Keywords extracted**: 15-20
- **Quality categories**: 3 (high/medium/low)
### Qualitative:
- Full transcript text
- Structured markdown
- Interactive HTML dashboard
- Searchable vector database
- Temporal information (timestamps)
---
## π Data Volume Example
**For 10-minute video** (~2,000 words):
- metadata.json: ~1 KB
- transcript.txt: ~12 KB
- chunks.jsonl: ~150 KB (with embeddings)
- dashboard.html: ~15 KB
- Docling JSON: ~30 KB
- Vector DB: ~50 KB
**Total**: ~260 KB of structured data per video
---
## π What Can Be Queried
### From Files:
- β
Video metadata (all 15 fields)
- β
Full transcript (searchable text)
- β
Individual chunks (by ID or content)
- β
Keywords and frequencies
- β
Quality distribution
- β
Timestamps for any chunk
### From Vector Database:
- β
Semantic similarity search
- β
Find similar chunks across videos
- β
Query by embedding vector
- β
Filter by metadata (video_id, quality, etc.)
### Via MCP Tools:
- β
Full-text search with context
- β
Semantic search
- β
SEO optimization insights
- β
Topic timeline analysis
- β
Quality metrics
- β
Performance benchmarks
---
**Every YouTube video becomes a rich, queryable knowledge base with 50+ data points!** π