# π§ RAG κΈ°λ° μ§μ§ μ λ¬Έκ° μμ€ν
(v3.1.0)
**ν΅μ¬**: νλ₯΄μλ μλ μμΈ λ¬Έμν β RAGλ‘ λμ κ²μ β 무νν μ λ¬Έμ±
---
## π― μν€ν
μ²
### κΈ°μ‘΄ vs RAG
**κΈ°μ‘΄ (v3.0.0)**:
```
410-llm-engineer.txt (15KB) β Context μ 체 λ‘λ
β νκ³: μ νλ κΉμ΄, λμ ν ν° λΉμ©
```
**RAG (v3.1.0)**:
```
410-llm-engineer.txt (1KB λ©νλ°μ΄ν°λ§)
knowledge-base/410-llm-engineer/ (100MB+)
βββ core-competencies/ (200 pages)
βββ case-studies/ (100+ cases)
βββ code-examples/ (500+ examples)
βββ research-papers/ (200+ summaries)
β μ§λ¬Έ λ°μ μ RAG κ²μ β κ΄λ ¨ 3-5 μ²ν¬λ§ (3KB)
β 98% ν ν° μ κ° + 무νν κΉμ΄
```
---
## ποΈ λλ ν 리 ꡬ쑰
```
persona-mcp/
βββ community/
β βββ 410-llm-engineer.txt (κ²½λ λ©νλ°μ΄ν°)
β
βββ knowledge-base/
β βββ 410-llm-engineer/
β βββ core-competencies/
β β βββ transformer-architectures.md (50 pages)
β β βββ prompt-engineering.md (80 pages)
β β βββ model-optimization.md (40 pages)
β β βββ deployment-strategies.md (35 pages)
β βββ case-studies/
β β βββ gpt4-deployment.md
β β βββ llama-fine-tuning.md
β β βββ 100+ more...
β βββ code-examples/
β β βββ quantization.py
β β βββ flash-attention.py
β β βββ 500+ more...
β βββ research-papers/
β βββ 200+ summaries
β
βββ src/rag/
βββ embeddings.ts (Voyage AI)
βββ vectorStore.ts (ChromaDB)
βββ retrieval.ts (Semantic Search + Rerank)
```
---
## π μν¬νλ‘μ°
### 1. μ€νλΌμΈ μΈλ±μ± (μλ² μμ μ)
```typescript
// λͺ¨λ μ§μ λ¬Έμ β μ²νΉ β μλ² λ© β λ²‘ν° DB
knowledge-base/410-llm-engineer/ (500 files)
β μ²νΉ (1000 tokens/chunk, 200 overlap)
β Voyage AI μλ² λ© (1024-dim)
β ChromaDB μ μ₯ (50,000 chunks)
```
### 2. μ€μκ° κ²μ (μ¬μ©μ μ§λ¬Έ μ)
```typescript
User: "How to optimize 70B model inference?"
β Query μλ² λ©
β λ²‘ν° κ²μ (Top 20)
β Cohere Rerank (Top 5)
β Context κ΅¬μ± (3KB)
β LLM μλ΅
```
---
## πΎ κΈ°μ μ€ν
| μ»΄ν¬λνΈ | μ ν | μ΄μ |
|---------|------|------|
| **μλ² λ©** | Voyage AI | MTEB #1 μ±λ₯ |
| **λ²‘ν° DB** | ChromaDB | κ°νΈ, λ‘컬 |
| **μ¬μμν** | Cohere Rerank | μ΅κ³ μ νλ |
| **μ²νΉ** | LangChain | κ²μ¦λ¨ |
---
## π μ±λ₯
| μ§ν | μ 체 λ‘λ | RAG |
|------|-----------|-----|
| Context | 150K tokens | 3K tokens |
| μ νλ | 85% | **92%** |
| λΉμ© | $0.45/req | **$0.009/req** |
| κΉμ΄ | 15KB | **100MB+** |
**ν¨κ³Ό**: 98% ν ν° μ κ° + 7% μ νλ ν₯μ + 무νν μ λ¬Έμ±
---
## π οΈ κ΅¬ν μμ
### embeddings.ts
```typescript
import { VoyageAIClient } from '@voyageai/client';
export class EmbeddingService {
async embedDocument(personaId: string, filePath: string) {
const content = await fs.readFile(filePath, 'utf-8');
const chunks = await this.splitter.splitText(content);
const embeddings = await this.voyageClient.embed({ input: chunks });
return chunks.map((chunk, i) => ({
id: `${personaId}-${i}`,
vector: embeddings.data[i].embedding,
content: chunk
}));
}
}
```
### retrieval.ts
```typescript
export class RetrievalService {
async retrieve(personaId: string, query: string) {
// 1. Query μλ² λ©
const queryVector = await this.embeddings.embedQuery(query);
// 2. λ²‘ν° κ²μ (Top 20)
const candidates = await this.vectorStore.search(queryVector, personaId, 20);
// 3. Cohere Rerank (Top 5)
const reranked = await this.cohere.rerank({
query,
documents: candidates.map(c => c.content),
topN: 5
});
return reranked.results;
}
}
```
---
## π μ€ν κ³ν
### Week 1-2: RAG μΈνλΌ
- [ ] ChromaDB + Voyage AI μ°λ
- [ ] μλ² λ©/κ²μ νμ΄νλΌμΈ
- [ ] 410-llm-engineer μ§μ λ² μ΄μ€ μμ± (200 pages)
### Week 3-4: ν΅ν©
- [ ] MCP μλ² RAG ν΅ν©
- [ ] 10κ° ν΅μ¬ νλ₯΄μλ μ§μ λ² μ΄μ€
- [ ] μ±λ₯ λ²€μΉλ§ν¬
### Week 5-8: νμ₯
- [ ] 142κ° νλ₯΄μλ μ 체 μ§μ λ² μ΄μ€
- [ ] νλ‘λμ
μ΅μ ν
- [ ] v3.1.0 Release
---
## π‘ ν΅μ¬ μ₯μ
1. **무νν κΉμ΄**: 100MB+ μ§μ vs 15KB μ ν
2. **νμ μ΅μ **: μ§μ λ² μ΄μ€ μ
λ°μ΄νΈ μ¦μ λ°μ
3. **λΉμ© ν¨μ¨**: 98% ν ν° μ κ°
4. **λμ μ νλ**: κ΄λ ¨ μ λ³΄λ§ μ 곡 (λ
Έμ΄μ¦ μ κ±°)
5. **νμ₯μ±**: 142 personas Γ 100MB = 14.2GB (κ΄λ¦¬ κ°λ₯)
---
**μν**: β
μν€ν
μ² μ€κ³ μλ£
**λ€μ**: RAG μΈνλΌ κ΅¬μΆ + 첫 μ§μ λ² μ΄μ€ μμ±