Archives posts from X into a local SQLite database, providing tools for searching, listing, and performing semantic analysis on the archived data.
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@X Archive DaemonFind posts about software architecture in my archive"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
X Archive Daemon
Kisa aciklama: X uzerinden post toplayan, bunlari yerel SQLite arsivine yazan ve MCP ile ajanlara kullandiran daemon tabanli bir arsivleme ve akilli arama projesi.
X Archive Daemon archives posts from X into a local SQLite database, exposes them through a daemon-first architecture, and optionally adds a local semantic analysis layer for smarter retrieval.
Install First
If you work with coding agents, the easiest setup flow is:
Give the repository link to the agent.
Ask the agent to install dependencies.
Ask the agent to create the local secret file.
Ask the agent to start the daemon and the MCP bridge.
Manual setup:
npm installCreate .secrets/x.json:
{
"authMode": "bearer_token",
"bearerToken": "YOUR_X_BEARER_TOKEN"
}Start the daemon:
npm run daemon:startStart the MCP bridge:
npm run mcp:startWhat This Project Does
This project has three layers:
ingestfetch posts from X
store them in SQLite
analysisoptional
adds local embeddings, topic labels, and educational scoring
semantic searchsearches analyzed posts locally
gives the agent a small, relevant candidate set
Scenario:
ingest= put boxes into storageanalysis= attach labels to the boxessemantic search= find the right boxes quickly
Architecture
daemonthe real execution engine
exposes:
GET /healthGET /toolsPOST /invoke
MCPthin stdio bridge for agents
exposes the same tools to the model
SQLitestores posts, scopes, sync runs, billing estimates, and optional analysis results
Core Features
1. Smart, low-cost ingest
The system avoids paying twice for the same coverage.
Example:
first you fetch the latest
50later you ask for the latest
100it does not re-fetch the first
50it fetches only the missing
50
This works for:
latest Ntimelinelatest Noriginal postsexact repeated date windows
exact repeated search queries
2. Original posts by default
When a user generically says "fetch posts" or "archive tweets", the default tool is:
ingest.accounts.original_backfill
This excludes:
replies
retweets
quote tweets
That keeps the archive cleaner and cheaper.
3. Media URLs are stored, files are not downloaded
If a post contains images:
files are not downloaded
only
mediaUrlsare stored
This keeps disk and network usage low.
4. Optional analysis layer
Analysis is off by default.
That means:
you can archive posts without analyzing them
you can analyze the same archive later
old and newly ingested posts are both supported
5. Local semantic search
Once analysis exists, you can search with natural language prompts like:
"teaching posts about coding"
"monolith vs microservices"
"backend and architecture advice"
The system then:
searches the analyzed local archive
scores candidates locally
lets the agent work on a narrow, relevant subset
Tool Groups
sources.accounts.resolveingest.accounts.backfillingest.accounts.original_backfillingest.accounts.syncingest.search.backfillarchive.posts.listarchive.posts.searcharchive.posts.semantic_searcharchive.accounts.listarchive.accounts.getarchive.billing.summaryanalysis.posts.runanalysis.labels.listarchive.insights.summary
MCP also exposes:
system.daemon.start
Risk Model
safe-read
read-only
does not change X or the local archive
operator-write
writes into the local SQLite archive
may consume X API credits
does not create, delete, like, reply, or DM on X
Note:
analysis.posts.runis alsooperator-writebut it does not consume X API credits
it only writes local analysis data and uses CPU
Performance Reference
Measured system:
Apple Silicon
M4 mini16 GBRAMlocal embedding model on CPU
Measured analysis speed:
900posts = about30.69s100posts = about3.41s980posts = about33.42sexpected total
Important:
this is not a large generative LLM benchmark
this is local tagging + embedding + semantic retrieval preparation
Local Model
The current analysis layer uses one small local model:
sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2
Its role is:
post embeddings
label-description embeddings
query embeddings
similarity-based tagging
semantic retrieval
This layer does not generate final answers. It builds a local meaning layer on top of the archive.
How Analysis Works
When analysis runs, each post gets signals such as:
educational score
matched labels
topic scores
reply/noise/technical flags
Tagging is based on:
rule-based signals
embedding similarity
a fixed label catalog with descriptions
Label Catalog
The repository includes a versioned label catalog with Turkish descriptions.
Examples:
software_architecturemonolith_vs_microservicesbackend_apidatabase_sqldatabase_indexingcachingdistributed_systemstesting_qaclean_codecode_reviewsecurity_appsecauthentication_authorizationci_cd_releaseai_assisted_codingvibe_codingprompting_for_engineeringtechnical_decision_making
Quick Start
Check daemon health:
curl http://127.0.0.1:3200/healthList available tools:
curl http://127.0.0.1:3200/toolsEstimate original-post ingest:
curl -X POST http://127.0.0.1:3200/invoke \
-H 'content-type: application/json' \
-d '{
"tool": "ingest.accounts.original_backfill",
"input": {
"username": "sampleauthor",
"searchMode": "recent",
"targetCount": 100,
"estimateOnly": true
}
}'Run local analysis:
curl -X POST http://127.0.0.1:3200/invoke \
-H 'content-type: application/json' \
-d '{
"tool": "analysis.posts.run",
"input": {
"username": "sampleauthor",
"limit": 200,
"onlyUnanalyzed": true
}
}'Run semantic search:
curl -X POST http://127.0.0.1:3200/invoke \
-H 'content-type: application/json' \
-d '{
"tool": "archive.posts.semantic_search",
"input": {
"username": "sampleauthor",
"query": "teaching posts about coding",
"educationalOnly": true,
"limit": 10
}
}'Model Packaging Note
The repository is public and safe to clone, but the local model files are not stored in git history because of GitHub file size limits.
The codebase is public-ready. Model packaging is handled separately from normal git push history.
This server cannot be installed
Resources
Unclaimed servers have limited discoverability.
Looking for Admin?
If you are the server author, to access and configure the admin panel.