An intelligent MCP server that provides advanced web crawling capabilities with interactive configuration, live progress tracking, and background task management for technical documentation.
TechCyclopedia provides comprehensive web crawling capabilities with intelligent content processing
Smart user preference system with persistent storage and "don't ask again" functionality
Real-time updates on crawling progress with detailed status and background task management
Clean, structured file organization by domain with intelligent content filtering
Automatic boilerplate removal and content optimization for clean, LLM-ready markdown
SQLite-based user preference storage with customizable crawl strategies
BFS strategy for comprehensive documentation extraction with configurable depth limits
An intelligent workflow that transforms web documentation into clean, organized content
User provides URLs through MCP client
Smart configuration with user preferences
BFS strategy with intelligent link discovery
Boilerplate removal and content optimization
Clean markdown files organized by domain
Get up and running with TechCyclopedia in minutes
pip install -r requirements.txt
python server/server.py
{
"mcpServers": {
"techcyclopedia": {
"command": "python",
"args": ["server/server.py"]
}
}
}
TechCyclopedia provides several powerful MCP tools for technical documentation crawling
Crawl technical documentation with interactive configuration and progress tracking
Start a background crawling task that can run while continuing to chat
Check the status of a background crawling task with detailed progress information
Get all crawling tasks and their status information
Set the 'don't ask again' flag for user preferences
TechCyclopedia is open source and welcomes contributions from the community. Join us in building the future of intelligent documentation crawling.