Thunderbit, a company already claiming over 100,000 users for its AI web data platform, has unveiled a suite of developer-focused tools: a new API, an MCP (Model Context Protocol) server, and a CLI (Command Line Interface). These offerings aim to assist developers in transforming unstructured web page content into usable data formats like Markdown or structured data. The stated goal is to provide more reliable access to web data for AI agents, RAG pipelines, and automation workflows.
The centerpiece of this launch is Thunderbit Distill, an adaptive engine designed to convert HTML to Markdown. Reports suggest this engine achieved a ROUGE-L score of 0.87 in internal tests, reportedly producing cleaner and more complete Markdown across various complex web page types. This development is framed as a solution to the common problem of web data variability, where changes in website structure often break data extraction processes, necessitating custom scrapers for each site.
Read More: HoYoverse invests $14.6 billion in AI for personalized game worlds

Beyond the API, server, and CLI, Thunderbit also highlighted existing features. These include an adaptive HTML-to-Markdown conversion and data extraction capabilities that work across product pages, pricing tables, and directories. The platform is also designed to explore subpages, aggregating information into a single structured view, and reportedly requires minimal user input for data acquisition, described as a "2-Click Scraping" feature.
Thunderbit's stated mission is to turn volatile web pages into data that software can use with consistency. The company is positioning its tools for sales, marketing, e-commerce, and research teams, emphasizing precision in web data extraction. Future plans reportedly include the development of "Auto Data Agents" for continuous dataset monitoring and updating, alongside "PDF Data Parsing Agents" for extracting information from documents. These upcoming features are presented as steps towards more autonomous data pipelines. The platform is currently accessible as a Chrome Extension.
Read More: UK Studios Seek Funds: Harder to Get in 2026