Thunderbit Adds New Tools For Developers To Get Web Data Easily

Thunderbit's new tools aim to make web data easier to use for AI, with a new engine achieving a high score in tests for converting web pages to Markdown.

Thunderbit, a company already claiming over 100,000 users for its AI web data platform, has unveiled a suite of developer-focused tools: a new API, an MCP (Model Context Protocol) server, and a CLI (Command Line Interface). These offerings aim to assist developers in transforming unstructured web page content into usable data formats like Markdown or structured data. The stated goal is to provide more reliable access to web data for AI agents, RAG pipelines, and automation workflows.

The centerpiece of this launch is Thunderbit Distill, an adaptive engine designed to convert HTML to Markdown. Reports suggest this engine achieved a ROUGE-L score of 0.87 in internal tests, reportedly producing cleaner and more complete Markdown across various complex web page types. This development is framed as a solution to the common problem of web data variability, where changes in website structure often break data extraction processes, necessitating custom scrapers for each site.

Read More: HoYoverse invests $14.6 billion in AI for personalized game worlds

Rubio vs. conservative critics of Iran 'deal'... - 1

Beyond the API, server, and CLI, Thunderbit also highlighted existing features. These include an adaptive HTML-to-Markdown conversion and data extraction capabilities that work across product pages, pricing tables, and directories. The platform is also designed to explore subpages, aggregating information into a single structured view, and reportedly requires minimal user input for data acquisition, described as a "2-Click Scraping" feature.

Thunderbit's stated mission is to turn volatile web pages into data that software can use with consistency. The company is positioning its tools for sales, marketing, e-commerce, and research teams, emphasizing precision in web data extraction. Future plans reportedly include the development of "Auto Data Agents" for continuous dataset monitoring and updating, alongside "PDF Data Parsing Agents" for extracting information from documents. These upcoming features are presented as steps towards more autonomous data pipelines. The platform is currently accessible as a Chrome Extension.

Read More: UK Studios Seek Funds: Harder to Get in 2026

Frequently Asked Questions

Q: What new tools did Thunderbit release for developers?
Thunderbit has launched a new API, an MCP server, and a CLI tool. These tools help developers change web page content into data formats like Markdown or structured data.
Q: How do Thunderbit's new tools help with web data?
The tools, especially Thunderbit Distill, convert HTML web pages into cleaner Markdown. This helps AI agents and automation workflows get data more reliably, even when websites change.
Q: Who can benefit from Thunderbit's new tools?
Sales, marketing, e-commerce, and research teams can use these tools for precise web data extraction. The platform aims to make web data consistent and usable for software.
Q: What are Thunderbit's future plans?
Thunderbit plans to create 'Auto Data Agents' for watching and updating datasets and 'PDF Data Parsing Agents' to get info from documents. This moves towards more automatic data handling.