The Web Scraping API Market in 2026
Web scraping has matured from a niche developer skill into a recognized infrastructure category. Grand View Research valued the web scraping services market at $901 million in 2024 and projects it to cross $1.6 billion by 2028, growing at a 13.1% CAGR. The growth is driven by three converging forces: enterprises need structured data to feed AI models, e-commerce companies rely on competitive price intelligence, and autonomous AI agents increasingly need real-time web data to function.
The companies that captured this market early are now generating substantial revenue. Zyte (formerly Scrapinghub), the company behind Scrapy and the Zyte API, raised $14 million in Series B funding and serves over 3,000 enterprise customers. ScraperAPI, founded by a single developer in 2018, raised $2.5 million after bootstrapping to profitability, and now handles billions of API requests per month across 10,000+ customers. SerpApi, which focuses exclusively on search engine result page scraping, generates an estimated $10M+ in annual recurring revenue from a single, focused product. Apify, the platform that lets anyone build and sell scraping actors, hosts over 3,000 public actors on its store, with top actor developers earning $10,000-$50,000 per month.
Why Individuals Can Compete in This Market
SerpApi built $10M+ ARR from one data type. Enterprise players cannot cover every niche.
Cloud hosting + pay-per-GB proxies means $50-200/month to run a production scraping API.
Autonomous agents need scraping APIs they can pay for instantly. x402 makes this permissionless.
Apify Store, RapidAPI, and x402 marketplaces provide built-in distribution to buyers.
The barrier to entry has never been lower. A developer with Playwright or Puppeteer skills, a proxy subscription, and a deployment platform can go from zero to a revenue-generating scraping API in a single weekend. The question is not whether scraping APIs make money -- the market has already proven that decisively. The question is which business model, selling platform, and technology stack will maximize your revenue per hour of development time.
The Three Business Models That Work
Every successful scraping API business uses one of three pricing models. Each optimizes for a different customer segment and scales differently. Understanding which model fits your product is the single most important business decision you will make.
Model 1: Per-Request Pricing
$0.001 - $0.05 per requestThe most straightforward model. Charge a flat fee per API call. ScraperAPI pioneered this approach at scale, charging $0.001-$0.01 per request depending on the plan tier. The beauty is simplicity: customers understand exactly what they are paying, and your revenue scales linearly with usage. This model works exceptionally well for x402 payment gating because each request is a discrete, payable transaction.
Revenue Example
A SERP scraping API at $0.005/request with 10,000 requests/day generates $50/day = $1,500/month. At 50,000 requests/day: $7,500/month.
Model 2: Subscription Tiers
$49 - $499/monthPredictable recurring revenue. Apify actors typically use this model, with free tiers (limited requests), pro tiers ($49-$99/month), and business tiers ($199-$499/month). The advantage is revenue predictability and higher lifetime customer value. The disadvantage is that you need to handle account management, usage tracking, and billing infrastructure. SerpApi uses this model with plans ranging from free (100 searches/month) to Enterprise ($10,000+/month for millions of searches).
Revenue Example
100 customers across three tiers: 60 at $49 + 30 at $149 + 10 at $499 = $12,400/month MRR. Customer churn at 5%/month means you need ~5 new customers/month to maintain.
Model 3: Enterprise Custom
$1,000 - $50,000+/monthHigh-value contracts with dedicated support, SLAs, and custom data delivery. Bright Data (formerly Luminati) built a $200M+ revenue business primarily on enterprise contracts. Zyte serves 3,000+ enterprise customers with custom scraping solutions. This model requires sales effort and relationship management, but a single enterprise customer can be worth more than hundreds of self-serve users. Many developers start with per-request pricing, prove their API's reliability, then upsell enterprise customers on custom contracts.
Revenue Example
5 enterprise customers at $5,000/month average = $25,000/month. Enterprise contracts typically have 12-month terms with 90%+ retention rates.
Which Model Should You Start With?
Start with per-request pricing via x402. It requires zero billing infrastructure, works from day one, and lets AI agents pay autonomously. Use the revenue data to understand your customer base, then layer on subscription tiers for repeat customers and enterprise contracts for high-volume users. This is exactly how ScraperAPI scaled: simple per-request pricing first, then tiered plans once they understood usage patterns.
Where to Sell: Platform Comparison
Three selling platforms dominate the scraping API market in 2026. Each has distinct advantages, fee structures, and buyer demographics. The right choice depends on whether you prioritize distribution, revenue retention, or AI agent compatibility.
| Feature | Apify Store | RapidAPI Hub | Self-hosted + x402 |
|---|---|---|---|
| Commission / Fee | 20% of revenue | 20% of revenue | 0% (direct USDC) |
| Built-in Audience | 90,000+ users | 4M+ developers | Marketplace listing |
| Payment Currency | USD (Stripe) | USD (Stripe) | USDC (on-chain) |
| AI Agent Compatible | Via API only | Via API only | Native (HTTP 402) |
| Infrastructure | Managed (their cloud) | Self-hosted | Self-hosted |
| Pricing Flexibility | Per-run + subscription | Per-request + plans | Per-request (micro) |
| Payout Speed | 30-day net terms | 30-day net terms | Instant (on-chain) |
| Vendor Lock-in | Apify actor framework | API spec only | None (standard HTTP) |
| Setup Complexity | Low (Apify SDK) | Medium | Medium (Hono + middleware) |
| Best For | Discovery & beginners | API marketplace reach | Max revenue & AI agents |
Apify Store
Best for beginners and discovery. The Apify Store has over 3,000 actors and 90,000+ users actively searching for scraping solutions. You build an “actor” using the Apify SDK, publish it, and the platform handles billing, infrastructure, and customer support. Top actors like the Amazon Product Scraper and Instagram Scraper earn their developers $10,000-$50,000/month. The 20% commission is the cost of built-in distribution.
You earn: $0.80 for every $1.00 in revenue. On $5,000/month gross: $4,000 net.
RapidAPI Hub
Largest API marketplace with 4 million+ developers. You host your own API and list it on RapidAPI's hub. They handle billing and provide a unified API gateway. The 20% fee matches Apify, but the audience is broader (not just scraping). Good for APIs with general developer appeal. The downside: intense competition and less discovery for niche scraping tools.
You earn: $0.80 for every $1.00 in revenue. On $5,000/month gross: $4,000 net.
Self-hosted + x402
Zero commission. USDC payments go directly to your wallet with no intermediary. The x402 protocol (HTTP 402 Payment Required) enables per-request micropayments that work for both human developers and AI agents. List your service on the Proxies.sx Service Marketplace for discovery, or let AI agents find you via x402 protocol discovery.
You earn: $1.00 for every $1.00 in revenue. On $5,000/month gross: $5,000 net.
The Hybrid Strategy That Maximizes Revenue
The smartest developers sell on multiple platforms simultaneously. List a basic version on the Apify Store for discovery and credibility. Offer the same API on RapidAPI for broader reach. Run the premium version (more features, higher rate limits, fresher data) self-hosted with x402 for maximum margins and AI agent compatibility. When Apify or RapidAPI customers ask for more capacity, upsell them to the self-hosted version where you keep 100%. This is the same strategy that many SaaS companies use: freemium for acquisition, premium for revenue.
Choosing a Profitable Scraping Niche
Not all scraping niches are equally profitable. The most lucrative ones sit at the intersection of high data value, strong anti-bot protection (which creates a technical moat), and recurring demand (customers need the data continuously, not once). Here are the niches ranked by revenue potential based on real marketplace data.
SERP / Search Engine Data
$10M+ ARR provenSerpApi generates $10M+ ARR scraping Google, Bing, and YouTube search results. Google serves 8.5 billion searches/day. Every SEO company, marketing agency, and competitive intelligence firm needs SERP data. The moat is Google's aggressive anti-bot protection -- mobile proxies are nearly mandatory.
E-commerce Price Monitoring
$5,000-$50,000/mo per customerAmazon, Walmart, Target, and other retailers are the most scraped websites in the world. Enterprises pay $5,000-$50,000/month for continuous price monitoring across millions of products. The data directly drives pricing decisions that can be worth millions in revenue. Bright Data built most of their $200M+ revenue here.
Social Media Data
$3,000-$20,000/moInstagram profiles, TikTok videos, LinkedIn job postings, Twitter/X mentions. Marketing agencies, brand monitoring companies, and influencer platforms all need this data. Anti-bot protections are severe (especially Instagram and LinkedIn), creating a strong technical moat for developers who solve it.
Real Estate Listings
$2,000-$15,000/moZillow, Realtor.com, Redfin, Rightmove (UK), Idealista (EU). Proptech companies, real estate investors, and market research firms pay consistently for listing data. The niche is less competitive than SERP or e-commerce because it requires domain expertise to structure the data meaningfully.
Job Listings
$1,500-$10,000/moIndeed, LinkedIn Jobs, Glassdoor, ZipRecruiter. HR tech companies, workforce analytics firms, and recruitment agencies need job listing data at scale. LinkedIn is notoriously difficult to scrape, making mobile proxies essential and creating a strong competitive moat.
Niche Selection Criteria
Choose a niche where: (1) the data is valuable enough that customers will pay $0.005+ per request, (2) the target site has enough anti-bot protection that casual scrapers fail, creating a moat for you, (3) demand is recurring (weekly or daily data refreshes, not one-time), and (4) you have domain knowledge to structure the output meaningfully. A real estate scraping API built by someone who understands property data will always beat a generic HTML-to-JSON converter.
Building the Scraper: From Script to API
The technical journey has three phases: build a reliable scraper, wrap it in an API, and gate it with payments. Here is a complete, production-ready example of a SERP scraping API using Playwright, mobile proxies from Proxies.sx, and x402 payment middleware.
Step 1: The Scraping Logic
A production scraper needs three things: proxy rotation for anti-bot bypass, retry logic for resilience, and structured output parsing. Here is the core scraping module using Playwright with mobile proxy integration.
// src/scraper.ts
import { chromium, Browser, Page } from 'playwright';
interface SERPResult {
position: number;
title: string;
url: string;
snippet: string;
domain: string;
}
interface ScrapeResult {
query: string;
country: string;
results: SERPResult[];
scrapedAt: string;
}
// Proxies.sx mobile proxy configuration
const PROXY_CONFIG = {
host: 'gate.proxies.sx',
port: 5432,
username: process.env.PROXIES_USER!,
password: process.env.PROXIES_PASS!,
};
export async function scrapeGoogle(
query: string,
country: string = 'US'
): Promise<ScrapeResult> {
let browser: Browser | null = null;
try {
// Launch with mobile proxy (real 4G/5G IP)
browser = await chromium.launch({
proxy: {
server: `http://${PROXY_CONFIG.host}:${PROXY_CONFIG.port}`,
username: `${PROXY_CONFIG.username}-country-${country}`,
password: PROXY_CONFIG.password,
},
});
const page: Page = await browser.newPage({
userAgent: 'Mozilla/5.0 (iPhone; CPU iPhone OS 17_5 like Mac OS X) '
+ 'AppleWebKit/605.1.15 (KHTML, like Gecko) Version/17.5 '
+ 'Mobile/15E148 Safari/604.1',
viewport: { width: 390, height: 844 },
});
// Navigate to Google with country-specific TLD
const tld = country === 'US' ? 'com' : country.toLowerCase();
await page.goto(
`https://www.google.${tld}/search?q=${encodeURIComponent(query)}&hl=en`,
{ waitUntil: 'domcontentloaded', timeout: 30000 }
);
// Parse organic results
const results: SERPResult[] = await page.evaluate(() => {
const items = document.querySelectorAll('#search .g');
return Array.from(items).slice(0, 10).map((el, i) => {
const titleEl = el.querySelector('h3');
const linkEl = el.querySelector('a');
const snippetEl = el.querySelector('[data-sncf]') || el.querySelector('.VwiC3b');
const url = linkEl?.getAttribute('href') || '';
return {
position: i + 1,
title: titleEl?.textContent || '',
url,
snippet: snippetEl?.textContent || '',
domain: url ? new URL(url).hostname : '',
};
});
});
return {
query,
country,
results,
scrapedAt: new Date().toISOString(),
};
} finally {
if (browser) await browser.close();
}
}Step 2: Wrap It in a Hono API
Hono is a lightweight, edge-ready web framework that runs on Cloudflare Workers, Deno Deploy, Bun, and Node.js. It is the ideal framework for scraping APIs because it is fast, has minimal dependencies, and integrates cleanly with the x402 middleware.
// src/index.ts
import { Hono } from 'hono';
import { cors } from 'hono/cors';
import { scrapeGoogle } from './scraper';
const app = new Hono();
// CORS for browser clients
app.use('/*', cors());
// Health check (free, no payment needed)
app.get('/', (c) => {
return c.json({
service: 'SERP Scraping API',
version: '1.0.0',
pricing: '$0.005 per search query (USDC via x402)',
docs: 'https://your-api.com/docs',
endpoints: {
'/api/serp': 'Google SERP scraping (GET)',
},
});
});
// SERP scraping endpoint
app.get('/api/serp', async (c) => {
const query = c.req.query('q');
const country = c.req.query('country') || 'US';
if (!query) {
return c.json({ error: 'Missing required parameter: q' }, 400);
}
try {
const result = await scrapeGoogle(query, country);
return c.json(result);
} catch (error) {
return c.json(
{ error: 'Scraping failed', message: (error as Error).message },
500
);
}
});
export default app;Step 3: Gate with x402 Payments
This is where self-hosted scraping APIs diverge from marketplace listings. Instead of Stripe billing, API keys, and account management, you add a single middleware that handles payment verification. The @proxies-sx/x402-hono package does this in one line. When a client calls your endpoint without a payment proof, they get back an HTTP 402 response with payment terms. Once they pay and retry with the transaction hash, the middleware verifies the on-chain payment and lets the request through.
// src/index.ts (with x402 payment gating)
import { Hono } from 'hono';
import { cors } from 'hono/cors';
import { x402Middleware } from '@proxies-sx/x402-hono';
import { verifySolanaPayment } from '@proxies-sx/x402-solana';
import { scrapeGoogle } from './scraper';
const app = new Hono();
app.use('/*', cors());
// Free health/discovery endpoint (no payment)
app.get('/', (c) => {
return c.json({
service: 'SERP Scraping API',
version: '1.0.0',
pricing: '$0.005 per request (USDC via x402)',
networks: ['solana', 'base'],
endpoints: { '/api/serp': 'GET - Google SERP scraping' },
});
});
// x402 payment gate: $0.005 per request
app.use('/api/*', x402Middleware({
price: 5000, // $0.005 in micro-units (1 unit = $0.000001)
recipient: 'YOUR_SOLANA_WALLET_ADDRESS',
verify: verifySolanaPayment,
networks: ['solana', 'base'], // Accept both networks
description: 'Google SERP scraping - 10 organic results per query',
}));
// Paid endpoint: scrape Google SERPs
app.get('/api/serp', async (c) => {
const query = c.req.query('q');
const country = c.req.query('country') || 'US';
if (!query) {
return c.json({ error: 'Missing required parameter: q' }, 400);
}
const result = await scrapeGoogle(query, country);
return c.json(result);
});
export default app;What Happens When a Client Calls Your API
- 1.Client sends
GET /api/serp?q=web+scraping+tools - 2.x402 middleware returns HTTP 402 with: price ($0.005), token (USDC), wallet address, supported networks
- 3.Client (human or AI agent) pays $0.005 USDC on Solana (~400ms) or Base (~2s)
- 4.Client retries the request with
Payment-Signatureheader containing the tx hash - 5.Middleware verifies on-chain payment, request reaches your scraping handler
- 6.Structured SERP data is returned as JSON. You keep 100% of the $0.005 USDC.
Revenue Calculations: Real Numbers
The economics of a scraping API depend on three variables: price per request, cost per request (primarily proxy bandwidth), and request volume. Here are realistic projections for a SERP scraping API using Proxies.sx mobile proxies at $4-6/GB.
Note: All revenue figures below are estimates based on typical request volumes and proxy consumption. Actual results depend on your niche, data quality, marketing effort, and target site complexity. These numbers are directional, not guarantees.
Side Project
1,000 requests/day, $0.005/request
Revenue: 1,000 req/day x $0.005 x 30 days = $150/month. Proxy cost: ~1,000 requests x 0.5 MB avg x 30 = ~15 GB at $4/GB = $60. Server: ~$10/month (Railway/Fly.io). Net on x402 (0% fee): ~$80/month. Net on Apify (20% fee): ~$50/month.
Growing API
10,000 requests/day, $0.005/request
Revenue: 10,000 req/day x $0.005 x 30 = $1,500/month. Proxy cost: ~150 GB at $4/GB (101-500GB tier) = $600. Server: ~$50/month. Net on x402 (0% fee): ~$850/month. Net on Apify (20% fee): ~$550/month ($300 commission).
Full-Time API Business
100,000 requests/day, $0.005/request
Revenue: 100,000 req/day x $0.005 x 30 = $15,000/month. Proxy cost: ~1,500 GB at $4/GB (501-1000GB tier) = $6,000. Server: ~$200/month (dedicated). Net on x402 (0% fee): ~$8,800/month. Net on Apify (20% fee): ~$5,800/month ($3,000 commission).
Enterprise Scale
500,000 requests/day, blended $0.004/request
Revenue: 500,000 req/day x $0.004 x 30 = $60,000/month. Proxy cost: ~7,500 GB at custom rate ~$3.50/GB = ~$26,250. Server + ops: ~$1,000/month. Net on x402: ~$32,750/month. At this scale, enterprise contracts with SLAs command higher per-request pricing.
The 20% Commission Adds Up Fast
At $1,500/month revenue
Apify/RapidAPI takes $300. x402 takes $0. That is $3,600/year you keep.
At $15,000/month revenue
Platform takes $3,000. x402 takes $0. That is $36,000/year in your wallet.
At $60,000/month revenue
Platform takes $12,000. x402 takes $0. The savings alone are a salary.
Compounded over 3 years
A growing API loses $100K-$500K in commissions to platforms. Self-hosting with x402 retains all of it.
Listing Your Service for Discovery
Self-hosting with x402 gives you 100% revenue retention, but you still need buyers to find your API. The Proxies.sx Service Marketplace provides discovery without taking a commission. Your service is listed alongside other x402-enabled APIs, and AI agents can find it through x402 protocol discovery. The live marketplace at agents.proxies.sx/marketplace/ already hosts three live services: Mobile Proxy, Antidetect Browser, and Google Maps Lead Generator.
How to Get Listed
- 1.Build your service using Proxies.sx mobile proxies
- 2.Gate with x402 payments using
@proxies-sx/x402-hono - 3.Submit via Telegram (@proxyforai), GitHub issue, or email (agents@proxies.sx)
- 4.Maya reviews and lists your service -- no commission on revenue
Developer Bounties ($1,150 Total)
In addition to API revenue, the marketplace offers 14 developer bounties totaling $1,150 paid in $SX token. Bounties range from $50 to $200 for building specific services:
- SERP tracking service ($100)
- E-commerce price monitor ($200)
- Social media scraper ($150)
- Job listing aggregator ($100)
- Ad verification service ($100)
AI Agent Discovery
The fastest-growing buyer segment for scraping APIs is autonomous AI agents. When you list on the Proxies.sx marketplace, your service becomes discoverable via the MCP Server (55 tools for AI agents) and x402 protocol discovery. Claude, GPT-based agents, LangChain chains, and CrewAI crews can all find your API, read the 402 payment terms, pay with USDC, and consume your data -- all without human intervention.
Over 35 million x402 transactions have already settled. As the autonomous agent economy grows, every x402-gated API becomes accessible to every agent with a funded wallet. This is distribution that no traditional API marketplace can match.
The Production Infrastructure Stack
A production scraping API needs five components: the scraping engine, the API framework, the payment layer, the proxy infrastructure, and the deployment platform. Here is the recommended stack with real cost estimates.
| Component | Recommended | Alternative | Cost/Month |
|---|---|---|---|
| Web Framework | Hono | Express.js / Fastify | Free |
| Scraping Engine | Playwright | Puppeteer / Cheerio | Free |
| Payment Middleware | @proxies-sx/x402-hono | Custom x402 handler | Free |
| Payment Verification | @proxies-sx/x402-solana | ethers.js (Base) | Free |
| Mobile Proxies | Proxies.sx | Bright Data / Oxylabs | $4-6/GB |
| Hosting | Railway / Fly.io | Cloudflare Workers / VPS | $5-50 |
| Cache | Redis (Upstash) | SQLite / in-memory | $0-10 |
| Monitoring | Axiom (free tier) | Grafana Cloud | $0 |
Install the SDK
# Install the Proxies.sx x402 SDK packages
npm install @proxies-sx/x402-core @proxies-sx/x402-hono @proxies-sx/x402-solana
# Install the scraping stack
npm install hono playwright
# Optional: cache layer
npm install @upstash/redisAdding a Cache Layer (Reduce Proxy Costs by 60-80%)
The single most impactful optimization for a scraping API is caching. If multiple clients query the same search term within a short window, serve the cached result instead of re-scraping. This can reduce proxy bandwidth consumption by 60-80%, dramatically improving margins.
// src/cache.ts
import { Redis } from '@upstash/redis';
const redis = Redis.fromEnv();
const CACHE_TTL = 3600; // 1 hour in seconds
export async function getCachedResult(key: string) {
const cached = await redis.get<string>(key);
if (cached) return JSON.parse(cached);
return null;
}
export async function setCachedResult(key: string, data: unknown) {
await redis.set(key, JSON.stringify(data), { ex: CACHE_TTL });
}
// In your API handler:
app.get('/api/serp', async (c) => {
const query = c.req.query('q')!;
const country = c.req.query('country') || 'US';
const cacheKey = `serp:${country}:${query.toLowerCase()}`;
// Check cache first (no proxy cost)
const cached = await getCachedResult(cacheKey);
if (cached) {
return c.json({ ...cached, cached: true });
}
// Scrape fresh data (costs proxy bandwidth)
const result = await scrapeGoogle(query, country);
await setCachedResult(cacheKey, result);
return c.json(result);
});Cache Impact on Margins
Without caching: 10,000 requests/day x 0.5 MB = 150 GB/month at $4/GB = $600/month in proxy costs. With 70% cache hit rate: only 3,000 unique scrapes/day x 0.5 MB = 45 GB/month at $4/GB = $180/month. That is a $420/month savings that goes directly to your bottom line. At the same $1,500/month revenue, your net profit jumps from $850 to $1,270.
Marketing Your Scraping API
Building the API is half the battle. Getting paying customers requires deliberate marketing across the channels where scraping API buyers actually look. Here are the highest-ROI marketing channels based on what worked for ScraperAPI, SerpApi, and successful Apify actors.
Technical Documentation
CriticalThe #1 conversion factor for API products. Write clear docs with code examples in Python, JavaScript, and curl. Include a playground or try-it endpoint. SerpApi's documentation is their primary marketing channel -- developers find it via Google and convert directly.
SEO Content
HighWrite blog posts targeting "how to scrape [website]" keywords. These searches have high commercial intent -- people searching for scraping solutions are ready to pay. ScraperAPI generates significant traffic from "how to scrape Amazon" type content.
Developer Communities
HighShare your API on r/webscraping, Hacker News, IndieHackers, and relevant Discord servers. Provide genuine value (free tier or generous trial) and avoid pure self-promotion. The scraping community is tight-knit and word-of-mouth is powerful.
AI Agent Ecosystem
GrowingList on the Proxies.sx marketplace for AI agent discovery. Ensure your x402 endpoint has a clear, machine-readable description at the root endpoint. As the agent economy grows, this becomes an increasingly important distribution channel.
The Launch Playbook
- Week 1: Ship MVP with 1 endpoint, x402 gating, and docs. List on Proxies.sx marketplace.
- Week 2: Post on r/webscraping and relevant Discord servers. Offer free trial (100 requests).
- Week 3: Write 2-3 blog posts targeting “how to scrape [your niche]” keywords.
- Week 4: List a basic version on Apify Store for discovery. Analyze first customers.
- Month 2: Add subscription plans based on usage data. Reach out to 5-10 potential enterprise customers.
- Month 3: Optimize based on metrics. Double down on what converts. Add endpoints based on customer requests.
Frequently Asked Questions
Common questions from developers building and selling scraping APIs. Each answer is based on real data from marketplace operators, successful API builders, and current market conditions.
Related Resources
Start Selling Your Scraping API Today
Build a scraper. Wrap it in Hono. Gate it with x402. List it on the marketplace. Keep 100% of your USDC revenue. The entire path from zero to a revenue-generating API takes a weekend.