Native Ad Data API: Programmatic Access to Live Native Creatives
Most native ad intelligence is locked behind a dashboard, so here is how analysts and data teams pull live creatives, the real advertiser, supply-chain hops, and landing pages as structured records over an API.

If you have ever tried to do real analysis on native advertising, you know the wall. The intelligence exists, but it is trapped behind a dashboard. You can browse one creative at a time, eyeball a handful of advertisers, maybe export a CSV that is already stale by the time it finishes downloading. What you cannot do is treat native ad data the way you treat every other dataset in your stack: query it, join it, schedule it, pipe it into a model.
This guide is for the people who want the data, not the dashboard. Analysts building competitive trackers. Data teams enriching a CRM. Quants backtesting creative angles. And, increasingly, the AI agents doing that work without a human in the loop. The question is plain. How do you get live native creatives, the supply chain behind them, and the landing pages they point to as clean, structured records over an API?
For scale, here is what one neutral index looks like right now: 589,036 creatives, 5.4 million ad observations, 25,933 distinct advertisers, and 926,259 captured landing pages across 42 networks (OpenAdLibrary, June 2026). That is the size of the haystack. The job of an API is to hand you exactly the needles you asked for.
What a native ad data API actually returns#
A native ad data API is an endpoint that returns native advertising intelligence as machine-readable records instead of a web page. Each record bundles the creative (image and headline), the publisher and network that served it, the supply-chain hops behind the placement, the landing page the click resolves to, and timing signals like first-seen and last-seen dates. You filter, join, and aggregate it like any database table.
That tidy definition hides a lot of engineering. Unlike a search ad or a Meta ad sitting in an official library, a native placement is rendered by a native ad widget, the "Recommended for you" or "Around the web" strip at the bottom of articles. Those widgets are dynamic, geo-targeted, and personalized. That is exactly why a single static scrape misses most of the inventory. A real dataset has to capture across geos and over time, then deduplicate down to a stable creative.

Finance is the single biggest native vertical in the index, with 17,232 creatives, just ahead of insurance (15,629) and health (14,895). The ad above is a textbook example of the genre: a deadline, a vague authority, and a dollar number. It had been running 13 days when we last saw it, which tells you more about its profitability than any focus group could.
The four data layers, and why most tools only give you one#
Think of native ad data as four stacked layers. Most legacy tools surface the top layer and stop. The value compounds as you go down.
| Layer | What it captures | Why it matters |
|---|---|---|
| Creative | Full-resolution image, headline, call to action | The actual ad, not a thumbnail. Feeds Copy DNA and visual clustering. |
| Network and publisher | Which native network served it, on which site | Lets you answer "who is buying ads on this site?" and segment by source |
| Supply chain | The ad-tech hops between publisher and advertiser | Reveals resold inventory, intermediaries, and routing |
| Landing page | The pre-lander and final destination of the click | Surfaces the real advertiser and the offer being sold |
The jump from layer two to layer four is the whole game. A placement is served by a network (Taboola, Outbrain, MGID, Revcontent, Teads, Yahoo, MSN) but the network is not the advertiser. The advertiser is whoever sits at the end of the click. Taboola alone accounts for 157,727 creatives in the index, Outbrain another 84,252, MGID 49,689. None of those numbers tell you who is actually buying. Following the redirect chain, server-side and without firing a real click on a live campaign, is what turns a pile of anonymous widget creatives into competitive intelligence you can act on.
For the mechanics of the routing in between, see the native ad supply chain explained with real traces. For pulling the real buyer out from behind the serving network, read how to identify the ad network behind any ad.
Duration is the column everyone underrates#
The most valuable field in a native ad dataset is not the creative. It is the timestamp.
You cannot see anyone's spend in native. You do not need to. A creative that survives across dozens of sites for weeks is being renewed because it converts. Spend leaves a fingerprint, and that fingerprint is duration. In our current index, the longest continuously observed creatives have run for about 28 days straight, the ceiling of our observation window so far. The headlines that hit that ceiling are revealing.

SmartAsset's "How Can I Avoid Paying Taxes on IRA Withdrawals?" sat at the top of the 28-day list. So did a clutch of "What's Your IQ Level?" quiz ads from My IQ on the Microsoft Audience Network, and a hearing-aid ad from Hidden Hearing that ran two different creatives to the same ceiling. When the same advertiser pins multiple variants at maximum duration, that is not luck. That is a profitable offer being fed.
One caution worth stating plainly. The "90-day winner" you hear quoted in media-buying circles is industry lore, not our measurement. What we can actually verify today is continuous runs up to roughly 28 days per creative. Treat the 90-day figure as folklore until your own dataset timestamps it.
API vs dashboard: match the tool to the job#
There is nothing wrong with a dashboard for ad-hoc research. The two interfaces are built for different jobs, and conflating them is why teams overpay for tools they barely use.
- Use a dashboard when a human is doing exploratory work: eyeballing a competitor's angle, pulling a few creatives for a deck, scanning what is hot this week. This is the classic ad spy tool workflow.
- Use an API when the consumer is a system: a scheduled job, a dbt model, a feature store, a Slack alert, or an AI agent. You want records, not pixels, and you want them on a cron.
The practical tell is repetition. If you find yourself manually exporting the same view every Monday, you do not have a research problem. You have an integration problem, and an API solves it. This is also where the broader discipline of ad intelligence is heading: less manual browsing, more programmatic feeds wired straight into competitive and creative workflows.
A concrete pipeline#
Here is how a data team actually wires native ad data into something useful. The pattern holds whether you are tracking a whole vertical or one competitor.
- Define the watchlist. A set of advertisers, landing-page domains, or publisher sites you care about. The API filters on these so you are not paying to ingest the whole internet.
- Pull on a schedule. A nightly job hits the endpoint with
last_seen >= yesterday, returning only new and still-live creatives. Each record carries a stable ID, so you can upsert without duplicates. - Join to your own data. Match landing-page domains against your CRM or a competitor table. Now every native creative is attributed to a company you already track.
- Compute longevity and spread. Group by creative ID. Measure days-live (
last_seenminusfirst_seen) and the count of distinct publishers. Long-running, widely-spread creatives are your winners, the ones worth modeling. - Act. Push a daily digest of new winners to Slack, feed the creatives into a clustering model, or hand the structured plan to an agent.

Step four is where native genuinely differs from other channels. A creative like the Nebroo hearing-device ad above, observed for 26 days, is the exact signal your pipeline should flag automatically. Duration is legible only if your dataset timestamps every observation, which is the whole reason 5.4 million observations sit behind those 589,036 creatives. The observations are the time series. The creative is just the key.
What separates a usable dataset from a noisy one#
Volume is the wrong headline metric. Ten million stale, poorly-deduplicated, network-only records are worse than a smaller set that is fresh, attributed, and clean. When you evaluate a native ad dataset or its API, pressure-test it on five things:
- Freshness. Are placements captured live, with first-seen and last-seen dates? Or is it a periodic dump that is already weeks old?
- Creative fidelity. Is the actual image stored at full quality, or a downscaled thumbnail that is useless for visual analysis?
- Attribution depth. Does it stop at the network, or trace the click to the final landing page and the real advertiser?
- Capture method. Does it read public ads without firing real clicks on live campaigns? Synthetic clicks spend other advertisers' money and corrupt their data, and yours.
- Coverage breadth. Does it span the major native networks and capture across geos, or is it one publisher in one country?

Notice the detail in that solar-battery ad. At full resolution you can read the offer, the framing, and the visual treatment, which is what feeds visual clustering and copy analysis. A thumbnail throws all of that away. These five tests are also what distinguish genuine ad transparency from a screenshot library with a search box bolted on. Native has historically lagged search and social here precisely because no neutral library existed, which is the backstory behind why a native ad library didn't exist until now.
Why native is harder, and more rewarding, than it looks#
Search and social have official, regulated transparency surfaces. Native does not. There is no government-mandated library for programmatic native advertising the way there is for political ads on the big platforms. That gap is exactly why a purpose-built dataset has outsized value. The data is public. It is just nobody's job to organize it.
The reward for solving the capture problem is a feed of competitive truth the advertisers cannot hide, because the ads are running in public for anyone to see. Pair that with programmatic access and you get something the legacy spy tools never offered: native ad data as infrastructure, not as a destination site you log into. For the wider context on how ad libraries, laws, and tools fit together, start with the pillar on what ad transparency is and how to use it, and open ad libraries explained covers the tooling landscape.
How OpenAdLibrary fits#
OpenAdLibrary was built data-first. It captures live public native ads across Taboola, Outbrain, MGID, Revcontent, Teads, Yahoo, and MSN, stores the real creative at full quality, classifies the supply chain, and follows each click to the landing page without clicking live ads. The same records that power the dashboard are available over an API and an MCP server, so an analyst, a pipeline, or an AI agent can pull creatives, the real advertiser, supply-chain data, and landing pages as structured records, then route them into Copy DNA, Creative Studio, or Optimize.
The pricing is open and deliberately cheap: $29.99/month, with a free tier to browse 200 ads and no card required, against rivals that run from roughly $80 to $400 a month. For data teams, that means you can validate the dataset against your own watchlist before you write a single line of integration code.
Start free and query the native ad dataset directly. No dashboard required.






