OpenAdLibraryOpenAdLibrary
Ad Transparency & Supply Chain

Native Ad Data API: Programmatic Access to Live Native Creatives

Most native ad intelligence is locked behind a dashboard, so here is how analysts and data teams pull live creatives, the real advertiser, supply-chain hops, and landing pages as structured records over an API.

Diagram of a native ad data API returning creative, advertiser, supply-chain and landing-page records as JSON

If you have ever tried to do real analysis on native advertising, you know the wall. The intelligence exists, but it is trapped behind a dashboard. You can browse one creative at a time, eyeball a handful of advertisers, maybe export a CSV that is already stale by the time it finishes downloading. What you cannot do is treat native ad data the way you treat every other dataset in your stack: query it, join it, schedule it, pipe it into a model.

This guide is for the people who want the data, not the dashboard. Analysts building competitive trackers. Data teams enriching a CRM. Quants backtesting creative angles. And, increasingly, the AI agents doing that work without a human in the loop. The question is plain. How do you get live native creatives, the supply chain behind them, and the landing pages they point to as clean, structured records over an API?

For scale, here is what one neutral index looks like right now: 589,036 creatives, 5.4 million ad observations, 25,933 distinct advertisers, and 926,259 captured landing pages across 42 networks (OpenAdLibrary, June 2026). That is the size of the haystack. The job of an API is to hand you exactly the needles you asked for.

What a native ad data API actually returns#

A native ad data API is an endpoint that returns native advertising intelligence as machine-readable records instead of a web page. Each record bundles the creative (image and headline), the publisher and network that served it, the supply-chain hops behind the placement, the landing page the click resolves to, and timing signals like first-seen and last-seen dates. You filter, join, and aggregate it like any database table.

That tidy definition hides a lot of engineering. Unlike a search ad or a Meta ad sitting in an official library, a native placement is rendered by a native ad widget, the "Recommended for you" or "Around the web" strip at the bottom of articles. Those widgets are dynamic, geo-targeted, and personalized. That is exactly why a single static scrape misses most of the inventory. A real dataset has to capture across geos and over time, then deduplicate down to a stable creative.

Taboola native finance ad: IRS forgives millions by June 30 tax deadline
Caption: A live Taboola finance creative from Fresh Start Information, captured by OpenAdLibrary, running 13 days as of June 2026.

Finance is the single biggest native vertical in the index, with 17,232 creatives, just ahead of insurance (15,629) and health (14,895). The ad above is a textbook example of the genre: a deadline, a vague authority, and a dollar number. It had been running 13 days when we last saw it, which tells you more about its profitability than any focus group could.

The four data layers, and why most tools only give you one#

Think of native ad data as four stacked layers. Most legacy tools surface the top layer and stop. The value compounds as you go down.

Layer What it captures Why it matters
Creative Full-resolution image, headline, call to action The actual ad, not a thumbnail. Feeds Copy DNA and visual clustering.
Network and publisher Which native network served it, on which site Lets you answer "who is buying ads on this site?" and segment by source
Supply chain The ad-tech hops between publisher and advertiser Reveals resold inventory, intermediaries, and routing
Landing page The pre-lander and final destination of the click Surfaces the real advertiser and the offer being sold

The jump from layer two to layer four is the whole game. A placement is served by a network (Taboola, Outbrain, MGID, Revcontent, Teads, Yahoo, MSN) but the network is not the advertiser. The advertiser is whoever sits at the end of the click. Taboola alone accounts for 157,727 creatives in the index, Outbrain another 84,252, MGID 49,689. None of those numbers tell you who is actually buying. Following the redirect chain, server-side and without firing a real click on a live campaign, is what turns a pile of anonymous widget creatives into competitive intelligence you can act on.

For the mechanics of the routing in between, see the native ad supply chain explained with real traces. For pulling the real buyer out from behind the serving network, read how to identify the ad network behind any ad.

Duration is the column everyone underrates#

The most valuable field in a native ad dataset is not the creative. It is the timestamp.

You cannot see anyone's spend in native. You do not need to. A creative that survives across dozens of sites for weeks is being renewed because it converts. Spend leaves a fingerprint, and that fingerprint is duration. In our current index, the longest continuously observed creatives have run for about 28 days straight, the ceiling of our observation window so far. The headlines that hit that ceiling are revealing.

Outbrain finance native ad from SmartAsset about avoiding taxes on IRA withdrawals
Caption: SmartAsset's IRA-withdrawal angle on Outbrain, observed running 28 consecutive days, OpenAdLibrary, June 2026.

SmartAsset's "How Can I Avoid Paying Taxes on IRA Withdrawals?" sat at the top of the 28-day list. So did a clutch of "What's Your IQ Level?" quiz ads from My IQ on the Microsoft Audience Network, and a hearing-aid ad from Hidden Hearing that ran two different creatives to the same ceiling. When the same advertiser pins multiple variants at maximum duration, that is not luck. That is a profitable offer being fed.

One caution worth stating plainly. The "90-day winner" you hear quoted in media-buying circles is industry lore, not our measurement. What we can actually verify today is continuous runs up to roughly 28 days per creative. Treat the 90-day figure as folklore until your own dataset timestamps it.

API vs dashboard: match the tool to the job#

There is nothing wrong with a dashboard for ad-hoc research. The two interfaces are built for different jobs, and conflating them is why teams overpay for tools they barely use.

  • Use a dashboard when a human is doing exploratory work: eyeballing a competitor's angle, pulling a few creatives for a deck, scanning what is hot this week. This is the classic ad spy tool workflow.
  • Use an API when the consumer is a system: a scheduled job, a dbt model, a feature store, a Slack alert, or an AI agent. You want records, not pixels, and you want them on a cron.

The practical tell is repetition. If you find yourself manually exporting the same view every Monday, you do not have a research problem. You have an integration problem, and an API solves it. This is also where the broader discipline of ad intelligence is heading: less manual browsing, more programmatic feeds wired straight into competitive and creative workflows.

A concrete pipeline#

Here is how a data team actually wires native ad data into something useful. The pattern holds whether you are tracking a whole vertical or one competitor.

  1. Define the watchlist. A set of advertisers, landing-page domains, or publisher sites you care about. The API filters on these so you are not paying to ingest the whole internet.
  2. Pull on a schedule. A nightly job hits the endpoint with last_seen >= yesterday, returning only new and still-live creatives. Each record carries a stable ID, so you can upsert without duplicates.
  3. Join to your own data. Match landing-page domains against your CRM or a competitor table. Now every native creative is attributed to a company you already track.
  4. Compute longevity and spread. Group by creative ID. Measure days-live (last_seen minus first_seen) and the count of distinct publishers. Long-running, widely-spread creatives are your winners, the ones worth modeling.
  5. Act. Push a daily digest of new winners to Slack, feed the creatives into a clustering model, or hand the structured plan to an agent.
Taboola health native ad about Americans ditching hearing aids for a new device
Caption: A 26-day Taboola health creative from Nebroo, the kind of long-runner step four is designed to surface, OpenAdLibrary, June 2026.

Step four is where native genuinely differs from other channels. A creative like the Nebroo hearing-device ad above, observed for 26 days, is the exact signal your pipeline should flag automatically. Duration is legible only if your dataset timestamps every observation, which is the whole reason 5.4 million observations sit behind those 589,036 creatives. The observations are the time series. The creative is just the key.

What separates a usable dataset from a noisy one#

Volume is the wrong headline metric. Ten million stale, poorly-deduplicated, network-only records are worse than a smaller set that is fresh, attributed, and clean. When you evaluate a native ad dataset or its API, pressure-test it on five things:

  • Freshness. Are placements captured live, with first-seen and last-seen dates? Or is it a periodic dump that is already weeks old?
  • Creative fidelity. Is the actual image stored at full quality, or a downscaled thumbnail that is useless for visual analysis?
  • Attribution depth. Does it stop at the network, or trace the click to the final landing page and the real advertiser?
  • Capture method. Does it read public ads without firing real clicks on live campaigns? Synthetic clicks spend other advertisers' money and corrupt their data, and yours.
  • Coverage breadth. Does it span the major native networks and capture across geos, or is it one publisher in one country?
Taboola home and garden native ad about solar home batteries
Caption: A 27-day Taboola home-and-garden creative from Solar Battery Subsidy, captured at full resolution by OpenAdLibrary, June 2026.

Notice the detail in that solar-battery ad. At full resolution you can read the offer, the framing, and the visual treatment, which is what feeds visual clustering and copy analysis. A thumbnail throws all of that away. These five tests are also what distinguish genuine ad transparency from a screenshot library with a search box bolted on. Native has historically lagged search and social here precisely because no neutral library existed, which is the backstory behind why a native ad library didn't exist until now.

Why native is harder, and more rewarding, than it looks#

Search and social have official, regulated transparency surfaces. Native does not. There is no government-mandated library for programmatic native advertising the way there is for political ads on the big platforms. That gap is exactly why a purpose-built dataset has outsized value. The data is public. It is just nobody's job to organize it.

The reward for solving the capture problem is a feed of competitive truth the advertisers cannot hide, because the ads are running in public for anyone to see. Pair that with programmatic access and you get something the legacy spy tools never offered: native ad data as infrastructure, not as a destination site you log into. For the wider context on how ad libraries, laws, and tools fit together, start with the pillar on what ad transparency is and how to use it, and open ad libraries explained covers the tooling landscape.

How OpenAdLibrary fits#

OpenAdLibrary was built data-first. It captures live public native ads across Taboola, Outbrain, MGID, Revcontent, Teads, Yahoo, and MSN, stores the real creative at full quality, classifies the supply chain, and follows each click to the landing page without clicking live ads. The same records that power the dashboard are available over an API and an MCP server, so an analyst, a pipeline, or an AI agent can pull creatives, the real advertiser, supply-chain data, and landing pages as structured records, then route them into Copy DNA, Creative Studio, or Optimize.

The pricing is open and deliberately cheap: $29.99/month, with a free tier to browse 200 ads and no card required, against rivals that run from roughly $80 to $400 a month. For data teams, that means you can validate the dataset against your own watchlist before you write a single line of integration code.

Start free and query the native ad dataset directly. No dashboard required.

Frequently asked questions

What is a native ad dataset?
A native ad dataset is a structured collection of native ad records where every field is queryable, each one capturing the creative image and headline, the publisher and network that served it, the supply-chain hops behind the placement, and where the click resolves. Unlike a screenshot gallery, you can filter, join, and aggregate it like any other table, which is what lets you measure things like how long a creative has run.
How is a native ad data API different from a dashboard ad spy tool?
A native ad data API returns the same intelligence as machine-readable records, while a dashboard is built for a human browsing one ad at a time. With an API you can pull thousands of creatives, join them to your own data, schedule refreshes, and feed downstream models or AI agents without anyone clicking through a UI. The practical tell that you need one: you keep exporting the same view by hand every week.
Does pulling native ad data require clicking on live ads?
No, a well-built native ad dataset captures public ads from publisher pages and follows each click server-side to record the landing page, without firing real impressions or clicks on live campaigns. That keeps the data clean and avoids spending other advertisers' budgets or polluting their analytics, which is also why capture method is one of the first things to vet in any provider.
Can I get the real advertiser behind each native ad, not just the network?
Yes, as long as the dataset traces the full click path to the final landing page. The placement is served by a network like Taboola or Outbrain, but the click resolves through redirects to a destination tied to the actual advertiser. Following that chain is what separates useful native ad data from a list of anonymous widget creatives labeled only by serving network.
How fresh does native ad data need to be?
For competitive work, freshness beats raw volume, because native creatives rotate constantly and a stale dump misses what is live right now. A dataset that captures live placements and timestamps first-seen and last-seen dates lets you measure how long an ad has run, the strongest public proxy for whether it is profitable. In our index the longest continuous runs observed so far reach about 28 days per creative.
The OpenAdLibrary Team
Written byThe OpenAdLibrary Team
Ad intelligence & native advertising research

We build OpenAdLibrary, the open ad-transparency platform. Every day our systems capture live native ads across Taboola, Outbrain, MGID, Revcontent, Teads, Yahoo and MSN, identify the real advertiser behind each one, and follow the click to its landing page. These guides distill what we see in that data so you can research the market faster.