The Enrichment Waterfall Explained: How RevOps Layers Clay, Apollo and a Finder Without Paying Twice

By Marcus Webb, Tools & Automation. Last updated: 2026-05-30

You are paying Apollo, a finder, and a second database to look up the same contact, then keeping whichever answer arrived first anyway.
Match rates look great in a vendor demo and crater on your actual ICP list, so you cannot tell which source earns its price.
Enriched data piles up in a sheet and never reaches the tool that sends, so the spend produces no meetings.
Two vendors return different emails or disagree on "verified," and nobody owns the tiebreak.

What is an enrichment waterfall, in plain terms?

An enrichment waterfall is a routing rule that sends each lead through your data vendors one at a time, in a priority order you set, and stops the moment one returns a verified field. Instead of querying Clay, Apollo, and a phone finder in parallel and paying all three for the same person, the waterfall asks the cheapest or highest-coverage source first, accepts its answer, and only spends a credit on the next vendor when the previous one comes back empty. The name describes the flow: a record falls from tier to tier until it lands a match.

The unit of the whole system is one verified field per record. A waterfall for work email might run Source A, then Source B, then Source C, and the average cost per enriched record becomes the blended price of "whoever matched first," not the sum of three lookups. One provider is never enough because B2B databases overlap on the easy contacts and diverge on the long tail, so you need the chain to reach the rows a single source would miss. On a 10,000-row list where the top source already covers 60% of your ICP, two thirds of your rows never touch the more expensive tiers at all.

How do you choose the fallback order across providers?

You rank providers by coverage and cost, putting the cheapest reliable source first and the premium specialist last. Everything the first tier catches never reaches a pricier vendor, so the source that wins the most rows cheaply belongs at the top, and the most expensive finder should only ever see the leftovers. The fallback order is the single decision that determines your blended cost per record.

The catch is that coverage is list-specific. A database that is excellent for North American enterprise can be thin for European founders or a niche vertical, so a vendor's recommended sequence is the wrong default. Run a 200-300 row sample of your real ICP through each candidate vendor in isolation, record the verified-match rate per source, then order the live waterfall by what the sample showed. This is also where email and phone split: phone numbers carry a much higher cost per match, so phone finders almost always sit last and run on the shortest possible survivor list. For the full pricing math, our companion piece on enrichment waterfall cost control breaks down per-field blended cost and credit budgeting.

Want to put this into practice?

Reachium automates LinkedIn outreach, content publishing, and inbox management in one platform.

Start Free →

How do you conserve credits as the waterfall runs?

You conserve credits with three rules: stop on the first verified hit, skip records you already have, and cap retries on the dead ones. Stop-on-hit is the core mechanism, because every later tier is gated on the previous tier returning empty, so a row that matches at Tier 1 never executes the Tier 2 or Tier 3 columns and never bills for them. A flat "enrich with all sources" workflow is the most expensive way to fill a field: you buy three answers and use one.

The second saving is hygiene. Before the waterfall fires, suppress rows that already carry a verified value in your CRM, since re-enriching a known contact is pure waste. The third is a retry cap: a record that fails every tier is a genuine miss, and re-running the chain on it next week rarely changes the answer, so flag it as exhausted rather than paying again. Together these three turn a parallel spend into a sequential one and reserve premium credits for the rows where coverage actually thins out.

How do you dedup between Clay, Apollo, and a finder?

You dedup by choosing a canonical key (usually LinkedIn URL or company-domain-plus-name), normalizing every provider's output to it, and resolving conflicts with a documented tiebreak rule. Different vendors format the same person differently, so without a single match key you end up with three near-duplicate rows instead of one enriched record. Normalize first, then merge.

The conflict rule is what makes the merge deterministic. When two sources return different emails, or one says "valid" and another says "risky," set one source as the authority for that field and document it, so the same input always produces the same record. Treat any catch-all or low-confidence return as a miss and pass it to the next tier rather than accepting it, because a guessed address that bounces costs you sender reputation downstream, which is far more expensive than the next credit. A final formula column should pick the first verified value and flag which tier supplied it, so you can audit cost per source later.

What does clean enriched data unlock downstream?

Clean enriched data unlocks accurate segmentation and decision-maker targeting, which is the entire point of paying for enrichment in the first place. A deduped, verified list lets you filter to the seniority, function, and account fit that matter, instead of spraying a noisy sheet. Reachium's lead universe shows the scale of that signal: of 1,889,156 B2B leads, 20.5% are flagged decision-makers (542k C-suite, 98k founders), which is exactly the segment a clean waterfall is built to surface.

The payoff lands when that list is activated, not when it sits in storage. On LinkedIn specifically, you are rate-limited by the platform, not your data budget. Across 316,703 LinkedIn outreach sequences run on the verified API, Reachium's data shows a 28% average connection acceptance rate, and a novel finding is a volume tax: acceptance peaked at 34% for accounts sending 10-19 invites a day and fell to 30.6% at 20-29 a day, so more volume produced fewer accepts. When every send counts, every send should target a correctly enriched, well-fit person. The same logic carries into content, where a tightly enriched audience makes a comment-to-DM lead-magnet funnel far more efficient, and into account selection, where the TAM, SAM, SOM framework keeps the waterfall pointed at a market you can win.

Want to put this into practice?

Reachium automates LinkedIn outreach, content publishing, and inbox management in one platform.

Start Free →

How do you measure waterfall quality over time?

You measure a waterfall on three numbers: fill rate (the share of rows that ended with a verified value), cost per enriched record (your blended spend divided by matches), and accuracy (bounce rate on emails, connect rate on phones). Fill rate tells you whether the chain is deep enough, cost per record tells you whether the order is optimal, and accuracy tells you whether a tier is quietly selling you junk. Track all three per tier, not just in aggregate.

The per-tier flag column from your dedup step is what makes this measurable. It answers the only question that justifies a third vendor: how many records did this tier uniquely rescue, and at what price each? If your most expensive specialist rescues 30 rows on a 10,000-row run, its real cost per match is brutal and the tier should probably go. Review these numbers quarterly and re-sample your ICP, because vendor coverage drifts and the order that was optimal last quarter rarely stays optimal.

FAQ

What is the difference between a waterfall and parallel enrichment?

Parallel enrichment queries every vendor at once and keeps the first answer, so you pay all of them for one usable result. A waterfall queries vendors in sequence and only runs the next one when the previous one returns nothing, so you pay for one match per record.

Does Clay charge me for every tier in the waterfall?

Clay charges per enrichment run, so the savings come from gating later columns with a conditional that runs only on empty rows. If a row matches at Tier 1, the Tier 2 and Tier 3 columns never execute and never bill.

How many enrichment vendors should a waterfall have?

Most teams get the bulk of the benefit from two to three tiers: a broad cheap source, a deeper database, and a specialist or verification pass for the survivors. Add a tier only when its flag column shows it uniquely rescues enough records to justify its cost per match.

Should phone numbers be in the same waterfall as emails?

Keep them as separate waterfalls because phone match rates and costs differ sharply from email. Phone finders carry a high cost per match, so they should sit last and run on the shortest possible list of already-qualified records.

Does better enrichment increase LinkedIn outreach results?

It increases efficiency rather than raw output, because LinkedIn caps daily volume regardless of your data budget. Reachium's data shows acceptance peaks at 10-19 invites a day, so spending those capped sends on accurately enriched, well-fit contacts is what moves results.

The Enrichment Waterfall Explained: How RevOps Layers Clay, Apollo and a Finder Without Paying Twice

Key Takeaways

The Enrichment Waterfall Explained: How RevOps Layers Clay, Apollo and a Finder Without Paying Twice

What is an enrichment waterfall, in plain terms?

How do you choose the fallback order across providers?

How do you conserve credits as the waterfall runs?

How do you dedup between Clay, Apollo, and a finder?

What does clean enriched data unlock downstream?

How do you measure waterfall quality over time?

FAQ

Sources