Surfacing and using CEX onchain data
Unwrapping the talk @ETHCC by @hildobby from @dragonfly_xyz aka the main source of datasets and dashboards available on @Dune đ
Critical insights *AND* a bonus đ at the end of this thread đ§¶

Why this talk? CEX data is undersurfaced onchain. Itâs hard to know or tell whatâs going in a CEX from the onchain perspective where everything is supposedly accessible and open. Hildobby is gonna try to connect the dots (addresses and tokens) and maintain the dataset.

Apart from security, proof of reserves and so on, CEX onchain data is actually core to many other datasets. Notably, staking and ETF flows are very valuable especially in major market swings.

Instead of using reported addresses, Hildobby is tracking the data onchain directly by tagging and automated detection of addresses. A few heuristics like â @BlackRock will use @CoinbaseInsto as custodianâ allow us to triangulate whoâs doing what.

The tagging approach is multiple folds. It starts with understanding how CEXs work onchain. Gas funding or back and forth between addresses allow for behaviour predictions ontop of onchain address graphs.

For Example, let's zoom in on tokens: they require gas funding to the deposit address when the CEX is not using dedicated smart accounts to manage consolidation.

CEX addresses can be broken down into 4 main categories (quite self explanatory) - Hot wallet, Cold Wallet, Gas Supplier, Excess Gas Recepient.
A few other categories appear for more sophisticated behaviours. This is not perfect of course because of random interactions or voluntary/involuntary deception attempts like some (too small?) deposits never being consolidated.


The address list thus emerges from pattern detections, multiple matches and extra validation from various sources like @herd_eco, @nansen_ai and @arkham

Maintenance is community led even if most of the contribution (99%?) come from @hildobby_

Ok so, now that we have CEX addresses, what can we do with it?
Let's extrapolate those data into several other datasets, re flows, crosslayers, etc


Of course đȘĄ the needles in the blockchain haystacks are deposit addresses. The approach to find them involves 8 steps:

All in all, as of June 2025, over 350 CEXs, 92M deposit addresses have been found.
How many are missing is of course an "unknown unknown" but at least that gives us a lower bound.

Putting them in perspective shows several cool stories, like the @Poloniex era, the downfall of @FTX_Official , the raise of @coinbase , @HTX_Global and of course @binance swallowing a huge part of the market.

The CEX activity represents a considerable portion of the onchain activities.
About ~25% of all transactions on #Ethereum L1 involve a CEX address.

This translates into Fees paid to the network.
CEXs account for ~10% of the total fees paid on Ethereum L1âŠ

âŠwhile their users account for ~5% of the total fees on Ethereum L1.
CEX activity in and out is therefore ~15% of the fees on the mainnet. đ°

Funding addresses: 85.5% of the Ethereum addresses are funded by an initial transfer coming from a CEX.
For the rest, funding comes from already funded addresses, eventually bridging, staking or mining from pre PoS. This is crucial in my opinion when it comes to adoption, the entry point of the network is a CEX. We are done with the days where anyone with a GPU could get a portion of ETH by mining. CEXs have a key role to play in adoption.

Re Adoption, thereâs a lot the dApp can get out of those datasets.
Categorization of the users such as âdoes the majority of my users come from Kraken?â can help on better targeting acquisition campaigns and CEX partnerships.

The other side of the EXchange spectrum - the DEX.
@BNBCHAIN has crushed the DEX volume marketshare by recently having Binance routing volume to the DEX on BNB. The operation is called DEX on CEX and we can expect other actors to do the same soon (hey @base ? @inkonchain ?).

This trend created a lot of arbitrage activity which in turn boosts even more the market share (in volume) of BNBchain.


You want to play around? Find the data set bellow:

To peruse all the CEX charts, head to:
You can notably have a look at the recent activity on CEX after the post @ethcc rally in $ETH price

For feedback and suggestions, do reach out to @hildobby_ dms. The focus is mainly on EVMs as this is the defacto standard for onchain activity.
In the near future, dataset expansion is a great AI usecase. Shootout to @Dune, @arkham , and @KaikoData for giving hildobby a taste of CEX data back in 2019

@EthCC @hildobby_ @Dune @arkham @KaikoData Whatâs next? Bridges and interoperability data.
itâs the next hot topic for @hildobby_ and heâs looking for help. DM him! Heâs cooking right now
@EthCC @hildobby_ @Dune @arkham @KaikoData Onto the bonus now đ
It would be cool to have the net flow of $ETH in CEX and AUM tracking on the #Ethereum native asset.
This is exactly what this new @Dune query is doing (still running đ€):
@EthCC @hildobby_ @Dune @arkham @KaikoData Watch the whole presentation here:
7.25K
23
The content on this page is provided by third parties. Unless otherwise stated, OKX is not the author of the cited article(s) and does not claim any copyright in the materials. The content is provided for informational purposes only and does not represent the views of OKX. It is not intended to be an endorsement of any kind and should not be considered investment advice or a solicitation to buy or sell digital assets. To the extent generative AI is utilized to provide summaries or other information, such AI generated content may be inaccurate or inconsistent. Please read the linked article for more details and information. OKX is not responsible for content hosted on third party sites. Digital asset holdings, including stablecoins and NFTs, involve a high degree of risk and can fluctuate greatly. You should carefully consider whether trading or holding digital assets is suitable for you in light of your financial condition.