https://github.com/coinbase/chainstorage/
ChainStorage is the foundational blockchain storage system that widely adopted within Coinbase. It is the blockchain data availability and agility layer. It has been used as the live data source to power Wallet, NFT, Exchange, and other Web3 applications. It is also used as a scalable batch/streaming data source to empower data science and machine learning workloads. In addition, it is the data source of the workflows to provide deep insight to our compliance auditing and is integrated well with existing big data systems like Snowflake and Spark.
ChainStorage is inspired by the Change Data Capture paradigm, commonly used in the big data world. It continuously replicates the changes (i.e. new blocks) on the blockchain, and acts like a distributed file system for the blockchain, design doc is here.
It aims to provide an efficient and flexible way to access the on-chain data:
There are three ways to interact with ChainStorage service
It is the native API with high performance but lack of authentication and rate control
# local
grpcurl --plaintext localhost:9090 \\
coinbase.chainstorage.ChainStorage/GetLatestBlock
grpcurl --plaintext -d '{"start_height": 0, "end_height": 10}' \\
localhost:9090 coinbase.chainstorage.ChainStorage/GetBlockFilesByRange
grpcurl --plaintext -d '{"sequence_num": 2223387}' \\
localhost:9090 coinbase.chainstorage.ChainStorage/StreamChainEvents
grpcurl --plaintext -d '{"initial_position_in_stream": "EARLIEST"}' \\
localhost:9090 coinbase.chainstorage.ChainStorage/StreamChainEvents
grpcurl --plaintext -d '{"initial_position_in_stream": "LATEST"}' \\
localhost:9090 coinbase.chainstorage.ChainStorage/StreamChainEvents
grpcurl --plaintext -d '{"initial_position_in_stream": "13222054"}' \\
localhost:9090 coinbase.chainstorage.ChainStorage/StreamChainEventstrvtfhetudvfkdvelrrndtclcklgrtibcuf
Chainstorage also provides SDK, and you can find supported methods here
Note:
GetBlocksByRangeWithTag
is not equivalent to the batch version of GetBlockWithTag
since you don't have a way to specify the block hash. So when you use GetBlocksByRangeWithTag
and if it goes beyond the current tip of chain due to reorg, you'll get back the FailedPrecondition
error because it exceeds the latest watermark.
In conclusion, it's safe to use GetBlocksByRangeWithTag
for backfilling since the reorg will not happen for past blocks, however, you'd be suggested to use GetBlockWithTag
for recent blocks (e.g. streaming case).
REST APIs are the counterpart to the GRPC APIs. The ChainStorage APIs are in beta preview. Note that the APIs are currently exposed as restful APIs through grpc transcoding. Please refer to the proto file for the data schema.
The ready-to-go public endpoint hosted by Coinbase is https://launchpad.coinbase.com, and you need an API key to use the endpoint:
# Get latest block
curl '<https://launchpad.coinbase.com/api/exp/chainstorage/ethereum/mainnet/v1/coinbase.chainstorage.ChainStorage/GetLatestBlock>' \\
--request POST \\
--header 'Content-Type: application/json' \\
--header 'x-apikey: YOUR-KEY'
# Get block by height
curl '<https://launchpad.coinbase.com/api/exp/chainstorage/ethereum/mainnet/v1/coinbase.chainstorage.ChainStorage/GetNativeBlock>' \\
--request POST \\
--header 'Content-Type: application/json' \\
--header 'x-apikey: YOUR-KEY' \\
--data '{"height": 16000000}'
# Get blocks in range
curl '<https://launchpad.coinbase.com/api/exp/chainstorage/ethereum/mainnet/v1/coinbase.chainstorage.ChainStorage/GetNativeBlocksByRange>' \\
--request POST \\
--header 'Content-Type: application/json' \\
--header 'x-apikey: YOUR-KEY' \\
--data '{"start_height": 16000000, "end_height": 16000005}'
More examples are available via this Postman Collection.