query_blockchain_sql
Query Ethereum blockchain data by combining SQL queries and direct dataset downloads in one step. Supports table references or explicit read_parquet() calls for precise data analysis.
Instructions
Download blockchain data and run SQL query in a single step
CONVENIENCE FUNCTION: This combines query_dataset and query_sql into one call.
You can write SQL queries using either approach:
1. Simple table references: "SELECT * FROM blocks LIMIT 10"
2. Explicit read_parquet: "SELECT * FROM read_parquet('/path/to/file.parquet') LIMIT 10"
DATASET-SPECIFIC PARAMETERS:
For datasets that require specific address parameters (like 'balances', 'erc20_transfers', etc.),
ALWAYS use the 'contract' parameter to pass ANY Ethereum address. For example:
- For 'balances' dataset: Use contract parameter for the address you want balances for
query_blockchain_sql(
sql_query="SELECT * FROM balances",
dataset="balances",
blocks='1000:1010',
contract='0x123...' # Address you want balances for
)
Examples:
```
# Using simple table name
query_blockchain_sql(
sql_query="SELECT * FROM blocks LIMIT 10",
dataset="blocks",
blocks_from_latest=100
)
# Using read_parquet() (the path will be automatically replaced)
query_blockchain_sql(
sql_query="SELECT * FROM read_parquet('/any/path.parquet') LIMIT 10",
dataset="blocks",
blocks_from_latest=100
)
```
ALTERNATIVE WORKFLOW (more control):
If you need more control, you can separate the steps:
1. Download data: result = query_dataset('blocks', blocks_from_latest=100, output_format='parquet')
2. Inspect schema: schema = get_sql_table_schema(result['files'][0])
3. Run SQL query: query_sql("SELECT * FROM blocks", files=result['files'])
Args:
sql_query: SQL query to execute - using table names or read_parquet()
dataset: The specific dataset to query (e.g., 'transactions', 'logs', 'balances')
If None, will be extracted from the SQL query
blocks: Block range specification as a string (e.g., '1000:1010')
start_block: Start block number (alternative to blocks)
end_block: End block number (alternative to blocks)
use_latest: If True, query the latest block
blocks_from_latest: Number of blocks before the latest to include
contract: Contract address to filter by - IMPORTANT: Use this parameter for ALL address-based filtering
regardless of the parameter name in the native cryo command (address, contract, etc.)
force_refresh: Force download of new data even if it exists
include_schema: Include schema information in the result
Returns:
SQL query results and metadata
Input Schema
Name | Required | Description | Default |
---|---|---|---|
blocks | No | ||
blocks_from_latest | No | ||
contract | No | ||
dataset | No | ||
end_block | No | ||
force_refresh | No | ||
include_schema | No | ||
sql_query | Yes | ||
start_block | No | ||
use_latest | No |
Input Schema (JSON Schema)
{
"properties": {
"blocks": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"title": "Blocks"
},
"blocks_from_latest": {
"anyOf": [
{
"type": "integer"
},
{
"type": "null"
}
],
"default": null,
"title": "Blocks From Latest"
},
"contract": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"title": "Contract"
},
"dataset": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"title": "Dataset"
},
"end_block": {
"anyOf": [
{
"type": "integer"
},
{
"type": "null"
}
],
"default": null,
"title": "End Block"
},
"force_refresh": {
"default": false,
"title": "Force Refresh",
"type": "boolean"
},
"include_schema": {
"default": true,
"title": "Include Schema",
"type": "boolean"
},
"sql_query": {
"title": "Sql Query",
"type": "string"
},
"start_block": {
"anyOf": [
{
"type": "integer"
},
{
"type": "null"
}
],
"default": null,
"title": "Start Block"
},
"use_latest": {
"default": false,
"title": "Use Latest",
"type": "boolean"
}
},
"required": [
"sql_query"
],
"title": "query_blockchain_sqlArguments",
"type": "object"
}