Numia Docs

Migrating BigQuery to ObsessionDB

Migration in progress

BQ
LegacyBigQuery
OD
New homeObsessionDB

Numia’s Data Warehouse historically ran on Google BigQuery. Both stacks are live today, but new datasets, chains, and product integrations land on ObsessionDB first. This page maps the BigQuery tables you’re used to onto their ObsessionDB equivalents so you can plan the cutover.

Why move?

ObsessionDB (our ClickHouse-based warehouse) gives us properties that BigQuery couldn’t:

  • Real-time freshness. Materialized views derive src_events and src_messages directly from raw block-results — no batch lag.
  • Predictable cost. You pay for the cluster, not for every byte scanned. Exploration stops being a billing event.
  • Re-derivable schema. Raw block payloads are retained, so new derived columns can be built without re-indexing the chain.
  • Native JSON. Decode message bodies and event attributes inline with JSON_VALUE / JSONExtractArrayRaw — no pre-materialization step.

Conceptual change

BigQuery had 7 raw tables per chain, each pre-parsing a slice of the block payload. ObsessionDB consolidates this into 3 tables per chain, with the parsing pushed into ClickHouse JSON functions you call at query time. The unified shape works across every Cosmos SDK chain.

BQ
BigQuery (legacy)

Per chain, 7 raw tables:

  • <chain>_blocks
  • <chain>_transactions
  • <chain>_tx_messages
  • <chain>_block_events
  • <chain>_message_events
  • <chain>_event_attributes
  • <chain>_message_event_attributes
OD
ObsessionDB

Per chain, 3 tables:

  • <chain_db>.raw_block_results
  • <chain_db>.src_events
  • <chain_db>.src_messages

Table-by-table mapping

The table below walks you through every BigQuery raw table and its ObsessionDB equivalent. Where a BigQuery table maps to a column inside an ObsessionDB table rather than a top-level table, the How to query column shows you the relevant ClickHouse JSON function.

BigQueryObsessionDBHow to query
<chain>_blocksraw_block_results

Block header lives in the header column.

Use block_height, block_timestamp, hash directly; decode further header fields with JSON_VALUE(header, ‘$.proposer_address’).
<chain>_transactionsraw_block_results

Each tx is an entry in the transactions JSON array; execution result is in tx_results.

ARRAY JOIN JSONExtractArrayRaw(transactions) AS tx, then JSON_VALUE(tx, ‘$.tx_hash’) etc.
<chain>_tx_messagessrc_messages

Direct 1:1 mapping. One row per message.

Use message_type to filter; pull fields out with JSON_VALUE(message, ‘$.from_address’).
<chain>_block_eventssrc_events

Filter to non-transaction events.

Add WHERE block_section_index IN (0, 2, 3) (begin / end / finalize block).
<chain>_message_eventssrc_events

Filter to per-transaction events.

Add WHERE block_section_index = 1 (transactions_results).
<chain>_event_attributessrc_events.event_attributes

Attributes are kept as a JSON array on each event row — no separate table.

ARRAY JOIN JSONExtractArrayRaw(event_attributes) AS attr, then JSON_VALUE(attr, ‘$.key’) / JSON_VALUE(attr, ‘$.value’).
<chain>_message_event_attributessrc_events.event_attributes

Same as above — attributes for message-level events live inline on the event row.

Same pattern, combined with WHERE block_section_index = 1.

Query translation

A few side-by-side examples to make the shift concrete. Assume we’re working with Cosmos Hub — BigQuery dataset numia-data.cosmoshub, ObsessionDB database cosmoshub_4.

Count blocks per day

BQ
BigQuery
SELECT
    DATE(block_timestamp) AS day,
    COUNT(*) AS blocks
FROM `numia-data.cosmoshub.cosmoshub_blocks`
WHERE block_timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 7 DAY)
GROUP BY day
ORDER BY day DESC
OD
ObsessionDB
SELECT
    toDate(block_timestamp) AS day,
    count() AS blocks
FROM cosmoshub_4.raw_block_results
WHERE block_timestamp >= now() - INTERVAL 7 DAY
GROUP BY day
ORDER BY day DESC

Top message types over the last day

BQ
BigQuery
SELECT
    message_type,
    COUNT(*) AS msgs
FROM `numia-data.cosmoshub.cosmoshub_tx_messages`
WHERE block_timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 DAY)
GROUP BY message_type
ORDER BY msgs DESC
LIMIT 20
OD
ObsessionDB
SELECT
    message_type,
    count() AS msgs
FROM cosmoshub_4.src_messages
WHERE block_timestamp >= now() - INTERVAL 1 DAY
GROUP BY message_type
ORDER BY msgs DESC
LIMIT 20

transfer events with sender / recipient

BQ
BigQuery
SELECT
    e.block_height,
    MAX(IF(a.key = 'sender',    a.value, NULL)) AS sender,
    MAX(IF(a.key = 'recipient', a.value, NULL)) AS recipient,
    MAX(IF(a.key = 'amount',    a.value, NULL)) AS amount
FROM `numia-data.cosmoshub.cosmoshub_message_events`      AS e
JOIN `numia-data.cosmoshub.cosmoshub_message_event_attributes` AS a
  USING (block_height, tx_id, msg_index, event_index)
WHERE e.event_type = 'transfer'
  AND e.block_timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 DAY)
GROUP BY e.block_height, e.tx_id, e.msg_index, e.event_index
OD
ObsessionDB
SELECT
    block_height,
    JSONExtractString(arrayFirst(x -> JSON_VALUE(x, '$.key') = 'sender',    JSONExtractArrayRaw(event_attributes)), 'value') AS sender,
    JSONExtractString(arrayFirst(x -> JSON_VALUE(x, '$.key') = 'recipient', JSONExtractArrayRaw(event_attributes)), 'value') AS recipient,
    JSONExtractString(arrayFirst(x -> JSON_VALUE(x, '$.key') = 'amount',    JSONExtractArrayRaw(event_attributes)), 'value') AS amount
FROM cosmoshub_4.src_events
WHERE event_type = 'transfer'
  AND block_section_index = 1
  AND block_timestamp >= now() - INTERVAL 1 DAY

Migration checklist

  1. Request ObsessionDB access — see Getting Started.
  2. Identify the BigQuery tables your workload reads. The mapping above tells you the ObsessionDB equivalent for each one.
  3. Translate your queries. SQL is mostly portable; the main rewrites are JSON access (JSON_VALUE / JSONExtractArrayRaw), date arithmetic (INTERVAL), and the block_section_index filter for event tables.
  4. Backfill your dashboards / pipelines against ObsessionDB and compare row counts vs. BigQuery as a sanity check.
  5. Cut over reads when you’re satisfied. We keep BigQuery live during the transition — no hard deadline.

Need help with the cutover?

We can help

If you have a complex BigQuery pipeline you want translated, or you want us to deliver decoded / parsed tables matching your old BigQuery views — reach out at support@numia.xyz or through our website.