We are running nodes on every chain we have indexed data for & are streaming data in real-time into a public database via our robust data pipeline.
As an intermediary data hosting solution, we have chosen BigQuery, which comes with a straightforward SQL interface & Google Sheets, and Google Data Studio integration. These tools are great out-of-the-box tools for data querying & visualization.
In the future, we plan to host all data on our side, allowing us to build custom interfaces and more advanced features & APIs. However, we have prioritized usability from day 1 and hence have chosen the above-mentioned setup.
Currently, Numia offers SQL-based access to on-chain data via BigQuery. Google BigQuery is a cloud-based big data analytics web service for processing very large read-only data sets. BigQuery was designed for analyzing data on the order of billions of rows, using a SQL-like syntax. So in order to access Numia data sets for now you'll need a Google Cloud account that you can create for free and get $300 in credits.
Some of the benefits of having it in BigQuery are:
- Native integration to Looker Studio (formerly Google Data Studio)
- Native integration to Google Sheets
- BigQuery API access to retrieve data programmatically
- Ability to join and merge data with other public (or private) data sets
- Python driver to connect to BigQuery from Jupyter Notebook
Google will charge you for running the queries. Generally, their pricing is based on the amount of data that each query processes, regardless of how much computation power is used. Raw data tables tend to be in the terabytes and we are constantly building newer smaller tables of the raw data so that you spend less when querying Numia's datasets.
We provide more recommendations in the "Optimizing Your Queries" section to reduce your cost.