Archive - April 2020

1
Complete Guide To Mastering and Optimizing Google Bigquery
2
Scheduling a Singer pipeline on Google Cloud – Part 3 Adwords to Bigquery

Complete Guide To Mastering and Optimizing Google Bigquery

If you are looking to get started with Bigquery, here are the concepts that you need to be familiar with so you can get the most optimum results from Google Bigquery. Table Partitioning It is useful to use the date time column to partition Bigquery datasets, this helps with the improvement of performance. If a date-time column is not available in the data set, you can use the ingestion time to partition the data set. Clustered Tables You can further optimize your queries in Bigquery by clustering according to some rows. You should cluster based on the most used rows for your queries. The query below will be optimized based on wiki and table Nested Data Bigquery works best with denormalized data, so the use of nested data and repeated fields is recommended over star schema or snowflake schema. A good example of this is a library, usually in a[…]

Read More

Scheduling a Singer pipeline on Google Cloud – Part 3 Adwords to Bigquery

If you have followed the tutorial you will have a docker image with your stitch pipeline in Google container registry. For running this setup we will be implementing the following setup in Google Cloud. Here are the tasks we will need to complete the setup:- Create a VM instance to get the historical data Create a pub/sub topic to trigger cloud functions Create Google cloud functions to start and stop the instance Setup cloud schedular jobs Create a VM instance to get the historical data The strategy I am following, in this case, is that I am downloading all the historical data till the current data and store the current date in the state.json file and everyday I run the cronjob to get data of the previous data into Bigquery. We will set up a containerized compute engine using the docker image we created in the previous article, the reason[…]

Read More

Copyright © 2023. Created by Meks. Powered by WordPress.