Starting a DBT project from nothing
In this tutorial, we’ll walk through setting up a minimal project using DBT (Data Build Tool) for data transformation. We’ll cover setting up the environment, creating queries, running transformations, and deploying to production. I set up this project because I am trying to figure out how to spin up my own DBT project on my own data set. Existing tutorials use jaffle_shop data and well-established dbt projects. I just wanted to figure out the minimum needed to run re-occurring data transforms using DBT.
This example uses BQ, but I am going to try to keep the tutorial as agnostic as possible. The code for the project is found in this GitHub repo.
Project Goal
The goal of this project is to take data from a public dataset, perform transformations using DBT, and schedule these transformations to run on a regular basis. The transformed data will be added to one or more target tables for analysis. We will divide the steps into two parts:
- Part 1: DBT setup
- Part 2: GCP deploy
Part 1: setup DBT
Prerequisites
- Basic knowledge of SQL
- Access to a cloud provider (e.g., Google Cloud Platform for BigQuery)
- Working terminal environment. I am…