1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120# Module 4: Analytics Engineering
Goal: Transforming the data loaded in DWH into Analytical Views developing a [dbt project](taxi_rides_ny/README.md).
### Prerequisites
The prerequisites depend on which setup path you choose:
**For Cloud Setup (BigQuery):**
- Completed [Module 3: Data Warehouse](../03-data-warehouse/) with:
- A GCP project with BigQuery enabled
- Service account with BigQuery permissions
- NYC taxi data loaded into BigQuery (yellow and green taxi data for 2019-2020)
**For Local Setup (DuckDB):**
- No prerequisites! The local setup guide will walk you through downloading and loading the data.
> [!NOTE]
> This module focuses on **yellow and green taxi data** (2019-2020). While Module 3 may have included FHV data, it is not used in this dbt project.
## Setting up your environment
Choose your setup path:
### ๐ [Local Setup](setup/local_setup.md)
- **Stack**: DuckDB + dbt Core
- **Cost**: Free
- [โ Get Started](setup/local_setup.md)
### โ๏ธ [Cloud Setup](setup/cloud_setup.md)
- **Stack**: BigQuery + dbt Cloud
- **Cost**: Free tier available (dbt Cloud Developer), BigQuery costs vary
- **Requires**: Completed Module 3 with BigQuery data
- [โ Get Started](setup/cloud_setup.md)
## Content
### Introduction to Analytics Engineering
[](https://youtu.be/HxMIsPrIyGQ)
### Introduction to data modeling
[](https://youtu.be/uF76d5EmdtU&list=PL3MmuxUbc_hJed7dXYoJw8DoCuVHhGEQb&index=40)
### What is dbt?
[](https://www.youtube.com/watch?v=gsKuETFJr54&list=PLaNLNpjZpzwgneiI-Gl8df8GCsPYp_6Bs&index=5)
### Differences between dbt Core and dbt Cloud
[](https://www.youtube.com/auzcdLRyEIk)
### Project Setup
| Alternative A | Alternative B |
|-----------------------------|--------------------------------|
| BigQuery + dbt Platform | DuckDB + dbt core |
| [](https://youtu.be/GFbwlrt6f54) | [](https://youtu.be/GoFAbJYfvlw) |
### dbt Course
| dbt Project Structure | dbt Sources | dbt Models | Seeds and Macros |
|-----------------------|-------------|------------|------------------|
| [](https://youtu.be/2dYDS4OQbT0) | [](https://youtu.be/7CrrXazV_8k) | [](https://youtu.be/JQYz-8sl1aQ) | [](https://youtu.be/lT4fmTDEqVk) |
| dbt Tests | Documentation | dbt Packages | dbt Commands |
|-----------|---------------|----------------------|---------------|
| [](https://youtu.be/bvZ-rJm7uMU) | [](https://www.youtube.com/UqoWyMjcqrA) | [](https://www.youtube.com/KfhUA9Kfp8Y) | [](https://www.youtube.com/t4OeWHW3SsA) |
## Extra resources
> [!NOTE]
> If you find the videos above overwhelming, we recommend completing the [dbt Fundamentals](https://learn.getdbt.com/courses/dbt-fundamentals) course and then rewatching the module. It provides a solid foundation for all the key concepts you need in this module.
## SQL refresher
The homework for this module focuses heavily on window functions and CTEs. If you need a refresher on these topics, you can refer to these notes.
* [SQL refresher](refreshers/SQL.md)
## Homework
* [2026 Homework](../cohorts/2026/04-analytics-engineering/homework.md)
# Community notes
<details>
<summary>Did you take notes? You can share them here</summary>
* [Notes by Alvaro Navas](https://github.com/ziritrion/dataeng-zoomcamp/blob/main/notes/4_analytics.md)
* [Sandy's DE learning blog](https://learningdataengineering540969211.wordpress.com/2022/02/17/week-4-setting-up-dbt-cloud-with-bigquery/)
* [Notes by Victor Padilha](https://github.com/padilha/de-zoomcamp/tree/master/week4)
* [Marcos Torregrosa's blog (spanish)](https://www.n4gash.com/2023/data-engineering-zoomcamp-semana-4/)
* [Notes by froukje](https://github.com/froukje/de-zoomcamp/blob/main/week_4_analytics_engineering/notes/notes_week_04.md)
* [Notes by Alain Boisvert](https://github.com/boisalai/de-zoomcamp-2023/blob/main/week4.md)
* [Setting up Prefect with dbt by Vera](https://medium.com/@verazabeida/zoomcamp-week-5-5b6a9d53a3a0)
* [Blog by Xia He-Bleinagel](https://xiahe-bleinagel.com/2023/02/week-4-data-engineering-zoomcamp-notes-analytics-engineering-and-dbt/)
* [Setting up DBT with BigQuery by Tofag](https://medium.com/@fagbuyit/setting-up-your-dbt-cloud-dej-9-d18e5b7c96ba)
* [Blog post by Dewi Oktaviani](https://medium.com/@oktavianidewi/de-zoomcamp-2023-learning-week-4-analytics-engineering-with-dbt-53f781803d3e)
* [Notes from Vincenzo Galante](https://binchentso.notion.site/Data-Talks-Club-Data-Engineering-Zoomcamp-8699af8e7ff94ec49e6f9bdec8eb69fd)
* [Notes from Balaji](https://github.com/Balajirvp/DE-Zoomcamp/blob/main/Week%204/Data%20Engineering%20Zoomcamp%20Week%204.ipynb)
* [Notes by Linda](https://github.com/inner-outer-space/de-zoomcamp-2024/blob/main/4-analytics-engineering/readme.md)
* [2024 - Videos transcript week4](https://drive.google.com/drive/folders/1V2sHWOotPEMQTdMT4IMki1fbMPTn3jOP?usp=drive)
* [Blog Post](https://www.jonahboliver.com/blog/de-zc-w4) by Jonah Oliver
* [2025 Notes by Manuel Guerra](https://github.com/ManuelGuerra1987/data-engineering-zoomcamp-notes/blob/main/4_Analytics-Engineering/README.md)
* [2025 Notes by Horeb SEIDOU](https://spotted-hardhat-eea.notion.site/Week-4-Analytics-Engineering-18929780dc4a808692e4e0ee488bf49c?pvs=74)
* [2025 Notes by Daniel Lachner](https://github.com/mossdet/dlp_data_eng/blob/main/Notes/04_01_Analytics_Engineering.pdf)
* Add your notes here (above this line)
</details>
## Useful links
- [Slides used in previous years](https://docs.google.com/presentation/d/1xSll_jv0T8JF4rYZvLHfkJXYqUjPtThA/edit?usp=sharing&ouid=114544032874539580154&rtpof=true&sd=true)
- [dbt free courses](https://courses.getdbt.com/collections)