https://github.com/sglkc/data-warehouse-advanced.git
This project is a data warehouse mock project about a hardware company sales. The final output will be a data warehouse with layered medallion architecture-like that will be presented in a human-friendly way.
The complete architecture will be included inside docs directory. Stay tuned!
The project is bootstrapped from Zero One Group's monorepo which is used as a standardized boilerplate for company projects.
Most of the tasks will be done inside data-pipeline directory, where the ETL process of the data
warehouse happen.
The project uses Docker for development, the services used are:
pnpm compose:up
To stop the services, run:
pnpm compose:down
To clean up everything, including container data, run:
pnpm compose:cleanup
Once everything is running, you may use each services below.
To log in to the PostgreSQL data warehouse inside Docker, use the following credentials:
The project uses Metabase for data presentation. Metabase is an open-source web-based business inteligence and analytics tool to query and visualize data in a straightforward manner.
Metabase uses its own database and separated from the data warehouse. This way, the front-end can be synced by export and import.
After running pnpm compose:up, you must do a first setup for Metabase. Please read
SETUP-METABASE.md for a walkthrough.
The data pipeline uses dlt and sqlmesh for the ETL process.
For technical details, please head to data-pipeline directory and open README.md.