๐Ÿ“ฆ japerry911 / crypto-data-pipeline

Data Pipeline that utilizes GCP, Python 3.10, Prefect, and more.

โ˜… 10 stars โ‘‚ 0 forks ๐Ÿ‘ 10 watching
๐Ÿ“ฅ Clone https://github.com/japerry911/crypto-data-pipeline.git
HTTPS git clone https://github.com/japerry911/crypto-data-pipeline.git
SSH git clone git@github.com:japerry911/crypto-data-pipeline.git
CLI gh repo clone japerry911/crypto-data-pipeline
Jack P Jack P Update README.md 463fe06 2 years ago ๐Ÿ“ History
๐Ÿ“‚ main View all commits โ†’
๐Ÿ“ .github
๐Ÿ“ dbt
๐Ÿ“ src
๐Ÿ“„ .gitignore
๐Ÿ“„ .prefectignore
๐Ÿ“„ Dockerfile_Agent
๐Ÿ“„ Dockerfile_Flows
๐Ÿ“„ poetry.lock
๐Ÿ“„ pyproject.toml
๐Ÿ“„ README.md
๐Ÿ“„ requirements.txt
๐Ÿ“„ README.md

Sky-Pipe

Summary

Sky-Pipe is a Prefect Dataflow Pipeline that integrates Google Cloud Platform and dbt. Specifically, the Sky-Pipe fetches daily exchange data from CoinMarketCap.com and loads this data into Google Cloud Platform, and transforms it with dbt. This pipeline also utilizes GitHub Actions for dev ops.

Technologies Used

  • Google Cloud Platform
  • Google Cloud Run Jobs
  • Google Cloud Storage
  • Google Cloud Secrets Manager
  • Google Cloud BigQuery
  • Google Cloud Artifact Registry
  • Python 3.10
  • Prefect
  • Flow
  • Task
  • Logging
  • Prefect-GCP
  • Cloud Run Job
  • GCP Secrets
  • Google Cloud Storage
  • GCP Credentials
  • GCP BigQuery


Prefect Flow Steps

  • Fetch data from CoinMarketCap.com
  • Write data to Pandas DataFrame
  • Load DataFrame to Parquet file
  • Upload Parquet file to Google Cloud Storage bucket
  • Load Parquet file from Google Cloud Storage to BigQuery via Load Job
  • Trigger dbt job to run transformations within BigQuery


GitHub Actions

  • Workflows
  • Deploy Agent
  • deploys Prefect Agent Docker Image and deploys Prefect Agent to
Google Compute Engine
  • Deploy Flows and Blocks
  • deploys Prefect Flows Docker Image and the actual Prefect Flow Deployment
for CoinMarketCap.com.
  • Actions
  • Container-Image
  • deploys Docker Image to Artifact Registry
  • Deploy Container to Compute Engine
  • deploys Docker Image from Artifact Registry to Compute Engine
  • Deploy-Flows
  • deploys Prefect Deployments to Prefect Cloud