Skip to content
View airscholar's full-sized avatar
💭
Do hard things!
💭
Do hard things!

Highlights

  • Pro
Block or Report

Block or report airscholar

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Pinned

  1. e2e-data-engineering e2e-data-engineering Public

    An end-to-end data engineering pipeline that orchestrates data ingestion, processing, and storage using Apache Airflow, Python, Apache Kafka, Apache Zookeeper, Apache Spark, and Cassandra. All comp…

    Python 110 52

  2. RedditDataEngineering RedditDataEngineering Public

    This project provides a comprehensive data pipeline solution to extract, transform, and load (ETL) Reddit data into a Redshift data warehouse. The pipeline leverages a combination of tools and serv…

    Python 28 22

  3. changecapture-e2e changecapture-e2e Public

    This project shows how to capture changes from postgres database and stream them into kafka

    Python 18 11

  4. RealtimeStreamingEngineering RealtimeStreamingEngineering Public

    This project serves as a comprehensive guide to building an end-to-end data engineering pipeline using TCP/IP Socket, Apache Spark, OpenAI LLM, Kafka and Elasticsearch. It covers each stage from da…

    Python 14 10

  5. FootballDataEngineering FootballDataEngineering Public

    An end-to-end data engineering pipeline that fetches data from Wikipedia, cleans and transforms it with Apache Airflow and saves it on Azure Data Lake. Other processing takes place on Azure Data Fa…

    Python 10 8

  6. ApacheFlink-SalesAnalytics ApacheFlink-SalesAnalytics Public

    This repository contains an end-to-end data engineering project using Apache Flink, focused on performing sales analytics. The project demonstrates how to ingest, process, and analyze sales data, s…

    Java 7 5