Apache Paimon is a lake format that enables building a Realtime Lakehouse Architecture with Flink and Spark for both streaming and batch operations.
-
Updated
May 17, 2024 - Java
Apache Paimon is a lake format that enables building a Realtime Lakehouse Architecture with Flink and Spark for both streaming and batch operations.
SeaTunnel is a next-generation super high-performance, distributed, massive data integration tool.
Data Ingestion Lab with Nifi and Data Analysis on PySpark and Pandas of UK Police Crime API
Infer SQL DDL statements from tabular data.
Data Ingestion pipeline orchestration with Prefect
A Python library that enables ML teams to share, load, and transform data in a collaborative, flexible, and efficient way 🌰
Data Ingestion Project on Cricket World Cups 2011-19.
a simple search, extractor and ingestion system for get the best sellers products of tech on the Amazon
kg-import automates the ingestion of heterogeneous datasets into a Knowledge Graph.
ingestr is a CLI tool to copy data between any databases with a single command seamlessly.
Various data importing routines with a unified interface (data-import, slurp).
Data Engineering Zoomcamp 2024
Pravega - Streaming as a new software defined storage primitive
The Data Integration Library project provides a library of generic components based on a multi-stage architecture for data ingress and egress.
Add a description, image, and links to the data-ingestion topic page so that developers can more easily learn about it.
To associate your repository with the data-ingestion topic, visit your repo's landing page and select "manage topics."