YTsaurus is a scalable and fault-tolerant open-source big data platform.
-
Updated
Jun 10, 2024 - C++
YTsaurus is a scalable and fault-tolerant open-source big data platform.
StarRocks, a Linux Foundation project, is a next-generation sub-second MPP OLAP database for full analytics scenarios, including multi-dimensional analytics, real-time analytics, and ad-hoc queries.
FLaNK AI Weekly covering Apache NiFi, Apache Flink, Apache Kafka, Apache Spark, Apache Iceberg, Apache Ozone, Apache Pulsar, and more...
World's most powerful open data catalog for building a high-performance, geo-distributed and federated metadata lake.
Supercharge Your Compute for Analytics & AI
A modern data marketplace that makes collaboration among diverse users (like business, analysts and engineers) easier, increasing efficiency and agility in data projects on AWS.
LakeSoul is an end-to-end, realtime and cloud native Lakehouse framework with fast data ingestion, concurrent update and incremental data analytics on cloud storages for both BI and AI applications.
Examples of using Terraform to deploy Databricks resources
Run an open-source data LakeHouse locally using Docker Compose
Helm chart for deploying ParadeDB on Kubernetes
The Goal of this project is to provide documentation for the Lakehouse Engine framework.
The Lakehouse Engine is a configuration driven Spark framework, written in Python, serving as a scalable and distributed engine for several lakehouse algorithms, data flows and utilities for Data Products.
Build Your First End-to-End Lakehouse Solution (aka.ms/fabconlake)
Add a description, image, and links to the lakehouse topic page so that developers can more easily learn about it.
To associate your repository with the lakehouse topic, visit your repo's landing page and select "manage topics."