datalake
Here are 224 public repositories matching this topic...
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
-
Updated
May 19, 2024 - Java
StarRocks, a Linux Foundation project, is a next-generation sub-second MPP OLAP database for full analytics scenarios, including multi-dimensional analytics, real-time analytics, and ad-hoc queries. InfoWorld’s 2023 BOSSIE Award for best open source software.
-
Updated
May 19, 2024 - Java
Database for AI. Store Vectors, Images, Texts, Videos, etc. Use with LLMs/LangChain. Store, query, version, & visualize any AI data. Stream data in real-time to PyTorch/TensorFlow. https://activeloop.ai
-
Updated
May 17, 2024 - Python
Upserts, Deletes And Incremental Processing on Big Data.
-
Updated
May 19, 2024 - Java
Postgres for Search and Analytics
-
Updated
May 19, 2024 - Rust
lakeFS - Data version control for your data lake | Git for data
-
Updated
May 19, 2024 - Go
Dinky is a real-time data development platform based on Apache Flink, enabling agile data development, deployment and operation.
-
Updated
May 19, 2024 - Java
LakeSoul is an end-to-end, realtime and cloud native Lakehouse framework with fast data ingestion, concurrent update and incremental data analytics on cloud storages for both BI and AI applications.
-
Updated
May 15, 2024 - Java
The LeoFS Storage System
-
Updated
Jun 2, 2020 - Erlang
Scalable identity resolution, entity resolution, data mastering and deduplication using ML
-
Updated
May 15, 2024 - Java
汇总Apache Hudi相关资料
-
Updated
May 19, 2024
A free to use dbt package for creating and loading Data Vault 2.0 compliant Data Warehouses (powered by dbt, an open source data engineering tool, registered trademark of dbt Labs)
-
Updated
May 15, 2024
World's most powerful data catalog service with providing a high-performance, geo-distributed and federated metadata lake.
-
Updated
May 19, 2024 - Java
Use SQL to build ELT pipelines on a data lakehouse.
-
Updated
May 25, 2022 - JavaScript
Open Control Plane for Tables in Data Lakehouse
-
Updated
May 18, 2024 - Java
A Data Platform built for AWS, powered by Kubernetes.
-
Updated
Jul 24, 2023 - Python
An IDE and translation engine for detection engineers and threat hunters. Be faster, write smarter, keep 100% privacy.
-
Updated
May 17, 2024 - Python
Improve this page
Add a description, image, and links to the datalake topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the datalake topic, visit your repo's landing page and select "manage topics."