The standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.
-
Updated
Jun 11, 2024 - Python
The standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.
The open-source tool for building high-quality datasets and computer vision models
pyDVL is a library of stable implementations of algorithms for data valuation and influence function computation
OpenDataVal: a Unified Benchmark for Data Valuation in Python (NeurIPS 2023)
Client interface for all things Cleanlab Studio
Interactively explore unstructured datasets from your dataframe.
Papers about training data quality management for ML models.
Applying various data engineering techniques into image classification task for KAIST DS801 term project
🧼🔎 A holistic self-supervised data cleaning strategy to detect irrelevant samples, near duplicates and label errors.
A multi-view panorama of Data-Centric AI: Techniques, Tools, and Applications (ECAI Tutorial 2024)
Introduction to Data-Centric AI, MIT IAP 2023 🤖
This data-centric AI repository implements a robust deep learning method (LFBNet) for fully automated tumor segmentation in whole-body [18]F-FDG PET/CT images.
Frontiers in Neuroinformatics 2022: Local Label Point Correction for Edge Detection of Overlapping Cervical Cells
Official Python SDK for Kern AI refinery.
Hugging Face Plugins for FiftyOne
A curated, but incomplete, list of data-centric AI resources.
Automatically find issues in image datasets and practice data-centric computer vision.
The data scientist's open-source choice to scale, assess and maintain natural language data. Treat training data like a software artifact.
Unsupervised classification to improve the quality of a bird song recording dataset. https://doi.org/10.1016/j.ecoinf.2022.101952
Enhancing Efficiency in Multidevice Federated Learning through Data Selection
Add a description, image, and links to the data-centric-ai topic page so that developers can more easily learn about it.
To associate your repository with the data-centric-ai topic, visit your repo's landing page and select "manage topics."