The open-source tool for building high-quality datasets and computer vision models
-
Updated
Jun 11, 2024 - Python
The open-source tool for building high-quality datasets and computer vision models
We leverage machine learning and data analysis to address real-world challenges in the copper industry. Our documentation encompasses data preprocessing, feature engineering, classification, regression, and model selection. Explore how we've enhanced predictive capabilities to optimize manufacturing solutions.
💻☕This repository is a resource for learning data science, including learning materials and projects. It covers topics such as data analysis, machine learning, and programming.
OpenDataVal: a Unified Benchmark for Data Valuation in Python (NeurIPS 2023)
A guide to all my projects
R package to clean and standardize epidemiological data
The standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.
Test data management tool for any data source, batch or real-time
pyDVL is a library of stable implementations of algorithms for data valuation and influence function computation
Prepping tables for machine learning
A toolbox of simple solutions for common data cleaning problems.
Market Basket Analysis
Desbordante is a high-performance data profiler that is capable of discovering many different patterns in data using various algorithms. It also allows to run data cleaning scenarios using these algorithms. Desbordante has a console version and an easy-to-use web application.
This repository is created as part of the Data Science Coursework Birzeit university
Easy to use Python library of customized functions for cleaning and analyzing data.
Wikidata and Wikipedia language data extraction
Find and replace erroneous fields in data using validation rules
Python package to make URL extraction, generalization, validation, and filtration easy.
Add a description, image, and links to the data-cleaning topic page so that developers can more easily learn about it.
To associate your repository with the data-cleaning topic, visit your repo's landing page and select "manage topics."