data filter routines using numpy
-
Updated
Feb 14, 2021 - Python
data filter routines using numpy
Analysis of Tweets Dataset using concepts like Data Curation and Data Processing.
TokenEditor is a web application for manual annotation (or manual review of automatic annotations) of text. Albeit primarily aimed at reviewing PoS tags and lemmas, it is fully customizable, to support any annotation levels.
Web application for text-based data labeling 🏷️
Python package to make URL extraction, generalization, validation, and filtration easy.
Rebalancing chemical reaction
Materials from a guest lecture entitled, "Beyond Data Standards," prepared for University of Washington's LIS 546 (Data Curation II) in Spring 2021.
Web Scraping & Text Data Collecting and Curating for Maithili Language. Also Language Modeling for collected data.
Practices of the "Diploma in Data Sciences, Machine Learning and its applications", in which I was a mentor.
Canonicalizing data and implementing strategies for ensuring equivalence
Data Curation, Winter 2021
Codes I wrote for the paper : "Global determinants of freshwater and marine fish genetic diversity" Nature Communications, 2020
This program consists in discovering equivalence links (owl:sameAs) for a given set of URIs dynamically and online with SPARQL queries.
Code and data for "Target-oriented Proactive Dialogue Systems with Personalization: Problem Formulation and Dataset Curation" (EMNLP 2023)
R script for GenBank sequences names changing, filling-in missing molecular markers data and sequences concatenation
TranSMART Arborist: Graphical tool for reshaping your data for the tranSMART data warehouse.
Some analysis on public datasets [WIP]
COVID19 Case Report Form Analysis - data and collection forms.
Add a description, image, and links to the data-curation topic page so that developers can more easily learn about it.
To associate your repository with the data-curation topic, visit your repo's landing page and select "manage topics."