text-mining

Data and scripts for training the open source PDF questionnaire extraction component for Harmony Kaggle competition using natural language processing (NLP)

nlp competition open-source pdf data-science natural-language-processing information-retrieval text-mining text-classification kaggle information-extraction psychology research-project pdf-files psychology-experiments sentence-embeddings pdf-document-processor psychology-questionnaire

Updated May 27, 2024
Python

george-gca / ai_papers_cleaner

Star

Extract text from papers PDFs and abstracts, and remove uninformative words.

python nlp pdf text-mining nltk text-processing

Updated May 27, 2024
Python

notesjor / CorpusExplorer.Terminal.Console

Star

Erlaubt anderen Programmen/Programmiersprachen den Zugriff auf Analysen/Daten des CorpusExplorer v2.0

nlp api linguistic text-mining corpus-linguistics corpusexplorer

Updated May 27, 2024
C#

jakeberggren / TDDE16-Text-Mining-Project

Star

Project in the course TDDE16 - Text Mining at Linköping University

text-mining text-classification statistical-machine-learning vector-database openai-api langchain

Updated May 27, 2024
TeX

Lambda-3 / DiscourseSimplification

Star

Extension of the SentenceSimplification project

natural-language-processing text-mining text-classification simplification discourse-analysis discourse-parsing

Updated May 27, 2024
Java

frances is an advanced cloud-based text mining digital platform that leverages information extraction, knowledge graphs, natural language processing (NLP), deep learning, and parallel processing techniques. It has been specifically designed to unlock the full potential of historical digital textual collections.

natural-language-processing text-mining apache-spark information-extraction knowledge-graph parallel-processing digitised-historical-collections cloud-based-platform

Updated May 26, 2024
Jupyter Notebook

caufieldjh / awesome-bioie

Star

🧫 A curated list of resources relevant to doing Biomedical Information Extraction (including BioNLP)

nlp natural-language-processing text-mining awesome bioinformatics information-extraction awesome-list biomedical medical-informatics biomedical-data biomedical-language

Updated May 26, 2024

gengoai / gengoai

Star

Mono Repository for GengoAI projects

java machine-learning natural-language-processing text-mining text-classification text-analysis

Updated May 26, 2024
Java

fitria-dwi / Hoax-Detection

Star

This project aims to build a model to predict the truth of an article, hoax or non-hoax. Apart from that, this project also wants to identify the percentage of hoax and non-hoax articles.

text-mining neural-network machine-learning-algorithms logistic-regression unsupervised-learning support-vector-machines decision-tree-classifier random-forest-classifier gaussian-naive-bayes k-nearest-neighbor-classifier hoax-detection

Updated May 26, 2024
Jupyter Notebook

Saeidhoseinipour / ELBMcoclust

Star

We unified some latent block models by proposing a flexible ELBM that is extended to SELBM to address the sparse problem by revealing a diagonal structure from sparse datasets. This leads to obtain more homogeneous co-clusters and therefore produce useful, ready-to-use and easy-to-interpret results.

text-mining word-cloud exponential text-summarization sparse-matrix co-clustering latent-block-model coclust

Updated May 25, 2024
Python

Improve this page

Add a description, image, and links to the text-mining topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the text-mining topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

text-mining

Here are 2,168 public repositories matching this topic...

zhenyuanlu / pyKCN

sap218 / jabberwocky

JesusSalinas / master_upb

biomedicalinformaticsgroup / cadmus

antoniooliveira03 / Projects

palladian / palladian

adbar / trafilatura

RonaldVisser / Mining_Archaeological_Reports

brandonrobertz / SparseLSH

mesolitica / malaysian-dataset

harmonydata / pdf-questionnaire-extraction

george-gca / ai_papers_cleaner

notesjor / CorpusExplorer.Terminal.Console

jakeberggren / TDDE16-Text-Mining-Project

Lambda-3 / DiscourseSimplification

frances-ai / frances-api

caufieldjh / awesome-bioie

gengoai / gengoai

fitria-dwi / Hoax-Detection

Saeidhoseinipour / ELBMcoclust

Improve this page

Add this topic to your repo