ocr
Here are 4,654 public repositories matching this topic...
Fast and accurate OCR on images and PDFs using Apple Vision framework directly from command line.
-
Updated
May 21, 2024 - Python
Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
-
Updated
May 21, 2024 - HTML
Open-source infrastructure and data orchestration platform for risk decisioning
-
Updated
May 21, 2024 - TypeScript
A community-supported supercharged version of paperless: scan, index and archive all your physical documents
-
Updated
May 21, 2024 - Python
PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.
-
Updated
May 21, 2024 - Python
A privacy-first, self-hosted, fully open source personal knowledge management software, written in typescript and golang.
-
Updated
May 21, 2024 - TypeScript
OCR software, free and offline. 开源、免费的离线OCR软件。支持截屏/批量导入图片,PDF文档识别,排除水印/页眉页脚,扫描/生成二维码。内置多国语言库。
-
Updated
May 21, 2024 - QML
docTR (Document Text Recognition) - a seamless, high-performing & accessible library for OCR-related tasks powered by Deep Learning.
-
Updated
May 21, 2024 - Python
Dedoc is a library (service) for automate documents parsing and bringing to a uniform format. It automatically extracts content, logical structure, tables, and meta information from textual electronic documents. (Parse document; Document content extraction; Document logical extraction; PDF parser; Scanned document parser; DOCX parser; HTML parser)
-
Updated
May 21, 2024 - Python
Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)
-
Updated
May 21, 2024 - Python
Online Handwritten Text Recognition (HTR) system implemented with PyTorch. Based on https://doi.org/10.1007/s10032-020-00350-4.
-
Updated
May 21, 2024 - Jupyter Notebook
Awesome OCR multiple programing languages toolkits based on ONNXRuntime, OpenVION and PaddlePaddle.
-
Updated
May 21, 2024 - Python
Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022
-
Updated
May 21, 2024 - Python
RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
-
Updated
May 21, 2024 - Python
Improve this page
Add a description, image, and links to the ocr topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the ocr topic, visit your repo's landing page and select "manage topics."