Skip to content

Extracted data from pdf files of resumes written in English. Used libraries: spacy, pdf2image, easyocr, poppler-utils.

Notifications You must be signed in to change notification settings

anmsajedulalam/Data-Extraction-from-PDF-files-of-Resume

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 

Repository files navigation

Data-Extraction-from-PDF-files-of-Resume

Author:

A. N. M. Sajedul Alam

Goal:

Data Extraction from PDF files of Resume

Important Details:

a) Implemented on 6 different resumes randomly collected from web
b) Used poppler utils for manipulating PDF files and converting them to other formats
c) Used pdf2image and easyocr for converting pdf to image and for optical character recognition
d) Used spacy for doing advanced natural language processing tasks and pillow's imageDraw module for getting simple 2D graphics for image objects
e) Worked only with resumes written in English Language

Attention:

All the resumes were webscraped randomly from web. Author will not be liable for any future misuse.

About

Extracted data from pdf files of resumes written in English. Used libraries: spacy, pdf2image, easyocr, poppler-utils.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published