Optical Character Recognition for Image Files Containing Bisaya Texts

Home Page
About
Submit A Journal
Submit A Conference
Submit Paper/Book
- Submit a Preprint
- Submit a Book
Publisher/Editor Panel
- Sign In/Sign Up

CCS Research Symposium
CCS 2021 PROCEEDINGS
Optical Character Recognition for Image Files Containing Bisaya Texts

Optical Character Recognition for Image Files Containing Bisaya Texts

Authors : Eirol Jan Coronado, Tristan Montaner

Pages : 1-5

View : 128 | Download : 63

Publication Date : 2021-06-02

Abstract :Optical character recognition (OCR) is the mechanical or electronic translation of images of hand-written or printed text into machine-editable text [4]. It is performed by optical character readers which are automated electronic systems. OCR may be defined as the process of converting images of machine printed or handwritten numerals, letters, and symbols into a computer- processable format. This study Show the accuracy of the OCR. PyTesseract is the chosen program to assess the accuracy of the Optical character recognition. We used images from different books in order for us to extract text from images. We have also conducted alpha and beta testing to know if we were able to identify if the results will differ if the program was utilized by us or other person. An inconsistent result had been observed while testing the Pytesseract program. Although this program is very easy to use and most efficient, this study is an evidence that OCR is not always 100.
Keywords : Optical character recognition, text extraction, artificial intelligence, information extraction

ORIGINAL PAPER URL

VIEW PAPER (PDF)

All Rights Reserved. İzmir Akademi Derneği
CopyRight © 2024