![]() |
OCR transforms text images into machine-readable formats. With applications ranging from receipts to license plates, we explore the process, syntax, and examples, demonstrating its versatility. In this tutorial, we will learn to perform Optical Character Recognition in R programming language using the Tesseract and Magick libraries. Optical Character RecognitionOCR stands for Optical Character Recognition. It is the procedure that transforms a text image into a text format that computers can read. OCR generally scans the image and extracts the text from the image that we can store in any string variable. OCRs are used to read receipts, cheques, code scanners, license plate scanners, and other numerous applications. The libraries used will be:
The Tesseract package provides R bindings Tesseract: a powerful optical character recognition (OCR) engine that supports over 100 languages. The engine is highly configurable to tune the detection algorithms and obtain the best possible results. SyntaxTo perform Optical Character Recognition, we simply use the ocr() method and pass the file. text <- ocr(pngfile)
cat(text)
ocr method takes the png file and extracts the text using its pre-trained model. Example 1: Reading text from an ImageStep 1: Install and load the libraries: R
Step 2: Load an image from a URL or file storage. R
Output: ![]() horje Step 3: Apply the OCR method on it. R
Output: [1] "GeeksforGeeks\nA computer science portal for geeks\n"
Example 2: Converting text from PDF.Here we need to convert the PDF into png and then perform the OCR. The syntax is as follows: pngfile <- pdftools::pdf_convert('https://www.africau.edu/images/default/sample.pdf', dpi = 600)
Here is the full code: R
Output: Converting page 1 to sample_1.png... done!
Converting page 2 to sample_2.png... done!
This is a small demonstration .pdf file -
just for use in the Virtual Mechanics tutorials. More text. And more
text. And more text. And more text. And more text.
And more text. And more text. And more text. And more text. And more
text. And more text. Boring, zzzzz. And more text. And more text. And
more text. And more text. And more text. And more text. And more text.
And more text. And more text.
And more text. And more text. And more text. And more text. And more
text. And more text. And more text. Even more. Continued on page 2 ...
Simple PDF File 2
...continued from page 1. Yet more text. And more text. And more text.
And more text. And more text. And more text. And more text. And more
text. Oh, how boring typing this stuff. But not as boring as watching
paint dry. And more text. And more text. And more text. And more text.
Boring. More, a little more text. The end, and just as well.
Text Localization in OCRNow we will learn to get the position of text and prepare a bounding box around it. To get the bounding box, we can run the ocr_data() method on the image. bound_box = ocr_data(img)
Step 1: Load the libraries R
Step 2: Load image and generate the bounding box data. The ocr_data() method takes an image and sends the coordinates of the rectangle box in form of (x1, y1, x2, y2) coordinates separated by comma which we extract in later step. The coordinates data is stored in bound_box variable. R
Step 3: Convert the coordinates from chr to double by extracting the bound_box data splitting by comma and then saving them as xmin, ymin, xmax and ymax respectively. R
Output: word confidence bbox Step 4: Plot the image R
Output: Advantages of OCR
Disadvantages of OCR
ConclusionIn conclusion, Optical Character Recognition in R opens avenues for text extraction from diverse sources. Tesseract and Magick libraries facilitate seamless integration, enabling tasks such as reading images and converting PDFs. While powerful, OCR’s effectiveness depends on image quality, with potential challenges in handwritten text recognition. |
Reffered: https://www.geeksforgeeks.org
AI ML DS |
Related |
---|
![]() |
![]() |
![]() |
![]() |
![]() |
Type: | Geek |
Category: | Coding |
Sub Category: | Tutorial |
Uploaded by: | Admin |
Views: | 17 |