Tesseract Ocr .net 4.0 to Read License Tag
Overview
- Optical Grapheme Recognition (OCR) is a widely used system in the computer vision space
- Learn how to build your ain OCR for a multifariousness of tasks
- We will leverage the OpenCV library and Tesseract for edifice the OCR arrangement
Introduction
Do you think the days when y'all had to fill in the dots of the correct reply during an exam? Or how nearly the aptitude test you gave before your offset task? I can vividly retrieve the olympiads and multiple-choice tests where universities and organizations used an Optical Graphic symbol Recognition (OCR) system to grade the answer sheets in droves.
Honestly, OCR has applications in a wide range of industries and functions. Then, everything from scanning documents – bank statements, receipts, handwritten documents, coupons, etc., to reading street signs in autonomous vehicles – this all falls under the OCR umbrella.
OCR systems used to be quite expensive and cumbersome to build a couple of decades ago. But advances in the computer vision and deep learning field mean we can build our own OCR system right now!
But building an OCR system isn't a straightforward task. For starters, information technology is filled with problems like different fonts in images, poor contrast, multiple objects in an paradigm, etc.
So, in this article, we volition explore some very famous and effective approaches for the OCR chore and how you tin implement i yourself.
If you are new to object detection and computer vision, I suggest going through the following resources:
- Step-by-Step Introduction to Basic Object Detection Algorithms
- Computer Vision Grade
Table of Contents
- What is Optical Graphic symbol Recognition (OCR)?
- Popular OCR Applications in the Real World
- Text Recognition with Tesseract OCR
- The Unlike Means for Text Detection
What is Optical Character Recognition (OCR)?
Let's first sympathise what OCR is, in case you haven't come beyond this concept before.
OCR, or Optical Character Recognition, is a process of recognizing text within images and converting information technology into an electronic form. These images could be of handwritten text, printed text like documents, receipts, name cards, etc., or even a natural scene photograph.
OCR has two parts to it. The start part is text detection where the textual part inside the image is determined. This localization of text within the image is important for the second function of OCR, text recognition, where the text is extracted from the epitome. Using these techniques together is how you can extract text from whatever paradigm.
But null is perfect and OCR is no exception. However, with the advent of deep learning, it has become possible to become meliorate and more generalized solutions to this problem.
Before we dive into how to build your own OCR, let's take a look at some of the popular applications of OCR.
Popular OCR Applications in the Real World
OCR has widespread applications across industries (primarily with the aim of reducing manual man try). Information technology has been incorporated in our everyday life to an extent that we hardly ever discover it! Just they surely strive to bring a improve user experience.
OCR is used for handwriting recognition tasks to extract information. A lot of piece of work is going on in this field and we take made some actually meaning advancements. Microsoft has come up with an awesome mathematical application that takes as input a handwritten mathematical equation and generates the solution along with a stride-by-step explanation of the working.
OCR is increasingly being used for digitization by various industries to cut down manual workload. This makes information technology very easy and efficient to extract and store data from business documents, receipts, invoices, passports, etc. Also, when you upload your documents for KYC (Know Your Customer), OCR is used to extract data from these documents and store them for hereafter reference.
OCR is as well used for book scanning where it turns raw images into a digital text format. Many large calibration projects like the Gutenberg project, Meg Book Project, and Google Books employ OCR to scan and digitize books and store the works as an annal.
The banking industry is also increasingly using OCR to annal client-related paperwork, like onboarding material, to easily create a client repository. This significantly reduces the onboarding time and thereby improves the user experience. Also, banks use OCR to extract information like account number, corporeality, cheque number from cheques for faster processing.
The applications of OCR are incomplete without mentioning their use in cocky-driving cars. Democratic cars rely extensively on OCR to read signposts and traffic signs. An effective understanding of these signs makes democratic cars prophylactic for pedestrians and other vehicles that ply on the roads.
There are definitely many more applications of OCR similar vehicle number plate recognition, converting scanned documents into editable word documents, and many more than. I would dearest to hear your experience of using OCR – let me know in the comments section below.
The digitization using OCR obviously has widespread advantages similar easy storage and manipulation of the text, not to mention the unfathomable corporeality of analytics that you can utilise to this data! OCR is definitely i of the most important fields of Figurer Vision.
At present, let'southward look at 1 of the nigh famous and widely used text recognition techniques – Tesseract.
Text Recognition with Tesseract OCR
Tesseract is an open-source OCR engine originally adult as proprietary software by HP (Hewlett-Packard) only was later made open source in 2005. Google has since and then adopted the projection and sponsored its evolution.
Every bit of today, Tesseract tin detect over 100 languages and can process fifty-fifty right-to-left text such as Arabic or Hebrew! No wonder it is used by Google for text detection on mobile devices, in videos, and in Gmail's paradigm spam detection algorithm.
From version 4 onwards, Google has given a significant boost to this OCR engine. Tesseract 4.0 has added a new OCR engine that uses a neural network system based on LSTM (Long Short-term Retentivity), one of the about effective solutions for sequence prediction problems. Although its previous OCR engine using blueprint matching is still bachelor equally legacy code.
One time you have downloaded Tesseract onto your organisation, you easily run information technology from the command line using the post-obit command:
tesseract <test_image> <output_file_name> -l <language(s)> --oem <mode> --psm <way>
You can modify the Tesseract configuration for results all-time suited for your image:
- Langue (-50) – Y'all can notice a single language or multiple languages with Tesseract
- OCR engine mode (–oem) – As you already know, Tesseract four has both LSTM and Legacy OCR engines. However, there are 4 modes of valid operation modes based on their combination
3. Page Division (–psm) – Can exist adjusted according to the text in the image for better results
Pyteseract
However, instead of the control-line method, you lot could also use Pytesseract – a Python wrapper for Tesseract. Using this you can hands implement your own text recognizer using Tesseract OCR by writing a simple Python script.
Yous tin can download Pytesseract using the pip install pytesseractcontrol.
The main function in Pytesseract is image_to_text() which takes the prototype and the control line options as its arguments:
What are the Challenges with Tesseract?
It'due south no surreptitious that Tesseract is not perfect. Information technology performs poorly when the epitome has a lot of noise or when the font of the linguistic communication is i on which Tesseract OCR is non trained. Other conditions like brightness or skewness of text will besides bear on the operation of Tesseract. Nevertheless, it is a good starting betoken for text recognition with low efforts and loftier outputs.
The Dissimilar Ways for Text Detection
Tesseract assumes that the input text image is adequately clean. Unfortunately, many input images volition contain a plethora of objects and not just a clean preprocessed text. Therefore, it becomes imperative to accept a good text detection arrangement that tin can detect text which can then be easily extracted.
There are a fair few means for text detection:
- Traditional way of using OpenCV
- Gimmicky manner of using Deep Learning models, and
- Building your very own custom model
Text Detection using OpenCV
Text detection using OpenCV is the classic style of doing things. Y'all tin can use diverse manipulations similar image resizing, blurring, thresholding, morphological operations, etc. to clean the epitome.
Here we take Grayscale, blurred and thresholded images, in that order.
Once you have done that, y'all can apply OpenCV contours detection to observe contours to extract chunks of data:
Finally, you can apply text recognition on the contours that you got to predict the text:
The results in the image above were achieved with minimum preprocessing and contour detection followed by text recognition using Pytesseract. Obviously, the contours did non notice the text every time.
But, still, doing text detection with OpenCV is a boring task requiring a lot of playing around with the parameters. Besides, it does non do well in terms of generalization. A improve way of doing this is by using the E text detection model.
Contemporary Deep Learning Model – EAST
EAST, or Efficient and Accurate Scene Text Detector, is a deep learning model for detecting text from natural scene images. It is pretty fast and authentic as information technology is able to detect 720p images at 13.2fps with an F-score of 0.7820.
The model consists of a Fully Convolutional Network and a Not-maximum suppression stage to predict a word or text lines. The model, however, does not include some intermediary steps like candidate proposal, text region formation, and word partition that were involved in other previous models, which allows for an optimized model.
You tin have a look at the paradigm below provided past the authors in their paper comparing the Due east model with other previous models:
East has a U-shape network. The first part of the network consists of convolutional layers trained on the ImageNet dataset. The next office is the characteristic merging branch which concatenates the current feature map with the unpooled feature map from the previous stage.
This is followed past convolutional layers to reduce ciphering and produce output feature maps. Finally, using a convolutional layer, the output is a score map showing the presence of text and a geometry map which is either a rotated box or a quadrangle that covers the text. This can be visually understood from the image of the architecture that was included in the research paper:
I highly suggest you become through the paper yourself to get a adept understanding of the Due east model.
OpenCV has included the Due east text detector model in version three.4 onwards. This makes it super convenient to implement your own text detector. The resulting localized text boxes can be passed through Tesseract OCR to extract the text and you will accept a complete end-to-end model for OCR.
Custom Model using TensorFlow Object API for Text Detection
The last method to build your text detector is using a custom-built text detector model using the TensorFlow Object API. It is an open-source framework used to build deep learning models for object detection tasks. To sympathise it in item, I suggest going through this detailed article offset.
To build your custom text detector, y'all would obviously require a dataset of quite a few images, at least more than 100. Then you need to annotate these images and so that the model can know where the target object is and larn everything about it. Finally, you lot can choose from ane of the pre-trained models, depending on the trade-off between functioning and speed, from TensorFlow's detection model zoo. You tin can refer to this comprehensive blog to build your custom model.
Now. training tin require some computation, but if you don't really take enough of it, don't worry! Yous tin use Google Colaboratory for all your requirements! This article will teach y'all how to use it effectively.
Finally, if y'all want to go a footstep ahead and build a YOLO state-of-the-fine art text detector model, this article will be a stepping stone to understanding all the nitty-gritty of it and yous will be off to a great showtime!
Stop Notes
In this article, we covered the problems in OCR and the various approaches that can be used to solve the chore. We likewise discussed the diverse shortcomings in the approaches and why OCR is non every bit easy every bit it seems!
Have you worked with any OCR application before? What kind of OCR use cases exercise you plan on building after this? Let me know your ideas and feedback below.
Source: https://www.analyticsvidhya.com/blog/2020/05/build-your-own-ocr-google-tesseract-opencv/
0 Response to "Tesseract Ocr .net 4.0 to Read License Tag"
Post a Comment