Although digitization is advancing, even today, companies still have a wide variety of documents only available in handwritten or typewritten form. Optical Character Recognition (OCR) and Intelligent Character Recognition (ICR) provide a remedy and many new possibilities for simplifying processes.
Digitization is advancing, but what do you do with conventional documents that are currently only available in written form? Even if they have perhaps already been scanned, they often remain unsearchable because they consist only of merged raster graphics.
Even the best resolution is useless if the text is not editable by the computer. And then there are the handwritten notes, such as meeting minutes or transcripts, which ultimately have to be laboriously transferred to the digital system. So what do we do?
Typing was yesterday, today is OCR.
In the case of templates that only exist haptically, such as old files and documents, scanning software with OCR can help you to create editable text from an image template. Photographed documents or other pixel-based files can also be captured with OCR. For this you need a text recognition software and an OCR engine that is as thorough as possible, such as the one from the Swiss software developer KADMOS.
What is OCR?
OCR stands for “Optical Character Recognition” and thus for a process that generates a text from the image template using scan, pattern and calculation scheme. The text is captured on a normal scanner. Afterwards, the template generated in this way is loaded into the text recognition program. The OCR engine now analyzes the image with regard to its components, so that it can easily recognize which part of the scanned document is image, which text. This step is very important for the later assembly.
Now the parts that have been recognized as text are compared with patterns and properties available in the program. Does this collection of pixels correspond more to a symbol or a letter? The evaluation of the analysis by different algorithms makes such a decision possible for the computer. Thus, the program recognizes the text line by line and finally reassembles the document according to the initial analysis. The image has become an editable, searchable document. The document is then saved as desired in a PDF, DOC or other file format.
Handwritten recognition thanks to ICR
ICR (Intelligent Character Recognition) is the logical development of OCR. It is a detailed analysis and evaluation of the scan result also with regard to the semantic context. This means that after an image content has been captured, it is not only separated into text and image, but within the text it is also analyzed whether it makes sense to use this letter. Especially with similar looking characters like “8” or “B” this technique brought a strong improvement of the accuracy of digital text recognition. Even if the originals already show faded letters due to age, they can often be recognized and digitized without problems thanks to ICR.
A large area of application for ICR is the recognition of handwritten texts, where text recognition software has often failed in the past. With integrated ICR, however, it is also possible to digitize them without any problems.
Applications for OCR and ICR
In times of networking, finding information digitally in your own company network is just as important as in archived documents. By means of text recognition software, which has integrated OCR and ICR, it is possible to protect existing paper archives and the important contents therein from decay and to make them legally accessible in the long term. This also applies to historical documents in Fraktur font!
OCR/ICR can also be used to simplify sorting processes, e.g. for incoming mail or in administrative entrances. Here, features on envelopes and/or packages can be recognized and then transferred to existing sorting systems. Full text recognition and search makes processing and complete document recognition possible. This applies to structured documents such as forms, recipes and bank transfer forms as well as to semi-structured texts such as invoices, delivery bills or even continuous texts without structuring – such as letters of complaint and other incoming mail. All these types of documents can also be captured from a cloud – OCR provides positive support for networking your processes. The time saved is considerable and the reduction in errors compared to manual capture is also impressive.
In the accounting department or even in the organization, the text capture software can be used to automatically scan documents and automatically transfer the acquired data to the appropriate programs for further processing. In this way, typing errors can be minimized and the filing of important documents is done at the same time.
Particularly in the digital age, you can also benefit from the technology in the mobile area. Travel expense reports and forms can be recognized and processed later. Meter readings of, for example, heaters, water meters or similar can also be photographed and then sent for further processing in the company’s own process.
In the increasingly established area of Industry 4.0, technologies such as OCR technology allow for the uninterrupted capture of information from screens and machines. Cost-intensive start-up and holding times are thus avoided. Your production can continue and you still get the required information and facts without any difficulties.
Especially successful is the use of OCR/ICR software kits of an IT provider in the pharmacy sector. Here, several million prescriptions and ordinances throughout Germany were automatically scanned. The error rate remained below 5%. By reading in the prescriptions at the counter, hundreds of man-hours were saved and the digitization process was optimized considerably.
It is therefore becoming apparent that OCR/ICR software will increasingly drive forward the digitization of the analog.