Create Searchable PDF Images Using OCR Software

Posted by Widget on June 17th, 2007

You know you want to get rid of the paper and the file cabinets and the endless hours of searching for lost records, so you’ve decided to go paperless. Why stop there? Scanning papers just guarantees that you’re going to have a bunch of image files on your network, but you can search a picture. To run searches, you need text. That’s where OCR software comes in.

OCR stands for Optical Character Recognition. It’s the function that interprets graphical marks as letters forming words. Of course, that recognition is seldom 100% perfect, but it’s getting better. What you will want is an application that will run the OCR function in the background and then append the recognized text to the PDF image to create one searchable file.

Several OCR softwares are available, including OmniPage, Readiris, Presto, and Microsoft Office Document Imaging. What you will want to do is implement a solution that you can use in conjunction with your high-volume copier/scanners.

I’ll be writing more on this topic soon. Really, the point of this entry is to alert you to the fact that not all PDF files are the same: some are searchable and some aren’t. If you want the searchable kind (recommended for a paperless office), you’ll need OCR software to get it.

One Response to “Create Searchable PDF Images Using OCR Software”

  1. Mervin Grims Says:

    I love your writing style really enjoying this site.

Leave a Reply