How to OCR PDF Documents

When you scan paper documents to PDF, you effectively create photographs of those documents and their content. In practice, this means that you can neither search nor select any of the text from the resulting PDF should you wish to copy or annotate it. The reason is that your computer does not recognize the text in that scanned, image PDF file.

In order to make your PDF searchable or to convert it to another file format for editing and processing like Word and Excel, you first need to run OCR on it. OCR stands for Optical Character Recognition. This technology allows your computer to recognize and read the text locked down within the image.

Let us show you how you can quickly OCR your scanned PDFs.

The first step is to get yourself a reliable OCR PDF converter. For the purposes of this how-to guide, we are using Able2Extract Professional, an OCR PDF desktop software. You install it like most other desktop tools on your computer, start it and open your image PDF file in it.

OCR PDF converter

The next step is to specify whether you want to run OCR on the entire document (Select All option) or just one area (Select Area). We selected to OCR the entire PDF.


Having made your selection, go to File menu and choose to Convert to Searchable PDF.

Convert to Searchable PDF

Able2Extract will quickly perform OCR on your PDF and you will then be able to search and select text from your PDF as you can see in the screen capture below:

Able2Extract OCR on your PDF

To unlock the text from the image PDF and export it into an editable format like MS Word, after selecting the content as explained above, you will need to click on the Word icon you on the main toolbar of Able2Extract under Convert to File Type. The software performs OCR by default and extracts the text from the image file.

OCR PDF Documents

The result of the OCR conversion is the formatted Word document with fully editable text. In exactly the same way, you can convert scanned PDF tables into Excel or image PDF presentations into PowerPoint, and more.

Key Features of Able2Extract Professional OCR PDF

Able2Extract is an all-in-one PDF solution for dealing with PDFs. It is a cross-platform software available for Mac, Windows, and Linux distributions. Apart from its ability to OCR PDF, it allows users to:

 

  • Create regular and password-protected PDF from all printable file formats.
  • Custom convert PDF to Excel: tweak tables to the user’s liking before conversion, including the option to preview the conversion output.
  • Edit PDFs instantly: add or delete content, redact private and sensitive information, split and merge PDF, etc.
  • Batch convert PDFs.
  • Annotate PDF with a dozen different types of annotations.
  • Fill in and edit PDF forms for easy data gathering and distribution.
  • Add custom Bates numbering for easy identification and retrieval of PDF information.
  • Convert native and image PDF to over 10 file formats including MS Office, AutoCAD, HTML, etc.

 

Although it is not a fully free software, it offers a seven-day trial which you can download from Investintech.com

 

Author: Investintech