How to create searchable PDF from scan
Everybody knows how to use the printer at the job or at the study to scan a few pages from a book. But wouldn’t it be nice to be able to search it – scanning a book to create searchable PDF. That’s what I will show you in this hack-blog post.
What you need to create a searchable PDF:
Scan the book
First you need to scan the pages from the book, you would like to be to be searchable. Create a folder on your computer and put alle the scanned files here. I have created my scan in .pdf format, but the format does not matter. You can use .jpg, .pdf, .tiff and more.
Open Adobe Acrobat andclick the tool; “Combine files”. (NB: For this step you can also use Sedja PDF tools, but I have much better results with Adobe Acrobat. The user experience is just far way better).
After this, drag alle the files from your folder with you non-searchable PDF files here.
Here you can choose what images / PDFs should be used in your searchable PDF. Drag the scanned pages around in the graphical interface and delete doublicate pages and more.
When you are done – click the “Combine” in the top right corner. Adobe Acrobat will ask you to choose a place to save the file. Remember the path.
I hate to read PDF files that are scanned from a book, because reading the on an iPad iPhone or similar, just is a pain in the ass. You would always have to adjust the position of the PDF. So I would recommend, that you always split PDF in such case.
But unfortuneatly Adobe Acrobat does not really support splitting scanned pages in the middle. But luckally, Sedja, PDF does. Therefore, go to www.sejda.com
Sedja offers a palette of very cool PDF tools, and you can do all sorts of great stuff with PDFs here. Take a look around. I promise you’ll be inspired.
Now click the “Split PDF in Half”-option, marked on the pic above. It’ll ask you to upload a PDF – choose the one we just combined from your scan.
When the upload is done, all pages are shown on top of each other, and you can drag the middle line to be just where there is no text, like shown below.
When that is done, just click done, and Sedja will work for a wile on splitting all the pages. In the end, you can download a new PDF, ready for creating a searchable PDF in the next step.
Create searchable PDF
Now the magic part comes ind. How to actually create the searchable PDF. For now the PDF is just images packed in a PDF shell.
In order to make it searchable, open the PDF you have just created with Sedja in Adobe Acrobat. And click the “Refine scanning” as shown below.
After that, a menu in the top of the window will open, making you select “Recognize text”. Click that and choose the language of the text in your PDF and click “Recognize text”.
This process will take a while, but afterwoods you are done and you have a searchable PDF – and best of all, you know how to create searchable PDF from scan.
Thanks for reading!