What is a "Sandwich PDF"?

With a simple scan of a document you might get an image, or a PDF containing an image, of text. But the PDF does not contain the scanned text. Therefore you can not search this text, or use copy and paste. To overcome this problem, a invisible layer with text can be put over the image, so that the position of the invisible text matches the corresponding part of the image. You can see this layer only if you search or select the text. And because such a PDF has more than one layer those PDFs are sometimes called Sandwich PDF.

There are lot of errors and not all words are found?

To create the text layer the document is processed with an OCR (Optical Character Recognition) tool and this OCR process is usually not error free. With a bad scan quality, or a font not suited for OCR the results might become unusable.

Show me an example.

Take this Original and the sandwiched Result.

What file formats are supported for uploading?

We support the following input formats: pdf, tiff, jpg, png, bmp, jpeg.

Not all pages of my PDF are processed!

As this is a free service and the process of creating a searchable PDF is computationally intensive we have to limit the available resource per submit. We are currently working on a solution to lift this limit. Send us an email if you want to get a notification when the limitation is removed.