Tuesday, January 25, 2011

Getting Text Out of a PDF file: Copy vs Export

This just in from Adobe's Joel Geraci: "Basically, “Copy with Formatting” and “Export Selection” won’t give you the same results; they were not designed to. “Copy with Formatting” formats the text as a continuous stream; text in multiple columns will not be preserved as columns for example. This was by design and the intention is to help paste content into an existing file that may formatted somewhat differently. Export a selection will attempt to preserve the content as it appears in the PDF file, including content position.

Here are a few tips to help you decide which method to use when reusing content from a PDF file.

Use “Copy with Formatting” when copying small amounts of text or simple content (text and a few images). This allows you to paste content inline to existing content. It allows you to “match destination formatting” when pasting into Word, for example.

Use “Export Selection” for complex content containing inline images and vector art or when you explicitly want to preserve the relative positioning of all content.

Finally, “Copy with Formatting” may be slightly slower since it needs to put multiple formats onto the clipboard."


