0% found this document useful (0 votes)
96 views

HOWTO Convert PDF To RTF, Doc

The document provides instructions for extracting text and graphics from PDF files using Adobe Acrobat. It explains that text can be selected and copied using the text selection tool for a word, line, paragraph, or entire page/document. Graphics can be extracted by saving the entire PDF as an EPS file or selecting and copying individual graphics using the touchup tools. Other third party tools may provide additional options for extracting text and graphics in different formats.

Uploaded by

Zé Fernando
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
96 views

HOWTO Convert PDF To RTF, Doc

The document provides instructions for extracting text and graphics from PDF files using Adobe Acrobat. It explains that text can be selected and copied using the text selection tool for a word, line, paragraph, or entire page/document. Graphics can be extracted by saving the entire PDF as an EPS file or selecting and copying individual graphics using the touchup tools. Other third party tools may provide additional options for extracting text and graphics in different formats.

Uploaded by

Zé Fernando
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

&RQYHUW 3') WR 57) By R Simonss (TechnicalUser) Jun 27, 2001 Acrobate 5.

0 has a feature to convert PDF to RTF, but it functions only for "tagged" PDF documents. I got the following answer when asking about it from Adobe (anyhow my PDF files were not tagged and I needed to convert the PDF to text so I developed an application myself that did the job): Adobe: RTF (Rich Text Format) and Acrobat 5.0 For RTF files from Acrobat 5.0 to retain their formatting, there are a couple of things to consider with your PDF file, and the creation of RTF files in general. When creating a RTF file from Acrobat 5.0, your PDF must contain structure and/or tags. Without these elements embedded in the PDF, the RTF converter in Acrobat has nothing to go by and therefore formatting will be ignored. Structured PDF files Structured PDF files contain formatting and styles that were added to the PDF during the conversion from the original application. Structured PDFs denote elements such as Headings, Footers, indexing and Table of Contents, and paragraph styles. Applications that create Structured PDFs include FrameMaker , InDesign, and MS Word (via the PDFMaker, which adds structure based on Words styles and structure). Tagged PDF files Tagged PDF files are created a couple of ways: using Web Capture 5.0, using PDFMaker 5.0, using the Tag Adobe PDF agent in Capture 3.0x or the Make Accessible plug-in with Acrobat 5.0. Tagging a PDF adds XML metadata to the PDF that denotes structural elements and styles like a Structured PDF file, however the tags further describe these elements to be understood by applications that can read XML metadata (such as screen readers and certain third party applications that can access the XML information of a PDF).

Therefore, for the best format output in RTF, its recommended you have your PDF be both structured AND tagged. Note: Tagging a PDF file increases the amount of data stored in the PDF; your PDF will be considerably larger than an untagged PDF file. Recommended Workflow for Creating RTF Files For unstructured (As-Is) PDF files 1. In Acrobat 5.0, select the Make Accessible plug-in under the Documents Menu (Note: if the Make Accessible option is grayed out, your file may be secured and does not allow alteration or modification). The Make Accessible plug-in will then analyze your file and place the appropriate tags based on its analysis. 2. Save your PDF file. This allows Acrobat to restructure the PDFs data within your PDF. 3. Choose File>Save As and under File Type, choose RTF (rich text format). Limitations of this method: 1. Tagging an unstructured file may not tag the file elements 100% correctly, particularly with complex formats. Remember that Acrobat is trying to tag the file as logically as possible. 2. Although the file will maintain some of its basic structure (such as Headings) when opened in another application, it may not recognize the headings for what they are. For instance, if you tag an unstructured PDF, save it as an RTF file and open it in MS Word, youll notice that the "headings" in your file will not correspond with its correct Heading version in the Style tool, it will see it as "Normal". (this is a bug) 3. Some elements in the file (such as paragraph alignment and bullet spacing) may be ignored only because this type of structure isnt recognized in the tags. For creating PDF files from MS Word 1. In MS Word, use the PDFMaker macro to create your PDF files. Within the PDFMakers conversion settings, you can set up your Headings to be recognized within the PDF file (often for the purpose of bookmarking eferences in the PDF). When you convert your Word document using PDFMaker, you are creating a structured PDF file. B. A.

2. In PDFMaker 5.0, you also have the option to add Tags to your PDF file. It is recommended to tag your PDF files at this point, rather than tag the PDF at a later date. 3. After you create your PDF using the PDFMaker, open the file in Acrobat and Save As RTF. Youll notice that Headings and Styles are correctly listed when the RTF is brought into an application such as MS Word; however, the RTF may not correctly display the Style used, particularly if its a custom Style. Final Note About RTF and applications that read RTF Keep in mind that some applications have far more complex (and therefore more accurate) RTF translators. For instance, youll notice that RTF, when opened in WordPad, does not retain the complex formatting (due to its own limitations as an application) in the RTF as MS Word or WordPerfect would. **************************************************** Status: Number: Category: Last Revision: Active 323695 Fax: 323695 How To Export

1999-08-04

How to Extract Text and Graphics from a PDF File Whats Covered Extracting Text Extracting Graphics Other Options for Extracting Text or Graphics The Portable Document Format (PDF) is designed for end-use files -- those that will be viewed and printed, but not substantially modified. You may want to extract text and graphics from PDF files. This document describes how to extract text and graphics in Adobe Acrobat 4.0 and later. Extracting Text You can extract text of varying lengths using Acrobat 4.0 or later (for other text extraction tools, check out PDF Store at www.pdfstore.com): - To extract one or more words on the same line, or extract an entire line, use the text select tool to select the text, and then copy and paste the selection. (Do not use the touchup text tool, which is designed for editing text within a PDF file. If you do use this tool, the pasted text will be in plain text [ASCII] format.) - To extract a paragraph or a single column, either use the column select tool or the text select tool while holding down the Control key to select the text, and then copy and paste. - To extract an entire page, choose View > Fit In Window, select any tool, choose Edit > Select All, and then copy and paste.

- To extract an entire document in Windows, choose Edit > Copy File To Clipboard. Or, in Mac OS or Windows, choose View > Continuous or View > Continuous - Facing, choose Edit > Select All, and then copy and paste. Note: When you paste the text, it will probably not be formatted the same as it was in the PDF file, and youll see different results in different applications. The texts color, point size, and style are usually retained, but its font is usually not retained. This is because of the way various applications collect information from the systems clipboard, which is where Acrobat copies text. Also, the application into which youre pasting must support Rich Text Format (RTF) text from the clipboard; if it doesnt, youll receive an error message or nothing will happen. Adobe Systems has no control over the way different applications collect information from the systems clipboard, and cannot assist you with achieving more consistent results when copying and pasting. Extracting Graphics You extract graphics by using different features in Acrobat 4.0 or later, depending on whether you want to extract the entire PDF file or an individual graphic. Extracting an Entire PDF File After you extract an entire PDF file, you can open the resulting EPS file in an imageediting application, such as Adobe Photoshop, or a drawing application, such as Adobe Illustrator. To extract an entire PDF file in Acrobat 5.0: 1. Choose File > Save As. 2. Choose Encapsulated PostScript (*.eps) from the Save As Type pop- up menu. 3. Select Settings. 4. In the Encapsulated PostScript dialog box, do either of the following: -- Choose PostScript Language Level 2 from the File Format Options pop-up menu. -- Select Include RGB or Lab images. 5. Click OK. To extract an entire PDF file in Acrobat 4.x: 1. Choose File > Export > PostScript or EPS. 2. In the Export PostScript or EPS Options dialog box, choose EPS with Preview in the File Format pop-up menu. 3. Choose Language Level 2 from the PostScript Option pop-up menu or select Include RGB or Lab images. 4. Click OK. Extracting an Individual Graphic To extract an individual graphic, do one of the following: - Use the graphics select tool or the touchup object tool to select the graphic, and then copy and paste. - Use the touchup object tool to right-click (Windows) or Control-right- click (Mac OS) the graphic, and then copy and paste it. To select the touchup object tool, click and hold the touchup text tool (i.e., the hollow T tool at the middle left of the Acrobat window),

and then select the touchup object tool (i.e., the solid black pointer). See the Acrobat user guide for more detailed instructions. - Use the touchup object tool to open the graphic in an image-editing application (e.g., Photoshop) or a drawing application (e.g., Illustrator), and then copy and paste or save the graphic. To open a graphic in an image- editing or drawing application using the touchup object tool, Ctrl-double-click (Windows) or Option-double-click (Mac OS) the graphic. Note: You may need to select the area of the graphic that you want to copy before you can copy it. Other Options for Extracting Text or Graphics Third-party, or non-Adobe, products may enable you to extract text or graphics in the format you want. To locate such third-party products, you should check Web sites that focus on PDF, such as www.planetpdf.com and www.pdfzone.com. These Web sites provide lists of vendors who distribute software, such as plug-ins, for Acrobat products.

You might also like