Antiword has been ported to FreeBSD, BeOS, OS/2, Mac OS X, Amiga, VMS, NetWare, Plan9, EPOC, Zaurus PDA, MorphOS, Tru64/OSF, Minix, Solaris and DOS. For this article, I’ll focus on using it in Linux.
Main Features
Antiword lets you view and convert MS Word documents from the command line. You can convert to the following formats:
Plain text Formatted text PDF Postscript XML (only DocBook is currently supported)
Limitations
Before you get too excited, I have to mention that Antiword was last updated in 2005 and is not compatible with newer DOCX documents. You also cannot use it to edit your documents.
Getting Antiword
If your Linux distribution has a package manager, you can most likely find Antiword in one of your repositories. Otherwise, grab the .tar.gz archive from the Antiword page on Freecode. Extract the archive and enter the antiword-0.37 directory. Then run:
Usage
For the following usage tips, I’m going to use my résumé as an example document. Here’s what it looks like in LibreOffice:
The most basic way to use antiword is to simply display the document:
As you can see, the default command doesn’t preserve certain aspects of formatting like font size, italics, and underlining, but it does a nice job of presenting the text in a readable form. To display formatting information, use the “-f” flag in your command:
No, this doesn’t actually show you the formatting in a WYSIWYG style; rather, it tells you about it with a markdown-like syntax. For example, it shows underlined text with underscores and bold text with asterisks. To convert your Word document to a PDF file, you must specify a paper size using the “-a” flag. Antiword supports the following paper sizes:
10×14 a3 a4 a5 b4 b5 executive folio legal letter note quarto statement tabloid
You can use the same paper sizes when converting a document to Postscript, but in that case you must use the “-p” flag instead. This example converts the document to a tabloid-sized PDF file: This is the resulting PDF file displayed in Okular:
Not bad! The dotted underlining and e-mail address hyperlink disappeared, but overall, the conversion was successful. If you’re converting to Postscript, you can also use the “-L” to print in landscape mode. This example will convert the document to DocBook format: The conversion will also preserve metadata, including the author name and creation date of the document. Here’s what the raw XML looks like:
And here’s what the DocBook file looks like in LibreOffice:
You can see that it looks different from the original Word document, but the structure has mostly been preserved. Converting to DocBook with Antiword would probably work better with Word documents that were created with conversion to XML in mind. To see what else you can do with Antiword – including restoring text that has been changed in MS Word – check out the man page (it’s also online).