Thursday, May 29, 2008

pdftotext

recently i needed to rip the text out of a pdf file so i could read it on my pda. (it doesn't have a pdf reader right now, and even if i got one it would take forever to render all the images.) here's a command that works pretty well, even preserving the layout with pretty good accuracy: pdftotext -layout -enc ASCII7 -nopgbrk file.pdf file.txt

No comments: