Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The claim that text is selectable in PDFs is often dubious.


The problem is that text selection relies on the PDF generation to be done in some kind of sensible fashion. There are so many ways to generate PDFs, and in some of them, the actual text is mangled or its order is mangled before it gets to the PDF generation step itself.

But in general, if you generate the PDF with an authoring tool like LaTeX or InDesign, or if you print to PDF from a webpage or document, it's going to be selectable in a sensible way.


Not sure about other apps, but the paid reader I use (1) includes an OCR function that adds a text layer to the document. Seems to work pretty well. (1) https://www.tracker-software.com/product/pdf-xchange-editor


Well, I meant that the format supports it. I merely mentioned it so nobody would reply with something like PNG.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: