Description of the bug
Using PyMuPDF or PyMuPDF4LLM to extract the Amazon Sustainability Report gives quite incomprehensible output with beginnings of words garbled, and spacing screwed up. Something about this PDF is not being parsed properly.
I have found other extractors do extract this report correctly.
How to reproduce the bug
import fitz
d = fitz.open("file.pdf")
for p in d.pages()
print(p.get_text())
07ef2453.pdf
PyMuPDF version
1.24.2
Operating system
MacOS
Python version
3.11