-
Notifications
You must be signed in to change notification settings - Fork 661
Closed
Labels
Fixed in next releaseenhancementfix developedrelease schedule to be determinedrelease schedule to be determined
Description
Description of the bug
When benchmarking one of my scripts, I noticed that the method Rect.intersects() (which is called thousands of times) is a significant bottleneck.
I was able to replace the PyMuPDF method with a simple custom implementation which behaves the same but is more than 10x faster.
Is there a good reason for the current, slower PyMuPDF implementation, or could this method be made much faster?
Example output of the reproduction script below:
$ time python test.py
PyMuPDF time: 7.175560788000526
Fast method time: 0.551379568999675
How to reproduce the bug
import pymupdf
import random
import time
def random_rect() -> pymupdf.Rect:
x0 = random.randint(0, 100)
y0 = random.randint(0, 100)
width = random.randint(0, 100)
height = random.randint(0, 100)
return pymupdf.Rect(x0, y0, x0 + width, y0 + height)
def fast_intersects(rect1: pymupdf.Rect, rect2: pymupdf.Rect) -> bool:
return not rect1.is_empty and not rect2.is_empty and (
(rect1.x0 < rect2.x1) and (rect2.x0 < rect1.x1) and (rect1.y0 < rect2.y1) and (rect2.y0 < rect1.y1)
)
pymupdf_time = 0
fast_time = 0
random_rects = [random_rect() for _ in range(1000)]
for rect1 in random_rects:
for rect2 in random_rects:
a = time.process_time()
pymupdf_result = rect1.intersects(rect2)
b = time.process_time()
fast_result = fast_intersects(rect1, rect2)
c = time.process_time()
pymupdf_time += b - a
fast_time += c - b
assert(fast_result == pymupdf_result)
print(f"PyMuPDF time: {pymupdf_time}")
print(f"Fast method time: {fast_time}")
PyMuPDF version
1.26.0
Operating system
Linux
Python version
3.12
Metadata
Metadata
Assignees
Labels
Fixed in next releaseenhancementfix developedrelease schedule to be determinedrelease schedule to be determined