Skip to content

Rect.intersects() is much slower than necessary #4527

@stijnvermeeren-swisstopo

Description

Description of the bug

When benchmarking one of my scripts, I noticed that the method Rect.intersects() (which is called thousands of times) is a significant bottleneck.
I was able to replace the PyMuPDF method with a simple custom implementation which behaves the same but is more than 10x faster.

Is there a good reason for the current, slower PyMuPDF implementation, or could this method be made much faster?

Example output of the reproduction script below:

$ time python test.py 
PyMuPDF time: 7.175560788000526
Fast method time: 0.551379568999675

How to reproduce the bug

import pymupdf
import random
import time

def random_rect() -> pymupdf.Rect:
    x0 = random.randint(0, 100)
    y0 = random.randint(0, 100)
    width = random.randint(0, 100)
    height = random.randint(0, 100)
    return pymupdf.Rect(x0, y0, x0 + width, y0 + height)

def fast_intersects(rect1: pymupdf.Rect, rect2: pymupdf.Rect) -> bool:
    return not rect1.is_empty and not rect2.is_empty and (
        (rect1.x0 < rect2.x1) and (rect2.x0 < rect1.x1) and (rect1.y0 < rect2.y1) and (rect2.y0 < rect1.y1)
    )

pymupdf_time = 0
fast_time = 0

random_rects = [random_rect() for _ in range(1000)]
for rect1 in random_rects:
    for rect2 in random_rects:
        a = time.process_time()
        pymupdf_result = rect1.intersects(rect2)
        b = time.process_time()
        fast_result = fast_intersects(rect1, rect2)
        c = time.process_time()

        pymupdf_time += b - a
        fast_time += c - b
        assert(fast_result == pymupdf_result)

print(f"PyMuPDF time: {pymupdf_time}")
print(f"Fast method time: {fast_time}")

PyMuPDF version

1.26.0

Operating system

Linux

Python version

3.12

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions