-
Notifications
You must be signed in to change notification settings - Fork 664
Closed
Labels
not a bugnot a bug / user error / unable to reproducenot a bug / user error / unable to reproduce
Description
Description of the bug
I'm trying to redact a particular text. it's redacted but the nearby details are deleted.
`
import fitz
import re
class Redactor:
# static methods work independent of class object
@staticmethod
def get_sensitive_data(lines):
""" Function to get all the lines """
for line in lines:
# matching the regex to each line
if re.search(rf"{'Lot:'}(.+)", line, re.IGNORECASE):
search = re.search(rf"{'Lot:'}(.+)", line, re.IGNORECASE)
yield search.group(1)
# constructor
def __init__(self, path):
self.path = path
def redaction(self):
""" main redactor code """
# opening the pdf
doc = fitz.open(self.path)
# iterating through pages
for page in doc:
# _wrapContents is needed for fixing
# alignment issues with rect boxes in some
# cases where there is alignment issue
# page._wrapContents()
# getting the rect boxes which consists the matching email regex
sensitive = self.get_sensitive_data(page.get_text("text").split('\n'))
for data in sensitive:
areas = page.search_for(data)
# drawing outline over sensitive datas
[page.add_redact_annot(area, fill = (0, 0, 0)) for area in areas]
# applying the redaction
page.apply_redactions()
# saving it to a new pdf
doc.save('redacted.pdf')
print("Successfully redacted")
path = '../sample files/Property_Information_Report.v7.pdf'
redactor = Redactor(path)
redactor.redaction()
`

How to reproduce the bug
Any help would be appreciated.
PyMuPDF version
1.24.0
Operating system
Windows
Python version
3.12
Metadata
Metadata
Assignees
Labels
not a bugnot a bug / user error / unable to reproducenot a bug / user error / unable to reproduce