feat: easyOCR added, tesseract - removed, marker - removed, license changed to MIT #91

pkarw · 2025-01-17T12:37:36Z

Added:

easyOCR support
language flag - as it's required by the easyOCR details - it supports multi language options like en or en,de,it - it's for loading the proper OCR language weights

Removed:

tesseract - as it's quality was very low and not frequently used, removed to save the maintenance resources,
marker - as it implies the GPL3 license - looking forward for an example adding marker as a 3rd party strategy!

Changed:

LICENSE - license changed to MIT

choinek · 2025-01-17T19:22:34Z

.vscode/settings.json

Good idea, I'll commit similar settings for PyCharm :)

client/cli.py

choinek · 2025-01-18T06:46:25Z

Works for me, thx

 ✔ Read text from images using ocr request method with data set "read example-invoice.pdf with extract strategy: easyocr and model: llama3.2-vision saved in default"
 ✔ Read text from images using ocr request method with data set "read example-invoice.pdf with extract strategy: easyocr and model: llama3.2-vision saved in s3"
 ✔ Read text from images using ocr request method with data set "read example-invoice.pdf with extract strategy: easyocr and model: llama3.1 saved in default"
 ✔ Read text from images using ocr request method with data set "read example-invoice.pdf with extract strategy: easyocr and model: llama3.1 saved in s3"
 ✔ Read text from images using ocr request method with data set "read example-mri.pdf with extract strategy: easyocr and model: llama3.2-vision saved in default"
 ✔ Read text from images using ocr request method with data set "read example-mri.pdf with extract strategy: easyocr and model: llama3.2-vision saved in s3"
 ✔ Read text from images using ocr request method with data set "read example-mri.pdf with extract strategy: easyocr and model: llama3.1 saved in default"
 ✔ Read text from images using ocr request method with data set "read example-mri.pdf with extract strategy: easyocr and model: llama3.1 saved in s3"

pkarw added 3 commits January 17, 2025 13:37

fix: exclude for __pycache__

5101f4a

feat: easyOCR implementation

fe37006

feat: LICENSE change, marker removed

9562593

pkarw changed the title ~~feat: easyOCR~~ feat: easyOCR added, tesseract - removed, marker - removed, license changed to MIT Jan 17, 2025

pkarw mentioned this pull request Jan 17, 2025

[feat] add marker strategy as a separate repo + example how to load it #92

Open

pkarw requested a review from choinek January 17, 2025 13:34

This was linked to issues Jan 17, 2025

[cleanup] Remove Tesseract support after adding easyOCR #87

Closed

[feat] Add EasyOCR strategy #56

Closed

[feat] Change license to MIT by removing dependency to marker #62

Closed

choinek approved these changes Jan 18, 2025

View reviewed changes

Update client/cli.py

ac6baf0

choinek merged commit 316884b into main Jan 18, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: easyOCR added, tesseract - removed, marker - removed, license changed to MIT #91

feat: easyOCR added, tesseract - removed, marker - removed, license changed to MIT #91

Uh oh!

pkarw commented Jan 17, 2025 •

edited

Loading

Uh oh!

choinek Jan 17, 2025 •

edited

Loading

Uh oh!

Uh oh!

choinek commented Jan 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

feat: easyOCR added, tesseract - removed, marker - removed, license changed to MIT #91

feat: easyOCR added, tesseract - removed, marker - removed, license changed to MIT #91

Uh oh!

Conversation

pkarw commented Jan 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

choinek Jan 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

choinek commented Jan 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

pkarw commented Jan 17, 2025 •

edited

Loading

choinek Jan 17, 2025 •

edited

Loading