/plushcap/analysis/voxel51/computer-vision-optical-character-recognition-pytesseract

Optical Character Recognition with PyTesseract

What's this blog post about?

In week five of "Ten Weeks of Plugins", a series dedicated to building FiftyOne plugins, we discuss Optical Character Recognition (OCR) and Keyword Search. The PyTesseract OCR plugin leverages the Tesseract OCR engine to perform optical character recognition on samples in a dataset, while the Keyword Search plugin allows users to search within labels generated by the first plugin. These two plugins combined enable searching through documents like pages of old books, handwritten notes or resumes based on their textual content.

Company
Voxel51

Date published
Sept. 21, 2023

Author(s)
Jacob Marks

Word count
2148

Hacker News points
None found.

Language
English


By Matt Makai. 2021-2024.