Тёмный

Extract Text From An Image Using Java | tesseract OCR | JavaTalent | Java  

Java Talent
Подписаться 902
Просмотров 2,7 тыс.
50% 1

Tesseract OCR Download:
tesseract-ocr-w64-setup-5.3.3.20231005.exe
github.com/tes... or
github.com/UB-...
tess4j library download:
jar-download.c...

Опубликовано:

 

8 сен 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 6   
@kushprakash123
@kushprakash123 6 месяцев назад
Can we fetch bold words with tesseract or any other open source api ?
@javatalent
@javatalent 6 месяцев назад
You can look into: 1. In Tesseract 3 there is a metadata result which contains a recognized font. Probably it is not super reliable, but it might work for basic fonts and detect bold and non-bold fonts. 2. In Tesseract 4 you can export HOCR output and configure it in a way to get boxes around each character (not sure about Tesseract 3). I am not sure how reliable these boxes are either, but if it is okay, you could use them to have a second algorithm which just classifies whether a single character is bold or not and remove non-bold text from the tesseract output. 3. In case you have precise line boxes before using tesseract, you could also look into training an algorithm which segments the part of the line which is bold, then crop the image and use tesseract only for the bold parts. This would probably the most technical solution, but I think it could work as well.
@runrunning4359
@runrunning4359 4 месяца назад
Thank you for video! 3rd link dont work!
@javatalent
@javatalent 4 месяца назад
Yes may be i have not tested that out.
@sukeshpandey9904
@sukeshpandey9904 4 месяца назад
hey i want the data for all the fonts in tesseract ,where can i get it?
@javatalent
@javatalent 4 месяца назад
Go through this link. Might be helpful what you looking for. tesseract-ocr.github.io/tessdoc/Fonts.html
Далее
What is the Java Job delusion?
12:23
Просмотров 119 тыс.
▼ЮТУБ ВСЁ, Я НА ЗАВОД 🚧⛔
30:49
Просмотров 421 тыс.
How To Read Images in Java Using OCR- Tesseract
21:35
Optical Character Recognition (OCR) - Computerphile
14:16
Extract text from images with Tesseract OCR on Windows
18:06
Optical Character Recognition Tesseract Java
5:43
Просмотров 4,6 тыс.
Extract Text From Images in Python (OCR)
29:24
Просмотров 272 тыс.
3 ways to reduce the size of your docker images
17:20
How Does Optical Character Recognition (OCR) Work?
5:48
#3 - Extract Images from PDF and Validate PDF Images
17:53
▼ЮТУБ ВСЁ, Я НА ЗАВОД 🚧⛔
30:49
Просмотров 421 тыс.