Step-by-Step Handwriting Recognition Tutorial Using TensorFlow

Подписаться 14 тыс.

Просмотров 56 тыс.

50% 1

Handwriting recognition is a powerful technology that is widely used in various applications, from scanning documents to recognizing notes and forms. In this tutorial, we will build a custom TensorFlow model to extract text from captcha images using the IAM Dataset. We will begin by collecting and preprocessing the Dataset, then define our model architecture using a CNN with LSTM layers and a CTC loss function. We will then train and evaluate the model and finally test it on a small sample of the test dataset. Along the way, we will discuss ways to improve the performance of our model, such as fine-tuning the hyperparameters, using a different dataset or augmenting the data, testing a different model architecture, or incorporating additional features. This tutorial will provide a good starting point for building an OCR system using TensorFlow.
Text Version Tutorial: pylessons.com/handwriting-rec...
GitHub: github.com/pythonlessons/mltu...
pypi: pypi.org/project/mltu/
#machinelearning #python #tensorflow #opencv #ocr

Опубликовано:

5 авг 2024

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 165

@arielm2466 Год назад

you are a life saver I had a project about this that had his deadline moved to 3 weeks earlier and that tutorial really helped

@PyLessons Год назад

Nice, great that I could save your time!

@venkadaramananp9937 Год назад

bro can you help me ? i getting errors

@user-wt6zq8wl8f Год назад

hii can u help to designn custom ocr

@yogeshmodi392 Год назад

In which format labels should be there for a custom image dataset for hcr and image hai only single character in it , labels format by I mean the content in the label file with respect to image.

@user-pu6sl8fg1d 5 месяцев назад

could you help me how i can create with my own image datasets for ocr creation

@AyushGuptaAyushgupta Год назад

thank you sir ill will use it for project

@PyLessons Год назад

Your welcome, for sure use it!

@hemantchauhan6437 4 месяца назад

NEED HELP! I am making a website where user can upload a pdf but I want that pdf to upload only if that pdf has images of only HANDWRITTEN text. Thank you for reading.

@atharvamahalle4825 Год назад

how much time it took you to train dataset

@MemesNFacts 5 месяцев назад

please help when i try to open the database website, it never opens. i really need the database

@SHUAIZHANG-rs7vn 2 месяца назад

My teacher has asked me to do word and text line projects and just reading tutorials and code is a bit overwhelming for me. Is there any relevant references so that I can understand the structure of the model and how the algorithms are implemented, I would appreciate if you can give me an answer.

@SHUAIZHANG-rs7vn 2 месяца назад

Hello, is there any relevant paper reference for word recognition and text line recognition, I would like to know more about the principle of their implementation.

@lilfeccibraemusic Год назад

hi sorry how do i test this model on my own images

@andrewpang7343 Год назад

hey, thx so much

@PyLessons Год назад

You're welcome!

@frog_ictu 6 месяцев назад

very good

@PyLessons 6 месяцев назад

Thank you! Cheers!

@vkrts9176 Год назад

Can we send a sentence in the form of an image to get the predictions?

@PyLessons Год назад

yes we can, that's what about my next tutorial will be

@sruthigayathrisrinivasan9141 6 месяцев назад

Can i use this code for tamil text recognition

@yashkewlani2878 Год назад

Can you please tell me how can we take input from our side after training the model with datasets ??

@PyLessons Год назад

There is an example where I do so with test data after training, simply replace it with your files

@businessmanagement4848 5 месяцев назад

Where can I get dataset?

@emalalekozai Год назад

In which Python library is the function "ModelConfigs()". When executing the command "configs = ModelConfigs()" I get the error message "NameError: name 'ModelConfigs' is not defined". Thanks.

@PyLessons Год назад

"ModelConfigs" is an object, that you must import

@shashidhardevraj 11 месяцев назад

Thanks a lot for this video! Really appreciate your effort and time in making this video and documentation. Will your code helps in converting a handwritten paragraph from an image into a text output?

@PyLessons 11 месяцев назад

It can't work with full paragraph, it's challenging task. You need to use opencv to separate each paragraph line, this would be easier to train and would work better

@boitumelorethabile4102 8 месяцев назад

@@PyLessons can it work for hand written text on a form with multiple lines? or a table?

@user-yr2cb6ms3r 5 месяцев назад

@@boitumelorethabile4102 did you finish it ? I am trying to work full paragraph. but it doesnt work

@pritamdas2232 Год назад

how to download the data set

@behzadbaghery2090 6 месяцев назад

Nice❤❤

@PyLessons 6 месяцев назад

Thanks 🤗

@cloydquisora4266 Год назад

Is it possible to implement or create an Android application that can recognize handwriting and give a percentage feedback on how accurate the handwritten letter is?

@PyLessons Год назад

Yes, it is but it's not as straightforward as you may think :)

@pancakekiemtienonline6562 Год назад

hi, can i use my custom dataset? my text like "hình ảnh", its latin

@PyLessons Год назад

yes, you can

@eznex5249 3 месяца назад

thanks for video is it possible to use this model without doing the training process? thanks

@PyLessons 3 месяца назад

Main tutorial purpose to show how to train such model, but yes you can use it if it’s enough to you

@user-du7js7jx3d Год назад

Thank you, Sir! Your work is truly amazing! Can I use it in my project?

@PyLessons Год назад

Thanks, yes you can :)

@tanvir_ovi010 3 месяца назад

Hi, I have watched full series thanks for the good work. Can the model.h5 for the hand written word converted to tflite? I want try the possibility to use this as a ocr for mobile devices

@PyLessons 2 месяца назад

Yes, absolutely. I haven't tried but you should be good with it

@baludatascience3094 Год назад

Hi Sir, thanks for the video. However, will this works on table like handwritten docs? let me know if any tutorial you are planning for...

@PyLessons Год назад

You need to separate each sentence, otherwise it may not or it would be really hard to train

@andrewperry6187 6 месяцев назад

Great Video! I am trying to download the IAM dataset, but I have not gotten a verification email from them. Anyone have any suggestions or help?

@Andrew2stronggaming 3 месяца назад

same

@acronym6589 7 месяцев назад

i installed everything correctly but i get errors on importing anything from mltu.tensorflow or mltu.annotations. Is there a fix?

@PyLessons 7 месяцев назад

Hey, thanks. I created mltu 1.1.8 versiopn, that should solve your issues, try it out :)

@user-vn5gs2pb3j 3 месяца назад

Hello sir! This model will be work on current version of Python and Tensorflow?i am using version 3.11

@PyLessons 3 месяца назад

I haven't tested it with python 3.11 and and latest TensorFlow, but as I know it may be not compatible with latest version of TensorFlow

@franskilyrics4139 Год назад

Can i convert the model.h5 to tflite?

@PyLessons Год назад

Yes, you can

@vantanle4720 Год назад

can u give link for download IAM database, your link dont work :(

@PyLessons Год назад

It works, sign up before downloading, fki.tic.heia-fr.ch/databases/download-the-iam-handwriting-database

@coelhucas Год назад

hey sir, thanks for the video, is very helpful. im having some troubles: words = open(stow.join(dataset_path, "words.txt"), "r").readlines() FileNotFoundError: [Errno 2] No such file or directory: 'Datasets/IAM_Words/words.txt' aparently, the dataset was to be one file txt with the words, but after extract i have a lot of folders and inside them the txt files.

@SonamSharma-ot9uo Год назад

have u resolved this error?same error i am also getting

@umandadikwatta178 11 месяцев назад

Can you describe how to OCR memes dataset, that include complex background

@PyLessons 11 месяцев назад

You asking how to extract text from memes? Segments text from these images, then crop text and use with OCR :)

@cloudartwork Год назад

I registered for IAM db so I can download data. How long does it take for them to email me back? Thanks for the video! Subbed!

@PyLessons Год назад

Usually you dont need to wait for email back, as I remember

@vaibhavsinghrathore5897 10 месяцев назад

Hey, can we run this program in macOS?

@PyLessons 10 месяцев назад

Don't have macOS, can't test, but it should work

@olenkanamaka4636 Год назад

Hey , can you please help me? I have problem on this step model.fit( train_dataset, validation_data=val_dataset, epochs=configs.train_epochs, callbacks=[earlystopper, checkpoint, trainLogger, reduceLROnPlat, tb_callback, model2onnx], workers=configs.train_workers ) It's seems that fit() method is not able to handle the train_data_provider and val_data_provider inputs correctly.

@PyLessons Год назад

I can't tell you, there is no error no nothing. Next time write issue on github. How I can be sure that you feed data provider to fit function not a simple list of data?

@olenkanamaka4636 Год назад

@@PyLessons hey can you help with this import? Help on package mltu: NAME mltu PACKAGE CONTENTS augmentors callbacks configs dataProvider inferenceModel losses metrics model_utils preprocessors transformers VERSION 0.1.5 FILE /usr/local/lib/python3.9/dist-packages/mltu/__init__.py from mltu.utils.text_utils import ctc_decoder, get_cer No module named 'mltu.utils'

@PyLessons Год назад

@@olenkanamaka4636 its bug in 0.1.5 version, udpate to 0.1.7 version

@olenkanamaka4636 Год назад

@@PyLessons thank you 💛

@user-dv5kn1gg7l Год назад

Thank you for the great tutorial. What Python version is used. I have 3.11.4, it is not very happy with this.

@PyLessons Год назад

Hey, right now, I recommend using 3.10, on the next release I'll check why it's not happy with 3.11 :D

@user-dv5kn1gg7l Год назад

Thanks.@@PyLessons

@user-yr2cb6ms3r 5 месяцев назад

@@user-dv5kn1gg7l I agree with you. When I use python 3.11 it doesnt work

@nareshmalviya3100 4 месяца назад

When i pass whole image. Can it detecr all the text in one shot

@PyLessons 4 месяца назад

Yes, that's what it does when we use CTC loss

@jayathakur6928 Год назад

Can this model predict external images that aren't derived from datasets?

@PyLessons Год назад

Yes if images are at least similar to dataset

@SHUAIZHANG-rs7vn 3 месяца назад

Did you do it? Please teach me how to recognize my own pictures.

@saadmasood4956 4 месяца назад

I have to build this project but Real time like by using a camera which reads the text and recognizes it real time so can you guide me aur is there anywhere you know i can get the tutorial for this project? Thankyou

@PyLessons 3 месяца назад

Usually you won't get exact tutorial to your need, you have this my tutorial, that should help you a lot!

@bhanusri3732 7 месяцев назад

I couldn't find documentation for dataprovider in tensorflow. Could you help me?

@PyLessons 7 месяцев назад

Hey, you need to check tf.keras.utils.Sequence object

@bhanusri3732 7 месяцев назад

@@PyLessons thank you

@user-yr2cb6ms3r 5 месяцев назад

is it normal in epoch 107 early stopping

@PyLessons 5 месяцев назад

Hey, it depends on your early stopping parameters and etc. You need to check how tour model was trained

@QzBoy Год назад

Hi Bro, is your solution applicable for Chinese HanZi ? thanks a lot!

@PyLessons Год назад

Hey, havent tried but i think it should work

@QzBoy Год назад

@@PyLessons Cool, thanks Bro!

@SHUAIZHANG-rs7vn 3 месяца назад

Hi, I have an AttributeError: “ImageToWordModel” object has no attribute “input_shapes”，What can I do to fix it?

@PyLessons 3 месяца назад

it seems like you are using latest tutorial code from github with older mltu version, use latest mltu in this case

@SHUAIZHANG-rs7vn 3 месяца назад

@@PyLessons Problem solved. I would like to know how to do text recognition with my own images without labels, do I need to preprocess the images?

@ruckydelmoro2500 Год назад

Sir i would like to ask im learning AI. Can i test it with my own data? and how?

@ruckydelmoro2500 Год назад

I would like to test with my images data after training. How can i do that?

@peterj1298 Год назад

can it predict words that are not from the dataset?

@PyLessons Год назад

YES! That's the whole purporse on this tutorial, that's why we use validation data

@illiahimself Год назад

@@PyLessons how can we implement that type of task here?

@ProgrammingForStudents 3 месяца назад

Hi, I am facing following error : line 6, in from mltu.utils.text_utils import ctc_decoder, get_cer ModuleNotFoundError: No module named 'mltu.utils' Can you please check why this error is occurring although i have installed mltu using ---> pip install mltu==0.1.5

@PyLessons 3 месяца назад

You can install newest version and use tutorial code from github

@ProgrammingForStudents 3 месяца назад

@@PyLessons Thanks

@rodrigoillas6085 Год назад

Ouch, internal server error. 502

@ProgrammingForStudents 4 месяца назад

Is this work with Arabic language as well?

@PyLessons 4 месяца назад

I don't know, haven't tried. But it should

@ayushnauriyal8527 Год назад

Sir when i am trying to load this model after saving it as .h5 file using model.save() and it is showing error unknown ctc loss function used when i try to load the model can anyone help me with that

@PyLessons Год назад

load_model(path, compile=False) try this

@ayushnauriyal8527 Год назад

@@PyLessons thank you it loads just fine now

@ayushnauriyal8527 Год назад

Ayush Nauriyal After loading the model i performed the prediction on the image img_path = '/content/a01-000u-s01-02.png' import numpy as np import cv2 img = cv2.imread(img_path) img2 = cv2.resize(img, (1408, 96)) img2 = np.expand_dims(img2, axis = 0) img2.shape predic = model.predict(img2) predic after doing this i am getting the prediction like this array([[[1.2415222e-09, 7.0665460e-11, 1.3446735e-09, ..., 4.2244028e-11, 4.6788357e-10, 9.9999982e-01], [9.2563290e-11, 3.5853189e-12, 1.3980632e-09, ..., 1.8043905e-12, 4.3966573e-11, 9.9999994e-01], [9.2093624e-11, 8.1774344e-13, 6.3274475e-10, ..., 2.2918776e-13, 1.6017824e-11, 9.9999994e-01], ..., [1.4345783e-10, 3.5449276e-12, 7.9118561e-09, ..., 7.7375151e-13, 1.8447799e-11, 9.9999994e-01], [1.9966975e-10, 2.5930644e-12, 6.7138979e-09, ..., 9.9339905e-13, 1.3786991e-11, 9.9999994e-01], [2.6358316e-09, 7.7780477e-11, 3.4101092e-07, ..., 1.6719615e-11, 3.1711211e-10, 9.9999934e-01]]], dtype=float32) how do i get it converted to text form ?

@user-wt6zq8wl8f Год назад

can u help me to create a custom ocr

@PyLessons Год назад

I already helped by creating this tutorial

@jay-uw9rx Год назад

How to find the accuracy sir

@PyLessons Год назад

For words accuracy we use CER (Character Error Rate) I introduced it in this tutorial, read text version tutorial or watch full video :)

@riswangp Год назад

bro for download the datasets is error

@riswangp Год назад

i used the codes that you've given

@Harregarre Месяц назад

19:24 is too relatable

@user-hs2od8dy7q Год назад

sir i am facing problems installing mltu package in python 3.7 using conda.Plzz help

@PyLessons Год назад

tried latest version, what error you face? try to update python version, many libraries doesn't support 3.7 anymore

@abhideep2004 Год назад

@@PyLessons sir the link to iam dataset is not working. Please help sir

@vigneshvicky6720 Год назад

Excuse me sir I am doing a project which is to recognize just letters not words and also a pen or pencil drawn line on paper... Plz plz help me sir I need a dataset first

@PyLessons Год назад

I can't help you getting a dataset, you can use mnist dataset for letters

@vigneshvicky6720 Год назад

@@PyLessons how mnist used for letters

@PyLessons Год назад

@@vigneshvicky6720 sorry, its EMNIST not mnist

@vigneshvicky6720 Год назад

@@PyLessons sir , dataset that u used to regonize hand written digits using yolo v3 is made by yourself??

@PyLessons Год назад

@@vigneshvicky6720 no, it's mnist dataset

@adamofucci4558 Год назад

🤗😘

@parikshitbarua8520 Год назад

While training it is taking lot of time. For 1 epoch it is taking around 14 minutes. For you it is showing 1221 per epoch but for me 5461 per epoch

@parikshitbarua8520 Год назад

At this rate it will take 10+ days. Can you help me on this?

@PyLessons Год назад

What gpu you have, you sure you train on gpu?

@calioutmyname 7 месяцев назад

I am using AMD Radeon Graphics, and I cant run tf with gpu, any suggestions how to run the training efficiently? @@PyLessons

@RamPrasad-vg5ii Год назад

hello sir, im getting the following error Traceback (most recent call last): File "d:\Ram ew project fy\mltu-main\Tutorials\03_handwriting_recognition\train.py", line 74, in configs.save() File "C:\Python310\lib\site-packages\mltu\configs.py", line 16, in save stow.mkdir(self.model_path) File "C:\Python310\lib\site-packages\stow\stateless.py", line 199, in mkdir return manager.mkdir(relpath,*args, **kwargs) File "C:\Python310\lib\site-packages\stow\manager\manager.py", line 866, in mkdir return self.put(directory, path, overwrite=overwrite) File "C:\Python310\lib\site-packages\stow\manager\manager.py", line 638, in put source = self._findArtefact(self.abspath(source)) File "C:\Python310\lib\site-packages\stow\manager\manager.py", line 215, in _findArtefact return manager[self.abspath(source)] File "C:\Python310\lib\site-packages\stow\manager\manager.py", line 71, in __getitem__ return self._loadArtefact(path) File "C:\Python310\lib\site-packages\stow\manager\manager.py", line 188, in _loadArtefact raise exceptions.ArtefactNotFound("Couldn't locate artefact {}".format(managerPath)) stow.exceptions.ArtefactNotFound: Couldn't locate artefact /Users/USER/AppData/Local/Temp/tmptvef7ore . can you please help me

@PyLessons Год назад

Try to pip uninstall stow and then pip install stow

@RamPrasad-vg5ii Год назад

@@PyLessons what is the python version I have to use

@PyLessons Год назад

@@RamPrasad-vg5ii I am using 3.10, but it shouldnt be a problem with python, what OS you use?

@RamPrasad-vg5ii Год назад

@@PyLessons window 11

@RamPrasad-vg5ii Год назад

I uninstalled and reinstalled python latest version and got error with downloading mltu

@user-ig9eu8ew9w Год назад

Does that work with Arabic please??

@PyLessons Год назад

Should work, I didn't tried

@pritamdas2232 Год назад

zip file download link not working any one help please

@PyLessons Год назад

I don't know, for me it works...

@pritamdas2232 Год назад

@@PyLessons how many days ago you try

@PyLessons Год назад

@@pritamdas2232 Ok, it seems it doesn't work anymore, download from official link fki.tic.heia-fr.ch/databases/download-the-iam-handwriting-database I need to find another working link...

@comendantcristian3413 Год назад

@@PyLessons This link does not have the .txt file, only the words. Can you suggest me what to do ?

@PyLessons Год назад

@@comendantcristian3413 download fki.tic.heia-fr.ch/DBs/iamDB/data/ascii.tgz it has words.txt

@jasonjunio388 8 месяцев назад

no audio

@PyLessons 7 месяцев назад

need to turn it on then :)

@kendrickcasanova9938 Год назад

💕 promo sm

@modestebolina3054 Год назад

Hey Can I pls get your contact? I need your help to make the project run 'cause I've tried so hard but sill encountering mistakes...it's really an emergency and hopefully to get a feedback,thanks

@atharvamahalle4825 Год назад

Sir how can I solve this error please reply it's very urgent image = cv2.resize(image, self.input_shape[:2][::-1]) cv2.error: OpenCV(4.6.0) D:\a\opencv-python\opencv-python\opencv\modules\imgproc\src esize.cpp:4052: error: (-215:Assertion failed) !ssize.empty() in function 'cv::resize'

@PyLessons Год назад

read an error, either a size in None or image is none