How to Fine-tune LayoutLMv3: Fine-tune LayoutLMv3 with Your Custom Data | Part -3 Fine tuning

Подписаться 436

Просмотров 9 тыс.

50% 1

In this tutorial, we will learn how to fine-tune LayoutLMv3 with annotated documents using PaddleOCR. LayoutLMv3 is a powerful text detection and layout analysis model that can be used to extract text from documents. PaddleOCR is an open-source OCR system that supports a variety of languages and document types.
To fine-tune LayoutLMv3 with annotated documents, we will need to:
1. PaddleOCR
2. Label-studio
3. Transformers - huggingFace
Code link : github.com/manikanthp/LayoutL...
LayoutLMv3, Fine-tune, Annotated Documents, PaddleOCR, Text Recognition, Document Layout Analysis, Computer Vision, Natural Language Processing, Deep Learning

Наука

Опубликовано:

10 июл 2023

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 95

@truehighs7845 5 месяцев назад

Hi Mani, I did launch the webserver with auth and I can access the images, uploaded the json, but in Label Studio, if I swap the ocr field to 'img' from 'string' it won't show the image, (brokenData)? Any idea?

@user-es3rp4lz6m 3 месяца назад

What version of transformers do you use? because I'm getting this error when I run main.py : ImportError: cannot import name 'PreTokenizedEncodeInput' from 'transformers' (C:\Users\khaou\AppData\Local\Programs\Python\Python312\Lib\site-packages\transformers\__init__.py)

@richierosewall3035 10 месяцев назад

The best video I've ever seen for layoutLM. Where is the part 4? Audio is not clear at the end of the video. If possible go over python inference so that viewers can understand clearly. Keep rocking 🎉🎉

@AIOdysseyhub 10 месяцев назад

Hi, Thank you very much for the support, Happy to see this comment. I have uploaded the inferencing part 4 as well, Please check out let me if you have any errors, queries you can comment in the respective videos, I will try my best to resolve it. ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-MnpJKKSYJDw.html Thanks you again.

@hopebeats5482 5 месяцев назад

@@AIOdysseyhub error i m getting Traceback (most recent call last): File "d:\Python\visual-ocr-label\src\main.py", line 35, in train_loss = train_fn(dataload, model, optimizer) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "d:\Python\visual-ocr-label\src\engine.py", line 11, in train_fn _, loss = model(**data) ^^^^^^^^^^^^^ File "C:\Users\Saujanya Basnet\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch n\modules\module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\Saujanya Basnet\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch n\modules\module.py", line 1527, in _call_impl return forward_call(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "d:\Python\visual-ocr-label\src\trainer.py", line 33, in forward loss = loss_fn(output,lables) ^^^^^^^^^^^^^^^^^^^^^^ File "d:\Python\visual-ocr-label\src\trainer.py", line 12, in loss_fn return nn.CrossEntropyLoss()(pred.view(-1,4),target.view(-1)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\Saujanya Basnet\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch n\modules\module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\Saujanya Basnet\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch n\modules\module.py", line 1527, in _call_impl return forward_call(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\Saujanya Basnet\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch n\modules\loss.py", line 1179, in forward return F.cross_entropy(input, target, weight=self.weight, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\Saujanya Basnet\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch n\functional.py", line 3053, in cross_entropy return torch._C._nn.cross_entropy_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index, label_smoothing) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ValueError: Expected input batch_size (1664) to match target batch_size (512).

@_sunitgamer_2629 2 месяца назад

If your code works can you please help??

@Lucifer18100 18 дней назад

Hi, thanks for the video. When I run main, I get a "RuntimeError: grad can be implicity created only for scalar outputs". Can anyone help me out how to solve this?

@gibsosmart 11 месяцев назад

Thanks Mani for the detailed video. It helps a lot. Please share the solution for inference as well along with Docker.

@AIOdysseyhub 11 месяцев назад

Thanks for the support, Sure will share the inferencing code. Please subscribe to the channel more such related videos. Thanks again 😃😃

@gibsosmart 11 месяцев назад

@@AIOdysseyhub yes. I did already. In the mean time can you please point out to some code where we can explore in the mean time

@chandanha9532 4 месяца назад

After running the main.py file i am getting the below error, how can I resolve this?? ValueError: Expected input batch_size (1536) to match target batch_size (1024).

@user-ko1ih2oi6j 5 месяцев назад

I am getting following error "ValueError: Expected input batch_size (2048) to match target batch_size (1024)." please help me to resolve this issue

@koyelimajumder8224 10 месяцев назад

Can we annotate multiple values from single key ? For example if there are multiple entries in a pdf or there is a table and I want to extract all the rows for a single table header?

@AIOdysseyhub 10 месяцев назад

You can do it, but ocr will make a single bounding box for the multiple words but you can label each of the bounding box like each word as specific label that you want to do it for multiple words I hope you got my point, if not please let me know Thank you for the support and for reaching out me, please subscribe the channel 😄🙏

@williamliu168 10 месяцев назад

Thank you for the tutorial! The bboxes coming from label studio seems to be from 0 to 100, but layoutlmv3 still requires 0 to 1000, should I multiply by 10?

@AIOdysseyhub 10 месяцев назад

Thanks for the kind words 😌, no need to multiply trace back the error. Recheck it again cause I have done scale up and scale down for the box. Once check if you are giving the variable name correctly. Thanks for reaching out over comments. If you still stuck with the same error. Please let me know. Will try my best to resolve this. Thanks again, please subscribe to the channel for more such videos 😄👍

@mzkhan4023 11 месяцев назад

I am also waiting for the next part

@AIOdysseyhub 11 месяцев назад

Sure, will release ASAP.

@user-cq5wl1jk5r 10 месяцев назад

Please do it ASAP....really helpful@@AIOdysseyhub 🙏

@parthmodi8792 8 месяцев назад

hey I am getting the below error when trying to run "Main.py" file "The tokenizer class you load from this checkpoint is not the same type as the class this function is called from. It may result in unexpected tokenization. The tokenizer class you load from this checkpoint is 'LayoutLMTokenizer'. The class this function is called from is 'LayoutLMv3Tokenizer'." Can you please helpppp

@AIOdysseyhub 8 месяцев назад

Please choice the correct tokenize check if the variable name is correct or not.

@user-dy2me1fk2z Год назад

Thank you for the tutorial and it's quite helpful! I want to know when the inference part will be available?

@AIOdysseyhub Год назад

I really appreciate your awesome support and the kind comment! It means a lot to me. 😊 I'm excited to let you know that I'm hard at work creating the video you've been looking forward to. Your patience is truly valued, and I can't wait to share it with you. Stay tuned for some great content coming your way! Thanks again for being such an awesome part of our community.

@adrienlefevre5352 Год назад

Great tutorial! Looking forward to learning how to load the model and how to predict classes on new document, in your next video!

@gibsosmart 11 месяцев назад

Yes. Please share the inference part as well. soon.

@narendrabhole2534 8 месяцев назад

I'm getting following error while training model "ValueError: Unable to create tensor, you should probably activate truncation and/or padding with 'padding=True' 'truncation=True' to have batched tensors with the same length. Perhaps your features (`labels` in this case) have excessive nesting (inputs type `list` where type `int` is expected)."

@AIOdysseyhub 8 месяцев назад

Please convert your labels from strings to integers( one hot encoding) as mentioned in video, please let me know if it's doesn't help you

@narendrabhole2534 8 месяцев назад

Thanks...yes .yday I noticed it integer was in double quote. After removing double quote it worked.

@HanmantDeshmane-ym5mu 3 месяца назад

@@AIOdysseyhub Hi Mani i was also facing the same error "ValueError: Unable to create tensor, you should probably activate truncation and/or padding with 'padding=True' 'truncation=True' to have batched tensors with the same length. Perhaps your features (`labels` in this case) have excessive nesting (inputs type `list` where type `int` is expected)." I have also tried this solution but its not working for me . i already did the one hot encoding as mentioned in the video plz help with this.

@user-kz3tq4oh7c 10 месяцев назад

Can we train the model using GPU? If yes how do we edit the code to do so? Am training it on Colab but it isnt using GPU to train.

@AIOdysseyhub 10 месяцев назад

while loading the model you need to use torch.device('cuda') to train with GPU

@darwin2k Год назад

when do you publish part 4 - how to use the trained model? Great job!

@AIOdysseyhub Год назад

Sure, i will explain how to do inferencing with the trained model, I am also working on Drag GANs so it's taking time, sure will upload the video and code as well. Thank you very much for the support 🙏. Please feel free to provide feedback. Like the videos if you like it. 😄

@narendrabhole2534 6 месяцев назад

I'm trying this to execute this on colab..however getting following error while executing Main.py code block ......RuntimeError: grad can be implicitly created only for scalar outputs. how we can create this entire script for google colab

@truehighs7845 5 месяцев назад

You can't put the whole main in a cell, because there is no main the scripts need to work into a succession, you have to break it down into functions and isolate input and output of files in the temporary folder or in you google drive that you need to mount and run everything sequentially. You are better off using his code, it needs a couple of tweaks but it's all together working. In the file that produces the label studio json for labelling there is twice a function for creating a url, url that you need to get right because it will be used by label studio to render your image.

@user-es3rp4lz6m 3 месяца назад

I'm getting this error when I run main.py : PreTokenizedEncodeInput must be Union[PreTokenizedInputSequence, Tuple[PreTokenizedInputSequence, PreTokenizedInputSequence]]

@_sunitgamer_2629 2 месяца назад

Is this resolved??

@yeojinkim5100 Год назад

Thanks for the nice video! :) I have some questions. 1. How was your f1 score?, 2. Is it possible to extract the data of a specific key pair like date or invoice number in json format after model training? If so, how?

@AIOdysseyhub Год назад

Hi @yeojinkim5100, In the src code I have not added evaluation metrics and the actual evaluation metrics used in LayoutLMv3 paper was map iou 0.50:0.95, you can use F1 score as well you can customize the code as per you wish, In the video I have just mentioned how we can start training process only, I have actual annotated more around 500+ images in the org and as we cannot share that data but the result were amazing you can also implement it using code mentioned in the video. Thank you for the support and If you have anymore questions please feel free to ask. Thanks again 😊😊

@tejareddy4416 Год назад

@@AIOdysseyhub will this work key value relationship extraction ? Bcz in Funsd Data for every form in annotation file has something called as "linking". It /establishes relation between two fields but i dont see that kind in your data set. Also you should have show the inference too after training the model bcz i don’t think this will work correctly for key values Extraction.

@AIOdysseyhub Год назад

@@tejareddy4416 Hi Teja, Linking of key, value pair will be done in LayoutXLM model which is advance version of LayoutLMv3 model not in this LMv3 model. For showing inferencing I need to label more image for training, but in video I have just labelled 10 images just to show how we can start training using layoutLMv3 on our custom data. I have train LMv3 model with 500+ for my org data which is confidential. But it worked we were able to extract the exact information what we want but linking will not be happening, for that we have applied custom logics based on coordinates values Thanks for watching the video and comment. Please like and subscribe the channel in further video will improve the quality of content and upload more AI related videos Thank you again

@moetezsoltani2182 Год назад

Does it work only for English language??

@AIOdysseyhub Год назад

@@moetezsoltani2182 No It will work irrespective of the language but while training and in inferencing we need to use OCR model which supports the language you want to extract. Thanks for the comment and please like and subscribe the channel for more such videos and in further upcoming videos will improve the quality and content of videos. Thanks for the support again.

@mohammedmuzammilkhan3043 11 месяцев назад

at 4:10 Training_LayoulLMV3 can we use is for training LayoutLM model

@AIOdysseyhub 11 месяцев назад

yes

@user-cq5wl1jk5r 10 месяцев назад

Hii @AIOdysseyhub . Thank you for sharing this content. i am facing an issue while running 'main.py' file . while training my dataset (Training_layoutLMV3.json file) i am getting an error like this, " TypeError: PreTokenizedEncodeInput must be Union[PreTokenizedInputSequence, Tuple[PreTokenizedInputSequence, PreTokenizedInputSequence]] ". The training is ok upto 85% but after that it showing this error. Can you please help me to solve this issue🙏

@AIOdysseyhub 10 месяцев назад

Hii, Thanks for the kind words ☺️ Have you converted labels in json file to integer like one hot encoding, I think that is reason for errors, if yes then check weather they are correctly mapped or not, then also you are getting same error let me know Thanks again for reaching out in comments 😊

@user-cq5wl1jk5r 9 месяцев назад

Hi @AIOdysseyhub . Yes I did the one hot encoding part. I used 6 labels instead of 4 here and I am working on invoice dataset. Here i will provide the link of my 'Training_json' file : drive.google.com/file/d/1zbRoKxabZ8Uc3kr56c-a9IBSeyKIaPdF/view?usp=sharing

@someetsingh1917 9 месяцев назад

got any result?@@user-cq5wl1jk5r

@musaibahmed3145 7 месяцев назад

Did you manage to find a solution for this? Im stuck here even after one hot encoding

@musaibahmed3145 7 месяцев назад

@@user-cq5wl1jk5r were you able to solve it? I'm getting the same error

@avi_9243 Год назад

Can you please share some resource on how to create a dataset for donut model

@AIOdysseyhub Год назад

Hi Abhi, Sure will search for you and if I find any will share it. Thanks for the comment and supports Let me know if we want to make video on donut as well

@gibsosmart 11 месяцев назад

why should we convert the text labels to integers?

@AIOdysseyhub 11 месяцев назад

because model cannot understand direct text, so we are manually doing one hot encoding

@gibsosmart 11 месяцев назад

Thank you for quick response. I realised it when running it with string.

@rajasekark7131 7 месяцев назад

hi bro why you did not split dataset as train,validation.

@AIOdysseyhub 7 месяцев назад

Hi, We split into three sets (train, test, valid) if we want to test the model performance while training, by only splitting into two we are testing the model performance after training is done for that iteration. Thank you for the support. Please subscribe to the channel for more such videos.

@shreyanshsahu8038 11 месяцев назад

Hello @AIOdysseyhub Thanks for making me to understand full flow of LayoutLMv3 in all 3 parts but I am waiting for the 4th part so please can you give me the update

@AIOdysseyhub 11 месяцев назад

HI Shreyan, Thank you for reaching us through comments, Sure will make the inference video as well, Please subscribe to the channel for more AI/ML related videos Thanks again.

@shreyanshsahu8038 11 месяцев назад

@@AIOdysseyhub Yes Sure, I have trained the model but I don't know what's my out put is going to be, From last few days I am waiting for the upcoming video

@RishabhGupta93 7 месяцев назад

Thanks for tutorial: I am getting following error: Traceback (most recent call last): File "F:\PyCharmProjects\LayoutLMTrial\Inference.py", line 51, in op = model(input_ids = inputs_ids.unsqueeze(0), File "F:\PyCharmProjects\LayoutLMTrial\venv\lib\site-packages\torch n\modules\module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "F:\PyCharmProjects\LayoutLMTrial\venv\lib\site-packages\torch n\modules\module.py", line 1527, in _call_impl return forward_call(*args, **kwargs) File "F:\PyCharmProjects\LayoutLMTrial\trainer.py", line 33, in forward loss = loss_fn(output,lables) File "F:\PyCharmProjects\LayoutLMTrial\trainer.py", line 12, in loss_fn return nn.CrossEntropyLoss()(pred.view(-1,5),target.view(-1)) AttributeError: 'NoneType' object has no attribute 'view' can someone please help ?

@AIOdysseyhub 7 месяцев назад

Please check the json file you got from paddleocr output in that json file you need to do onehot encoding manually string to integer as explained in video. Let me know if it working or not. Thank for your support, Please subscribe the channel for more such videos.

@RishabhGupta93 7 месяцев назад

@@AIOdysseyhub thanks for the response. i did exported to json-min format. changed the labels to integer manually. but still getting the error the only difference is that i have 5 classes and you demonstrated only 4 classes.

@AIOdysseyhub 6 месяцев назад

@@RishabhGupta93 number of classes didn't matter, it should be something with code, please post the complete error what you are getting

@user-rm6jw8lw8u 7 месяцев назад

Hello sir can we finetune with jsonl dataset ?

@AIOdysseyhub 7 месяцев назад

Sure, you can convert jsonl into json and then train it, I am not sure directly if we can use jsonl you can give it a try

@user-rm6jw8lw8u 7 месяцев назад

@@AIOdysseyhub yeah sure I'm trying with jsonl I'll tell you after it works

@Mindmap-gv5jg 2 месяца назад

Can you please provide Training_json file

@AjitKumarMCS 11 месяцев назад

Please upload the next video

@AIOdysseyhub 11 месяцев назад

Sure, I am on it

@AIOdysseyhub 10 месяцев назад

Hi, I have uploaded the fourth part inferencing, please check and let me know if you stuck somewhere I will try my best to resolve it Thank you

@tranphu2768 8 месяцев назад

ValueError: Unable to create tensor, you should probably activate truncation and/or padding with 'padding=True' 'truncation=True' to have batched tensors with the same length. Perhaps your features (`labels` in this case) have excessive nesting (inputs type `list` where type `int` is expected).

@AIOdysseyhub 8 месяцев назад

Please track back to which line the error was raising, based on that line we can check where we are doing mistake. If it's not helping, please let me know. Thank you

@tranphu2768 8 месяцев назад

I have resolved this issue, it is in the input json file, instead of the Key Label being of type str I have not changed it to type int. Thank you very much! Can I contact you via social media?@@AIOdysseyhub

@user-rx7td3lr4i 8 месяцев назад

@@AIOdysseyhub facing same issue. Please help Some weights of LayoutLMv3ForTokenClassification were not initialized from the model checkpoint at C:/Users/AshwariyaSah/ASH/LayoutLMV3_Fine_Tuning/inputs/layoutlmv3Microsoft and are newly initialized: ['classifier.bias', 'classifier.weight'] You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference. Some weights of LayoutLMv3ForTokenClassification were not initialized from the model checkpoint at ../inputs/layoutlmv3Microsoft and are newly initialized: ['classifier.bias', 'classifier.weight'] You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference. 0%| | 0/4 [00:00

@user-rx7td3lr4i 8 месяцев назад

@@AIOdysseyhub File "C:\Users\AshwariyaSah\ASH\LayoutLMV3_Fine_Tuning\src\main.py", line 35, in train_loss = train_fn(dataload, model, optimizer) File "C:\Users\AshwariyaSah\ASH\LayoutLMV3_Fine_Tuning\src\engine.py", line 9, in train_fn for data in tqdm(data_loader, total=len(data_loader)): File "C:\Users\AshwariyaSah\.pyenv\pyenv-win\versions\3.10.0\lib\site-packages\tqdm\std.py", line 1182, in __iter__ for obj in iterable: File "C:\Users\AshwariyaSah\.pyenv\pyenv-win\versions\3.10.0\lib\site-packages\torch\utils\data\dataloader.py", line 630, in __next__ data = self._next_data() File "C:\Users\AshwariyaSah\.pyenv\pyenv-win\versions\3.10.0\lib\site-packages\torch\utils\data\dataloader.py", line 674, in _next_data data = self._dataset_fetcher.fetch(index) # may raise StopIteration File "C:\Users\AshwariyaSah\.pyenv\pyenv-win\versions\3.10.0\lib\site-packages\torch\utils\data\_utils\fetch.py", line 51, in fetch data = [self.dataset[idx] for idx in possibly_batched_index] File "C:\Users\AshwariyaSah\.pyenv\pyenv-win\versions\3.10.0\lib\site-packages\torch\utils\data\_utils\fetch.py", line 51, in data = [self.dataset[idx] for idx in possibly_batched_index] File "C:\Users\AshwariyaSah\ASH\LayoutLMV3_Fine_Tuning\src\loader.py", line 32, in __getitem__ encoding = self.processor( File "C:\Users\AshwariyaSah\.pyenv\pyenv-win\versions\3.10.0\lib\site-packages\transformers\models\layoutlmv3\processing_layoutlmv3.py", line 122, in __call__ encoded_inputs = self.tokenizer( File "C:\Users\AshwariyaSah\.pyenv\pyenv-win\versions\3.10.0\lib\site-packages\transformers\models\layoutlmv3\tokenization_layoutlmv3_fast.py", line 330, in __call__ return self.batch_encode_plus( File "C:\Users\AshwariyaSah\.pyenv\pyenv-win\versions\3.10.0\lib\site-packages\transformers\models\layoutlmv3\tokenization_layoutlmv3_fast.py", line 412, in batch_encode_plus return self._batch_encode_plus( File "C:\Users\AshwariyaSah\.pyenv\pyenv-win\versions\3.10.0\lib\site-packages\transformers\models\layoutlmv3\tokenization_layoutlmv3_fast.py", line 670, in _batch_encode_plus return BatchEncoding(sanitized_tokens, sanitized_encodings, tensor_type=return_tensors) File "C:\Users\AshwariyaSah\.pyenv\pyenv-win\versions\3.10.0\lib\site-packages\transformers\tokenization_utils_base.py", line 223, in __init__ self.convert_to_tensors(tensor_type=tensor_type, prepend_batch_axis=prepend_batch_axis) File "C:\Users\AshwariyaSah\.pyenv\pyenv-win\versions\3.10.0\lib\site-packages\transformers\tokenization_utils_base.py", line 764, in convert_to_tensors raise ValueError( ValueError: Unable to create tensor, you should probably activate truncation and/or padding with 'padding=True' 'truncation=True' to have batched tensors with the same length. Perhaps your features (`labels` in this case) have excessive nesting (inputs type `list` where type `int` is expected).

@AIOdysseyhub 8 месяцев назад

Hi @@user-rx7td3lr4i, Are you getting this error while inferencing, or while training the model ?

@tranphu2768 8 месяцев назад

Hi brother, I have exported the json file of label studio, now I want to use it for training on Paddle, I hope you can support me, thank you very much.

@AIOdysseyhub 8 месяцев назад

Sure, I will

@jagdishmudaliyar1645 10 месяцев назад

please sir put the part4 fast

@AIOdysseyhub 10 месяцев назад

Hi, Uploaded the inferencing part, Please check and let me know if you stuck some where ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-MnpJKKSYJDw.html thank you for the support

@user-px1mx5cn9u Год назад

Second half of the video is unclear

@AIOdysseyhub Год назад

Hi, yeah I can understand that, I am making another video from layout lm paper explanation to coding in depth, will try to increase the quality at Max. Thank you for the feedback and please support the channel 🙏