LayoutLMv3 Training with CORD (receipts) dataset

Rajistics - data science, AI, and machine learning

Подписаться 4,3 тыс.

Просмотров 14 тыс.

50% 1

This notebook shows how to Fine-Tune a LayoutLMv3 model for token classification on the CORD receipt dataset. This notebook can be found at bit.ly/raj_layout or at my Github github.com/rajshah4/huggingfa...
Notebook: colab.research.google.com/dri...
I messed up when recording, so you can't see me. It's not as fun, but hopefully still valuable for some of you.
Outline:
0:00 - Introduction
1:27 - Setting up the environment
2:08 - Load the dataset
4:09 - Preprocessing the dataset
7:22 - Defining metrics
8:12 - Model training
11:56 - Inference - single
14:20 - Batch inference
━━━━━━━━━━━━━━━━━━━━━━━━━
★ Rajistics Social Media »
● Link Tree: linktr.ee/rajistics
● Tik Tok: / rajistics or @rajistics
● Medium: / rajistics or @rajistics
● Hugging Face: huggingface.co/rajistics or @rajistics
● Twitter: / rajistics or @rajistics
● Website: rajivshah.com
● LinkedIn: / rajistics
━━━━━━━━━━━━━━━━━━━━━━━━━

Наука

Опубликовано:

4 авг 2024

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 17

@karlschliep5787 Год назад

That was great! Thanks for demonstrating that. I've been meaning to try this model out for a while and you've really helped lower that barrier for entry. Love your work Raj!

@Rajistics Год назад

Glad I could help!

@kumarshivam202 Год назад

There were a lot of duplication in bbox coordinates for words. Do we pass bbox coordinates for sequence/lines or for words?

@user-ew5zb1xx3s Год назад

Would you be able to make a new video on creating and labeling custom data for tokenClassification? I found niels' code hard to follow.

@user-ge5wr5ue1b 11 месяцев назад

How to extract data in key value format in json from this model?

@saurabhghosh7800 3 месяца назад

Can we do multipage multi document type classification with this?

@Kaan474 Год назад

How is prepare custom dataset ?

@ThomasLPacker 11 месяцев назад

What hardware are you using on Colab? Free or not?

@user-ew5zb1xx3s Год назад

During the inference, I'm not sure I understand why we pass the labels to the model again? I get it that we send the words and bboxes (assuming its coming from the OCR) to the newly trained model but what's the significance of the labels? The individual words from OCR don't have particular labels in case of TokenClassification no?

@user-ew5zb1xx3s Год назад

I believe for a regular scenario where you don't have the NER labels, offset mapping is used (Just saw it at the very bottom of your notebook!). Is there any distinction for inferences that run with/without offset mapping? Is there any recommendation on how to create the NER labels? Is this something that comes with off-the-shelf OCR libraries e.g. pyTesseract?

@sebabrataghosh8466 Год назад

How to print label name and predicted value? (not show in image)

@malishakapugamage7052 Год назад

Hey..! Did you figure out how to do that?

@nikitakhamgal9937 5 месяцев назад

getting a error at this snippet but not able to solve it { from datasets import load_dataset dataset = load_dataset("nielsr/cord-layoutlmv3") }

@saivarmauddaraju3977 Месяц назад

I am also getting same error. Did you sorted it

@sketchychillandchill 5 месяцев назад

The model is apache-2 tho

@Rajistics 5 месяцев назад

No it isn't, it is (CC BY-NC-SA 4.0) - github.com/microsoft/unilm/blob/master/layoutlmv3/README

@TheGIVVO97 4 месяца назад

@@Rajistics Apparently only the pretrained models are non commercial; the code itself is for commercial use, so if someone wanted to do the pretraining from scratch it would be possible to use it commercially