Тёмный

Training a spaCy SpanCat Model to Annotate in Texts more quickly in Prodigy | SpanCat 03 

Python Tutorials for Digital Humanities
Подписаться 26 тыс.
Просмотров 1,9 тыс.
50% 1

GitHub repo: github.com/wjbmattingly/youtu...
Join this channel to get access to perks:
/ @python-programming
If you enjoy this video, please subscribe.
✅Be my Patron: / wjbmattingly
✅PayPal: www.paypal.com/cgi-bin/webscr...
If there's a specific video you would like to see or a tutorial series, let me know in the comments and I will try and make it.
If you liked this video, check out www.PythonHumanities.com, where I have Coding Exercises, Lessons, on-site Python shells where you can experiment with code, and a text version of the material discussed here.
You can follow me at:
/ wjb_mattingly

Наука

Опубликовано:

 

23 июл 2023

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 20   
@BSP77
@BSP77 10 месяцев назад
This is a wonderful series! I look forward to the next video, thank you!
@pcxxy
@pcxxy 6 месяцев назад
super helpful video, looking forward to video 04 keep up to great work!
@GrahamAndersonis
@GrahamAndersonis 7 месяцев назад
Thanks! What video do you recommend after this spancat 03? Feels like there is more to know here
@python-programming
@python-programming 7 месяцев назад
Thank you so much for this!! 🎉
@python-programming
@python-programming 7 месяцев назад
There shoud be an 04. It looks like it never uploaded to youtube. I will be sure to fix that soon!
@paulmiller591
@paulmiller591 11 месяцев назад
Very helpful Cheers!
@python-programming
@python-programming 11 месяцев назад
I'm so happy to hear it helped!
@dariaglushkina2036
@dariaglushkina2036 7 месяцев назад
Hello! Thanks a lot for your tutorials! Could you please make a new video on how to correctly create and modify config files? I've tried to train a spancat model upon en_core_web_lg and en_core_web_trf models (I want to have both ner and spancat), but it did not work because of some errors in config files. I think this topic will be very useful also for others. Thank you again.
@python-programming
@python-programming 7 месяцев назад
Absolutely! I will try to do that soon! Thanks for the idea!
@jsr7599
@jsr7599 5 месяцев назад
Does part 4 not exist because you ran into issues with it not predicting anything? Confused on why there’s no docs online about finishing this spancat process, but a lot of posts online about it not predicting correctly
@python-programming
@python-programming 5 месяцев назад
Thanks for the comment! No, it works fine. I lost the video footage and I need to re-record it. I'm trying to get it done ASAP. I use it on a lot of projects and spancat does work well. You need more training data for it, typically.
@davidrussell9662
@davidrussell9662 4 месяца назад
Please do. I was looking forward to it@@python-programming
@GrahamAndersonis
@GrahamAndersonis 7 месяцев назад
For spancat, is it better to treat a section of sentences as single doc for tagging, or is it better do tag sentence by sentence. In my case , there are sentences, external doc references, tables, figures, code, and other stuff that describe a section.
@python-programming
@python-programming 7 месяцев назад
It depends on how much context is needed to accurately predict a span. If it relies on larger context, go larger (up to 250 tokens or so).
@GrahamAndersonis
@GrahamAndersonis 7 месяцев назад
@@python-programming if the token size is larger than 250, do you simply make section 1a, and section 1b? In my case I have some control over where I divide the section.
@python-programming
@python-programming 7 месяцев назад
spaCy will automatically handle the chunking of the text for you when you run the model. This is just for training the model. If you have some control, then yes, just find a natural breaking point and separate there (such as a paragraph)@@GrahamAndersonis
@GrahamAndersonis
@GrahamAndersonis 7 месяцев назад
@@python-programming for future ref, do you consult and/or have a discord?
@python-programming
@python-programming 7 месяцев назад
@@GrahamAndersonis I do! You can reach me via the form on my site: wjbmattingly.com/
@shawnmarcy4413
@shawnmarcy4413 11 месяцев назад
🎉🎉🎉
@judithnathanail3742
@judithnathanail3742 2 месяца назад
Enjoyed the video. Would love to see a video using the Prodigy pdf plugin - Prodigy_pdf - to annotate some pdfs in Prodigy and then train a model in Spacy (or something else); followed by applying the created model to some unknown pdfs. Lots of humanities materials are pdfs. There is a nice video on annotating papers (ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-rwyze49ne8I.html) but to be useful, we need to use the annotated output to train a model.
Далее
3M❤️ #thankyou #shorts
00:14
Просмотров 7 млн
THE PRODIGY SYNTH TUTORIAL
37:09
Просмотров 484 тыс.
Bulk Labelling and Prodigy
22:42
Просмотров 9 тыс.