Tokenization in Spacy: NLP Tutorial For Beginners - S1 E8

codebasics

Подписаться 1,1 млн

Просмотров 73 тыс.

50% 1

Видео Поделиться Скачать Добавить в

Опубликовано:

22 окт 2024

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 65

@codebasics 2 года назад

Check out our premium machine learning course with 2 Industry projects: codebasics.io/courses/machine-learning-for-data-science-beginners-to-advanced

@aakuthotaharibabu8244 Год назад

SPACY makes NPL implementation easy just like the way CODEBASICS making NLP learning easy.

@dikshyakasaju7541 2 года назад

Really enjoying this playlist, and I've reached the 8th tutorial already just in 1 day. Thank you for making it interesting!

@codebasics 2 года назад

Glad you liked it 👍☺️

@somendrew Год назад

in One day? Holy Moly , its my 3rd day..., How you mentally prepare for ?

@ajaythapar6169 Год назад

I want to write this every time when I go through your RU-vid videos (earlier Deep Leaning and now NLP).... You are an outstanding educator. Your practice of illustrating complex concepts with pertinent use cases adds an engaging dimension to the learning experience. Your proficiency in simplifying intricate ideas with clarity is truly remarkable. Your sense of timing in presenting crucial details is impeccable, and your suggested reading resources are exceptionally valuable. Thank you for putting your efforts in creating such useful leaning material.

@codebasics Год назад

Ajay, thanks for the detailed feedback and I am glad these videos are helpful to you 👍🙏

@RijalShishir 8 дней назад

Everything was great, but the curse spell you casted at the end is pure magic! 😂

@PrathmeshBodas 9 месяцев назад

Thanks! Your videos are really helpful. You are making great job of explaining complex topics. Thanks once again

@nimishshirodkar 2 года назад

You are the best Dhaval. I have seen many tutorials on different ML/DL/NLP topics but the way you teach is something different. It is very hands on and easy to understand. I really look forward to your videos. I recently did post graduate program in Data Science from Great Lakes but frankly, the teaching you provide is much better than some of the professors I had there. Keep it up!

@saarthaksangamnerkar 2 года назад

Good intro into NLP concepts, Dhawal. Btw, as someone who has worked on a large scale NLP projects here in Toronto, I can vouch that FirstLanguage NLP APIs are right up there with one of the biggest cloud service providers' speech SDK - and at a fraction of cost! And the co-founder is a PhD specializing in NLP herself.

@codebasics 2 года назад

Yup, indeed the co founder is quite knowledgeable and the platform is also very well built. I suggest people to try it out, it saves you a lot of money 💵

@parttimelarry Год назад

Thanks!

@gautamnayak8847 Год назад

Pretty much loved it all on a watching spree 8th lesson in 24hours :)

@Breaking_Bold Год назад

Very nice video ...explaining NLP !!!

@pphantom5037 2 месяца назад

thank you so much for this great explanation and the exercies. great work!

@rajiv7 3 месяца назад

Simply Superb !!! Thanks a ton !!!

@AhmedMostafa-eu3up 18 дней назад

dude loves Dr. Strange! he plays with him all over the playlist😂, but indeed this playlist is great!

@celalrehmanov7052 9 месяцев назад

Thank you for your compliment, I am one of your sensor student :D

@payalGupta-jc4ow 11 месяцев назад

Indeed it's a very good playlist on NLP, but can u please do some hands-on experience on audio files also. i mean if u can help me with the audio files instead of text as a data set,

@ajaythapar6169 Год назад

You are exception the way you expose

@balamuralisrinivasan7297 Год назад

Excellent and insightful

@nadianizam6101 Год назад

Excellent Explanation👍

@datayogi_ 2 года назад

Hi sir, can you please share your views about data analyst jobs in government bodies in india, the pros and cons of that.

@vigneshpadmanabhan 2 года назад

i thats because, Dr. Strange has space inbetween. when its removed. the Dr.Strange is together in one sentence. Thanks for the videos!

@SurajIntelligentBrains 2 месяца назад

00:02 Tokenization is the process of splitting text into meaningful segments. 02:21 Tokenization in Spacy 07:22 Spacy's tokenization splits currency and punctuation into separate tokens 09:39 Tokenization in Spacy involves splitting text into separate tokens based on prefixes, suffixes, and exceptions. 14:48 Tokenization in Spacy allows for the identification and classification of different attributes of tokens. 17:19 Tokenization in Spacy 21:52 The main point of the given subpart is to explore the attributes of spaCy tokens. 24:17 Tokenization in Spacy allows you to type in different languages even with an English keyboard 29:02 Tokenization in Spacy allows splitting the text into segments 31:13 Tokenization is an essential part of the spaCy pipeline. 35:20 Tokenization in Spacy

@santoshsaklani5019 2 года назад

Kindly make some videos on how to vectorize source code for training DL model

@CendrillonSympson Месяц назад

You make wonderful videos! 👏 I have a quick question: 🤷‍♂️ I have a set of words 🤷‍♂️. (behave today finger ski upon boy assault summer exhaust beauty stereo over). How do I use this? 🤨

@shashankk5953 2 года назад

Sir is it possible to create voice recreation?? Please make video on it☺☺☺

@umeshtiwari800 Год назад

Always very good👍

@anilgupta4801 Год назад

Great videos

@leensmits 4 месяца назад

The referred book at page vi: "If you have never studied statistics, I think this book is a good place to start. And if you have taken a traditional statistics class, I hope this book will help repair the damage." 😄

@anirudhsom6590 5 месяцев назад

sir how r u getting recommendation of syntax while u typing the function ?

@harshalbhoir8986 Год назад

Thank you so much!!

@codebasics 2 года назад

Do you want to learn technology from me? codebasics.io is my website for video courses. First course going live in the last week of May, 2022

@jesuyanmifeegbewale3883 Год назад

I made it here. Lets see how far i can go

@jaswanth220 2 года назад

Hello Dhaval, Do you have any tutorial on Spiking nueral network, or guide that could help. By the i have following you awesome tutorials on Nueral networks, thanks a million

@codebasics 2 года назад

I dont have a video on that. But I can make a note of adding that one in the future. Thanks for your appreciation

@PrabinKumarDas001 Год назад

My spacy is tokenizing words like #hello to # and hello, I want to prevent that. Is there something I can do?

@nimishshirodkar 2 года назад

I tried the first problem on the entire pdf using PyPDF2 library but I get some non-urls also picked up

@enggm.alimirzashortclipswh6010 2 года назад

Love from Pakistan 🇵🇰

@ankitverma1790 6 месяцев назад

Why spacy is tokenizing ice and cream separately in "I love ice cream" ?

@pranavkanumuri1441 4 месяца назад

Because they are seperate words

@saurabhupadhyay1015 Год назад

Sir I tried this code: python -m spacy download en_core_web_sm again and again but getting errors. Help

@priyasahu7595 11 месяцев назад

I am facing same problem. Did you find how to correct that issue?

@kirtipant949 3 месяца назад

In my code like_email is giving empty list

@Pooria.Khorrami Год назад

Perfectttttttttttttttttttttttttttttttttt

@Prim0rdiaL7 2 года назад

Data Analytics by Abhay Deol

@codebasics 2 года назад

Ha ha.. you are probably the 5th person comparing me with Abhay Deol. Others have called me Arvind SA and also Satya nadella with hair 😂😂🤗🧐

@geetharajamanickam3744 2 месяца назад

you always say 'this technique will be covered later' but you don't explain those.

@anidea8012 2 года назад

"hindi is the language of my country" , plz don't use this sentence next time. this information is miss leading

@ChildhoodSaver 6 месяцев назад

it is the national language 👍

@ayushbhosale2004 4 месяца назад

@@ChildhoodSaverIndia does not has any national language it was 22 official languages and 2 administrative languages hindi and English. Hindi is not our national language

@ramandeepbains862 2 года назад

Excercise 1 Solution : for token in doc: if(token.like_url): print(token)

@nitinverma_121 Год назад

answer of exercise question 2 is little wrong for cases like " i have 500 $ and the quantity of good people in the company is 10" This is correct: # Extract money transactions = "Tony gave two $ to Peter, Bruce gave 500 € to Steve 10" doc= nlp(transactions) ans= [] count= len(doc) for token in doc: if token.i != count-1: if token.like_num and doc[token.i + 1].is_currency: ans.append(token.text + ' '+ doc[token.i + 1].text) ans