No video :(

Live Day 1-Live Session On EDA And Feature Engineering- Zomato Dataset

Krish Naik

Подписаться 1 млн

Просмотров 212 тыс.

50% 1

Видео Поделиться Скачать Добавить в

Опубликовано:

6 сен 2024

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 192

@anandmohite 2 года назад

so just to clarify, UTF8 and Latin1 encoded means UTF8 is used for electronic communication, like the data we process we convert this alpha numeric data into machine language using UTF8 encoding .. and we all know at granular level the machine language is in 0s and 1s, so we need some sort of encoding to convert the incoming English language alphabets into machine language for a machine to process the data ... but when the incoming language has some characters which are not defined in English language then machine will not able to convert it into 0s and 1s cos for it its a foreign word/character not listed in given reference directory which is UTF8 in this case, so we need to provide appropriate reference so in case of incoming data has Japanese character then you can use JIS encoding, in case of incoming data has Latin characters then latin-1 etc ...

@abhimanyukspillai6572 2 года назад

That was very informative brother. Thanks a lot for the explanation 🤗

@purvalohani1725 2 года назад

thanks a lot, it really helped!

@Mars7822 2 года назад

thsnk you m an

@ramachandrakulkarni5083 2 года назад

That's informative thanks you

@srikanthsuroju9384 Год назад

But how to find out that our data has japanese or latin or any other foreign characters in it??

@prashindu 7 месяцев назад

Enjoyed going through this EDA session. Sharing my new learnings from my first EDA session as my thanksgiving. 1) Learnt about encoding errors. 2) Learnt about matplotlib figure sizing use rcParams 3) Learnt about why hue did not work in Seaborn as intended. 4) Learnt to understand the difference between value_counts and groupby().size 5) Loved to see how reset_index was being used time and again. 58:08 We use hue when we wish to differentiate the categories. But remember the colours of the bar graph were already in different colours. Why was that? In Seaborn library, the default palette is to show a mix of colours for each bar, which we subsequently changed to the colours we wanted. So the hue parameter did not do anything additional in this context. But the hue parameter did give us a legend. 1:18:45 We used groupby().size plenty of times. I was wondering when to use value_counts and when to use groupby().size df[condition] returns a series ==> Here we can use value_counts(). We cant use groupby when it is a series. df[df[condition]] returns a dataframe ==> We can use groupby().size here. Also size gives total count of all elements. If used with a groupby, it gives count of all elements for each group, including null values. But if we use value_counts, we get the element-wise total excluding null values.

@antonioarana8002 2 года назад

Is really helpful that someone with real knowledge and experience teaches this kind of hands on real example stuff! so many thanks Krish

@niharikathakur1672 2 года назад

Just completed this session. Now everything seems so relatable and understandable. THANK YOU

@PMKB4 2 года назад

@Uda_dunga 2 года назад

pls help me to understand 😭🙏🏽

@jaitiwari241 Год назад

Placement hue aapki kahi

@bhooshan25 2 года назад

Thanks!

@ajaykushwaha-je6mw 2 года назад

In real project we do not import CSV file, we pull data from mongo or from SQL db, can you please create video on importing data frame from Database.

@iakhileshgupta3553 2 года назад

I cannot open this CSV to read in pandas can you please help me to read it. I'm getting a permission error

@sidindian1982 2 года назад

@@iakhileshgupta3553 save the csv file in local folder ... then assign the path example Data= pd.read_csv(C:\\program file \\user\\folder\\'file_name.csv') Then print data.head() ... First 5 columns & rows come into picture ...

@iakhileshgupta3553 2 года назад

@@sidindian1982 permissionerror: [errno 13] permission denied: 'c:\\users\\hp\\desktop' this is the error message

@sidindian1982 2 года назад

@@iakhileshgupta3553 recheck CMD prompt ... whether pandas is installed or not ...if not then ... Pip install Numpy ... Pip install pandas ....both in cmd propmt In Jupyter notebook .. .import Import pandas as pd Import Numpy as np .. THEN RUN IT .. CLICK ON RUN BUTTON ( it has to reset always whenever you open Jupyter notebook .... Then type ... Df= read_csv('dats.csv') df.head ()

@iakhileshgupta3553 2 года назад

Sir @@sidindian1982 i have pandas & numpy installed i have checked on cmd as pip list it is there & also done what you suggested it says already installed but it won't read the file & gives the above permission error in the terminal

@aryamanbansal1 2 года назад

final_df['Cuisines'].value_counts(sort=True, ascending=False)[:10]

@ankitraj3180 2 года назад

on which basis you define the top 10 cuisines..... and one more thing I tried a code final_df[final_df['Rating text']=='Excellent']['Cuisines'].head(10) so I found this code more efficient on the basis of info. which code is more appropricate??

@dr.madhurinaik 7 месяцев назад

final_df['Cuisines'].value_counts().head(10)

@Arjun-hc7ow 6 месяцев назад

@@ankitraj3180 yepp, correct way

@niteshkuwarbi04 7 месяцев назад

for assignment 1:26:16 final_df['Cuisines'].value_counts().reset_index().head(10)

@Siddhant_Banerjee 4 месяца назад

I tried this `df.Cuisines.value_counts()[:10].index` But both are basically the same thing I guess.

@ashulohar8948 Год назад

Best teaching I am from non tech background even I understand ur teaching 😊 god of data science krish naik

@riteshmukhopadhyay6922 2 года назад

In missing values part the heat map plot is not visible instead if we resize the scale of the graph we will be able to actuallyplot the point. plt.figure(figsize = (15,8)) sns.heatmap(df.isnull(), yticklabels = False, cbar = True, cmap = 'viridis') the figure method will help us to resize the heat map accordingly.

@Uda_dunga 2 года назад

achaaa

@rutu.dances.to.express 2 года назад

Thank you sir for this! I think, you should conduct more such sessions where you assign us such questions related to Data Analytics and then discuss answers

@kangkankalita5221 2 года назад

Awesome session , much better than paid session, please keep posting sir.. thanks lot

@soumya7427 2 года назад

Thank you sir. This is a great session. It is very helpful and everything seems very easy to understand.

@garimaattri4760 4 месяца назад

You make it very easy sir..the way you teach is fabulous....

@anuragthakur5787 2 года назад

Wonderful session sir thank you very much we are catching up please don't get disheartened by viewer counts

@gh504 2 года назад

Thank you sir for this amazing session. Sir please do live sessions on deep learning and NLP

@ankitayadav2690 2 года назад

Superb sir, we are lucky to have a mentor like you

@equiwave80 2 года назад

Thanks for this video. I spent my Sunday morning in a very useful way in brushing up my Python skills.

@AnkJyotishAaman 2 года назад

For the last Assingment which he has given as homework you can replace final_df as what you've coded in your book final_df[["Cuisines"]].groupby(["Cuisines"]).size().reset_index().sort_values(by=0,ascending=False).head(10)

@nimisha9095 2 года назад

Can you please explain??

@MuhammadAhmed-jm1bs 2 года назад

Damn man that's a long code. Or you could simply write : final_df['Cuisines'].value_counts().head(10)

@AbdulHannan-dg6dl 2 года назад

cuisine_names=final_df.Cuisines.value_counts() cuisine_names[:10]

@sumijasukumaran1394 2 года назад

Good live class ,understandable,thank you for the session

@pawankatwe8985 2 года назад

Great session... Thank you

@shanmuganathan6230 5 месяцев назад

final_df[['Cuisines']].groupby('Cuisines').size().reset_index().rename(columns={0:'Count'}).head(10).sort_values(by='Count',ascending=False)

@muhammadowaiskhan6831 2 года назад

I am from Pakistan and I have seen a lot of videos about EDA, But this one is just amazing. It is really easily understandable for begginers. Respect to you Sir!!!

@cyberpro151 Год назад

do you work as a data analyst?

@muhammadowaiskhan6831 Год назад

@@cyberpro151 yes

@shafatnawaz6102 4 месяца назад

Bruh Watching this legend and honestly man today i miss the superchat faeture must be added in youtubePk

@karthikrajendran3394 Год назад

This is convenient data set, no strings in numerical columns, or extra characters. That's a challenge.

@kartik_exe_ Год назад

Hello the explanation is great and i have done the assignment it was super easy: # Finding top 10 cuisines cuisine_counts = df.groupby('Cuisines').size().reset_index() top_10_Cuisines = cuisine_counts.head(10)

@thilak8595 2 года назад

final_df[final_df['Aggregate rating'] == 0][['Aggregate rating','Country']].value_counts() this also workes at 1:11:19

@user-rg6og5en2k Год назад

This was sooo helpful krish sir! U made it like butter for us. I liked everything thank you u keep inspiring us

@antonioarana8002 2 года назад

Prefect explanation! the visualization, the sub-setting , everything, the queries and the observations! GREAT thanks so much... (i just sttruggle a little bit about when to use groupby)

@evelyncusilopez6776 11 месяцев назад

Awesome, thank you Krish!

@user-wi7mt5st2s 4 месяца назад

Thank you so much Sir for the great video

@dikshagupta3276 2 года назад

In cell no 10 you added so many features it is important pls reply nice explanation 👍

@usamashaikh1046 4 месяца назад

Really appreciate your work

@maneeshmm8105 2 года назад

thank you for giving such an amazing things that from your channel....

@talibdaryabi9434 Год назад

assignment: df_combined['Cuisines'].value_counts().reset_index().rename(columns = {'index' : 'Food'}).iloc[:11,:]

@sonaganeshg2301 Год назад

Thank you so much sir. You are like a way to build confidence in me to start data science

@kibetwalter8528 2 года назад

Hi Krish. Please do an example for the difference between using LSTM for classification and LSTM for regression. Explain the difference between using LSTM for the two. Especially for multivariate. You have always been my teacher. I learned machine learning and deep learning from you. No other bootcamp, I didn't do any computer science course in University. Just your RU-vid videos. Thank you so much.

@rithikahuja8203 Год назад

The most helpful video sir thank you so much for your valuable efforts ❤❤

@SACHINGUPTA-in2gj 2 месяца назад

great session sir love u sir 🥰😍

@chaiyanutjirayupat4724 Год назад

You are the best!

@firasathali8044 2 года назад

Hello sir, your contributions are very much helpful to many aspirants. one question why have stopped linear algebra tutorial ?

@palvinderbhatia3941 2 года назад

Hi Krish Amazing video. Thanks alot for all the videos, keep it up 👍 Have a doubt @1.02, why max no of ratings is between 2.5 to 3.4? And not 2.5 to 3.9?

@MuhammedShaheb 5 месяцев назад

Great session

@datasciencegyan5145 2 года назад

we can use import matplotlib.pyplot as plt plt.figure(figsize=(10,4)) for figure size in visualization

@Uda_dunga 2 года назад

yaa its just for background of ur chart

@user-fb9tw6yh1f 11 месяцев назад

Thank you Krish.

@samuelmorales4871 Год назад

Thank you amazing video

@sparshruhela8584 7 месяцев назад

#find the top 10 cuisines final_df['Cuisines'].value_counts().head(10).reset_index()

@skuna1217 2 года назад

Wonderful Content

@sethusaim1250 2 года назад

Thank you sir

@atharv_preeti 2 года назад

Wonderful Krish. Just love it.

@sidnoga 2 года назад

Thank you for the amazing session

@syeedafatima8634 2 года назад

you are just amazing!

@shaelanderchauhan1963 2 года назад

Pronunciation of Cuisines was hilarious 1:26:20 HHAHAHAHAHHHAHAHAHAHAHAHAHAHAHAHAHA. It was an Amazing Video Kudos

@discoverychannel6799 2 года назад

great session thank you I learnt so much

@kareoss 2 года назад

Nice sir

@amolghongade19_07 Год назад

Good learning

@catchursam 2 года назад

Great session. Wish I could have joined online

@javeedtech Год назад

Thanks sir One issue is Krish sir moves his screen rapidly, it is difficult to code along with him, In RU-vid we can pause video and look for that section, but in live class difficult.

@amoldusane9851 2 года назад

nice section very informative.....

@harish00784 2 года назад

💖💖💖AMAZING💖💖💖

@swapnilloharkar9668 2 года назад

Really Helpful.

@HarishKumar-qt3mr 2 года назад

Very helpful content brother

@akashrathore1388 Год назад

Mzza hi a gya

@exclusiveglobaleducation2658 2 года назад

really a great session .

@indranisen5877 2 года назад

Very helpful..

@abelsontenny7537 2 года назад

Using concat instead of merge will result in NaN values perhaps

@riteshmukhopadhyay6922 2 года назад

awesome work,

@deepakrc8956 Год назад

Thankyou sir..

@abhishekrao1097 2 года назад

final_df['Cuisines'].value_counts().head(10)

@sakshirathoree2908 2 года назад

How to change the background color of seaborn plot to black??

@Akash_158 2 года назад

sns.set(rc={'axes.facecolor':'dark', 'figure.facecolor':'dark'})

@dikshagupta3276 2 года назад

Nice session thanku sir In jupyter how we execute all cell in one command

@surajsuryawanshi5182 2 года назад

Osm session ❤👍

@onlymusic2005 2 года назад

You are a source of motivation... Keep up the good work, Krish! May Allah bless you and all your beloved!

@aashishraj685 2 года назад

in case of skewed data, do we need to perform yeo-johnson power transformation and then standard scaling for the SVM model?

@MechTech17 2 года назад

top 10 cuisines ----"zomato.Cuisines.value_counts().sort_values(ascending=False).head(10)"

@isi6402 Год назад

on which bases you are finding top 10 see in real life we prefer the food of that restaurant which have good ratings ?

@ashulohar8948 Год назад

Please make more vedios on different use cases

@siddnrx3943 4 месяца назад

final_df['Country'][final_df['Rating text'] == 'Not rated'].value_counts()

@codecheckAbhi 2 месяца назад

# Find the top 10 cuisines final_df['Cuisines'].value_counts().sort_values(ascending = False).head(10)

@tanish7124 2 года назад

Thank you sir, Where to do we get the notepad which you have worked? is it saved somewhere. please inform

@sagaragalawe1536 7 месяцев назад

About error we can ask chatgpt direct

@geekyprogrammer4831 2 года назад

Amazing session Krish😁

@drprince8766 7 месяцев назад

How to create website video. thank you

@chetak-thegermanshepherdsm141 2 года назад

Sir, Django playlist has been left incomplete, I believe. Please upload more videos on django

@er_ritesh_meshram Год назад

sir please do EDA and ML project.

@kartiksopran1359 2 года назад

Hi sir can you explain how to aggregrate multiple columns in group by

@pythonenthusiast9292 2 года назад

what are the pre-requisites for this series?

@silverstone9952 11 месяцев назад

Pata chala

@pythonenthusiast9292 11 месяцев назад

@@silverstone9952playlist ka pehla vid dekho dost

@pdivyanshupandey104 6 месяцев назад

i m getting error in barplot where we plotting aggregate rating vs rating count (the error showing value error not able to interpret the rating count) at 55.20 mins

@Guy_who_lifts Год назад

ty sir

@colabwork1910 2 года назад

Love you

@sandipansarkar9211 2 года назад

finished coding

@shubhamsingh3122 2 года назад

sir i am unable to open this dataset on my jupyter notebook

@riffatabdulrauf2132 2 года назад

Sir I want to extract features from a text .CSV file, by using TFIDF and Ngram model, and I want the output in sparse matrix, Do you have any tutorial on that plz guide.

@gopikrishna4552 2 года назад

Hi sir, while merging two different data sets. either data shape should be same or different?

@praveentanikella4078 Год назад

Krishna have a lets say confusion rather than doubt as am looking for EDA knowledge if in real time if i was given a data set for doing EDA as am data analyst so my work related to get the insight from the data and to represent to client or have to do Machine learning for predictions also. Please can you give clarity?