Your vdo are interesting and informative. I have walked through but could not found any playlist on "Comprehensive Data Analysis" covering all aspects of reading files from different sources, databases, pdf and do wholistic analysis. Plus this playlist should cover REGEX
Hi Jonathan, Great idea! I'll add that to the list. In the meantime, you can use this script: import datetime (datetime.date.today() - datetime.date(1991, 4, 24)).days / 365.25 It calculates the difference (in days) between today and the date, and divides it by 365.25. Hope this helps!
@@datagy Thank you!! that works. (the terminal show me the results in days, but the actual .xlsx that I made have the correct results in years). Thank you again big time!
sir, I work on football match prediction so when I remove the Full-time home team goal(FTHG) and Full time away from team goal(FTAG) feature than my algorithm accuracy is 66% but when I add FTHG and FTAG feature than my accuracy is above then 90% so what can I do?
Thank you for the Tutorial. I found this video very useful and interesting. I have question though, what if we don't know the resulting number of columns? Because, without this information, I can't proceed adding column names onto the dataframe.
@@datagy Thanks for the reply. In the tutorial video - text to column section, you've created 2 columns knowing that there will be only 2. There could be instances where we might don't know number of delimiter/or varies. for example a Name column cell containing - 3 names separated by ";" -->"Tom Cruise; Sylvester Stallone; Tom Hanks" and in the 2nd cell "Catherina Zeta-Jones; Angelina Jolie"
@@prashantha14r Hey there! Great question. There are probably much more elegant ways of doing this, but here's one solution! Let me know if you have any troubles. import pandas as pd df = pd.DataFrame.from_dict({'Names': ["Tom Cruise; Sylvester Stallone; Tom Hanks", "Catherina Zeta-Jones; Angelina Jolie"]}) # Count how many columns need to be created (find the maximum number of delimeters that exist, then add 1) number_of_columns = df['Names'].str.count(';').max() + 1 # Create a list of columns using a list comprehension (check out: ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-ojarZZ5MxBk.html for help) columns = ['name_'+str(i) for i in range(number_of_columns)] # Split columns df[columns] = df['Names'].str.split(';', expand=True) Nik