Learn how to accurately do Natural Language Processing (NLP) on twitter data, and use roBERTa model with python for tweet sentiment analysis. Code on GitHub: github.com/meh... roBERTa on Huggingface: huggingface.co...
Great tutorials! All your code can run successfully and they are the most up-to-date! I am liking all your videos and look forward to your future ones!
Wow man!!! your tutorials are super easy to follow. I came across your channel searching something randomly. Wish you the best to grow this channel and bring more such videos.
Very very good tutorials! Both the quality of the video and the quality of how you explain the different code + a very good choice on what kind of video to make! 🙌 Keep going Is it possible to have a video on how to catch the number of tweets that have been posted in the past x months/years from one specific account? Thank you
Ausome ! As an accountant, i am interested in getting into Python for various analysis . please which book to learn from from scratch and comprehensively.
Thank you for your instruction. I followed through, made some modifications, and managed to do sentiment analysis for 5000 tweets. It took like 12 hours to do it though, with 7-8 seconds for each tweet. And for some reasons, all of the emojis were corrupted.
How do we know the sequence of the scores is Negative, Neutral and Positive. And everytime the output is going to be 3 scores? Is the output of raw scores given by Roberta cosine similarity? Unable to follow that part? Great insights though
@aispectrum, can I use the same roberto model for Instagram comments? I have the comments but not sure which model I have to use for sentimental analysis. Thanks for your content.
Hey, ur playlist helped me alot. I'm working on twitter data related to some social issues in our society (india). But majority of tweets are in regional languages. Tried using google translator API but results are not satisfactory. Can this model analyze all languages??
i have a question that this sentiment analysis just works on comment that we put here or can we make sentiment analysis from the post of someone's comment/tweets section hope you understand
Hey ! Thanks a lot for your videos, helps so much ! I'm new in Python and I wanted to give a list of tweets (as you did in your video "How to get TWEETS by Python | Twitter API 2022") instead of a single tweet. I succeeded in getting every score for every tweet but it could be great to keep only the best score (and its corresponding label) for each. How would you do that ? Thanks a lot for your content !
@@aispectrum Thanks for your help. Works great but if I try to add it to the dataframe (to have the label and prediction), it is the sames values for all rows. For example "Neutral" and "0,75" for the first row and same for the following ones :/ Sorry for my lack of skills
It is possible to do the same analysis for multiple tweets. After doing the preprocessing on the text (@username to user , etc), just pass the list of the tweets to the tokenizer (just make sure to specify the max_length, and padding=True,truncation=True) and get the encoded_tweet. After geting the output, you can get the scores by writing softmax(output[0].detach().numpy(), axis=1)
@@jesusbaug hi can u send the code of how we can give multiple tweets as input... I have written tweet = ['Great content subscribed' , 'Hiii sir' ] But when i am giving this in : For word in tweet.spli() it is throwing error, can you help please
What are the consequences of using "max_length" with several tweets? Because it's analyzing different tweets with different lengths that I joined in one text. If I use max_length "100", it means 100 tokens, am I correct? Which can be include tokens across different tweets... so, I am really struggling to run it over a large number of tweets.
No more Free Access of Twitter API anymore, They cost you 100$ per month. I started this project and come to at the end while pulling the tweet request.
Hello , found your labs very interesting and well-explained . I have a query sir. Sometime while fetching real - time tweets I get 403 response. I am having essential access on my twitter dev account. What could be reason ?
I believe because you have essential access, you cannot use the API v1.1 (tweepy.API). You could either request for an elevated access or try to use the API v2 instead (tweepy.Client).
Great Video. I am having an error but I don't why. The error is 'RuntimeError: Numpy is not available', I already have installed Numpy, maybe You can help me. thanks buddy
"from transformers import AutoTokenizer [and/or] AutoModelForSequenceClassification" kills the kernel in jupyter notebook. I've tried uninstalling and re-installing stuff, pip and conda, etc. Would love a fix!
@@aispectrum Your solution worked. Thank you! I know this is a very simple question, but how do I modify your code to apply to a df of tweets instead of a single tweet? Presently, I can only do it by creating new variables ('encoded_tweets'; 'scores'; etc.) as empty lists of length equal to the length of my df then iterating through those lists. In R, I would just use "mutate" and it would be incredibly simply and fast.
When I try to reproduce your results, for output = model(**econded_tweet), on a different dataset, I get this error: RuntimeError: The expanded size of the tensor (1114) must match the existing size (514) at non-singleton dimension 1. Target sizes: [146609, 1114]. Tensor sizes: [1, 514] What can I do to fix it?
if anyone is interested: the above query can be achieved using following code (rest of the code stays the same) # sentiment analysis def SA(tweet_proc): encoded_tweet= tokenizer(tweet_proc,return_tensors='pt') output=model(**encoded_tweet) scores=output[0][0].detach().numpy() scores=softmax(scores) ind=np.argmax(scores,axis=0) return labels[ind] df['SA'] = df['Text'].apply(SA)
Does both pytorch and tensorflow use my gpu for the sentiment analysis? I've managed to implement the code in my project, but the sentiment analysis runs very slow. Will it be faster if I switch to tensor flow and how do i do that?