Sign Language Detection using ACTION RECOGNITION with Python | LSTM Deep Learning Model

Подписаться 265 тыс.

Просмотров 398 тыс.

50% 1

Want to take your sign language model a little further?
In this video, you'll learn how to leverage action detection to do so!
You'll be able to leverage a keypoint detection model to build a sequence of keypoints which can then be passed to an action detection model to decode sign language! As part of the model building process you'll be able to leverage Tensorflow and Keras to build a deep neural network that leverages LSTM layers to handle the sequence of keypoints.
In this video you'll learn how to:
1. Extract MediaPipe Holistic Keypoints
2. Build a Sign Language model using a Action Detection powered by LSTM layers
3. Predict sign language in real time using video sequences
Get the code:
github.com/nicknochnack/Actio...
Chapters
0:00 - Start
0:38 - Gameplan
1:38 - How it Works
2:13 - Tutorial Start
3:53 - 1. Install and Import Dependencies
8:17 - 2. Detect Face, Hand and Pose Landmarks
40:29 - 3. Extract Keypoints
57:35 - 4. Setup Folders for Data Collection
1:06:00 - 5. Collect Keypoint Sequences
1:25:17 - 6. Preprocess Data and Create Labels
1:34:38 - 7. Build and Train an LSTM Deep Learning Model
1:50:11 - 8. Make Sign Language Predictions
1:52:40 - 9. Save Model Weights
1:53:45 - 10. Evaluation using a Confusion Matrix
1:57:40 - 11. Test in Real Time
2:20:46 - BONUS: Improving Performance
2:26:52 - Wrap Up
Oh, and don't forget to connect with me!
LinkedIn: bit.ly/324Epgo
Facebook: bit.ly/3mB1sZD
GitHub: bit.ly/3mDJllD
Patreon: bit.ly/2OCn3UW
Join the Discussion on Discord: bit.ly/3dQiZsV
Happy coding!
Nick
P.s. Let me know how you go and drop a comment if you need a hand!

Наука

Опубликовано:

16 июн 2024

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 1 тыс.

@girishkemba3865 3 года назад

I remember some time ago requesting this type of video,but to see that its finally here brings me joy. Can't wait to do this and show to my sign language friends.

@NicholasRenotte 3 года назад

I know right, it's taken a while but finally it's here! Thanks for sharing!

@user-no2xg7mv7h 4 месяца назад

00:01 This video demonstrates sign language detection using action recognition with Python. 01:40 The video discusses the process of sign language detection using action recognition and LSTM deep learning model. 05:16 MediaPipe Holistic allows us to get key points from face, body, and hands 07:17 Setting up webcam access and rendering frames using OpenCV 11:06 The code captures frames from a webcam and displays them on the screen. 12:46 Setting up MediaPipe Holistics and creating variables for MediaPipe Holistic and MediaPipe Drawing Utilities 16:46 The video explains the process of color conversion in sign language detection. 18:32 The process involves detecting sign language using media pipe and a deep learning model. 21:59 The video discusses the different types of landmarks in sign language detection using action recognition. 23:23 The video explains how to detect and visualize different types of landmarks using MediaPipe. 27:05 The video discusses how landmarks in facial and body pose can be connected to each other. 28:37 Implementing sign language detection using LSTM deep learning model in Python 32:18 Landmarks are drawn and rendered in real time using image pass and cv2 33:55 You can customize the formatting of the dots and connections in Sign Language Detection using a landmark drawing spec and a connection drawing spec. 37:32 Updating pose and hand landmarks with different colors and parameters 39:31 Different models in action: left hand, right hand, face, and pose. 43:09 The code demonstrates how to extract landmark values using pose estimation. 45:04 The video explains how to reshape and convert landmarks into a single array. 48:27 Building a neural network and extracting key points using action recognition with Python 50:10 Setting up error handling and placeholder arrays for pose and face landmarks. 53:52 The video explains how to extract key points for sign language detection using LSTM deep learning model in Python. 55:31 Concatenating pose, face, left hand, and right hand keypoints for sign language detection. 59:11 Using LSTM Deep Learning Model to detect sign language actions 1:00:57 Creating folders to store data for different actions and sequences. 1:04:16 Creates a folder structure for sign language detection using action recognition with Python. 1:05:48 Collecting data using MediaPipe loop and capturing snapshots at each point in time. 1:10:14 The code is outputting text to the screen and taking a break at frame 0. 1:11:44 The first block of code prints starting collection in the middle of the screen and pauses. 1:15:10 The code collects key points by looping through actions, sequences, and frames. 1:16:39 Implementing sign language detection using action recognition with a LSTM deep learning model. 1:20:25 Sign language detection using action recognition with Python 1:23:20 Using MediaPipe to collect key points for sign language detection 1:26:55 Creating a dictionary to map labels to numeric ids 1:29:08 Sequences represent feature data and labels represent y data 1:32:26 Data preprocessing and training and testing partitioning are important steps in sign language detection using LSTM deep learning model. 1:34:12 Training LSTM neural network using TensorFlow and Keras. 1:38:14 The model uses LSTM layers for sign language detection. 1:39:55 The next three layers are dense layers using fully connected neurons. 1:43:16 The video discusses the process of formulating a neural network for sign language detection using action recognition and LSTM deep learning model. 1:44:58 Training the model with 2000 epochs 1:48:11 The training accuracy is high at 93.75% after 173 epochs. 1:49:37 The model has three LSTM layers and dense layers, with a small number of parameters to train. 1:53:13 Reloading a deleted model and evaluating its performance using scikit-learn. 1:55:03 Converting y test and y hat values to matrices and then evaluating the model performance using a confusion matrix and accuracy score. 1:58:18 Implementing prediction logic by concatenating data onto sequence and making detections when 30 frames of data are available. 2:00:30 Implement logic to grab the last 30 sets of key points for generating predictions. 2:04:18 Implementing visualization logic and checking result threshold and sentence length 2:06:45 The code checks if the current action matches the last sentence in the string. 2:10:05 Sign language detection using LSTM deep learning model 2:12:39 The video discusses sign language detection using action recognition with Python using an LSTM deep learning model. 2:17:13 The video discusses sign language detection using action recognition with Python 2:19:14 Sign Language Detection using Action Recognition with Python 2:22:29 To ensure accurate action detection, the last frame needs to be included in the sequence. 2:24:04 The code implementation adds stability by checking if the last 10 frames have the same prediction. Crafted by Merlin AI.

@savi-2084 4 месяца назад

I can not thank you enough for all the videos you create i was a noob in tech but the moment i started watching your videos its been a year now and i am so proud of you and myself for coming this far and this project works for me❤

@aminberjaouitahmaz4121 3 года назад

Thank you for these clear, practical, straight to the point tutorials! Looking forward to your future videos!

@NicholasRenotte 3 года назад

Cheers @Amin, so pumped you're enjoying them!

@aqsaqamar1634 Год назад

@Nicholas Renotte can you tell me why error is coming mp 'mediapipr. Python. Solutions. Holistic' has no attribute 'FACE_CONNECTIONS'

@kanchanpatil9642 Год назад

as someone who is following this in 2023, here's some change(s).....i'll be editing them in as they pop in while I go through the tutorial. 25:42 FACE_CONNECTIONS seems to be renamed/replaced by FACEMESH_TESSELATION.And well since we want just the outlines of the face, it's FACEMESH_CONTOURS that we would need in this project.

@taredje4664 Год назад

thanks, you save me

@VanderlanAlves7 Год назад

wow! Thank you very much!!!

@interstellarstar3742 4 месяца назад

hey i can't collect data how i save .

@stinger9231 24 дня назад

Thank You so much, got stuck there for a minute

@yohanessatria2220 3 года назад

Man, you are so underrated and deserve a lot more! thanks a lot for these awesome learning materials! I have learned a lot from you. Keep inspiring, man :)

@NicholasRenotte 3 года назад

Thanks so much @Yohanes! So glad you're enjoying them 🙏

@rainymatch Месяц назад

It's so cool to see how happy Nicholas is when everything works in the end. That's the spirit! Amazing video, thanks a lot for your work man!

@mohammadmehdiNazemi 6 месяцев назад

Thanks for the amazing tutorials! absolutely life-saving. Just a reminder that the z value from mediapipe is with respect to the wrist landmark not the distance from the camera! I found out pretty late!

@leafiadias96 2 года назад

thanks for this amazing tutorial sir , we are working on a project that needed this section and your videos and explanation are being extremely helpful to me and my team ! thanks a lot

@fawwazhameed1104 Год назад

Heyy leafia, could you tell me about your project?

@torstenknodt6866 2 года назад

Thanks, great videos. Would be great if you could elaborate into the differences of the used media pipe implementation, compared to the others you mentioned. I mean really a comparison of the underlying models/ networks and their training.

@engeerdanisme Год назад

Thank you @Nicholas Renotte I just passed my capstone project defense utilizing this deep learning model

@malice112 Год назад

Nicholas is the best machine learning youtuber, his tutorials are interesting and fun.

@gaddesaishailesh2772 3 года назад

I was really waiting for this video!

@NicholasRenotte 3 года назад

IKR, it's taken a little while hey @Gadde Sai Shailesh!

@ibrahimalizada381 2 года назад

Hi, Nicholas! These are great video series to watch and learn! Thank you very much! Can you please prepare a video applying CV on real-time sign language detection on the base of a ready dataset avaliable in Internet? It may be much more interesting if we can see ViT in action recognition as well.

@VarunAditTheGreat Год назад

Hey, I am trying to build a project with a bigger dataset for ASL. Did you find any dataset?

@asutoshpatro2865 Год назад

@@VarunAditTheGreat i have found it its wlsal data set did u make pls share the code link

@ruthogadina757 6 месяцев назад

i'm learning about this, would you like to work on a project together?

@ibrahimhameem1334 3 года назад

Super stuff Nicholas! Super grateful for your tutorials 🙌🏻. Keep up the great work!

@NicholasRenotte 3 года назад

Thanks so much @Ibrahim, soooo much more to come!

@stevecoxiscool 2 года назад

Great explanation on how to use LSTM with pose coordinates.

2 года назад

that's amazing! I watched this video more than a month ago but it seemed difficult for me as a beginner. Then I've tried my best to finished Machine Learning/ Deep Learning/ Python / Tensorflow and some Data Science course within a month. Now watching this video again is like watching a movie! it's easy to follow! love it

@NicholasRenotte 2 года назад

YESSS! That's amazing that you stuck with it, great work man!!

@ruqaiyaali1645 2 года назад

you finished ML/DL/Python and Data science course within a month!! how is this possible man? I am having a hard time with these courses 🥲

2 года назад

@@ruqaiyaali1645 I think you must be familiar with python code. Make sure practice more than what you learn.

@nguyenvietthai5868 8 месяцев назад

@ are you Vietnamese. I see your name. Can you give me some experience please? If so, please respond to me. Thanks a lot.

8 месяцев назад

@@nguyenvietthai5868 Hi there, please let me know your concerns, I hope that I could help you too.

@theethatanuraksoontorn1369 2 года назад

Hey Nicholas, I am working on similar project. Just wondering when I test the model using your metric it does not reflect the same accuracy as the real-time test. I train the model accuracy to 80-90% but the real-time test barely capture any sign language. Do you have any thought?

@Nikos_prinio Год назад

Hi ! I'm impressed by the amazing clarity of your explanations. For one second I thought you must be a trained teacher robot....

@Stacio6 2 года назад

Hi Nicholas thanks so much !!!! I am creating a model to help deaf people here in my country. Greetings from Guatemala !!!

@NicholasRenotte 2 года назад

Awesome stuff!!

@MuhammadKamran-ow5vp Год назад

I have a question. Is it possible to feed video of arbitrary lengths (frames) instead of feeding an action of fixed length video? Because in real time, we perform sign language pretty fast and each action is of arbitrary length.

@mahmudanajnin9367 2 года назад

hey nick! this project is amazing! thank you for these awesome tutorials. You did sign language detection with tensorflow object detection which detects sign using single frame but here we're using multiple frames to detect it. So i was wondering how is this one better than tensorflow object detection?

@NicholasRenotte 2 года назад

Just depends on the use case, the OD model does it on a single frame, this does it for multiple frames (this one is better for signs with multiple phases)

@ishaanverma1969 Год назад

This content is so underrated! Thank you so much!

@T-She-Go 2 года назад

Thank you so much Nicholas 😌 This will help me with my project 🙌🏾

@theethatanuraksoontorn1369 2 года назад

Hi Nicholas, been working on similar project. I believe this tutorial is done for simplicity so I would like to add a piece of my mind. When adding more action, the prediction on the realtime will be mix a lot due to frame overlap and wrong slicing of the frame. I would suggest to show some viz as start and end of the prediction. So the user can follow the start to the end frame. This way it is similar to the way it is collected and higher prediction accuracy.

@rowlandgoddy-worlu3382 2 года назад

This is an amazing video! I have learned a lot following your tutorials. One question - What if you are trying to capture actions that are not of equal time duration. E.g if a sign language like "Good Morning" lasts for 5 seconds and another sign like "Welcome" lasts for 9 seconds. How can this be treated?

@032lovishkumar8 2 месяца назад

hey, i am getting error IndexError: list index out of range while running 2:00:10 , how can i resolve it ?

@rusticagenerica Год назад

Exceptional tutorial. Thank you from the bottom of my heart.

@phoque6 3 года назад

Thank you for a detailed and wonderful mediapipe tutorial :)

@NicholasRenotte 3 года назад

So glad you liked it!

@angelortiz3564 2 года назад

This is so awesome! You can theoretically do the same for the static letters in the ASL alphabet, right? Just make the dataset that contains each hand sign. The model would be train on the keypoints of each hand sign. Although I am not sure it for some hand sign letters, the keypoints would be accurate. What do you think?

@anshumanchoudhary4732 9 месяцев назад

That model would be far more easier to achieve

@eswar7781 4 месяца назад

@@anshumanchoudhary4732which model

@yashas_hm 2 года назад

Hi Nicolas, Such an amazing video. Helped me a lot building a project. I am working on a different project in which I trained the model with around 20 signs from ASL but I am getting a categorical accuracy of only 0.05 on average in each epoch. can you tell me where I went wrong or anything to imporve it?

@martinposso2098 Год назад

hey how you managed to fix that problem?

@depallyyadaiahgoud750 Год назад

That's way cooler one and your explanation was a ton easier 😉 Thanks Nick

@akshatraj5952 9 месяцев назад

Videos that you make is wonderful. Thank you for these practical and clear points in the tutorials.

@usamaejaz5264 8 месяцев назад

MP_Data folder missing ha , tou wo kahan se lae gy

@akshith.vbharadwaj2269 2 года назад

Greetings Hey man this is an awesome tutorial and I completely love the way u have explained the process step by step. It was an awesome tutorial and I completely loved it. I tried it on my own and I have encountered some problems it would be a great help if u could help me out with it. I have followed the same method that u have prescribed on the video these are the problems which came up. Even after getting overall categorical accuracy 95% and above accuracy on training datasets when I do the gesture recognition it is not recognising one gesture. And sometimes it shows the same gesture even though I am showing a different gesture. Sometimes even it is detecting 2 gestures even though I am not giving any gestures. I am always retraining the same data to get a higher accuracy before going to the gesture recognition part. I have also increased a layer in the LSTM model but the results are the same. Would greatly appreciate the help.

@NicholasRenotte 2 года назад

Start with the data, I would add more data of the underperforming classes then retrain. Remember bad data in will lead to bad outputs and vice versa, try adding 20-30 more samples for each underperforming class and give it a go!

@mervesisci4983 2 года назад

Hi Nicholas, Thank you for this amazing tutorial. If we use padding in this case (videos containing movements with different number of frames) how can we make predictions in real-time? In the tutorial you set a fixed length (30 frames) (sequence=sequence[:30] if len(sequence)==30), but in my case there are different frame sizes for each activity in real-time prediction.

@abhisekpanigrahi1033 2 года назад

Hello Nicholas I also have this question. Can you please answer this what if the dimensions are different each time we run

@dikennamaka7408 2 года назад

Hi Nicholas, thanks for this project, it is incredible. How would you handle video files with varying number of frames? How can I possibly approach the situation?

@matteosacco00 22 дня назад

Same question, anyone with suggestions?

@mehmety5012 2 года назад

Great Tutorial Nicholas. Thank you so much !

@user-eg5zp7wi4v Год назад

Thank you so much! You are my best teacher in my college life!!!!

@SABEDIT2914 3 месяца назад

Did you made this project?

@tigre1217 2 года назад

Hi nick! Nice tutorial on this sign language recognition program. I had faced some problems of the categorical accuracy staying the same when im trying to add more signs to the model rather than 3 like the ones you used in the video, is there any way to solve this issue? Thanks!

@raiyan22 Год назад

Hi, are you still working on it?

@labhjoshi3182 Год назад

@@raiyan22 same question

@danieladama8105 3 года назад

Can’t lie.. I have learnt a lot from Nicholas

@NicholasRenotte 3 года назад

My man! Thanks for checking in!

@PIKACHU-zn8fx 3 месяца назад

agreed still learning from him

@fatiha2413 2 года назад

Hi, Nicholas! I learned a lot from this video! Thank you very much!

@amessit10 2 года назад

hiii FATIMA , can we implement this project for 26 letters as i am getting error " list index out of range" when trying to do more that 3 actions

@gustavojuantorena 3 года назад

Wow! This is great Nick! 👏👏👏

@NicholasRenotte 3 года назад

Thanks a bunch Gustavo!!

@y.yuvraj 2 года назад

Hii Nicholas This is really an amazing tutorial I really appreciate it. But I am having an error at fitting the model and it is of 'ValueError' which is "Failed to convert a NumPy array to Tensor". I tried many things but it is not going away so please give me a hand on this.

@another.nikhil Год назад

check the datatype of the inputs in your model. The keras api only accepts numpy arrays.

@amitdutta3875 3 года назад

you are great.

@NicholasRenotte 3 года назад

Thanks so much @Amit!

@soumendas2336 3 года назад

thank you Nicholas i have learned a lot of things from this video ....that I was looking for the past few months..

@NicholasRenotte 3 года назад

Yesss! So glad!

@rabiraj1387 3 года назад

Awaited Video Nicholas hope to complete it and implement on my side.

@NicholasRenotte 3 года назад

I know, can't believe it's finally out! Let me know how you go @Rabi Raj!

@girisathvikavpragatiengine309 Год назад

Hey Nicholas, the tensorflow version of 2.4.1 is showing an error. It says " Could not find a version that satisfies the requirement tensorflow==2.4.1" please help me out

@alissiazaidi2631 4 месяца назад

hey, did you find the solution ? Actually, I have the same error...

@pareshgupta3288 2 месяца назад

@@alissiazaidi2631 just change the version if it's winows use: pip install tensorflow==2.10.0 if linux: pip install tensorflow==2.16.0

@estebanpozo8702 2 года назад

Hi Nicholas, thanks again for this great tutorial! I am writing this because I would like to learn more about how did you chose your architecture. As you mention, almost all the state-of-the-art papers use a combination of CNN and LSTM. So, I have two questions 1. Would it be possible to get a more detailed explanation on how you build this model? 2. So far, I have reviewed “LSTM: A Search Space Odyssey” by Greff (+ other papers) and the “Neural Network design” handbook by Hagan. Could you recommend me any documents regarding LSTM architectures?

@NicholasRenotte 2 года назад

This is how I normally build stuff: 1. Find a research paper that has implemented a similar model 2. Try building the code for that model 3. Fine tune and iterate (a lot) to get solid performance I wish I had a standard process but it is hyper iterative.

@estebanpozo8702 2 года назад

@@NicholasRenotte thanks! :)

@udoysaha3086 2 месяца назад

Helped a lot.. Everything explained really well.. Thank you so much!

@mansikhamkar1479 2 года назад

Your tutorials are really very productive. Why don't you start your courses for tensorflow, python, etc . on some online learning platforms ?

@dinukii3332 2 года назад

Hi Nicholas! Thank you for your tutorial once again. Quick question, How can I change the code to access a folder that contains a dataset of videos without live capturing them? Really appreciate if you could give an answer :)

@NicholasRenotte 2 года назад

You could loop through each one of the videos by using os.listdir or the tensorflow dataset class then run it through the mp holistic pipeline!

@dinukii3332 2 года назад

@@NicholasRenotte Thank u:)

@HannahCynthia-mu4ct 2 года назад

Heyy. Do you know the exact code to loop through video dataset?

@riadhaoufi9452 2 года назад

@@HannahCynthia-mu4ct i'm looking for it too, i hope he gets to answer up thank you so much for the video brother @Nicholas Renotte

@riadhaoufi9452 2 года назад

@@NicholasRenotte i'm so lost brother :(

@WJ-zq3xo 2 года назад

Great tutorial as usual, Nick! Learning a lot from you :D Did anyone try to use a set of videos instead of recording their own videos? If yes, what did you change in the code base? Kudos

@shrirampareek 2 года назад

Hey! I used some set of videos(26) and was able to get 92% on test dataset however when I tried doing the same gestures using webcam, I get same 4 classes all the time

@amessit10 2 года назад

@@shrirampareek can we implement this project for 26 letters as i am getting error " list index out of range" when trying to do more that 3 actions

@neerajpatil7850 2 года назад

@@amessit10 Same for me ! Have you figured out the why the error ?

@amessit10 2 года назад

@@neerajpatil7850 No man, i closed this project coz i only need hand gestures not full body keypoints

@amessit10 2 года назад

hands occludes , so recognition fails

@MuhammadKamran-ow5vp Год назад

It was really a great tutorial on real time sign language detection.

@ayoobaboosalih3059 2 года назад

Amazing tutorial! Would love to see action recognition done with video transformers!!

@pritishmair9577 2 года назад

Is there a dataset available for this, which has more signs than these 3. If so it will be really great if someone could share it

@vaibhav607 2 года назад

Please, can you reply on the status of this?

@ahmedkalair9862 3 месяца назад

@pritishmair9577 did u find it

@felipepires3453 2 года назад

Hi Nicholas, thanks for the awesome tutorial! I've got 3 questions about the project, hope you don't mind helping me: 1. When training my model, i've 90%+ accuracy very quickly (150 epochs more or less), but all of sudden it dropped to 30% and kinda stabilized until the rest of the execution, how can I fix it? 2. If I want to add more signs after first training my model, I'll have to re-train it? Or can I train just those specific signs separately? How do I do it? 3. After the model is working just fine, it is possible to attach the real time script to an android app?

@howcircle5530 2 года назад

i also wanna know about you 3rd quection.🤓

@NicholasRenotte 2 года назад

1. So accuracy never went back up? Try adding more data for each class depending on what's performing well/not well. 2. You can apply transfer learning, drop the final layer, add a new layer which has the same number of classes as your new signs then retrain 3. Yes, I haven't shown it here as it's probably a whole other video though!

@tigre1217 2 года назад

@@NicholasRenotte Hi Nick, can you elaborate more on the 2nd point? I was quite confused since it is my first deep learning project. Thanks!

@adriamasitoribio 10 месяцев назад

@@tigre1217 hey! diod you figure it out?

@arpanroy2892 2 года назад

Your every video slightly edited , directly goes in my cv 🤣🤣🤣🤣 , thanks for taking care of my future ❤❤❤

@NicholasRenotte 2 года назад

Hahaha, build that experience man and go getem!

@abhishripatil791 2 года назад

Thank you for this this helped me so much with my project esp making the dataset

@Gabbosauro 3 года назад

Hi Nicholas, I've been working on my thesis project about the quality of body movements and I encountered a problem with keras. I see that you feed in the first layer a sequence of constant 30 frames (1 second of video/list of mediapipe landmark object). In my case I have a variable number of frames (i.e. a video containing movements that lasts some 2 seconds (60 frames), some 2.5 seconds (75 frames), some 3 seconds (90 frames), etc., hence with different number of frames), how can I solve this? I looked around and people say that I can apply the so called "padding and masking" which takes the largest number of frames (longest video) and then add a special value to the others (padding) and after that somehow ignore/filter the special number later (masking). But this can't be applied to my case because I would like to have the freedom of variable number of frames during prediction. I hope you understand what I want to ask, otherwise let me know and I will try to clarify it as much as I can. Thank you!

@NicholasRenotte 2 года назад

AFAIK it's the only way to do it, unless you look at something like a sequence to sequence model (I think, don't quote me on that though lol). Padding would be the easiest approach. Set a fixed max length and fill out the frames without detections with a numpy array with zeroes!

@Gabbosauro 2 года назад

@@NicholasRenotte Thank you for the reply! Will the padding influence much the classification? I mean if video1 with movementA lasts 3 seconds and video2 with movementA lasts 1 second + 2 seconds of zeroes, would that cause problems during prediction or do you think it will work well?

@NicholasRenotte 2 года назад

@@Gabbosauro I would prototype and see the impact first. Kinda hard to say without seeing benchmark results.

@Gabbosauro 2 года назад

@@NicholasRenotte Alright, I'll test it out. Thanks!

@Gabbosauro 2 года назад

@@NicholasRenotte What I did and it seems starting to do the training is setting input_shape=(None, number_of_features) so time_steps set as "None" instead of 30, and during model.fit() I give it a batch_generator. ( based on this reference: datascience.stackexchange.com/questions/48796/how-to-feed-lstm-with-different-input-array-sizes ) But sadly the accuracy chart doesn't look good, sometimes it is around 40-50%, sometimes it drops to 20%.

@yousseffarhan8901 3 года назад

لا يمكنك أن تتخيل كم ساعدتني. شكرا جزيلا لك 🙏🏼

@NicholasRenotte 3 года назад

🙏

@matts2581 10 месяцев назад

Excellent instruction! TY very much for sharing! :)

@T-She-Go 2 года назад

Update: I managed to get an accuracy of 98% by changing the activation functions of the LSTM and Dense layers. 😌 Hope that this helps y'all who might be stuck on this Hi Nic 😌 me again 😅 So I'm trying to use a new data set of gestures and I can't seem to get an accuracy >20%. I have tried to change the learning rate, the optimiser, etc, but non of these work 🙈 Is there something that I am missing? Thank you in advance 🌸

@NicholasRenotte 2 года назад

How many gestures and how many classes? For really similar classes I'd suggest adding way more data in order to produce a more accurate model. Also, what activations did you change, curious?

@T-She-Go 2 года назад

@@NicholasRenotte I used 5 gestures, 2 were based on hand movements and 3 were based on head movements. I think I should've added more data because the prediction model could not tell the difference between all the head gestures x_x Also, I changed the ReLu activations to Sigmoid

@mahmudanajnin9367 2 года назад

thank you so much..using sigmoid function really worked for me!

@T-She-Go 2 года назад

@@mahmudanajnin9367 Yaaay :) I'm glad

@mahmudanajnin9367 2 года назад

@@T-She-Go can you tell me how to find out how many labels the confusion matrix is for?? i have 5 classes in my project and yhat = [1, 0, 1, 1, 2, 0, 1, 0, 4, 3]. My confusion matrix gives 5set of arrays..I'm really confused. Is it related to yhat value?

@knd3846 2 года назад

hi .. first of all thanks for your free code to this brilliant work. second, i am a beginner in using python yet i have come too far in running your code. At step 11 i am facing an error that keeps appearing and i am exhausted right now bcz i have spend my whole day in finding a perfecct solution for it. it keeps showing TypeError: only size-1 arrays can be converted to Python scalars. after running plt. imshow coding line ..... plz plz need help...

@xboxgaming4307 2 года назад

Facing same issue .. even i follow all of the same steps ... srsly i need help too

@safamunir1510 2 года назад

I'm having same issue in the coding ... please help us removing this error

@harryfeng4199 2 года назад

did u manage to figure it out?

@knd3846 2 года назад

@@harryfeng4199 nope.. I have tried so mnay different things but its all in vane.. I am at my last step though..

@sowmyacheguri21 Год назад

Hey! Did u figure it out?

@predoca46 Месяц назад

31:06 Im making a project to my school and he's look like your project, and he's function is like your. But, im dont have knowledge sufficient to make this alone. So im watching your video to learn much and complete that, thanks for the video and sorry for my english haha. Send hello to Brazil 🇧🇷 😂

@DarkOceanShark 2 года назад

Thank you so much, Nick! Your video is fantastic and I have to say your method of teaching is top-notch mate. I am using you video for my project to interpret all the 26 letter signs in ASL. Could you please do me a favor of telling me how to train the model using an already availble dataset instead of creating it ourselves, like how it's done in the video? Your help will be much appreciated. Even the suggestion of one of your videos where you use a dataset will suffice.

@amessit10 2 года назад

hiii can u able to do it for 26 letters ????? can u help me ?

@idkidk1774 2 года назад

finally it worked

@idkidk1774 2 года назад

Sir how to increase accuracy

@mrmoody915 Год назад

@@idkidk1774 create a for loop that trains the model each time it then checks accuracy and if it is higher than the previous highest accuracy the model is saved and the new highest accurancy is set

@mrmoody915 Год назад

@@idkidk1774 also just increase data sets

@aqsaqamar1634 Год назад

@@mrmoody915 can you please solve my error

@mrmoody915 Год назад

@@aqsaqamar1634 which is

@jeanpierrebravomendoza6470 3 месяца назад

I'm deaf help

@satyaranjansahoo8431 2 месяца назад

Use caption

@redabenlekehal7271 2 года назад

Brilliant as expected

@LucasEloi 3 года назад

Nice work, thank you for the wonderful video!

@NicholasRenotte 3 года назад

Cheers @Lucas!

@ashurroganathan8632 3 года назад

Always Great Videos :). I have learned many Things from you. Thx

@NicholasRenotte 3 года назад

Thanks so much @Ashur! So glad you're enjoying it!

@barithiachudhan3034 2 года назад

Hai nicholas it was such a wonderful implementation and thanks for sharing it with us

@NicholasRenotte 2 года назад

So glad you enjoyed it!!

@adityashinde6265 Год назад

Wow!! such a helpful video. Thankyou very much

@raneggg 2 года назад

Hi Sir. Could you please explain the basis of the model? Especially, on the part where you used multiple LSTM and DENSE layer as well as the number of units.

@sahanahiremath8945 Год назад

This helped me sooo muchhhh! Thanks.

@danielwang5366 Год назад

Awesome video, thanks a ton for making it, it really helps beginners learn hands-on! A question I had, is it possible to extract the keypoint values from just the hands? I found that the ML model was detecting words even when my hands weren't in frame (i.e. just moving my head or face).

@harrylee97625 3 года назад

Nicholas certainly deserves more views.

@NicholasRenotte 2 года назад

Awww, thanks @Harry. Much appreciated man!

@dantealonso7174 3 года назад

Thanks a lot for this content, I've been learning a lot, you are a god :)

@NicholasRenotte 3 года назад

Keep on learning my guy! Love that you're smashing them!

@alexandregagne4151 2 года назад

Very good video. Thank you for your hard work :) New subs

@Rohan_is_discovering 19 дней назад

Someone just completed his internship with the help of your code and also got a certificate from an IT company

@chamangupta4624 2 года назад

Very good prjoect , very well implemented ,

@FLANCKE Год назад

Thanks for this concise and insightful tutorial! One question: Isnt some normalization or standartization of the features/landmarks necessary before training the model? If not, coulr you explain? Thanks!!

@memsofgamers9479 2 года назад

Best lecture 😍 sir will please make a full video for beginners

@aayushsingh6306 2 года назад

Hey Nicholas! Best wishes for the new year! Really like the content you make, keep it going!! I wanted to ask two questions. 1. Is it possible to get landmarks according OpenPose keypoints instead of BlazePose (the one that Mediapipe uses by default), in Mediapipe? 2. If say I have 100 different sentences, so 100 different action sequences, would it be wise to implement it like the way it's implemented in the video or would you take a different approach? Just wanted to get some clarity in concepts. Thanks!

@viswanthtorati8533 Год назад

Yes, I too have doubt on the second question you have asked. Actually we are doing a project on it with more words but it is not working. So, if you know the answer please let me know so that it can be helpful to our project .

@leetabulo5172 Год назад

@@viswanthtorati8533 hie, in your project which approach did you take?

@kiddicode6897 2 года назад

Wow, Thank you. I like all your video. You are very intelligent.

@sazidshaik4577 3 года назад

Thanks For Considering My Comments And Did with LSTM Love You and Really Good

@NicholasRenotte 3 года назад

Anytime, it was a long time coming but it's here!!

@aashwinsharma1859 Год назад

Wonderful video. Keep posting such great content videos

@magma3683 2 года назад

Hey Nicholas, great tutorial really enjoy this content. Have you considered doing a video for a preview with Gradio? Would be great help If you do an in-depth much like this one.

@ridwanelektro992 2 года назад

Amazing tutorial sir...

@ifrasaleem2041 2 года назад

Hi, Nicholas! Thank you for making this amazing video. Can you please make a video that how we can deploy this model in flutter

@meetvardoriya2550 3 года назад

Another biggeeeeeee on the heap!,amazing sir❤️🙏

@NicholasRenotte 3 года назад

YESSS! The big videos are quickly becoming my fav to make, lmk what you think @Meet Vardoriya!

@scottsobel5546 Год назад

Great video Nicholas! I downloaded the notebook but am having issues with the different versions of the different packages. Which versions of the packages did you use? Thanks!

@hamednasr3078 2 года назад

I wish you recorded all your videos with zoom and font size of 22:30, it is really great 🙂

@NicholasRenotte 2 года назад

Yeah I've gotta work out how to do it, I just can't code with that amount of zoom though @Hamed. Will see what I can do!

@eliashailu2857 3 месяца назад

Great work, thanks.

@Allooustad3awani 2 года назад

thanks for your presentation .

@user-mh6ek3hv3k 6 месяцев назад

Hello Nicholas, I really enjoyed this tutorial. I wanted to ask if there was a way to normalize the x, y and z coordinates to they are not dependent on their position in the frame.

@joshgibson539 2 года назад

I really hope you continue this project.

@NicholasRenotte 2 года назад

I don't think I'm ever going to give this one up until I truly nail it. I feel like we're maybe 50ish percent of the way there. Still a TON of work to do.

@joshgibson539 2 года назад

@@NicholasRenotte I know it requires a lot of data and work to do. Also a project like this that helps people is always a great thing to be working on. I'm glad to see you sticking to it. I really wish SignAll would just release their product instead of making it about money. Their database has I have heard over 300,000+ sign language hand symbol videos labelled. I guess businesses and schools can request the software. But I just know they won't let just anyone touch it otherwise. That just really depresses me to know. I have a cousin that I can never understand when he comes over yet he understands me due to his hearing aid implant. It just sucks... and I think the world needs a solution that's not locked away.

@joshgibson539 2 года назад

@@NicholasRenotte Try requesting data from How2Sign's Github 16,000 vocabulary words (srvk /how2-dataset). just be sure to read their licensing terms before requesting it though if you do. Sorry I don't know many good resources I just want to see the project flourish.

@werehappy121 Год назад

Hi, Nicholas Thanks for this stellar tutorial. Is it possible to train multiple models at the same time?

@iamabdirazak 2 года назад

Hi Nicholas, first I have to thank you so much for ur great awesome tutorials, I'd like to ask how can I reduce the fluctuation of predicting the action, even though ur case was almost close to zero, but in my case I got huge function, and sometimes even if I don't pose or do any trained action it will still predict the wrong pose

@angelgabrielortiz-rodrigue2937 2 года назад

This is such a cool video! Thank you so much I do have two questions: 1. Are the keypoints normalize? I mean, do they have values relative to image size or are they regular values? 2. If I were to use static frames instead of sequences of frames, what type of layers could be used? I just want to detect static hand signs, whihch wouldn't need a sequence of frames

@ruszjea 6 месяцев назад

Hello, I'm working on a similar project. Did you find a solution for this?

@vanmuonha Год назад

Hi Nicholas! Your video is amazing, it helped me a lot. I have the following question: Can we use this method to identify human emotions through facial points? If yes, what points should be paid attention to? Thanks very much.

@manimass5267 Год назад

Hlo

@AyushGupta-kp9xf 2 года назад

Awesome tutorial as always Nicholas ! I am a huge fan! , please tell what changes I should do if I am doing this with just one class (binary classification) ?

@NicholasRenotte 2 года назад

Just train for one class, one label and one set of annotated videos.