Fantastic tutorial! Some of the libs are a bit old now. I got it working on lambda stack with the following changes: 1. Use latest tensorflow-gpu and tensorflow 2. Change "stable-baselines" to "stable-baselines3" 3. Change "MlpLstmPolicy" to "MlpPolicy'" Cheers
Please keep making videos! You're a real treasure explaining RL so well! I'm just learning it at school and you really just helped me understand a lot of it. Thank you!
Just so everyone knows, you need to add df.sort_index() so the data isn't reversed. The model is training and predicting on reversed data. Gym-anytrading does not automatically sort by the date index.
Done, will add it to the list. I enjoyed getting back into the finance side of things. What do you think about extending out the trading RL series a little more?
Extremely interesting and easy to understand! I'd like to learn more about the pros and cons of other gym trading environments. Thanks for all the time you spend producing these tutorials. You're helping a lot of people like me.
Definitely @Vincent. Glad you enjoyed the video. Will find some other trading envs that we can take a look at, I had another one on my list already that I've tested.
Nicholas your work is impressive and community is growing, Perfect. The community always can expect useful set of instructions/information's about AI and as now how to model/teach RL agents. The RL is really awesome but rather very abstract. so it requires lot of studying. Your effort in promoting this branch of AI noticeable. Thanks also for the stable-baselines tips. Have a nice day.
@@blockchainbacker.4740 Hello Nicholas, I do not have any problem with investments since I invest only in education. You provide one of the best YT content. Thanks
Heya @Markus, that was someone impersonating me. They were blocked from the channel, click the user name in case it looks weird next time! But as always, thank you soooo much for checking out the videos. The new RL trading one is out as well: ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-q-Uw9gC3D4o.html
@@NicholasRenotte I can't really understand how someone can be "impolite" generally. Your YT channel is awesome and following mini RL series very impressive. You display somehow very innovative way where RL can be deployed. Good luck with you innovative way of mind.
Can you do an update on this ? Since tensorflow 1.15.0 is not available anymore and seems they changed so much that i just cannot get this to work with tensorflow 2
How do I force it though to not make short trades? I only want to train it to make long trades. Also, I would want it to sell before it tries to buy again (Buy then Sell). Is this stuff configurable with Gym-Anytrading?
Very easy to understand, you're very talented in teaching and explaining. Thank you for your effort. I don't know, whether this was already answered by someone, if so, please ignore this or use it for your own implementations (no warranty), but here is my implementation of the callback to stop when explained variance is between a given threshold: from stable_baselines.common.callbacks import BaseCallback from stable_baselines.common import explained_variance from typing import Dict, Any class ExplainedVarianceCallback(BaseCallback): def __init__(self, minThreshold=0.95, maxThreshold=1.15, verbose=0): super(ExplainedVarianceCallback, self).__init__(verbose) self.minThreshold = minThreshold self.maxThreshold = maxThreshold self.values = [] self.rewards = [] self.window_size = 5 def on_training_start(self, locals_: Dict[str, Any], globals_: Dict[str, Any]) -> None: super(ExplainedVarianceCallback, self).on_training_start(locals_, globals_) self.window_size = locals_['self'].env.get_attr("window_size")[0] def on_step(self) -> bool: self.values = self.values[1 if len(self.values) > self.window_size else 0:] + list(self.locals['values']) self.rewards = self.rewards[1 if len(self.rewards) > self.window_size else 0:] + list(self.locals['rewards']) ev = explained_variance(np.array(self.values), np.array(self.rewards)) if(ev > self.minThreshold and ev < self.maxThreshold): return False pass To use this, you can pass an instance to model.learn(). You may want to specify a custom threshold by giving the minThreshold and maxThreshold attributes to init method of instance. It's not very beautiful, I'm still a newbie to python :) HTH
would be cool if you could throw some basic technical indicators like RSI and OBV into the df to see if it helps the agent. Also, would appreciate it if you could add callbacks to stop training at the optimum level. Keep up the good work and looking forward to the next vid of this series :)
Thank you for such clear, hands-on tutorials on reinforcement learning. I have a couple of questions, though. I've learned elsewhere that an RL agent requires a trade-off between exploration and exploitation. I didn't see this specifically mentioned in this video. Is there a reason for that? Perhaps it's not advisable to use any exploration/exploitation trade-offs in trading algorithms, or maybe this specific RL model doesn't support it. I would appreciate it if you could help me understand these considerations. Additionally, I would love to see an example of an RL agent being trained with new data while in operation. I believe the official terminology is "online learning" or "continual learning." Please consider making a video that covers that topic as well.
If someone have some strange "gym.logger" error try to install gym==0.12.5👍Please mr Renotte think about future-proofing dependencies, in any case great video👍
Hey man your videos are great. Please continue with reinforcement learning so that we can learn to apply RL algorithms. I have been searching for Application of RL algorithms but it is hard to come by and finally got your videos. Really great videos you make. Keep going please.
would really love to see a more in depth video, heck would love to see more video from you. I am learning a lot so thank you! new subscribers and been binge watching your stuff. good works
Could you please explain in depth the best way to find the explained_variance near 1? And also maybe talk about the overfitting of the model and how it perfoms using cv time split. Thank you Nicholas, really good content!
Is there any method by which we can plot the rewards vs the episodes graph? Your videos are really helpfull in learning Reinforcement learning. Thank you
Excellent video! I'm wondering if you could add video on portfolio optimization using RL in the future? This would be more related to real-world trading environment where one needs to determine ratio allocation among multiple assets...
Thanks for the video, nicely explained and very interesting! Do you have any video or information about adding the sentiment to this reinforcement learning model?
RL is hopeless in trading guys. at least for me. Ive been working on it for weeks with stable baselines and my custom trading environment.step system, rewarding strategy ,observation approach... these are very challanging topics. the results are so unstable and far away from the classic algo trading profits. just find some good old indicator and optimize parameters with bayesian ops. scikit api is easy and effective. at least you will get some meaningful results... RL is not ready for market. at least for me :) and thanks for the video. I will check that anytrade environment without hope :)
I think it's still a while away. TBH I'm going to do a ton more research into it over the year as I've read some promising papers. Always room for improvement though!
I haven't used an agent like this, but I wonder how you know if the model has overfit the data when you go forward? I also wonder how a bot like is is supposed to handle anomalous events.
I think it might have been tbh @Varuna, we're going to be handling it properly when we build up our bot for the ml trader series with better backtesting!
@@NicholasRenotte Love to hear that! BTW just saw I was at the same company as you ;). Pinged you at some social media, if you are interested in connecting.
Hi Nicholas, Excellent explanations, really commendable, can you please have one video with using callback, looking forward for some more knowledge sessions from you!
Hello Nicholas! First of all thank you for your videos they are really intersting! Can you explain a little bit more the meaning of the explained variance and the value_loss? It will help to understand why they need to be respectively closed to 1 and 0. Do you have also a automatical way to select the best moment to stop the fitting ? Thank you in advance
Great video, highly educational. My guess is that the granularity was not sufficient as things can change wildly during the day and between market open. Minute granularity in 5 minute trading blocks may have worked better
Great lecture , really helped me a lot . Can you please explain what are the different states in the environment and what is the Y axis in the final performance plot. Thank you so much again for the wonderful tutorial.
I am totally drowned by the simple and elegant style by which you explain the algorithm. Superb. Hats off to your videos. Using your codes to train myself in trading. Could you please explain how to introduce any data and either buy or sell or hold with few examples using few stocks like apple, google msn, tesla etc., and share the codes in google.colab. Is it possible. Any way congrats for your great endevor in stock market intro in public. No one will share such things because money matters. I am impressed. Thanks a lot Nicholas. May god bless you.
Awesome work, this would be a good series for you build out with other features and all the things you're more than happy to expand on :) . Also looked at other videos you have made, very clear and great videos!
Thanks so much @stockinvestor1, I'm definitely going to do more in this space. So far three more tutorials planned for RL trading but open to any additional ideas!
Ikr @FuZZbaLLbee, I think I’d just be creating RL agents 24/7 chasing alpha if there was the chance of predicting them! Was genuinely curious how a quick RL agent was going to perform in this case. Definitely will do a little work on exploring integration with external data sources e.g. forecasting reddit sentiment and bringing it as a leading signal.
Hi Nicholas. Love your videos. I probably learn more watching your youtube videos than my data science classes back in college. I just wanted to know if we can add plain vanilla indicators such as 200 EMA, MACD, RSI as input vectors to train our reinforcement learning algorithm. These indicators can improve the performance of our algorithm.
Has anyone else noticed that when he runs evaluations, he gets a different total profit. See 34:53 and 35:55 when he reruns the evaluation. The frame_bounds are the same (90,110) for both evals. When evaluating the same trained model on the same range of data, shouldn't one expect the same total profit?
Great content Nicholas! I have a small doubt. While the dataframe has different prices (Open, High, Low and Close), which price does the model use? How does it decide to use which one?
@@NicholasRenotte I'm just learning. I was on an expensive algotrading course, but the guy didn't explain as well as you do. Since you're an IBM guy, I was wondering if you could give us a course on Qiskit and Quantum Machine Learning please. Thank you.
@@ButchCassidyAndSundanceKid oooooh now you're talking my language! Been keeping that one a secret but I've definitely got something coming in that space 😉
@@NicholasRenotte My understanding of Qiskit and Quantum Machine Learning is still very limited, all these Hadamard Gates, Shor's Algorithm concepts are very abstract and not easy to grapple with. And without a full understanding, one cannot proceed further to Quantum Machine Learning. I have just finished watching your Reinforcement Learning series, it taught me a different way of writing RL code (I used to use tf.agent which is very clunky and difficult to understand). Thank you Nicholas. Keep up with the momentum. Look forward to more of your tutorials.
i just wonder if the "Estimation Part" is correct ?? you use ``` env = gym.make(...) obs = env.reset() ``` but the problem is that did you load the previously trained model ?? when using the code, `env` will be replaced with a new one, is that right ?? more, you 'KeyInterrupted' the training part, will the model save the process where it had been ??
Awesome! You explain very well!! Do you have a tutorial where you adopt TA(RSI, MACD, BB etc..) to the chart to see if the agent perform better or if was traded only on TA.
This is a great video. Good work @Nicholas! Is it possible to save the developed model as a pickle file? Eg. if we need to deploy the model in a production.
Normally you just save the trained model weights as a h5 and reload into keras! I show how to do it with stable baselines in the RL Course ont he channel.
Found a MISTAKE! Just take a calculator and check the last chart with buy and sell signals. Assume, each time we are buying or shorting 1 share. Then, by the last trade our approximate balance is -15, which means there was a loss while model is wrongly calculating it as a 5% gain...
@Nicholas Renotte thank you the video is very good, but may I ask a question I'm still confused in part 1 determining the window_size 5 and the frame_ bound (10,100), will the frame bound display the best 90 days of data or what? thanks for the explanation later
Great video mate, would love to see the in-depth video, but this does look like a method that is far too easy to overfit with. What do you think? Thanks again for the vid!
Yep, agreed! It probably didn't help that I picked a stock that had wild variance. Having early stopping will help, the model trained so fast it was hard to stop when it reached an appropriate level of EV. I'm going to pump out some more detailed videos on how to get something that is actually a little more stable working plus custom indicators, integration, etc. Thoughts?
@@NicholasRenotte That would be awesome! I agree with the custom indicators and integration :) those videos would be really helpful, some like a moving average would be a great intro I think, do you know if you can do multiasset trading using the RL environments that you're using?
@@btkb1427 got it, I'll add an SMA or something along those lines! I think you could have a multi-asset model but from my (very brief) research it sounds like individual asset models are more the norm. You would probably run individual asset models with an overarching funds management model is my best guess!
I want to pass in a few more data to the model along with the price, such as the market cap, the change in market cap, the difference in 'high' price and 'low' price, maybe something more to experiment with. How do I do this? Thanks in advance
Note: Stable baselines only work with Tensor Flow one. I'm going through tutorials on reinforcement learning and it's hard to find something relatively current and easy to follow.
Definitely, I talked a little about it inside of this ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-Mut_u40Sqz4.html but agreed, I think we still need a bit more on it!
Hi Nicholas, should we perform a train test split on the df and, if so, do stationary variables and feature standardization of the variables help with improving the model?
How would I go about saving/loading models? Let's say I want to implement a callback that saves the model with the highest explained variance and then load it and serve it new data as it comes in through the terminal? Is there a way to do this without using stable-baselines, and use keras instead?
As I am new to all this still trying to get my hands around Deep Learning I have been looking at maybe going with reinforcement learning. The only question I have is it seems that we only are using RL to train any given model and then use that in 'production'. What I'm looking for is something which will allow it to always be in learning mode which it then could adapt as the environment around it changes. For this is RL the best way to go?
Heya @Jeff, you're going down the right path. From a deployment standpoint I've only shown interation. In production like projects what you'll actually do is have continuous retraining and deployment. The process is the same however the cycle is changed. So for example what you would do is: 1. Train your model 2. Deploy it to production 3. Overnight or at a set schedule, bring in new data 4. Load your existing model and retrain it on the new data 5. Redeploy the model
@@NicholasRenotte What I want to try and do really is to give the robot reward areas and from there have it train itself on what to do next. This way the robot would evolve more or less. The only items I will need to look into is how to recharge it. But at the end give it a camera, 4 sensors (cliff sensors), a charger and a pickup device like the Vector Robot has. From there it learns to move and explore and what not all the while knowing only that it's doing the right thing when getting a reward. BUT yes, before I get to this level I need to first go back to your very first video and start at the very beginning. :) The only issue I am having now is I can't get my Jetson Nano (b01) Jetpack 4.5 fully installed and working. :(
I am trying to do RL with energy engineering, this tutorial is as close I have gotten because it uses recorded data. Any recommendations or examples of code that tackles optimal solutions for energy storage, micro grids, refridgeration, etc?
Ooooh, I'm actually looking at energy and resource opt at work rn. I don't have anything quite yet but this might help in terms of setting up the custom environment that will likely be needed: ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-bD6V3rcr_54.html
@@NicholasRenotte it would be interesting to do a principal component analysis on a huge database of indicators to see which are less correlated and bring the most variance to the data. That way you know that you are using the right indicators (there are so many out there). Otherwise use forecasting models as indicators in which basing your trading strategy on. For example a logistic regression function saying will price go up or down ? An ARIMA checking the direction, if it forecasts that the price will go up or down. Once you have several confermation indicators, enter the trade and then add the proper risk management to the system. No idea, I am still trying to figure all of this out. Im coming from tradingviews pinescript, which is nowhere near what you can do in python. I do have a bit of knowledge of data science, but between studying it and actually applying it, there is lots of pracite to be done.
@@NicholasRenotte For sure though I think the whole crypto space is much more interesting as there is plenty of data to choose from. Especially from the blockchain it self not only necessarily price and volume data. Just some food for thought, maybe that gives you some ideas for next videos
Hi i am getting ValueError: too many values to unpack (expected 4) near n_state, reward, done, info = env.step(action) under Build environment. Can anyone help me with this
Umfortunately I get by the end of part three the following errot: attribute error: 'gym_logger' has no attribute 'MIN_LEVEL'. I tried the solutions on stackoverflow without success. Do you have any idea?
can you add some of your python code trading rules. like you can buy if the moving average is going up , sell a short if the moving average is going down. at least control when its buy or shorting to the most optimized condition. still its a good video.
Hi Nicholas, thanks for great video. I want your advice, as I want to make a crypto-prediction model, and want to use it for real-crypto-trading.So should I go with traditional LSTM,RNN etc approaches, or I have to go with RL. Thanks for any help.
Thanks so much @farhat. Yah, it's pretty easy to swap it out for alternate securities but I don't think you'd actually try to HFT something that's undergoing as wild price action as GME right now. Got some more videos planned for the RL trading series, anything you'd like to see?
@@NicholasRenotte Thank you Nicholas, i really like your content. I have some feedback: My first thoughts about ML and specifically RL is that we need just to provide it the data, the goal, and it will try to find the best model, without huge knowledge from our side, but now, i see that we need to tweak it to find the best results. Continuing with RL applied to trading, is there anyway we can ingest additional data like some indicators (RSI, MA...) Q: Do you use NN in your environment here ?
@@farhatsam8529 right on time, check this out: ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-q-Uw9gC3D4o.html yep we do use an NN here, it's natively integrated into the MlpLstmPolicy.
Hi Great Video 😉! Do you run that in the cloud? Which computer do you have? Do you need a NVIDIA GPU to ML? I have a Lenovo laptop with a AMD Ryzen 5, can I run most Models? Thanks for your help
Thanks so much @Daniel. This is running on my local machine. It's using a 2070 Super and a Ryzen 7 3700X. The Ryzen 5 should be fine to run most models, just make sure if you're getting a GPU it's an NVIDIA one as it natively works with Tensorflow!
@@danielsilva3383 anytime! TBH, I'm not too up to speed on Laptops but I've found I'm able to handle most CV and RL tasks with a 2070 Super. If you plan on getting into hardcore NLP you might need something beefier.
@@joaobentes8391 niceeee, I saw an example of that a couple of nights ago using Gym Universe. Sounds awesome!! Would love to see a snippet once you've got it trained.
In addition, the gym_anytrading is not suitable for stable_baseline3, I recommend readers to follow the steps of custom environment in stable_baselines3's doc to rebuild the environment.
Thanks for such great content! I am very much interested in RL and robotics and I have been working to build a robo dog which uses RL to learn to walk and can recognize faces and can understand gestures and voice commands, could you please do this kind of video? :)
kindly make a video on custom indicators and create a trading bot video specifically for Technical and Financial Analysis used in daily trading of ant cryptocurrency.
Hello, I am trying to train an agent on a custom environment that I made and I was wondering if there is any way to increase how often the performance metrics like explained_variance and value_loss pop up during the training process.