Тёмный

How to Manage a Database with Python, Pandas & SQL - UPDATE a DB professionally [DO NOT SKIP THIS!] 

Algovibes
Подписаться 121 тыс.
Просмотров 12 тыс.
50% 1

Опубликовано:

 

21 окт 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 95   
@thespacebantu7439
@thespacebantu7439 Год назад
Thank you for being a great mentor in my life, you are the reason for my success!
@Algovibes
@Algovibes Год назад
Thank you so much. Pinned your comment as I really appreciate it!
@gunbotimap3726
@gunbotimap3726 Год назад
Thank you very much! It is incredible! Millions of hours explained in ONE video! Thank you thank you thank you!
@SolidBuildersInc
@SolidBuildersInc Год назад
This was really a powerful solution instead of using API for streaming live data. You get the best of both worlds, History data and real time updates without hammering the API. The last thing is to have a rolling History of some time for certain timeframes. For example its a rolling History of 20 years for 1 month data, or 6 months for weekly data etc. Thanks for sharing. I have it all working except the last idea in this post. It's really neat to see this working so nicely.
@Algovibes
@Algovibes Год назад
Cool man! Thanks for sharing your thoughts here and I am happy my videos could extract value for you.
@fauziyahaya2124
@fauziyahaya2124 Год назад
just gonna leave a comment here. back when I was learning python. i was thinking this will be most efficient way for me to do backtesting. now I got the tutorial . thanbks man
@Algovibes
@Algovibes Год назад
Thanks a lot for your comment my friend!
@diaboloshots
@diaboloshots Год назад
Very very good video! I'm new to SQL and have been wondering how to approach this for quite some time! Thank you for the clear logical explanation 😄
@Algovibes
@Algovibes Год назад
Thanks a ton for your feedback Alex. Happy that it's useful for you.
@nguyenduyta7136
@nguyenduyta7136 Год назад
Good and quality channel I ever seen. Thank you 🙏
@Algovibes
@Algovibes Год назад
Thank you so much!
@kia4now
@kia4now Год назад
My favourite youtuber strikes again!
@Algovibes
@Algovibes Год назад
thanks for your support buddy
@grizthegriffdog9939
@grizthegriffdog9939 Год назад
YES !!!! can't wait to watch this, will do it later. Probably gonna answer all our question! thanks so much
@Algovibes
@Algovibes Год назад
Thanks for watching and leaving a comment mate. Appreciate it!
@Kr3m3rsVids
@Kr3m3rsVids Год назад
Great tip excluding the last row in the crypto updating function, wouldn’t have thought of that myself!
@Algovibes
@Algovibes Год назад
Happy that you find it useful! Thanks a ton for watching and leaving a comment.
@wei7360
@wei7360 Год назад
thank you so much. ive been going through your videos since day 1 and I have learned so much from you. More than other schools like codeacademy and other "boot camp schools" I learn the best when I am applying the skills to something applicable and that is how I remember. Thank you for all that you do and I wish you keep on providing value! Much Love my friend.
@Algovibes
@Algovibes Год назад
Thank you so much my mate. Means a lot to me reading that :-) Love the Pepe profile pic 🐸
@techtrader8434
@techtrader8434 Год назад
I was just looking for a video like this , now I believe THE SECRET book concept . ❤️❤️
@rubenghidanac
@rubenghidanac Год назад
😅
@techtrader8434
@techtrader8434 Год назад
@@rubenghidanac May be utube algo is part of it . Everything connected. Or may be I'm just 😂 talking out of world
@Algovibes
@Algovibes Год назад
:D thanks for watching mate
@mahmoodhajipour3680
@mahmoodhajipour3680 Год назад
I personally have a deep need to Data Base oriented videos, I really appreciate your kindly efforts, please keep on making this videos. this topics are so useful to us and the way you teaching is very enjoyable, if you add websocket to these DB videos it would be PERFECT, Thank you...
@Algovibes
@Algovibes Год назад
Thanks for your comment mate :-) I actually have something planned with DBs and websocket in the near future.
@mrrolandlawrence
@mrrolandlawrence Год назад
Great work. Been watching all your content since I subscribed. Im working on something similar - websockets + postgres db as a background service. Then just using the pandas dataframe with the real time updated table. I could have got postgres to listen with a socket, but this way I can include the EMA / AMA, RSI calculations that I want in python which is a lot easier to do.
@Algovibes
@Algovibes Год назад
Awesome. Thanks a lot for your comment Roland. Appreciate it!
@riccardoronco627
@riccardoronco627 Год назад
for split and dividends we could add a check on the LAST day in the database for the close and compare the close saved on disk with the one we download (Adj. close). IF this is different then there has been dividends, splits, etc hence we should download the whole time series again and save it from scratch with the fresh data
@enzanto
@enzanto Год назад
oh lol, creating a local sqlite db has shaved A LOT of time of my scripts xD big thumbs up from me
@Algovibes
@Algovibes Год назад
Thanks mate. I personally prefer MySQL for bigger projects. Also have some videos on that if you are interested in that!
@enzanto
@enzanto Год назад
@@Algovibes when I am done testing, I will spin up a mysql in my kubernetes cluster. And have my scripts run on daily schedule. And report to a discord channel :)
@pietraderdetective8953
@pietraderdetective8953 Год назад
This is perfect, thanks for the video! may I ask a noob question? for financials data (stock price, financial statements) what kind of DB should we use? SQL or no-SQL? I tested SQLite before but I got worried due to it's not supporting async write. Also no-SQL like MongoDB can actually do SQL as well, but on the other hand SQL is difficult to scale horizontally. due to the factors above, in my mind the no-SQL choice is a no-brainer...but listening to a lot of opinions, that's not the case which confuses me. Also please please keep on making DB oriented videos..these contents are super awesome especially in conjunction with financials and trading topics! Cheers!
@Algovibes
@Algovibes Год назад
Thanks a lot for your comment man. Appreciate it! Well it's really hard to answer questions like that as it depends. I personally wouldn't use a NoSQL DB here but my view is biased as I primarily work both privately and professionally with SQL DBs. But could you elaborate on the async topic? Does it not work because the Database is locked or what exactly are you referring to?
@pietraderdetective8953
@pietraderdetective8953 Год назад
@@Algovibes sorry for my lack of understanding...i meant to say "it does not support concurrent writes" and yes it's due to the locking contention. I've come across newSQL (~ distributed SQL) recently and I'm trying the free-tier of CockroachDB. Looks interesting as it got the best of both worlds: the structure of SQL and the horizontal scaling capability of no-SQL.
@grizthegriffdog9939
@grizthegriffdog9939 Год назад
@@pietraderdetective8953 there is an async version of sqlite3 called aiosqlite3
@bestfriendbat
@bestfriendbat 4 месяца назад
Thanks for the video! btw, how do you handle dividend and split if you just append new row into the db?
@arturKrzyzak
@arturKrzyzak Год назад
Is there any way to offset time in df/database for the one I am currently in? For example, in Amsterdam it will be UTC + 1hour. I found a workaround adding line to 'getdata' function: frame.index = frame.index + pd.DateOffset(hours=1). This works for pandas df, but I got this error when 'crypto_SQL_updater' function is executed: ValueError: Length mismatch: Expected axis has 0 elements, new values have 5 elements. So dataframe is being created correctly, but it does not want to write to SQL DB by some reason. FUll code is: def getdata(symbol, start): frame = pd.DataFrame(client.get_historical_klines(symbol, '1m', start)) frame = frame.iloc[:,:5] frame.columns = ['date', 'open', 'high', 'low', 'close'] frame.set_index('date', inplace=True) frame.index = pd.to_datetime(frame.index, unit='ms') # frame.index = frame.index + pd.DateOffset(hours=1) frame = frame.astype(float) return frame # df = getminutedata('DOGEUSDT', '2022-11-12').to_sql('DOGEUSDT', engine) print(pd.read_sql('DOGEUSDT', engine)) def crypto_SQL_updater(symbol): max_date= pd.read_sql(f'SELECT MAX(DATE) FROM {symbol}', engine).values[0][0] print('MAX Date from DB before update is: ' + max_date) new_data = getdata(symbol, max_date) print(new_data) print('-----------------------------') new_rows= new_data[new_data.index > max_date] new_rows[:-1].to_sql(symbol, engine, if_exists='append') print(str(len(new_rows[:-1])) + ' new rows imported to DB') crypto_SQL_updater('DOGEUSDT')
@Algovibes
@Algovibes Год назад
stackoverflow.com/questions/37614303/in-pandas-how-to-get-a-utc-timestamp-at-a-given-hour-from-a-datetime Should answer it, right? :p
@stonyandroid5316
@stonyandroid5316 Год назад
First of all, amazing content....thanks as always! My question is regarding stocks which have stock splits and dividends that change the OHLC data historically (ie, data in the db already). If not too much trouble, could you talk a bit about that also? Thanks 🙏
@Algovibes
@Algovibes Год назад
Thank you very much mate! Very good question. I actually thought about covering that in this video. In some cases you might be interested to change rows which have changed and update them with new data. Also interesting but I am not sure if it's worth making a video on that.
@borism4849
@borism4849 Год назад
Must watch indeed. But what are the advantages of having a database? If you can pull all the data you need with the yahoo APIs, why you need to store them?
@Algovibes
@Algovibes Год назад
I touched on that topic in the beginning of the video: You don't want to exceed API limits. Furthermore: Let's say you want to backtest 500 stocks. You would need to run 500 api requests which is time consuming.
@heakjosh8805
@heakjosh8805 Год назад
Need a timescaleDB version ! Thank you !
@Algovibes
@Algovibes Год назад
Thanks for the suggestion!
@Anonymus-el7dk
@Anonymus-el7dk Год назад
Cooles Video, weißt du was Performance technisch besser ist. Über Datenbank oder über csv? Hab aktuell ein Docker am laufen mit dem ich csv auslese und dann einen append mit den neuen Daten mache. Lösche vorher auch den letzten Eintrag da ich auch Minuten Daten ziehe. Danke
@Algovibes
@Algovibes Год назад
Danke :) Würde pauschal Datenbank sagen, aber es kommt immer ein bisschen auf die Datenmenge an und was du genau vor hast.
@aryanzaveri7764
@aryanzaveri7764 Год назад
Great video. How can you integrate interval in sqlimporter function
@Algovibes
@Algovibes Год назад
You can just pull the interval instead. Or if you want to apply multiple intervals you have to add like an argument for the interval and change table names like BTC_1m or so. Hope that helps!
@ernest987987
@ernest987987 Год назад
Hey Algovibes, I think you should consider switching from SQLite to Duckdb, the speed difference is just astonishing
@Algovibes
@Algovibes Год назад
Hi buddy, I am working with MySQL and MSSQL mostly. Appreciate your suggestion tho!
@ujjwalkumar3905
@ujjwalkumar3905 Год назад
Sir Very nice video, my previous problem resolved. Sir one problem is that with TvDatafeed(tredingview API), here are bar number instant start date. Pls take this case and make video on this. Thanks
@Algovibes
@Algovibes Год назад
Awesome. Happy to read! Do you have a reference for the tvdatafeed? Happy to cover that as well!
@AHA10005
@AHA10005 Год назад
Isn’t there some better method for checking if table already exist? I mean, “try-except” works well, but it may supress also any other error that may occure when updating table and create new one…
@Algovibes
@Algovibes Год назад
Sure. You can also do that with an exists query. Good thought!
@arturKrzyzak
@arturKrzyzak Год назад
Hi, much appretiate you took an effort exploring this topic. Actually, some time ago I tried to achieve this, but I took another approach (and failed, haha)- I was trying to save minute data from finance websocket live stream. Main issue of that was that CPU was hammered all the time with usage between 70-80% all the time (on i7 CPU). And my raspberryPi was under 99%load. So I dropped this idea. I can share the code(its essentially your code from unicorn_binance tutorial ;-) if you would like to take a look and maybe improve on that.
@Algovibes
@Algovibes Год назад
Welcome man! Well it's time for another try then :p
@jonasprechtel8428
@jonasprechtel8428 Год назад
Great intro tutorial. I have always wondered why you stick to yfinance. I have moved to yahooquery as it's way more performant imho. What do you think?
@Algovibes
@Algovibes Год назад
Thanks a lot Jonas! Never heard of. Got to check that out.
@rubenghidanac
@rubenghidanac Год назад
Thank you for the video! What IDE do you use?
@BOSprodz
@BOSprodz Год назад
Jupyter notebook
@Algovibes
@Algovibes Год назад
Using Jupyter notebook here. Privately and professionally I use a combination of VSC / Spyder if you were interested in that!
@grizthegriffdog9939
@grizthegriffdog9939 Год назад
do you know of a function call in the binance api that can retrieve the date/time of the very first (oldest) data point in their api in order to request all available 1min klines from the beginning of any pair/coin price history?
@Algovibes
@Algovibes Год назад
Not exactly but you can construct that. Just pull data back to before crypto were a think and check for the first row of all pulled cryptos and with that you are getting the tradable data for a bunch of cryptocurrencies.
@prakashtiwari7597
@prakashtiwari7597 Год назад
Sir please make a video on making roc chart of any parameters of ohlc
@Algovibes
@Algovibes Год назад
Thanks Prakash for the suggestion!
@atangbingana283
@atangbingana283 Год назад
im looking for a video on how to create a finance API with real time data do you have a video on this ?
@Algovibes
@Algovibes Год назад
Yea, got a whole playlist on that. My cryptobot playlist. Be invited to check that out :-)
@philippecolin151
@philippecolin151 Год назад
Brilliant!
@Algovibes
@Algovibes Год назад
Thanks buddy
@amitshewale1594
@amitshewale1594 Год назад
How can we add a new data column to this existing sql database? Please guide, thanks
@NormanCrypt
@NormanCrypt Год назад
nice work, thank you! can't i use instead of if_exists='append' >>>>> ['append' if symbol in engine.table_names() else 'create'][0], i mean instead of using try and except
@NormanCrypt
@NormanCrypt Год назад
above was just an example, question was if it's something wrong with using list comprehension to get a parameter
@Algovibes
@Algovibes Год назад
Actually a pretty nice idea! Thanks for sharing :-)
@amitshewale1594
@amitshewale1594 Год назад
If we do this in google colab, can we access the db from a different colab notebook?
@Algovibes
@Algovibes Год назад
Yes ofc!
@janstrw7505
@janstrw7505 Год назад
Thank you for making this Video. Do u also have a Video where u describe how i can get a program python "localhost"? I see your Google Cloud environment but not how to set up a localhost. I want to start trying to create crypto bots :) Thank you!
@Algovibes
@Algovibes Год назад
Welcome mate. Can you elaborate on your question?
@janstrw7505
@janstrw7505 Год назад
@@Algovibes in your browser it says localhost at the top how can I create such an environment in which I can program in my network? how can i run a python bot 24/7 without renting a server?
@jitkadyan
@jitkadyan Год назад
Sir Great video again, very well explain. Sir requesting you please create a video on trailing stop loss during live market data and how to apply it during live market data. will be very thankful to as not getting detail on the internet.
@Algovibes
@Algovibes Год назад
Thanks mate. Interesting suggestion!
@mahmoodhajipour3680
@mahmoodhajipour3680 Год назад
That was so good and useful video, Thank you, but it would be so perfect if you had used websocket to update data base, i would appreciate if you add this content to your video...
@Algovibes
@Algovibes Год назад
Thanks Mahmood. Will see what I can do!
@mahmoodhajipour3680
@mahmoodhajipour3680 Год назад
@@Algovibes love you ❤️
@hakimdz3358
@hakimdz3358 Год назад
how python cx_oracle dump database to file??? plz
@Algovibes
@Algovibes Год назад
Could you elaborate pls? Thx!
@hakimdz3358
@hakimdz3358 Год назад
@@Algovibes ????
@nkipa
@nkipa Год назад
What about binance?
@chigstardan7285
@chigstardan7285 Год назад
Use binance API or wrapper for it.
@Algovibes
@Algovibes Год назад
Covered in the end of the video!
@chigstardan7285
@chigstardan7285 Год назад
@@Algovibes Hey, should I still use python-binance? Cause it hasn't been updated since April.
@Algovibes
@Algovibes Год назад
@@chigstardan7285 In the end it's your decision. I find it quite convenient and it does the Job with a ton less code for me.
@chigstardan7285
@chigstardan7285 Год назад
@@Algovibes ok thanks.
@wei7360
@wei7360 Год назад
hey algovibes, i have trial/error but I keep getting an TypeError: Input type must be a str. (Maybe because I am using mysql as my db?) The problem seems to stem from here. max_date returns a numpy.datetime64('2023-08-28 ...') object. Therefore, I am unable to compare the times between max_date and new_data.index. Can anyone please kindly figure out a solution? Any help would be much appreciated! max_date = pd.read_sql(f'SELECT MAX(TIME) FROM BTCUSDT', engine).values[0][0] print(max_date)
@wei7360
@wei7360 Год назад
i solved the error! thanks for challenging a beginner like me! if you using mysql, must insert ``` max_date = pd.read_sql(f'SELECT MAX(TIME) FROM {symbol}', engine).values[0][0] max_date_str = np.datetime_as_string(max_date, unit='ms', timezone='UTC') print(max_date) new_data = getdata(symbol, max_date_str) new_rows = new_data[new_data.index > max_date] ```
@Algovibes
@Algovibes Год назад
BOSS!
Далее
Angry bird PIZZA?
00:20
Просмотров 7 млн
Solving real world data science tasks with Python Pandas!
1:26:07
Solving one of PostgreSQL's biggest weaknesses.
17:12
Просмотров 205 тыс.
Exploratory Data Analysis with Pandas Python
40:22
Просмотров 486 тыс.
How Notion Cut Latency by 20%
19:00
Просмотров 73 тыс.
Microservices with Databases can be challenging...
20:52
Database Indexing for Dumb Developers
15:59
Просмотров 66 тыс.