How to Manage a Database with Python, Pandas & SQL - UPDATE a DB professionally [DO NOT SKIP THIS!]

Algovibes

Подписаться 121 тыс.

Просмотров 12 тыс.

50% 1

Видео Поделиться Скачать Добавить в

Опубликовано:

21 окт 2024

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 95

@thespacebantu7439 Год назад

Thank you for being a great mentor in my life, you are the reason for my success!

@Algovibes Год назад

Thank you so much. Pinned your comment as I really appreciate it!

@gunbotimap3726 Год назад

Thank you very much! It is incredible! Millions of hours explained in ONE video! Thank you thank you thank you!

@SolidBuildersInc Год назад

This was really a powerful solution instead of using API for streaming live data. You get the best of both worlds, History data and real time updates without hammering the API. The last thing is to have a rolling History of some time for certain timeframes. For example its a rolling History of 20 years for 1 month data, or 6 months for weekly data etc. Thanks for sharing. I have it all working except the last idea in this post. It's really neat to see this working so nicely.

@Algovibes Год назад

Cool man! Thanks for sharing your thoughts here and I am happy my videos could extract value for you.

@fauziyahaya2124 Год назад

just gonna leave a comment here. back when I was learning python. i was thinking this will be most efficient way for me to do backtesting. now I got the tutorial . thanbks man

@Algovibes Год назад

Thanks a lot for your comment my friend!

@diaboloshots Год назад

Very very good video! I'm new to SQL and have been wondering how to approach this for quite some time! Thank you for the clear logical explanation 😄

@Algovibes Год назад

Thanks a ton for your feedback Alex. Happy that it's useful for you.

@nguyenduyta7136 Год назад

Good and quality channel I ever seen. Thank you 🙏

@Algovibes Год назад

Thank you so much!

@kia4now Год назад

My favourite youtuber strikes again!

@Algovibes Год назад

thanks for your support buddy

@grizthegriffdog9939 Год назад

YES !!!! can't wait to watch this, will do it later. Probably gonna answer all our question! thanks so much

@Algovibes Год назад

Thanks for watching and leaving a comment mate. Appreciate it!

@Kr3m3rsVids Год назад

Great tip excluding the last row in the crypto updating function, wouldn’t have thought of that myself!

@Algovibes Год назад

Happy that you find it useful! Thanks a ton for watching and leaving a comment.

@wei7360 Год назад

thank you so much. ive been going through your videos since day 1 and I have learned so much from you. More than other schools like codeacademy and other "boot camp schools" I learn the best when I am applying the skills to something applicable and that is how I remember. Thank you for all that you do and I wish you keep on providing value! Much Love my friend.

@Algovibes Год назад

Thank you so much my mate. Means a lot to me reading that :-) Love the Pepe profile pic 🐸

@techtrader8434 Год назад

I was just looking for a video like this , now I believe THE SECRET book concept . ❤️❤️

@rubenghidanac Год назад

😅

@techtrader8434 Год назад

@@rubenghidanac May be utube algo is part of it . Everything connected. Or may be I'm just 😂 talking out of world

@Algovibes Год назад

:D thanks for watching mate

@mahmoodhajipour3680 Год назад

I personally have a deep need to Data Base oriented videos, I really appreciate your kindly efforts, please keep on making this videos. this topics are so useful to us and the way you teaching is very enjoyable, if you add websocket to these DB videos it would be PERFECT, Thank you...

@Algovibes Год назад

Thanks for your comment mate :-) I actually have something planned with DBs and websocket in the near future.

@mrrolandlawrence Год назад

Great work. Been watching all your content since I subscribed. Im working on something similar - websockets + postgres db as a background service. Then just using the pandas dataframe with the real time updated table. I could have got postgres to listen with a socket, but this way I can include the EMA / AMA, RSI calculations that I want in python which is a lot easier to do.

@Algovibes Год назад

Awesome. Thanks a lot for your comment Roland. Appreciate it!

@riccardoronco627 Год назад

for split and dividends we could add a check on the LAST day in the database for the close and compare the close saved on disk with the one we download (Adj. close). IF this is different then there has been dividends, splits, etc hence we should download the whole time series again and save it from scratch with the fresh data

@enzanto Год назад

oh lol, creating a local sqlite db has shaved A LOT of time of my scripts xD big thumbs up from me

@Algovibes Год назад

Thanks mate. I personally prefer MySQL for bigger projects. Also have some videos on that if you are interested in that!

@enzanto Год назад

@@Algovibes when I am done testing, I will spin up a mysql in my kubernetes cluster. And have my scripts run on daily schedule. And report to a discord channel :)

@pietraderdetective8953 Год назад

This is perfect, thanks for the video! may I ask a noob question? for financials data (stock price, financial statements) what kind of DB should we use? SQL or no-SQL? I tested SQLite before but I got worried due to it's not supporting async write. Also no-SQL like MongoDB can actually do SQL as well, but on the other hand SQL is difficult to scale horizontally. due to the factors above, in my mind the no-SQL choice is a no-brainer...but listening to a lot of opinions, that's not the case which confuses me. Also please please keep on making DB oriented videos..these contents are super awesome especially in conjunction with financials and trading topics! Cheers!

@Algovibes Год назад

Thanks a lot for your comment man. Appreciate it! Well it's really hard to answer questions like that as it depends. I personally wouldn't use a NoSQL DB here but my view is biased as I primarily work both privately and professionally with SQL DBs. But could you elaborate on the async topic? Does it not work because the Database is locked or what exactly are you referring to?

@pietraderdetective8953 Год назад

@@Algovibes sorry for my lack of understanding...i meant to say "it does not support concurrent writes" and yes it's due to the locking contention. I've come across newSQL (~ distributed SQL) recently and I'm trying the free-tier of CockroachDB. Looks interesting as it got the best of both worlds: the structure of SQL and the horizontal scaling capability of no-SQL.

@grizthegriffdog9939 Год назад

@@pietraderdetective8953 there is an async version of sqlite3 called aiosqlite3

@bestfriendbat 4 месяца назад

Thanks for the video! btw, how do you handle dividend and split if you just append new row into the db?

@arturKrzyzak Год назад

Is there any way to offset time in df/database for the one I am currently in? For example, in Amsterdam it will be UTC + 1hour. I found a workaround adding line to 'getdata' function: frame.index = frame.index + pd.DateOffset(hours=1). This works for pandas df, but I got this error when 'crypto_SQL_updater' function is executed: ValueError: Length mismatch: Expected axis has 0 elements, new values have 5 elements. So dataframe is being created correctly, but it does not want to write to SQL DB by some reason. FUll code is: def getdata(symbol, start): frame = pd.DataFrame(client.get_historical_klines(symbol, '1m', start)) frame = frame.iloc[:,:5] frame.columns = ['date', 'open', 'high', 'low', 'close'] frame.set_index('date', inplace=True) frame.index = pd.to_datetime(frame.index, unit='ms') # frame.index = frame.index + pd.DateOffset(hours=1) frame = frame.astype(float) return frame # df = getminutedata('DOGEUSDT', '2022-11-12').to_sql('DOGEUSDT', engine) print(pd.read_sql('DOGEUSDT', engine)) def crypto_SQL_updater(symbol): max_date= pd.read_sql(f'SELECT MAX(DATE) FROM {symbol}', engine).values[0][0] print('MAX Date from DB before update is: ' + max_date) new_data = getdata(symbol, max_date) print(new_data) print('-----------------------------') new_rows= new_data[new_data.index > max_date] new_rows[:-1].to_sql(symbol, engine, if_exists='append') print(str(len(new_rows[:-1])) + ' new rows imported to DB') crypto_SQL_updater('DOGEUSDT')

@Algovibes Год назад

stackoverflow.com/questions/37614303/in-pandas-how-to-get-a-utc-timestamp-at-a-given-hour-from-a-datetime Should answer it, right? :p

@stonyandroid5316 Год назад

First of all, amazing content....thanks as always! My question is regarding stocks which have stock splits and dividends that change the OHLC data historically (ie, data in the db already). If not too much trouble, could you talk a bit about that also? Thanks 🙏

@Algovibes Год назад

Thank you very much mate! Very good question. I actually thought about covering that in this video. In some cases you might be interested to change rows which have changed and update them with new data. Also interesting but I am not sure if it's worth making a video on that.

@borism4849 Год назад

Must watch indeed. But what are the advantages of having a database? If you can pull all the data you need with the yahoo APIs, why you need to store them?

@Algovibes Год назад

I touched on that topic in the beginning of the video: You don't want to exceed API limits. Furthermore: Let's say you want to backtest 500 stocks. You would need to run 500 api requests which is time consuming.

@heakjosh8805 Год назад

Need a timescaleDB version ! Thank you !

@Algovibes Год назад

Thanks for the suggestion!

@Anonymus-el7dk Год назад

Cooles Video, weißt du was Performance technisch besser ist. Über Datenbank oder über csv? Hab aktuell ein Docker am laufen mit dem ich csv auslese und dann einen append mit den neuen Daten mache. Lösche vorher auch den letzten Eintrag da ich auch Minuten Daten ziehe. Danke

@Algovibes Год назад

Danke :) Würde pauschal Datenbank sagen, aber es kommt immer ein bisschen auf die Datenmenge an und was du genau vor hast.

@aryanzaveri7764 Год назад

Great video. How can you integrate interval in sqlimporter function

@Algovibes Год назад

You can just pull the interval instead. Or if you want to apply multiple intervals you have to add like an argument for the interval and change table names like BTC_1m or so. Hope that helps!

@ernest987987 Год назад

Hey Algovibes, I think you should consider switching from SQLite to Duckdb, the speed difference is just astonishing

@Algovibes Год назад

Hi buddy, I am working with MySQL and MSSQL mostly. Appreciate your suggestion tho!

@ujjwalkumar3905 Год назад

Sir Very nice video, my previous problem resolved. Sir one problem is that with TvDatafeed(tredingview API), here are bar number instant start date. Pls take this case and make video on this. Thanks

@Algovibes Год назад

Awesome. Happy to read! Do you have a reference for the tvdatafeed? Happy to cover that as well!

@AHA10005 Год назад

Isn’t there some better method for checking if table already exist? I mean, “try-except” works well, but it may supress also any other error that may occure when updating table and create new one…

@Algovibes Год назад

Sure. You can also do that with an exists query. Good thought!

@arturKrzyzak Год назад

Hi, much appretiate you took an effort exploring this topic. Actually, some time ago I tried to achieve this, but I took another approach (and failed, haha)- I was trying to save minute data from finance websocket live stream. Main issue of that was that CPU was hammered all the time with usage between 70-80% all the time (on i7 CPU). And my raspberryPi was under 99%load. So I dropped this idea. I can share the code(its essentially your code from unicorn_binance tutorial ;-) if you would like to take a look and maybe improve on that.

@Algovibes Год назад

Welcome man! Well it's time for another try then :p

@jonasprechtel8428 Год назад

Great intro tutorial. I have always wondered why you stick to yfinance. I have moved to yahooquery as it's way more performant imho. What do you think?

@Algovibes Год назад

Thanks a lot Jonas! Never heard of. Got to check that out.

@rubenghidanac Год назад

Thank you for the video! What IDE do you use?

@BOSprodz Год назад

Jupyter notebook

@Algovibes Год назад

Using Jupyter notebook here. Privately and professionally I use a combination of VSC / Spyder if you were interested in that!

@grizthegriffdog9939 Год назад

do you know of a function call in the binance api that can retrieve the date/time of the very first (oldest) data point in their api in order to request all available 1min klines from the beginning of any pair/coin price history?

@Algovibes Год назад

Not exactly but you can construct that. Just pull data back to before crypto were a think and check for the first row of all pulled cryptos and with that you are getting the tradable data for a bunch of cryptocurrencies.

@prakashtiwari7597 Год назад

Sir please make a video on making roc chart of any parameters of ohlc

@Algovibes Год назад

Thanks Prakash for the suggestion!

@atangbingana283 Год назад

im looking for a video on how to create a finance API with real time data do you have a video on this ?

@Algovibes Год назад

Yea, got a whole playlist on that. My cryptobot playlist. Be invited to check that out :-)

@philippecolin151 Год назад

Brilliant!

@Algovibes Год назад

Thanks buddy

@amitshewale1594 Год назад

How can we add a new data column to this existing sql database? Please guide, thanks

@NormanCrypt Год назад

nice work, thank you! can't i use instead of if_exists='append' >>>>> ['append' if symbol in engine.table_names() else 'create'][0], i mean instead of using try and except

@NormanCrypt Год назад

above was just an example, question was if it's something wrong with using list comprehension to get a parameter

@Algovibes Год назад

Actually a pretty nice idea! Thanks for sharing :-)

@amitshewale1594 Год назад

If we do this in google colab, can we access the db from a different colab notebook?

@Algovibes Год назад

Yes ofc!

@janstrw7505 Год назад

Thank you for making this Video. Do u also have a Video where u describe how i can get a program python "localhost"? I see your Google Cloud environment but not how to set up a localhost. I want to start trying to create crypto bots :) Thank you!

@Algovibes Год назад

Welcome mate. Can you elaborate on your question?

@janstrw7505 Год назад

@@Algovibes in your browser it says localhost at the top how can I create such an environment in which I can program in my network? how can i run a python bot 24/7 without renting a server?

@jitkadyan Год назад

Sir Great video again, very well explain. Sir requesting you please create a video on trailing stop loss during live market data and how to apply it during live market data. will be very thankful to as not getting detail on the internet.

@Algovibes Год назад

Thanks mate. Interesting suggestion!

@mahmoodhajipour3680 Год назад

That was so good and useful video, Thank you, but it would be so perfect if you had used websocket to update data base, i would appreciate if you add this content to your video...

@Algovibes Год назад

Thanks Mahmood. Will see what I can do!

@mahmoodhajipour3680 Год назад

@@Algovibes love you ❤️

@hakimdz3358 Год назад

how python cx_oracle dump database to file??? plz

@Algovibes Год назад

Could you elaborate pls? Thx!

@hakimdz3358 Год назад

@@Algovibes ????

@nkipa Год назад

What about binance?

@chigstardan7285 Год назад

Use binance API or wrapper for it.

@Algovibes Год назад

Covered in the end of the video!

@chigstardan7285 Год назад

@@Algovibes Hey, should I still use python-binance? Cause it hasn't been updated since April.

@Algovibes Год назад

@@chigstardan7285 In the end it's your decision. I find it quite convenient and it does the Job with a ton less code for me.

@chigstardan7285 Год назад

@@Algovibes ok thanks.

@wei7360 Год назад

hey algovibes, i have trial/error but I keep getting an TypeError: Input type must be a str. (Maybe because I am using mysql as my db?) The problem seems to stem from here. max_date returns a numpy.datetime64('2023-08-28 ...') object. Therefore, I am unable to compare the times between max_date and new_data.index. Can anyone please kindly figure out a solution? Any help would be much appreciated! max_date = pd.read_sql(f'SELECT MAX(TIME) FROM BTCUSDT', engine).values[0][0] print(max_date)

@wei7360 Год назад

i solved the error! thanks for challenging a beginner like me! if you using mysql, must insert ``` max_date = pd.read_sql(f'SELECT MAX(TIME) FROM {symbol}', engine).values[0][0] max_date_str = np.datetime_as_string(max_date, unit='ms', timezone='UTC') print(max_date) new_data = getdata(symbol, max_date_str) new_rows = new_data[new_data.index > max_date] ```

@Algovibes Год назад

BOSS!