query on following case: 1)every couple of mins (ie.15 or 30 mins) ,multiple files ( i.e 20 to 30 files & sometime more) are loaded in Folder' 2)on each file loaded in folder ,it trigger cloud run (eventarc is set on that specific folder) 3)cloud run will be created and invoked on each file created in folder 4)cloud run read that file & do transformation as needed on data and store in one specific bigquery table Here question is 1)if there is multiple cloud run instance is created (concurrency is not 1), how it make sure that 2 cloud run instance do not works on same file as both will get event on each file creation (as per my understanding) 2)does it handle internally by cloud platform or need to write custom code to handle this case?
thank you for creating this practical use case. how does it know to not load same file again? where does that information get captured in GCS which keeps track of loaded file. so in the event want to reload it what to do?
Hi Anjan, thank you for the video. I need urgent help... I have a usecase where I need to append only delta data of a json file from GCS to bigquery, its like if the file already exists in Bigquery and if I load same file with updated record or updated row then how can I just update the new entry and not overwrite the existing old entries.
Hello Anjan, could you please make a video of complete roadmap of GCP data engineer considering in mind for the Non-IT ,Tester guys, who want to get into GCP. Thank You.
Hi , sure pls wait for some more time and keep watching other videos , I have been thinking the same from past few weeks , but post Cloud function series I have planned composer series , if I get some time, I will try to to do that in between .