Hi Friends, In this video, I have explained a realtime scenario in PySpark github.com/sra... Code & dataset are uploaded to GirHub: raw.githubuser... github.com/sra... Please subscribe to my channel for more interesting learnings.
If the requirement is to have data from both the dataframes then we can use UnionByName.. but here the requirement is to check if the columns are matching with other dataframe and if any column is missing then it will create a column with same name and null value.
@@sravanalakshmipisupati6533 thanks..under "sampledata" branch, looks this specific notebook is not checked in yet, could you commit the same? Or pls help me locate the file if it's checked in with some other name..
@@sravanalakshmipisupati6533 Yes please explain the overall procedure. like what tools are using (Github, jenkins, jira, etc) in the project with flow. actually there is no proper video which will explain project end to end process. so it will be great if you do one ?
I have the same scenario. I have 2 dataframes with different number of columns. The second dataframe have an update values so i want to update the dataframe 1 considering the values of the second dataframe but keeping the values of the first dataframe if there is no a change. Could you help with this?
Hi give me solution if i have table with name , id ,departmnet in name column 2 name . now the condition is want new column but in that new column i want all nam which are in name column