You have a new subscriber! I love the way you explain data engineering. You and Seattle Data Guy are my faves when it comes to Data Engineering Content Creators.
I'm really interested in this field and currently leaning Python. I must say this list is great but I'm really overwhelmed by the amount stuff one has to learn to transition in this field! I'm gonna stick with it and hopefully come through from the other end 😁
Definitely stick with it! One thing to remember is while there are many tools, you don't need to know ALL of them to have a successful career and you also don't need to learn all at once (it takes a whole career to do that). Here is a recommendation to help you get started: 1. Start with getting very comfortable w/ SQL (and/or Python if you'd like) 2. Learn more about data modeling techniques (ex. dimensional modeling, star schema) and the way data typically moves (ex. ETL vs ELT) 3. Pick a common database to study and practice on (ex. Snowflake or SQL Server) 4. Learn how to use a tool like dbt to transform data within those databases which also will show you other important concepts like Version Control 5. Pick a data visualization tool (ex. Power BI or Tableau) and use your transformed data to make a cool dashboard 6. Pick another part of the process (ex. Extract tools, scheduling tools, etc.) and keep adding to your skillset Good luck!
@@KahanDataSolutionsI really want to thank you for this thoughtful response and the road map provided. I honestly didn't expect this swift response and it shows that you love what you do! I will defo stick with it and hopefully make a successful career out of it. Thanks again 💪🏿
@@KahanDataSolutions Thank you for your extra detailed explanation to Adam 1. I would like to ask that this video would be more helpful for senior people who is deciding what their companies should use depend on their business case and requirements?
And about the spreadsheets part, you are def right. We are using Google spreadsheets and using python to automate the process to write our outputs there.
Some other alternatives for scheduling and orchestration are: Dagster Prefect Oozie Or whatever your cloud offering might have, I know Google Cloud has Cloud Scheduler. If you suggest Jenkins as a job scheduling tool in this day in age, I will hunt you down...
Hi, thank you for your video. I know that this is old now but I wish you would put the names of each tool you listed under the tool. If you aren't familiar with the specific tool it can be hard to know how to spell it. I know I can Google but I was taking notes as I was following along. Thank you.
Thanks! Databricks would fall in the same area as "cloud databases". Spark would fit in around the "ELT Components" and used primarily to process large amounts of data.
Hello! Thank you for your invaluable video! I find it extremely useful for beginners! I would like to ask about one thing regarding Data Engineer Career. I learnt Pandas in terms of Data Wrangling and Transformation. Therefore, how about Pandas for Data Engineers? Is it useful tool for ETL/ELT transformations? Obviously, the next step will be PySpark, but I would like to start learninig Pandas. It seems it is a good path for the next one. What do you think about it ? I would appreciate it if you could share your views about it.
Hey man, may i ask a question? I have an ETL experiences with 2 etl tools and multiple RDBMS (on premise), and i wanted to shift into Data Engineering roles that works usually combining ETL Tools+Python and its libraries/frameworks, am i considered as new graduates or industry professionals? Since i don't have any experiences with Python ? And does it usually means i have to take "paycut"? let's say i make $500 a month as ETL Developer, and i wanted to shift to Data Engineer roles , does it means i will be getting paid like $300 a month since i don't have DE experiences? I really need some guidance... Thankyou :)
I agree that the term is a bit odd, but that's what has stuck as of today. Another term you might see used to describe that process is "Operational Analytics"
you just list out, half of the data team (Devops Engineer, Data Engineer, DBA, SQL Developer, Server Executive, Data Analyst, Business Analyst), You dont need to learn the all of this to be data engineer...