I've worked with Palantir. It looks glitzy and easy and click click click, but its the type of thing that if you have to do anything thats out of the ordinary you have to bend their rigid "objects" into new shapes that they don't like. I'm pretty sure they bought all the "Apps" from different vendors because they all seem to have different paradigms - not a cohesive development environment at all. And the documentation sucks. Definitely mostly sad path here. Notice how often he has to click around ALOT just to do something really really simple? And he doubts if he's done something right (or wrong) because even after doing the little thing he's demoing, its just confusing and a mess. There is actually a code interface, but I remember assuming that I could loop a directory and act upon every object in the folder, but dig this, after spending a bunch of time working towards a solution with that assumption in mind, it turns out, no, you can't loop the objects in a folder. WTF? WShzt doesn't allow ypu to do that through their API or any other way? "Hopefully this doesn't break." Lol love that quote. Exactly how I felt working with this piece of trash every day.
dbt is a game changer - it can help create a cultural change where people actually like working with data teams because they see lineage, quality and quick turnarounds built into the system. Highly recommended!!!
Thanks for the call out Christine Carragee! Fun to hear about people creating something clear when your brain would otherwise be shuffling ideas from "50 disc" CD player of thoughts.
This would be a great question to pose to the consulting community I am growing, feel free to join and ask - the-technical-freelancer-academy.circle.so/getting-started
Christine, the word ‘lumpy’ is a wonderful descriptor for consulting income. It makes me think of my dad describing cycling in slightly hilly terrain - lumpy.
It is! i think of it kind of like riding waves. Some days have great waves, others small ones, and still others massive ones. You just gotta respect the ocean
The price of the model for ai summarize is a lot less than llama 3 though. Also good to note the price for enterprise and premium seems to be the same which doesn’t seem to be the case in Snowflake.
I think it'll change, I like what joe reis sad recently in his blog where he talked about DEs having to be more aware of data science and software so we can provide data for AI, data science, analytics, etc
I love the "last mile" issue presentation when it comes to purely AI generated content. Therefore as mentioned only very few such solutions are deployed to Production. When it comes to IT consultants they are usually brought for short term (up to 1 year) when expertise / experiences are needed lacking in the core team. The demands are high as they tend to be very expensive. The issue is they come, do the work & leave. So it may become really difficult to do certain adjustments to the solution afterwards. Doing fine documentation during the development can help however not many companies are willing to pay the consultant's high rate to do it.
Hey youtube.com/@SeattleDataGuy, love your videos so far! Was curious if you'd like to add your insight into the following terms? Batch processing v stream processing OLTP v OLAP
those would be some great topics, I have written on the OLTP vs OLAP topic before here - seattledataguy.substack.com/p/oltp-vs-olap-transactions-vs-analytics
Love hearing independent consulting best practices and what to expect! How do you approach accessing these clients' data and systems as a third party? Do they typically just give you a license (in the case of, say, Microsoft 365) as they would a W2 employee?
Sometimes, usually more a "seat" vs a license, I haven't often needed Microsoft products that are on a laptop, and if I have the client has sent me a physical laptop.
This guy is the best example when you spend 10 yrs of your professional life in "super-cheap-money-world" what happens, a smart kid with a very vague idea of the real world :)
@@SeattleDataGuy explaining the same reason to a Bank where people dont evaluate a technology on "how much money" it has raised. Your generation is just spoiled or scammed by cheap money culture.
Used it for years, I also tried the later 2.x version, I still don't like it, and I think there are better ways of architecting pipelines. But yeah I was amazed when I saw Airflow the first time, and it did solve a lot of problems, but I still think, it is a tool of the past. I hope I am wrong!
We’re a young data team for a large organization. Biggest roadblocks for us are issues with data governance (“you can’t have or report on our data”), budget for tooling (“prove the value of the tool, then we can purchase it”), and cloud concerns (“all my data is on-prem. You can’t just put it in the cloud”)
Yeah, those are always a struggle. In some companies you'll never win that batter(until leadership gets changed out) in other cases you have to be willing to speak your mind and say, "Hey I can't do XYZ which you asked me to do under these conditions, so either things stay the same or you start opening doors". But thats easier to say as a consultant because I don't mind ending a project if a client won't work with me to get to the to the goal they wanted to get to(never had to go that far).
This has happened to me. Now I’m leading a team of data scientists, engineers, analysts and migration specialists. I’ve had to learn so much so quick about strategy and people management. I’ve had to coach the people on my team to really empower and own their own tasks. At the beginning of being head of data I was taking on way too many “low level tasks”. Now I’m delegating and empowering. I still have alot to learn though.
If you're looking for help setting up your data team and strategy, then feel free to set-up a free consultation here - calendly.com/ben-rogojan/consultation
Just went thru this process with my company the past year. Great video. With us it went something like: - Where is all of our data - How are we doing reporting now - What are the shortcomings of existing reporting solutions - Do we need a warehouse (yes) - What warehouse do we pick - What ETL stack makes sense for our use case - What do we integrate in what order to maximize value and get adoption rolling Also, Having someone on the exec level champion the BI effort and really push it forward was huge for the thing to actually materialize.
Thanks for sharing! I really appreciate it when people add more context and their own experiences. Were there any gotchas you ran into while going through this process?
I'd love to believe this! I guess the reason I have a hard time believing it is because I know there are lots of consultants that work in the space of setting up Palantir which suggests that it still requires technical skills to set-up and work with(also based on a few conversations I have had with people working with Palantir). But always happy to be wrong.
the de role should not exist, it should just be sde who also own data as a product. kind of lile front end, backend, thete will be a data focused engineer, that we can call data engineer. o wait
Interested! can you please have special pricing for people in Africa. 50% reduction is good but our earnings are way too low probably 20x less than those in US or Europe. It becomes difficult for us to participate in this type of good courses. Any help! Thanks.
@@SeattleDataGuy yes, thx , I'm trying to understand how Knowledge graph/Vector DB's will integrate into this too, is it safe to assume both will be essential pieces of the enterprise ai layer/stack now being invested in heavily, or do you see one being more relevant in next 2-5 yrs?
To me, one of the main benefit of Spark Structured Streaming is that you can easily switch between near real-time (micro batches) and scheduled batch processing without having to re-writing a single line of code. This is a very effective way of scaling up and down and balancing costs vs latency.