Тёмный
Databricks
Databricks
Databricks
Подписаться
Databricks is the Data and AI company. More than 10,000 organizations worldwide - including Block, Comcast, Conde Nast, Rivian, and Shell, and over 60% of the Fortune 500 - rely on the Databricks Data Intelligence Platform to take control of their data and put it to work with AI. Databricks is headquartered in San Francisco, with offices around the globe, and was founded by the original creators of Lakehouse, Apache Spark™, Delta Lake and MLflow.
Data Intelligence Day Seoul 2024
2:31
12 часов назад
An Introduction to DBRX
17:50
21 час назад
Demo: How Do I Use DBRX?
11:08
21 час назад
Setting up PAT and Secret Scope
2:33
21 час назад
Комментарии
@TheDataArchitect
@TheDataArchitect 6 часов назад
Who's the speaker?
@muhammadibrahimabdullahi3840
@muhammadibrahimabdullahi3840 8 часов назад
AI can do everything you need to do in times of studying and understanding AI.
@benim1917
@benim1917 12 часов назад
Awesome 👏🏾
@Thegameplay2
@Thegameplay2 12 часов назад
🎉
@gravenguan
@gravenguan 13 часов назад
How did parse_json handle schema evolution and from my kowledge, prod table do not recommend parse schema on the fly, it's more safer to define schema first
@Databricks
@Databricks 13 часов назад
I agree, but with a lot of JSON data you don't know the schema upfront and so can't define it. It's worth noting this is different from inferring the schema which looks at the first 1000 rows and is brittle to upstream changes - Holly
@gravenguan
@gravenguan 12 часов назад
@@Databricks We used parse_json for dev and exploration purposes as well, thank for the clarification
@Databricks
@Databricks 11 часов назад
@@gravenguan No worries! Hope this clarifies for other users too
@nagendrasrinivas-cj7sr
@nagendrasrinivas-cj7sr 14 часов назад
this is clearly copied from snowflake
@Databricks
@Databricks 11 часов назад
Variants in their various forms have been around for many decades. We're big fans of open source so anyone can use the implementation in other projects or products.
@TheDataArchitect
@TheDataArchitect 18 часов назад
That's awesome.
@matthiasmueller9340
@matthiasmueller9340 21 час назад
How can I specify the required runtime version when using serverless sql warehouse?
@Databricks
@Databricks 19 часов назад
Variant types will be coming to serverless early/mid July, no need to select a runtime - Holly
@afrikaniz3d
@afrikaniz3d 21 час назад
Only note for these videos, since they're not Shorts, ia that it would be more beneficial to use the full wide (1920 x 1080) format, so it's more readable at all resolutions.
@EranM
@EranM День назад
Can't you get the score (ranking score | similarity score) while fetching items from the Vector DB? ..
@EranM
@EranM 2 дня назад
can someone explain to me, how come you calculate USER embedding when training. And when searching for similar embeddings, you actually get ITEMS embeddings???
@LQDEN
@LQDEN 2 дня назад
Still didn't explain what it is exactly
@gybob100
@gybob100 2 дня назад
The shovel company telling you how valuable the gold is
@user-he1hs5vx3d
@user-he1hs5vx3d 3 дня назад
She is creepy because she is not an honest person. She keeps stealing others works and ideas to pretend she is an expert. To make her greater, she belittles others, including her student (5:30).
@jianguo8233
@jianguo8233 3 дня назад
Is 4.0 a release or preview today?
@uchechukwumadu9625
@uchechukwumadu9625 3 дня назад
Insightful!
@slavenlulic7736
@slavenlulic7736 3 дня назад
powerfull
@SnatrWhamo
@SnatrWhamo 4 дня назад
Great video and very very useful! While implementing, I got stuck uploading the pdf to a Volume in the Unity Catalog. I am the "Owner" of my Databricks Workspace and Azure account although I don't seem to have the option to add a Volume to a Catalog and thus don't have the option to add the pdf to a Volume. This seems to have to do with permissions and possibly setting up a metastore between DataBricks and Azure Blob Storage? Might you have any insights, ideas, solutions or workarounds? Thanks again for a great video and all the resources to implement this super useful technology!
@jasondrew2087
@jasondrew2087 4 дня назад
Couple of things, you need USE SCHEMA and CREATE VOLUME permissions on the Schema and USE CATALOG on the catalog. Also you need CREATE EXTERNAL VOLUME permissions on the External Location you plan on using for your Volume.
@BlizzardzRS
@BlizzardzRS 4 дня назад
While I appreciate the contributions Databricks's makes to the open source community, *this video is incredibly misleading*. DBRX is *not* the highest production quality open-source model nor the best in price per performance. The graph you showed is incredibly misleading, not least because you compared your models to LLaMa2-70B. No one in their right mind at the time of this video's recording is using LLaMa2-70B. Everyone has moved on to LLaMa3, with many providers even disabling LLaMa2 on their platforms because it is more expensive and less performant than LLaMa3. A fairer comparison would be between DBRX and LLaMa3-70B and LLaMa3-8B. You didn’t show that because DBRX gets roasted in these comparisons. (Your talked about the cost associated with training your LLMs and how the cost has come down substantially. Really, this is an argument that the $10M Mosaic/Databricks have spent on DBRX is already redundant. You guys are losing credibility by posting stuff like this. Databricks does some great work. Don’t tarnish your reputation with borderline fraudulent content like this.
@georges7298
@georges7298 5 дней назад
Thanks - for the open sourcing, and for the summit.
@BeginnerAlchemist
@BeginnerAlchemist 5 дней назад
I have a question: why we try to research Small-LM just to avoid using GPUs? If we want to save the money for training, we can do the research for how to make GPU or model more effectively, not to avoid using higher techs.
@DamaruM
@DamaruM 4 дня назад
GPU= power consumption
@tulikabose5120
@tulikabose5120 День назад
It's not just for GPUs...Small-LM has its own market for on-device or on-edge processing, where there are concerns of privacy and customers would not want their data to go to clouds, and secondly in many industrial use-cases where internet and cloud access isn't accessible due to the remote nature of the use-case, and model inference needs to be done on device...The demand for SLMs is increasing in such use cases...Many big tech companies are not just working on LLMs but also on SLMs under the hood as both of them have to co-exist to cater to different user requirements.
@BeginnerAlchemist
@BeginnerAlchemist День назад
@@tulikabose5120 Thank you, I see. It is useful for small devices with limited calculation hardware and the privacy. That's true. So many LLM need a huge data to train and it should collect people's private info to become stronger. That's hated by most of people.
@mc.pretzel
@mc.pretzel 5 дней назад
Boomshakalaka!
@plartoo
@plartoo 5 дней назад
:D Show us how to do more complex data transformations than just a simple join you demo-ed and what the actual limitations are (because that's where the reality meets the demo). While you are at it, tell us how to automate (schedule) this pipeline and set up notifications and data quality checks. Next, let us know how to QA that dashboard you let GenAI created (to make sure it's not hallucinating and spitting out bullshit while destroying our firm's reputation), and how to surface it to customers via URL in a secure way (without paying you through our noses). Finally, tell us how much it costs to process GBs of data per month. This is the unbearably condescending demo that assumes the attendees are stupid and don't know what entails in serious, real-world data wrangling. And I know a couple of my clients who are leaving Databricks because they are freaking expensive.
@ser1ification
@ser1ification День назад
Exactly. I’m tired of these hype machines. Everything is in beta. Customers are the beta testers. Only thing these guys did good is the Unity Catalog. Of course Spark and Delta as well.
@gopi4841
@gopi4841 6 дней назад
Nice one, Darshana.
@xiaoyu2270
@xiaoyu2270 6 дней назад
jensen from china wenzhou
@chima6291
@chima6291 10 часов назад
bullshit. He was born in Taiwan
@forrestbajbek3900
@forrestbajbek3900 6 дней назад
Wow, this is a huge improvement.
@AleksandarKrumov-pm4tk
@AleksandarKrumov-pm4tk 6 дней назад
wow
@cobrider2
@cobrider2 6 дней назад
2 reactions: - by querying the table with duckdb, the authentication and permission is handled only by Unity Catalog, and not by the underlying storage solution (AWS S3, Azure ADLS, ...). right ? - Applying column masks will only work for hosted compute like the databricks clusters, because querying with a local self hosted compute like DuckDB requires to download the parquet files (containing the PII data) locally then only execute the query... meaning you actually have PII data downloaded on your local machine. right ?
@cobrider2
@cobrider2 6 дней назад
had a laugh, thank you
@subhroitmecse
@subhroitmecse 6 дней назад
Examples are not clear about Delta lake ACID properties.
@Clammer999
@Clammer999 6 дней назад
One of my favourite AI legends. Her passion for humanity and how AI can be leveraged to help improve people’s lives is admirable and astounding.
@WonkaTruck
@WonkaTruck 6 дней назад
I still can't read Iceberg in Databricks, stop hoping for adoption and just fix that...
@DCC72
@DCC72 6 дней назад
And from nothing, a college professor just evolved from the bacteria. Rubbish.
@sunnychabbi3639
@sunnychabbi3639 6 дней назад
Pls provide notebook. It is not available in dbdemos
@GerardInnes
@GerardInnes 7 дней назад
As a new RU-vidr you doing very well. He teaching us option trading nicely. Just need to be consistent with this process of trading on binary options...
@henryebube3576
@henryebube3576 7 дней назад
I followed you tutor.I get stuck at 9.38. I type databricks-bge-large-en as the embedding model but the create button is disable not sure why
@jasondrew2087
@jasondrew2087 6 дней назад
You shouldn't have to type it in, rather it should be an option in the drop down. If you go to Serving do you see it listed as a Foundational model?
@bloggeranonymous9400
@bloggeranonymous9400 7 дней назад
just blah blah blah of buzz words !
@yashdagade1240
@yashdagade1240 7 дней назад
are human brains binary too?
@Milhouse77BS
@Milhouse77BS 7 дней назад
"building a dashboard" I'd rather build a "semantic model".
@geeb009
@geeb009 7 дней назад
Good introduction!
@user-wr4yl7tx3w
@user-wr4yl7tx3w 7 дней назад
anyone has the link to the paper on archive?
@BeginnerAlchemist
@BeginnerAlchemist 5 дней назад
Impossible Distillation for Paraphrasing and Summarization: How to Make High-quality Lemonade out of Small, Low-quality Models
@kiran.khandelwal
@kiran.khandelwal 7 дней назад
Shoes !! Same shoes