Тёмный
No video :(

BloombergGPT: How We Built a 50 Billion Parameter Financial Language Model 

Toronto Machine Learning Series (TMLS)
Подписаться 4,9 тыс.
Просмотров 124 тыс.
50% 1

We will present BloombergGPT, a 50 billion parameter language model, purpose-built for finance and trained on a uniquely balanced mix of standard general-purpose datasets and a diverse array of financial documents from the Bloomberg archives. Building a large language model (LLM) is a costly and time-intensive endeavor. To reduce risk, we adhered closely to model designs and training strategies from recent successful models, such as OPT and BLOOM. Nevertheless, we faced numerous challenges during the training process, including loss spikes, unexpected parameter drifts, and performance plateaus.
In this talk, we will discuss these hurdles and our responses, which included a complete training restart after weeks of effort. Our persistence paid off: BloombergGPT ultimately outperformed existing models on financial tasks by significant margins, while maintaining competitive performance on general LLM benchmarks. We will also provide several examples illustrating how BloombergGPT stands apart from general-purpose models.
Our goal is to provide valuable insights into the specific challenges encountered when building LLMs and to offer guidance for those debating whether to embark on their own LLM journey, as well as for those who are already determined to do so.
David Rosenberg, Head of ML Strategy, Office of the CTO, Bloomberg

Опубликовано:

 

6 сен 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 93   
@TropicalCoder
@TropicalCoder 10 месяцев назад
Kudos to Bloomberg for sharing their experiences with us. I would have liked to know about what their model can do for them in the end. Certainly if they go to the next version that understands graphs and tables that would be much more powerful. Such is an essential advancement in generative AI.
@elmichellangelo
@elmichellangelo 9 месяцев назад
"Certainly....powerful". AKA less employee.
@denijane89
@denijane89 10 месяцев назад
I'd definitely liked more examples on what it can do. "Does better on financial tasks" is a very vague statement. But great work. I wish I had their resources so that I can train my own LLM.
@12345idiotsluggage
@12345idiotsluggage 10 месяцев назад
Do you have a BBG license? If not, dumb money is always last. GS telling everyone that CRE is still alive means it's betting against CRE. BBG telling everyone they're using LLM means it's too late for John Q. Public.
@aitools24
@aitools24 10 месяцев назад
00:06 Bloomberg GPT is a 50 billion parameter language model trained on financial data. 02:30 Language models use probability distributions to generate text. 06:55 Building a large language model requires consideration of three categories: code, data, and compute infrastructure. 09:02 Considerations for building a financial language model 13:24 The training data for BloombergGPT consists of public and private data sources, totaling 710 billion tokens. 15:43 Building the model size and training data sets involve trade-offs. 19:53 Training instability was indicated by the increase in gradient norm during V1 model training. 21:56 Performance declined during spikes despite attempts to optimize. 25:47 The financial language model performed well on various tasks, especially in reading comprehension with numeric tables. 27:46 The model can generate BQL queries from natural language input, making it easier to gather and analyze data. 31:36 Building a 50 billion parameter financial language model is possible with enough computing resources and time. 33:26 Encoder-decoder models considered but not implemented due to limited successful training at large scale. 37:50 Possibility of expanding the large language model to become multi-modal. 40:09 Efficiency of token encoding affects context window for model generation Crafted by Merlin AI.
@philiphua2562
@philiphua2562 2 месяца назад
Had a meeting with them last week. As of now, bb ai will be released next year but don’t quiote me on this
@johanneszwilling
@johanneszwilling 10 месяцев назад
😘 Looking forward to a time where results and problems solved are the killer headline in Ai,...not how complex your solution is!
@besomewheredosomething
@besomewheredosomething 10 месяцев назад
I wonder if the number of parameters became the limiting factor. I get that some smaller models were getting greats results, but there is probably a reason why the best, open-source llama-2 model, is 70B.
@Jonnyrockin71
@Jonnyrockin71 8 месяцев назад
Very educational, I am only as non-technical novice, especially with Gen AI and Foundation Models. I learned from 40 minutes.
@bvssrsguntur6338
@bvssrsguntur6338 8 месяцев назад
Thank you for sharing your experience which we don't hear from Google and OpenAI
@user-xd4bz5py7i
@user-xd4bz5py7i 9 месяцев назад
bringing value to shareholders - Yay 😀
@johngrabner
@johngrabner 10 месяцев назад
what are the 2 opensource projects that shared their logs and experience training large language models?
@stuartdavid
@stuartdavid 9 месяцев назад
BLOOM from HuggingFace and OPT from Meta.
@AO-rb9yh
@AO-rb9yh 9 месяцев назад
I think you should have started from a checkpoint & trained on your private dataset, just monitoring test performance on MMLU and your finance metrics.
@Yewbzee
@Yewbzee 10 месяцев назад
How about a demo of it in action?
@devon9374
@devon9374 10 месяцев назад
Leave it to the finanace guys to call LLMs “simple” like it’s been obvious since the beginning of time 😂
@macgyverfever
@macgyverfever 8 месяцев назад
Okay, I nominate this guy to be future Marty McFly in "Back to the future 4"
@alexcloak
@alexcloak 8 месяцев назад
great talk! lovin' the technical details about training
@TheMatthew34202
@TheMatthew34202 9 месяцев назад
This man is a genius
@erniea5843
@erniea5843 10 месяцев назад
This is awesome, I wonder if you could achieve similar results without training a new model but using a PEFT approach instead?
@d_b_
@d_b_ 10 месяцев назад
Would like to hear this too from both a $ cost and a model performance assessment. What benefit did using their own data gain them? How accurate was the model with pulling out facts that it was trained on? Won't a useful finance model require a stream of current data? BQL doesn't smell like it needs a newly pre trained model.
@lucavalentini9238
@lucavalentini9238 8 месяцев назад
hard cause of they wouldn't want to risk ending up sharing their v sensitive financial data with 3rd parties I guess
@alextongdo
@alextongdo 10 месяцев назад
Its funny he mentions that open source helped them and that want to share what they learned.. they used datasets, architecture, likely code from open source. And yet the only novel thing I can even spot in this whole talk is their special tokenizer for numbers, which he “doesn’t want to get into.” How transparent.
@AmanGoelish
@AmanGoelish 8 месяцев назад
He literally says they tokenizes individual numbers and uses multi word tokenizers, as compared to the arbitary number tokenizer of gpt-3.
@rajxraj1
@rajxraj1 9 месяцев назад
Awesome way to learn
@produbag2408
@produbag2408 10 месяцев назад
How well the model predict future prices?
@FlavianoFlauber
@FlavianoFlauber 8 месяцев назад
tks for sharing
@beyondtoday2036
@beyondtoday2036 10 месяцев назад
24:18 lr reduced from 6e-5 to 1e-4 ? ridiculous
@stuartdavid
@stuartdavid 9 месяцев назад
Ah those numbers are transposed - good catch.
@brandonmoffett2255
@brandonmoffett2255 10 месяцев назад
is there a demo in this video?
@Yewbzee
@Yewbzee 10 месяцев назад
No, just a lecture on how they did it and the difficulty they faced.
@kunwei2005
@kunwei2005 9 месяцев назад
sorry, don't think that slide on 6:49 was explained well.
@bvssrsguntur6338
@bvssrsguntur6338 8 месяцев назад
Bard got trained on hundreds of trillions of tokens vs 700 billion tokens for Bloom model.
@user-sg2fw6ze7n
@user-sg2fw6ze7n 9 месяцев назад
자본주의라는 개념은 미래에는 없어. 그건 이제 과거의 존재했던 개념으로만 남을꺼야.
@brandonsager223
@brandonsager223 10 месяцев назад
If you aren't picking stocks what's the point
@user-sg2fw6ze7n
@user-sg2fw6ze7n 9 месяцев назад
집도 미래에는 무한 공급이 가능해3d프린터가 무한으로 집을 찍어낼수있기때문에, 무조건 가능해. 지금 당장도 가능해. 그런데 자본주의 기득권들이 자기들 부동산 자산 가치가 하락할까봐, 막고있는거지. 근데 아무리 막고 아무리 이걸 반대해도 바꿀수가 없어.
@paparaoveeragandham284
@paparaoveeragandham284 9 месяцев назад
look into
@Photomonon
@Photomonon Год назад
Spells the end of speculative investing
@besomewheredosomething
@besomewheredosomething 10 месяцев назад
Not this model. Also, speculative investing is based on "what might be", these models won't be able to "predict" the future with 100% certainty. Also, Also, each model's output with be an input into a trading bot, which will effect the environment and thus require the models to adjust, which will at best, just speed up speculative trading, but sure won't make it go away.
@CrazyFanaticMan
@CrazyFanaticMan 10 месяцев назад
Speculation means guessing what will happen in the future. So unless you have magical clairvoyance (insider information), speculative trading will not end
@frankgreco
@frankgreco 10 месяцев назад
No, it won't. Models are a snapshot of history; they are databases of patterns, essentially a collection of experiences. They can imply what might happen, but history doesn't always repeat itself. Models can be good guidelines, ie, tools for investors. Btw, AI has been used in finance since the 1980's...
@atishbhattacharya3473
@atishbhattacharya3473 10 месяцев назад
AIs are getting scary smart, they may not predict the future but they will be able to analyze vast data & guess the most likely way markets will behave in a given scenerio and take positions. But markets and behavior of market participants evolve constantly, so these models will have to keep up with that, that may not be so easy to do... Until AGI comes. AGI will be the end of trading as we know it.
@frankgreco
@frankgreco 10 месяцев назад
@@atishbhattacharya3473 I wouldn't focus on AGI ending trading. I'd focus on how you can use AI to help you trade. Wall Street has been trying to use AI for trading since the 1980s. It's not something new to the street. GenAI is just another pattern recognizer tool... a powerful one, but it's still just a tool. You have to know how to use it.
@user-sg2fw6ze7n
@user-sg2fw6ze7n 9 месяцев назад
지금 모든 회사들이 AI를 다 따로 만들잖아. 그런데 그렇게 할 필요가 없이, 모든 AI 회사 IT 빅테크 기업 국가들이 힘을 합쳐서 엄청난 데이타를 학습하고 대규모 컴퓨터를 합쳐서 한개의 AI를 만들어내면, 아마 성능이 상상을 초월할꺼다. 아마 인간을 이미 아득히 넘을 성능을 보이는 수준일꺼다. 이미 그 수준으로 가지 않을까? 인간의 지식보다 훨씬 더 넓은 지식을 가진거지. 인간은 의사나 판사나 it프로그래머나, 그래픽디자이너나, 소설가나, 어떤 한 가지 직업의 지식분야만을 전문가로써 공부해서 성장하는데, 이 AI는 모든 직업의 전문가 수준의 지식을 다 통합하고 있으니, 지적능력 지식으로는 상대가 안돼. 수천 수만명의 박사 수준의 전문가 지식을 아득히 넘은수준이 바로 지금 나올수있다. 지적능력으로는 상대가 안돼. 프로그래밍능력, 헌법에 나온 법 데이타, 100년동안 매일 나온 전세계의 모든 언론사의 기사들, 모든 존재하는 소설책의 내용들, 지적 능력 데이타로는 상대가 안돼. 인간은 절대 못이겨. 이미 상대가 안돼. 모든 나라와 모든 빅테크와 모든ai기업이 합치면 인간을 아득히 뛰어넘는 수준의 AI는 이미완성됐다. 나눠서 해도 성능이 뛰어나. 합치면 이미 인간을 아득히 뛰어넘는다. 인간은 이미 상대가 안돼. 그럼 인간은 과연 미래에 살아남을수있을까? 직업을 못구해. 경제적 능력으로는 ai에게 상대가 안돼. 인간이 필요할까? 직업을 못구해 인간의 성능을 뛰어넘으면 인간이란 존재할까? 능력주의, 성과주의, 실력으로 따지면 이미 상대가 안돼. 그런 경제논리 능력논리로따지면 인간은 필요가 없어. 왜냐면 상대가 안되니까. 그럼이 미래에 살아남을수있을까? 거의 불가능하지 않을까?
@hegerwalter
@hegerwalter 10 месяцев назад
Does BloombergGPT suffer the same problems as GPT4, most notably hallucinations? We talk about mistakes that the financial industry made, and the government had to bail them out. The noteworthy ones I can think of are changes in the tax code resulting in Texas's banks access to commercial real estate coverage in 1984, Long Term Capital Management, the 2008 Financial Crisis due to mortgage default swaps, and Oil Pipeline investment in Mexico. We talk about capitalism, but the level of corruption and bailouts makes the Soviets just a bunch of amateurs. So, hey, if it makes mistakes, the tax payer will bail them out.
@JLOOV-JLOOV
@JLOOV-JLOOV 9 месяцев назад
If you are suffering from hallucinations using your own money, that is okay. The problem arises when you are hallucinating with other people's money.
@Greatpretender11c
@Greatpretender11c 10 месяцев назад
Markov chain... Shannon?..pff
@stuartdavid
@stuartdavid 9 месяцев назад
?
@ABeautifulHeartBeat
@ABeautifulHeartBeat 9 месяцев назад
Bloomberg GPT is now obsolete
@user-sg2fw6ze7n
@user-sg2fw6ze7n 9 месяцев назад
모든 직업이 그렇게 되는건 아니지. 스포츠는 인간이 하겠지. 축구 야구 농구 배구, 그리고 방송도 인간이 하고 콘서트 노래도 인간이 하겠지. 사이버 가수 AI가수가 나오긴하는데, 완전히 대처될진 모르겠고, 영화나 드라마도 애니메이션이 흥행하는것을보면, 사이버 가상의 배우들이 영화와 드라마를 찍을날도 올꺼야. 그때는 배우가 단 1명도 없는데, 영화와 드라마가 나올꺼야. 애니메이션인데 실사화 된 애니메이션같은거지. 배우라는 직업도 없어질수있어. 왜냐면 더 싸더니까,
@lobovutare
@lobovutare 10 месяцев назад
I'd be freaking ashamed to work on this project. What kind of value is this adding to the world? Do we really want to live in a world where our economy is ran by bots and the gains of those bots go to a handful of super rich people? How about doing something useful, like solve our climate crisis or reduce the amount of suffering on the planet. This is just the super rich getting even richer.
@matthewfinch7275
@matthewfinch7275 10 месяцев назад
You should look into Bloomberg’s philanthropy organization. They legitimately are funding many initiatives to solve climate crisis and has a track record of helping communities in need across the globe.
@BenOgorek
@BenOgorek 10 месяцев назад
Well for one they’re sharing what they learned with us for free
@lobovutare
@lobovutare 10 месяцев назад
​@@BenOgorek Whoop-de-doo. The economy has become a battleground for AI's making trade decision at speeds humans can't even fathom with near zero ethical consideration, but Bloomberg was so generous to boast about it on stage. (Not that we can trust these institutions to do the right thing with humans in the loop either. In the end it's all about making money. Not about producing value or somehow making the world a better place.)
@goldenfishes3695
@goldenfishes3695 10 месяцев назад
... My advice for you is... The financial market is a platform for people to pool resources that are aggregated and simplified by $$$ values. Stop blaming the rich for exploiting the system or doing good in the game, rise up and learn to invest, join the game and beat them. Quite frankly it is hard for the finance people to not earn money because commoners like you only know how to cry. The only way your world will survive is if we all go back to the caveman days. As long as you start even 1 transaction you'd trigger resemblance of a financial market and there will always be people who will learn the system and game it. So stop sounding lofty and victimized, how about you go read a book and learn the system instead. I know you'd say that there's market manipulation, speculation and pump and dump schemes... You are right but there's also financing so you get your houses, you get construction pojects going, third world nations get funding to build factories so they can manufacture goods for us and drive down cost of items. Don't look at the bad only, the good outweighs the bad. In case you worry about the power of the rich, they are not so stupid to end the world for the poor. No matter how much guards and private armies they build they can never kill 8 billion human beings, it's numerically impossible.
@over9000andback
@over9000andback 10 месяцев назад
Better a bot we can audit than a malicious and corrupt banker.
@user-sg2fw6ze7n
@user-sg2fw6ze7n 9 месяцев назад
이미 이 세상은 돈을 못벌면 사람 취급을 안해. 돈이 없으면 내가 파는 물건 내가 일하는 월급을 줄수가 없는 존재야. 그럼 노숙자와 똑같은거야. 왜냐면 돈을 못버니까? 그러면 돈을 못버는 존재를 어떻게 판단하고 있지? 사람 취급을 안해. 왜냐면 내가 당장 오늘 돈을 벌어서 밥을 먹거나, 잠을 자거나, 스마트폰을 쓰거나, 인터넷할려나, 차를 타거나, 뭘 할려고해도 돈이 있어야해. 그런데 돈을 벌려면, 인간은 어떻게 했지? 노동력을 제공해야해. 돈을 가지고 태어나지 않은 이상은, 그런데 이미 노동력을 AI와 로봇이 우위로 점할거라는것을 모르는 사람이 없어. 그럼 인간이 필요할까? 왜냐면 상대가 안되는데? 능력주의 성과주의 생산성, 기업이 인간과 로봇중에 생산성이 높은 쪽을 고른다고, 당연히 더싸고 일을 잘하는 로봇을 골라. 그래서 로봇이 점점 많아져, 소비자또한 인간보다, 로봇과 AI를 선택해. 자율주행이 운전을하면 더 싸. 그럼 당연히 소비자 인간이 인간대신에, 더 싸게 운전해주는 자율주행을 선택해. 인간이 선택을하는거야. 인간이 스스로 그런 선택을해. 그럼 돈을 못버는 인간은 살아남을수가 없어. 노숙자가 어떻게 살아남아? 그럼 모든 인간이 로봇과 기계 AI에 밀린다고, 그럼 기업이 인간을 고용을하지 않을꺼야. 그럼 지금 자본주의앞에선 인간은 살아남을수가 없어. 자본주의 앞에서 경제적 생산성 능력주의 이 앞에서 인간은 존재의 이유가 없어. 왜냐면 로봇에 상대가 안되거든, 그럼 인간은 멸종이 확정이야. 살아남을수가 없어. 자본주의 논리로는 그렇다는거야. 모든 기업이 어느 나라에서라도, 다 그렇게 해. 다 기계 로봇 AI로 교체중이야. 기업이 그래야 살아남으니까. 자본주의 논리로는 인간은 살아남을수가 없어. 이미 이건 결과로 증명된거야.
@saravanampatti1
@saravanampatti1 2 месяца назад
Completely usless model. Models learn from the past data . They cannot predict the future. Once political event can completely turn the model parameters upside down.
@spacekraftru
@spacekraftru 10 месяцев назад
Wiki is a skewed data which is subjectively corrected by influencing parties with budget. So any llm trained on it will be amateurish and subjective
@recursion.
@recursion. 10 месяцев назад
Agreed
@variable42
@variable42 10 месяцев назад
Most ignorant comment here. Bravo sir
@Quantnom
@Quantnom 9 месяцев назад
FinGPT is better tho.
@user-sg2fw6ze7n
@user-sg2fw6ze7n 9 месяцев назад
방송도 버출얼 캐릭터들이 많이해. 물론 사람이 하는건데, 이것도 AI화 될수있어. 그럼 그것도 인간이 할수 없을꺼야.
@user-sg2fw6ze7n
@user-sg2fw6ze7n 9 месяцев назад
아나운서도 필요가 없어. 이미 AI 아나운서와 기자가 뉴스 진행을해. MBN 김주희 아나운서였나? 중국의 CCTV도 10년전에 한거야. AI 아나운서가해도 똑같아. 별 차이가 없어. 그럼 아나운서도 필요가 없어. 기자도 다른 기사 복불복하는 기사 청와대에서 보도 할꺼 내려줘서 그거 그냥 쓰는 기자들도 필요가 없어. 그거 AI가 하면 돼. 기자도 필요가 없어. 상담사들도 챗gpt가 다 상담하고 대답해. 상담사라는 직업들도 90%는 AI로 바뀌고 나머지 10%만 인간이 할꺼고, 사람이 필요가 없어. 그럼 도대체 어디서 직업을 구할까? 택시 운전도 테슬라와 구글 웨이모가 자율주행 택시 하면 택시 직업 수백 수천만명의 사람들은 직업을 잃어. 트럭 운전사도 없어질꺼고, 어디서도 일을 구할수가 없어. 인간은 필요가 없어. 택배 물류도 기계가 다해. 인간은 필요가 없어.
@jackie6786
@jackie6786 2 месяца назад
not a denmo what a garbage
@zandrrlife
@zandrrlife 8 месяцев назад
Subpar model 😂. I legit made something better...by myself with $500. I guess everything is so new, it seemd few actually understand how to build a SOTA model. Guess this was a POC though.
Далее
[1hr Talk] Intro to Large Language Models
59:48
Просмотров 2,1 млн
Linkin Park: FROM ZERO (Livestream)
1:21:01
Просмотров 5 млн
Cristiano Ronaldo Surpassed Me! #shorts
00:17
Просмотров 13 млн
Top Quants on Big Data and Disruption
28:43
Просмотров 77 тыс.
Run your own AI (but private)
22:13
Просмотров 1,4 млн
ChatGPT: 30 Year History | How AI Learned to Talk
26:55
FinGPT: Open-Source Financial Large Language Models
26:10
Linkin Park: FROM ZERO (Livestream)
1:21:01
Просмотров 5 млн