Apple M3 Machine Learning Speed Test (M1 Pro vs M3, M3 Pro, M3 Max)

Daniel Bourke

Подписаться 211 тыс.

Просмотров 187 тыс.

50% 1

Видео Поделиться Скачать Добавить в

Опубликовано:

3 окт 2024

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 203

@nat.serrano 5 месяцев назад

Finally somebody explains this shit properly not like all the other youtubers that only use it to create videos

@bill13579 9 месяцев назад

Your PyTourch was great. I recommend it to anyone learning ML, really can help you understand how GPT is built.

@mrdbourke 9 месяцев назад

Thank you Bill! Glad you enjoyed!

@mrsengeel 3 месяца назад

Please what laptop did you use to learn the course

@AZisk 9 месяцев назад

Nice! Missed you buddy!

@_MrCode 9 месяцев назад

I'm sure this video touched your heart.

@anthonypenaflor 8 месяцев назад

I actually thought this was your video when it popped in my feed!

@AZisk 8 месяцев назад

@@anthonypenaflorI don’t have such a beautiful desk.

@hyposlasher 8 месяцев назад

So cool that you used 10 shades of green in your graphs. It's very convenient to distinguish

@mrdbourke 8 месяцев назад

You’re right, I made a mistake here - I only really noticed this reviewing the video, I guess since I made it I could tell the difference. Next time the graphs will be easier to distinguish!

@hyposlasher 8 месяцев назад

⁠besides that, the video is awesome and very informative

@andresuchitra Месяц назад

Thank you Daniel, a thorough research for ML engineers. This research is worth a conference session 💪

@laiquedjeutchouang 8 месяцев назад

Thanks, Daniels, for the video and for sharing the materials' links. You're a legend. Got an M3 Pro 14" (11-core CPU, 14-core GPU, 18GB) last month and have been wondering it was an optimal move.

@aptbabs 9 месяцев назад

Yo, it’s been a while I saw my teacher, nice to see again and good video by the way. More blessings bro.

@junaidali1853 9 месяцев назад

Appreciate hardwork. But please consider using better color scheme for bars. They all look the same.

@andikunar7183 9 месяцев назад

Surprised, that you did not include RAM bandwidth in the beginning. Whenever you do non-batched inference, the memory-bandwidth becomes your main constraint, instead of your GPU-performance. As shown in your M1 Pro to M3 Pro comparison. llama-cpp's M-series benchmarking shows really nicely, why the M3 Pro with it's 150GB/s instead of 200GB/s memory is a problem, not its (faster) GPUs. If one just does inference and has large models, requiring lots of RAM, the M2 Ultra really shines with its loads of 800GB/s RAM. Totally agree, that with learning and batching, it's different and NVIDIA's new GPU performance blows away Apple silicon.

@imnutrak130 9 месяцев назад

that's RU-vid quality education, good enough but most of the times they are missing crucial details and due to that mistake twisting the truth especially for performance. Although this person has studied and gets paid BIG salary to know such details ..... weird but I boil it down to maybe a simple human mistake. Still a good video!

@mrdbourke 9 месяцев назад

Woah, I didn't know about the lower memory bandwidths between the M1/M3. Thank you for the information. I just wanted to try raw out-of-the-box testing. Fantastic insight and thank you again.

@gaiustacitus4242 3 месяца назад

nVidia's GPU performance falls on its face once the LLM's size exceeds the video card's onboard RAM.

@andikunar7183 3 месяца назад

@@gaiustacitus4242 yes, but you can split layers to multiple cards. For me, I decided for a M2 Max 96GB MacStudio and not for a 1kW+ heater PC, even though in pure GPU horsepower the 4090 is much faster. And never regretted it. Correction - I now regret my M2 Max decission since last week, because Apple/MacOS Sequoia finally will do nested Virtualization. But only on M3 and above. And with this I have hopes of virtualized GPUs at some time. Nvidia/CUDA always was virtualize-able and works in Docker-containers/VMs.

@gaiustacitus4242 3 месяца назад

@@andikunar7183 Even with two nVidia 4090 GPUs a 70B parameter LLM will still yield lower performance than a high-end M-series Mac.

@joejohn6795 9 месяцев назад

Please redo using MLX as that's what the developers using this laptop will probably be using.

@andikunar7183 9 месяцев назад

Especially since this week Apple released MLX with quantization support and other stuff.

@mrdbourke 9 месяцев назад

Fantastic idea! I started with TensorFlow/PyTorch since they're most established. But MLX looks to be updating fast.

@miguelangel-nj8cq 8 месяцев назад

not even that much, it doesn't even come close to those who really use Tensorflow and Pytorch, besides that if you have your production environment in the cloud, those 2 libraries are better integrated than MLX, in addition to the fact that for quick deployments you already have the containers preconfigured and optimized with those libraries and CUDA since the cloud servers are dominated by NVIDIA and not Apple's "Neural Engine".

@Joe_Brig 9 месяцев назад

If portability isn't a requirement, then the Mac Studio Ultra should be considered with its 60 GPU cores and 800GB/s memory bandwidth.

@Maariyyaa-i8f 2 месяца назад

you’re a great teacher! extremely clear

@modoulaminceesay9211 9 месяцев назад

Good to see you again you made machine learning and ai fun

@francescodefulgentiis907 8 месяцев назад

this video is exactly what i was searching for, thank you so much for proving such clear and usefull information.

@m_ke 9 месяцев назад

You missed memory bandwidth, the M1 pro has higher bandwidth than the non Max m3 macbooks.

@mrdbourke 9 месяцев назад

Thank you! I didn't know this. Very strange to me that a 2 year old chip has higher bandwidth than a brand new chip.

@joesalyers 16 дней назад

When you have the same chip you will hit the silicon lottery and one machine will have a better GPU while the other will have a better CPU depending on dead transistors and little lottery based differences. So I'm not surprised that an M# pro and M3 Max with the same Neural engine will perform differently. The Silicon lottery is a real thing that will always be a factor in computing. Great video by the way and very informative.

@synen 9 месяцев назад

These machines are great as laptops, for desktops, Intel 14th Gen i9 plus Nvidia GPU smoke them away.

@gaiustacitus4242 3 месяца назад

For small LLMs, you are correct. For 13B parameter or larger LLMs a maximum spec'd Mac Studio M2 Ultra or MacBook Pro M3 MAX will outperform the best Windows-based solution you can build. Of course, the new Copilot+ PCs running Snapdragon X Elite CPUs will also outperform the desktop build you've recommended when running 3B to 3.8B parameter LLMs.

@cybertg1041 9 месяцев назад

OH FINALLY, waiting for that. U are king bro

@kpbendeguz 8 месяцев назад

Would be interesting to see how the 128GB version of M3 Max performs compared to the RTX cards on very large datasets, since 75% ~ 96GB could be used as vram in that Apple Silicon.

@SanjiPinon 9 месяцев назад

Hey Daniel, Consider trying their MLX versions as some of the models enjoy performance gain as high as 4x compared to their torch counterparts

@siavoshzarrasvand 9 месяцев назад

Does MLX work with Llama 2?

@SanjiPinon 9 месяцев назад

@@siavoshzarrasvand yup and much much faster than llama.cpp

@_MrCode 9 месяцев назад

Glad to see you back.

@Yingzhe-mi6 9 месяцев назад

Hi Daniel, will you be teaching something more than image classification? You are the best programming teacher I have ever followed. Looking forward to your new deep learning course on ZTM.

@bonecircuit9123 9 месяцев назад

Thanks a lot for the valuable information. You saved me a tonne of time to come to a conclusion. cheers mate

@krishnakaushik4294 9 месяцев назад

Sir I follow all of your blogs, vedios etc I want to be a ML Engineer so i enrolled in your 'Complete ML and Data Science course on ZTM'. What a marvellous way of teaching ❤❤

@alibargh 7 месяцев назад

Excellent comparison, thanks 😊

@denis.gruiax 2 дня назад

Many thanks for the video 🙏

@EthelbertCoyote 9 месяцев назад

One thing is clear even as a PC person Mac had a steep advantage with M3's dynamic ram to vram conversion and mow power. Sure they don't have the hardware or software of nVidia but for some Ai users, the entry price for the VRam is a winner.

@dimitris7368 2 часа назад

1) How do you exactly SSH to your remote NVIDIA setup? Via VS code? 2) For a remote NVIDIA setup, is Windows ok or should it be linux-based?

@m_codes Месяц назад

Bought Nvidia RTX 4070 Ti Super.This video was very helpful.

@DK-ox7ze 8 месяцев назад

I believe you can also target pytorch to run on Apple silicon's NPU rather than the GPU. And I am sure it will perform better. Though not sure about how much memory the NPU has access to. It will be great if you can explore this and do a video on it.

@nadtz 9 месяцев назад

The M3 Pro being slower/not much faster in some tests is probably because of the slower ram. I'd be interested to see how 30 and 40 series cards stack up but considering the cost of the laptops already this is quite the effort so no complaints.

@kborak 9 месяцев назад

my 6750xt will beat these things lol. You macboys are so lost in the woods.

@nadtz 9 месяцев назад

@@kborak I'm not a mac user, I wouldn't buy Apple hardware for love or money. But the chips are still pretty good so it's interesting to see how they stack up to a better GPU for this kind of workload.

@ri3469 9 месяцев назад

This is interesting! It seems between the m3 pro 16GB (150GB/s) and m3 max 32GB (400GB/s), and considering the m1 pro 32gb (200GB/s), would you suggest that RAM is a much important factor to these ML tasks than memory bandwidth? Or other? Would be keen to see a test between an m3 pro 32gb vs your m1 pro 32gb to see if memory bandwidth of 50GB/s difference has any real world result differences. (also one less GPU core but faster boost in M3 pro)

@paulmiller591 8 месяцев назад

Very helpful thanks Daniel. I was going to race out and buy an M3 to do my ML work, but I will hold off for now. I suspect Apple will do something to help boost performance considerably on the software side, but who knows.

@oddzc 8 месяцев назад

Your tests just prove how bullcrap synthetic benchmarks are. Love your work.

@ericklasco 6 месяцев назад

Thank you for the Knowledge it really gave me an insight.

@alexpascal5403 2 месяца назад

nobody likes me. but that’s okay! i’ll suqqOn it anyway. !!! i’ll suqqOn it anyway !!!! feed me a lemon bro. give me your lemon dude.

@RandyAugustus 7 месяцев назад

Finally a useful video. Too many “reviews” focus solely on content creators. Now I know I can do light ML on my Mac. And do the heavy lifting with my 30 series RTX card.

@digitalclips 8 месяцев назад

I'd love to see you test the M3 Ultra with 64 GB RAM when it comes out, I am using the M2 Studio Ultra at present and wonder if it will be worth upgrading. Running batches, it gets warm, but I've never heard its fan yet.

@imnutrak130 9 месяцев назад

7B parameters / 25 ( 25 and delete 7 zeroes or divide by 250 000 000) = 28GB which is close enough for a simple maths for GB Memory for Molde Parameters.

@jplkid14 9 месяцев назад

Why didn't you compare M1 max?

@levelup2014 9 месяцев назад

I wish you would make videos covering AI news your probably more qualified to talk about new developments in this space then 80% of these “AI channels”

@ahmedrahi9775 8 месяцев назад

The comparison between the M1 Pro and M3 Pro is not ideal. The M3 pro you are testing is the binned version with only 14 cores however your comparing it too the full M1 Pro. To get an accurate performance measurements its best to measure both the full chips rather than the binned version that way we can truly see if the memory bandwidth has any difference when it comes to Machine learning

@altairlab4876 9 месяцев назад

Hi Daniel! What a great PyTorch tutorial you have made. Thanks for that! Also thanks for that speed comparing video. Can you record the video that comparing the speed of different Colab versions? I mean free, 10$ and 50$. Also here can be added M3 max and your Titan (which you already have done). Maybe one of your friends has 50$ account and he can do that tests for you [for all of us :)]

@tty2020 9 месяцев назад

your M1 Pro RAM is about twice that of your m3 pro, so maybe that is why it performs better than the latter.

@mrdbourke 9 месяцев назад

Yeah you're right, I also just found out that M1 has a higher memory bandwidth than the M3 (150gb/s vs 200gb/s) thanks to another comment. That likely adds to the performance improvement on the M1. Strange to me that a 2-year-old chip can comfortably outperform a newer chip.

@JunYamog 9 месяцев назад

I have only a 16 GB M1 Pro, on the first 2 benchmark I get similar or slightly faster speeds. I will try to run them on the other benchmarks, I got side tracked modifying the 1st benchmark to run on a quad RTX 1070 setup.

@krishna1-c6d 3 месяца назад

Thankyou for the video.

@krishnakaushik4294 9 месяцев назад

Happy Christmas Sir❤❤

@mrdbourke 9 месяцев назад

Happy Christmas legend!

@WidePhotographs 6 месяцев назад

In the process of learning ML/Ai related tasks. Based on your experience would you prefer a 13” MBP M2 24GB RAM ($1,299 new) or a 14” MBP M3 Pro 18GB RAM ($1,651 used)?

@mrdbourke 6 месяцев назад

The 24GB of RAM would allow you to load larger models. But it also depends on how many GPU cores the two laptops have. Either way, both are great machines to start learning on

@valdisgerasymiak1403 9 месяцев назад

IMHO macbooks are only inference machines, not training. It's great to run locally 7B, 13B, 30B LLMs (depends of your # of RAM), run quick stundents training on something like MNIST. I personally write code for training and run experiments with small batch size on my M1 pro, than copy the code on my 3090 PC and run long training with bigger batch and fp16. While PC is busy, I run next experiments in paralle on laptop. If you load with big training your main laptop, you will have uncomfortable experience if you want browsing, gaming, etc in parallel with training.

@InnsmouthAdmiral 8 месяцев назад

While this is a nice buying guide for my next laptop, this is just a shining endorsement for Google Colab. What an insane value for new-comers looking to learn while not being hobbled by old equipment.

@tryggviedwald5126 8 месяцев назад

Thank you, I was hoping someone would look into how these machines perform on ML, not only video processing. The results are quite disappointing.

@sathiyanit 9 месяцев назад

Very good one. Thank you so much.

@franckdansaert 9 месяцев назад

is your test is using the Mx GPU : are TensorFlow and Pytorch optimized for Apple GPU silicon ?

@ryshask 6 месяцев назад

On my m1 max 64GB... I'm getting 8208 on Core ML Neural Engine... My Core ML Gpu falls more in line at 6442... All this while powering 3 screens. Watching youtube and a twitch stream. Not that I expect those things to add much load... But it is nice to have a machine that can basically do everything at once with near zero penalty.

@furkanbicer154 8 месяцев назад

Can you also make comparison with Neural Engine of M processors?

@azrinsani 10 дней назад

I have the exact same Macbook Pro 32GB 16 Core GPU !!!! Wondering, will running this fry your macbook?

@duydang5101 2 месяца назад

Thank you. M1 Pro is so good.

@SloanMosley 9 месяцев назад

If you really want to show the Apple silicons advantage just wait till the M3 ultra comes out with 256GB Memory and then use a model that needs that much memory. Then the only comparison would be ~3 A100s. With apples new MLX and flash is all you need we might even get better results

@OmarDaily 9 месяцев назад

Can’t wait to pick one up, I was planning on a M2 Ultra but, I’m expecting to keep this machine for a good while as part of my server rack.. So M3 ultra it is!

@SloanMosley 9 месяцев назад

@@OmarDaily I’m very jealous 😛 I have the 64GB M1Max and if if a new MOE model comes out that rivals gpt4 it might just be worth it

@javierwagner4410 7 месяцев назад

I think it would be interesting if you standardized your measures by memory and number of cores.

@stephenthumb2912 8 месяцев назад

although it's nice to see vision models most people wanted to see inference w/transformer LLM's then fine tuning LORA, SFT. llama2 q40 is hardly a test even an 8gb mac metal can run that. would like to see different quants at 33b and 70b with different loaders, AWQ, GPTQ, exllama etc.

@godofbiscuitssf 8 месяцев назад

At one point you say the bottleneck is memory copies from CPU to GPU and back, but the M-series doesn't have to do memory copies because it's all shared memory. In fact, one of the first optimizations for code on Apple Silicon is removing all the memory copying code because it's an easy gain. Have you accounted for this in either your code or the library code you're using, or both?

@betobeltran3741 Месяц назад

I am a medical doctor with a recently acquired Ph.D. in pharmacology. I am currently engaged in clinical research, focusing on identifying factors that lead to therapeutic failure in patients with various conditions. My work involves analyzing patient data files that include sociodemographic information, pathological records, clinical data, and treatment details. These datasets typically contain between 100 and 2,000 variables per patient, with a maximum of 1,000 patients in an ideal scenario. I will be using R and RStudio to process and analyze this data in various ways. Based on your experience, could you suggest a computer configuration capable of handling this type of data processing efficiently? Thanks in advance!

@YuuriPenas 8 месяцев назад

Great job! Would be great to include some popular Windows laptops as well in the comparison :)

@dkierans 9 месяцев назад

Outstanding

@woolfel 9 месяцев назад

from my experience, tensorflow optimization is a little better than pytorch for convolutional models.

@icollecteverything 8 месяцев назад

You posted the single-core CPU scores for the M3 Macs, that's why they are all the same pretty much.

@tybaltmercutio 8 месяцев назад

Could you elaborate on that? Are you referring to the ML timings?

@shettyvishal-5561 8 месяцев назад

Which one would you prefer buying now between M3 (8gb ram and 512ssd) VS M1 (64gb ram and 512 ssd) M1 = 2,286 usd M3 = 2045 usd

@JunYamog 9 месяцев назад

Thanks for this, really useful and confirms my initial thoughts on just getting an M1 Pro 16GB over M3 8GB (M1 Pro is slightly cheaper). My M1 Pro is similar to yours 10 cpu + 16 gpu but just 16GB and has been slightly faster on both pytorch benchmarks. I then was curious to see how it compares to a quad RTX 1070. I modified your code (I will make a PR) to use all four GPU for CIFAR100. In general it is faster than the M1 Pro, what is interesting is how it compares to single card vs quad cards. CIFAR100 on small batch it was really bad, however by 512 batch size it was faster than a single card (34 secs on 1024 batch). It keeps on improving until 3072 with 16 secs, then gets worse at 4096 back to 19 secs similar to 2048. Also by 4096 batch size the GPU VRAM is almost full and close to 8GB.

@gaiustacitus4242 3 месяца назад

The problem with reliance on nVidia GPUs is that performance takes a nosedive once the LLM can no longer be loaded into the video card's onboard RAM. Any M-series Mac with 128 GB RAM will outperform a PC equipped with 120 GB RAM and the best available nVidia GPU. I know because I've invested in builds of both sets of hardware only to learn the hard way that a Windows PC with an nVidia 4090 GPU with 24 GB RAM is extremely disappointing for 13B parameter or larger LLMs. The smaller LLMs do not yield acceptable quality of results. At present, your best approach to running a private LLM that approaches the accuracy of ChatGPT 4o is a Mac Studio M2 Ultra with 192 GB RAM and maximum CPU/GPU cores, followed by a MacBook Pro M3 MAX with 128 GB RAM and maximum CPU/GPU cores. Of course, if your goal is to just tinker with a local LLM to gain a better understanding of how AI works, then run smaller LLMs on a Windows PC with an nVidia GPU.

@Rowrin 7 месяцев назад

Also worth noting that the GPU on a macbook only has access to 75% of the unified memory.

@PMX 7 месяцев назад

Both the M3 Pro and the M3 Max you tested have lower bandwidth than the previous M1/M2 Pro / M1/M2 Max and since bandwidth is hugely important that was reflected in your results. The M1/M2 Pro have a 200 GB/s whereas the M3 Pro only has a 150 GB/s. The M1/M2 Max have a 400 GB/s bandwidth but the M3 Max model you chose only has a 300 GB/s bandwidth (there are also M3 Max models with 400 GB/s).

@PMX 7 месяцев назад

As an example, I get 30% faster inference speed on my M2 Max (400 GB/s memory bandwidth) as you got with the base M3 Max (300GB/s bandwidth).

@mrdbourke 7 месяцев назад

Wow! I didn’t even know this… excellent info. So what makes the bandwidth increase from the base models? Is it RAM upgrades or storage? Or something else?

@mrdbourke 7 месяцев назад

@@PMX makes sense!

@PMX 7 месяцев назад

@@mrdbourke The M3 Max with a 30 core GPU has a 300 GB/s bandwidth and the one with a 40 core GPU has a 400 GB/s bandwidth

@mrdbourke 7 месяцев назад

@@PMX woah so the 40 cores is worth the upgrade. Is this information on Apple’s website? I must’ve missed it

@IviAxm1s7 5 месяцев назад

It's a good idea going for a new M3 MacBook Air model with 16GB for starting to learn ML?

@mrdbourke 5 месяцев назад

Yes that would be a perfect laptop to start learning ML. You can get quite far with that machine. Just beware that you might want to upgrade the memory (RAM) so you can use larger models.

@krishna1-c6d 3 месяца назад

@@mrdbourke sir I am having M3 Air 16 GB and Macbook Pro M3 Pro 18 GB What should I go for, if I am starting to learn and grow in ML in long term and the price difference between both is 30,000 /- please adive, thanking you

@sam-bm1fg Месяц назад

@@krishna1-c6d You dont need such heavy powered machines to start learning ML. Just use google collab to learn. May be then once you implement projects you will understand which is better.

@garthwoodworth3558 8 месяцев назад

Question: I bought the M1 max with 64 GB ram, and 32 cores GPU. Like you, I am now extremely satisfied with my purchase two years later. Question: I like your set up using the Apple machine in conjunction with a box with that RTX4090 installed. Would that set up run in parallel with my GPU course? And similarly, if I added equivalent ram to that box, would it work together with my installed 64 GB?

@dr.a.o. 9 месяцев назад

~$3000 for that deep-learning PC seems super cheap. It will cost double the price where I live...

@andyparker8631 8 месяцев назад

Would be very interesting to normalise the results based on cost of hardware, after all it always comes down to spend!

@jks234 8 месяцев назад

Hm, in my opinion, a strange metric because "effectiveness per dollar" doesn't really tell you much. My bike costs $300 and my car cost $10000. My bike averages around 20 mph and my car 75 mph. That comes out to 30x the price for 4x the speed. Did this tell you anything? In my opinion, no. What is a far more useful metric is the options the purchase makes available to you. If I have a car, traveling 10 miles for food is a very easy decision to make. If I only have a bike, traveling 10 miles is a major decision. With the right hardware, you unlock options like "iterative experimentation" whereas before, you had to carefully choose your workloads. And as he mentions, certain configurations simply lock you out of certain desired avenues. (8 GB of RAM is too little for many projects.) So yeah... spend is not a very useful metric, in my opinion. Choosing the bike over the car is a pretty pricey choice for reasons beyond money.

@aparkeruk 8 месяцев назад

Interesting analogy, but with the car many other features (in the warm, carries 4 people....). When buying compute power for AI then yes you could also consider laptop might be better than desktop for convenience, but not really like the car example. If you were comparing mainframe to laptop to desktop then might be nearer this analogy. Guess will not matter soon, as cheapest will be cloud purely by volume!@@jks234

@mikefriscia6329 9 месяцев назад

This is really a great video. The problem I have is all my development is on a laptop and I think this is wrong. The conundrum is simple, I will present my work, that's a given, so how do I dev on a much more powerful desktop and still have the ability to present my work? I hate powerpoints of screenshots, I want to really show what I'm doing.

@eddavros264 8 месяцев назад

How about connecting with ssh to your desktop from your laptop?

@samsontan1141 7 месяцев назад

Great video, could you please update us if the new mlx change the result or your conclusion at all? Would love to know if the m series chip is as good as what the others are saying .

@HariNair108 8 месяцев назад

I am planning to buy M3 pro. Which one should i go for 30 core GPU or 40cire gpu. My use will be around running some prototype models in LLMs.

@shubhamwankhade628 8 месяцев назад

Hi Daniel love your video, can you please suggest which laptop is good for deep learning mac or windows or linux

@JoseMariArceta 3 месяца назад

So cool, are you able to run these tests on a m3 max chip with a maxed out ram configuration? Could it be more "usable" than say a 4090 with "only" 24gbs of dedicated vram?

@TheMetalMag 9 месяцев назад

Great video

@_codegod 8 месяцев назад

💁🏽‍♂You should try MLX instead of PyTorch w/ mps backend for Macs and then compare results with CUDA ones.

@gt_channel 9 месяцев назад

I don't think the choice of the graph colors was good

@_mikeusa 9 месяцев назад

There’s too many considerations that were left out. M3 chips need more RAM because they’re sharing with the system. You want 48-64GB for these tests. In addition this didn’t mention the difference between performance cores and efficiency cores. The ratios changed with the latest M3 CPUs. Finally, with RAM upgrades you want to consider the memory throughput, which was capped lower unless you upgraded the M3 Max. All-in-all this is a good general comparison for affordable devices that students may have. I’d like to see an upgraded M3 Max/64GB/4TB. Acknowledging, NVIDIA would still be faster. Of course, if speed is the game you’d put this on a AWS server somewhere and just have it churn for you.

@nostriluu 9 месяцев назад

The Titan is five years old. Would have been nice to include a current GPU. like the 4090. It can be 2.5× faster than the 3090, which is newer than the Titan.

@spuffles 8 месяцев назад

Seems M3 is crippled on most tests due to low memory to be a real M version vs, and more a “how low ram can hurt you”. I would have loved all models at the same ram, or all being base or maxed out models. That said, interesting insights on the effect of ram and how nVidia performs when we’re talking strict GPU.

@mrdbourke 8 месяцев назад

All M3 models are the base variant in their category. Only upgraded model was the M1 Pro (can’t buy anymore). But yes you’re right would be cool to see them all on the same RAM!

@MrSmilev 6 месяцев назад

Do apple silicon chips handle the workload on neural cores themselves or do they need to be specifically invoked via an sdk from the code? what was the workload on those during each test? I wonder if they were invoked at all. if they were, it sounds like they do not matter compared to GPU, however it's claimed they can do something like 17 tops which outperforms any google coral. Moreover, apple claims neural cores are 60% faster on m3 compared to m1. confused now.

@dang4546 8 месяцев назад

Your tests are probably gated by RAM so aren’t showing processor and GPU deltas. While most tests ran, there was probably a lot of VM swapping. :-)

@szymonmercado9510 2 месяца назад

In this video, the M3 base model has only 8gb ram and the M1 pro has 32gb ram. What if I'm choosing between the M3 base that has 16gb ram and the M1 pro that has 16gb ram as well, should I still go for the M1 pro? Thanks

@alsoeris 4 месяца назад

14" thermal throttles btw

@nocopyrightgameplaystockvi231 9 месяцев назад

Rtx 4090 has better Tensor cores, so that's hard to compete even with a M3 Max.

@andikunar7183 9 месяцев назад

not for pure non-batched inference, where the memory-bandwith as well as memory-size is the main constraint. There the M2 Ultra's 800GB/s vs. 4090 1080GB/s is not so bad. The higher GPU-power of the 4090 really shines with batched processing.

@RichardGetzPhotography 8 месяцев назад

M series doesn't allow for external GPUs so how do you hook a 4090? This would make a good video.

@usefulprogrammer9880 3 месяца назад

Seasoned ML/AI engineers know just about the only thing we use our laptops/personal machines for is a web client to log in to cloud services and train from there 😂

@joshbasnet3014 8 месяцев назад

Do you have a masters/phd degree on ML ? Does your job require data science degree?

@flimo13 3 месяца назад

M1 Pro doesn't perform better because of the more GPU cores, but because M3 Pro was seriously handicapped, not just less performance cores, but the memory bandwidth is severely cut back.