Downgrading My GPU For More Performace

Подписаться 285 тыс.

Просмотров 36 тыс.

50% 1

Checking out a older nvidia tesla card that can meet my needs for AI.
○○○ LINKS ○○○
Nvidia Tesla M40 ► ebay.us/ED5oqB
Nvidia Tesla P40 ► ebay.us/HWpCZO
○○○ SHOP ○○○
Novaspirit Shop ► teespring.com/...
Amazon Store ► amzn.to/2AYs3dI
○○○ SUPPORT ○○○
💗 Patreon ► goo.gl/xpgbzB
○○○ SOCIAL ○○○
🎮 Twitch ► / novaspirit
🎮 Pandemic Playground ► / @pandemicplayground
▶️ novaspirit tv ► goo.gl/uokXYr
🎮 Novaspirit Gaming ► / @novaspiritgaming
🐤 Twitter ► / novaspirittech
👾 Discord chat ► / discord
FB Group Novaspirit ► / novasspirittech
○○○ Send Me Stuff ○○○
Don Hui
PO BOX 765
Farmingville, NY 11738
○○○ Music ○○○
From Epidemic Sounds
patreon @ / novaspirittech
Tweet me: @ / novaspirittech
facebook: @ / novaspirittech
Instagram @ / novaspirittech
DISCLAIMER: This video and description contains affiliate links, which means that if you click on one of the product links, I’ll receive a small commission.

Опубликовано:

21 сен 2024

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 113

@KomradeMikhail Год назад

SD, GPT, and other AI apps _still_ not taking advantage of AI Tensor cores... Literally what they were invented for.

@gardenerofthesun 7 месяцев назад

As long as I know, llama-cpp can use tensor cores

@joo9125 Год назад

Turing, not TURNing lol

@nneeerrrd Год назад

He's a Pro, don't tell him he's wrong 😂

@igyysdaddy191 5 месяцев назад

you just turinged him on

@subsubl 4 месяца назад

😂

@syspowertools3372 6 месяцев назад

I picked one up on Ebay for $45 shipped. I also had a FTW 980ti cooler laying arround. As long as the cooler fits the stock PCB of any 970 to titan X card, you can just swap it. You may need to cut out or re-solder the 12v power connector in the other orientation tho, in my case I moved it from the back to the top. I also thermal glued heat sinks on the backplate because not beingin a server case means that vram gets warm.

@yungdaggerdikkk 5 месяцев назад

holy molly bro, 45? any link or tip to get one that cheap? ty and hope u enjoy it x)

@joshuachiriboga5305 5 месяцев назад

@@yungdaggerdikkk Newegg has them at about that price

@joshuachiriboga5305 5 месяцев назад

Running Stable Diffusion does it run out of vram at 12gb or at 24gb? The tech docs claim the system is 2 systems of Cuda and vram etc...

@gregorengelter1165 Год назад

I also got myself an M40 a few months ago. But cooling with air is not really a good solution in my opinion. I was lucky enough to get a Titan X (Maxwell) water block from EK for 40€/~44USD. With it, the part runs perfectly and comes under full load to a maximum of 60 ° C / 140 °F. If you are not so lucky, I would still recommend using these AiO CPU to GPU adapters (e.g. from NZXT). Air cooling is comparatively huge and extremely loud (most of the time).

@KiraSlith Год назад

I'm using a trio of P40s in my headless Z840, kinda risking running into the PSU's power limit, but there's nothing like having a nearly real-time conversation with a 13b or 30b parameter model like Meta's LLaMA.

@jaffmoney1219 Год назад

I am looking into buying a Z840 also, how are you able to keep the P40s cool enough?

@KiraSlith Год назад

@@jaffmoney1219 Air ducting and cramming the PCIe zone intakes to 100%. If you buy the HP branded P40s supposedly their BIOS will tell the motherboard to ramp the fans automatically. I'm using a pair supposedly from PNY so I don't know.

@strikerstrikerson8570 Год назад

@@KiraSlith Hello! Can you make a short video on how it works for you from the side of hardware and a language model such as LLAMA? If you can’t or don’t want to make a video, you can briefly describe here your hardware configuration, and what is better to take for this? I'm looking at an old platform 2011-v3 18-22 core cpu, gaming motherboard from asus or asrock with 128/256gb ddr4 ecs ram. At first I wanted to buy a modern video card RTX 30xx / 40xx line, but then I came across Tesla server accelerators, which have a large amount of VRAM 16/24/32 GB which we have about 150/250/400 euros Unfortunately, there is somehow little information, and if you come across videos on RU-vid, then people start stable diffusion, which gives very deplorable results even at tesla V 100, which the RTX3060 bypasses. Thanks in advance!

@KiraSlith Год назад

@@strikerstrikerson8570 Sure, when it comes down for maintenance next. It's currently training a model. If you want new cards only and don't have a fat wallet to spend from, you're stuck with Consumer cards either way. Otherwise, what you want depends entirely on what your primary goal is. Apologies in advance for the sizable wall of text you're about to read, but it's necessary to understand how to actually pick a card. I'll start by breaking it down by task demand: - image recognition and voice synthesis models want fast CUDA cores but still benefit from higher core counts, and the larger the input or output, the more VRAM they need. - Image generation and voice recognition models also want fast CUDA cores, but their VRAM demands expand exponentially faster. - LLMs want enough VRAM to fit the whole model uncompressed and lots of CUDA cores. They aren't as affected by core speed but still benefit. - Model training always requires lots of VRAM and CUDA cores to complete in a reasonable amount of time. Doesn't really matter what the model you're training does. Some models bottleneck harder than others (though the harshest bottleneck is always VRAM capacity), but ALL CUDA Compute capable GPUs (basically anything made after 2016) are able to run all models to some degree. So I'll break it down by their degree of capability, within their same generation and product tier. - Tesla cards have the most CUDA cores and VRAM, but have the slowest cores and require your own high CFM cooling solution to keep them from roasting themselves to death. They're reliably the 2nd cheapest card option for their performance used and the only really "good" option for training models. - Tesla 100 variants trade VRAM capactiy for faster HBM2 memory, but don't benefit much from that faster memory outside enterprise environments with remote storage. They're usually the 2nd most expensive card in spite. - Quadro cards strike a solid balance between Tesla and Consumer. Fewer CUDA than Tesla but more than Consumer. Faster CUDA cores than Tesla but slower than Consumer. More VRAM than consumer, but usually less than Tesla. Thanks to "RTX Experience" providing solid gaming on these cards too, they're the true "Jack of all trades" option and appropriately end up with a used price right in the middle. - Quadro "G" variants (eg GP100) trade their VRAM advantage over consumer for HBM2 VRAM at absurd clock speeds, giving them a unique advantage in Image generation (and video editing). They're also reliably the most expensive card in their tier. - Consumer cards are the best used option for the price if you want bulk image generation, voice synthesis, and voice recognition. They're slow with LLMs, and if you try to feed them a particularly big model (30b or more) will bottleneck even more harshly on their lacking VRAM (be it capacity or speed) and potential to bottleneck even further paging out to significantly slower system RAM.

@og_tokyo 9 месяцев назад

Stuffed a z440 mobo into a 3u case, will be putting 2x p40s in here shortly.

@StitchTheOtter Год назад

I did get myself a P40 for 170€. RTX 2080 gaming performance and 24gb GDDR5 694.3 GB/s. Stable diffusion on my 2080 runs around 5-10x Faster than on the P40. But it would make a good price/performance cloud gaming GPU.

@vap0rtranz 4 месяца назад

Great explanation. Basically a Gamers vs AI hackers. The AI models want to fit into V/RAM, but are huge, so the 8G or 12G VRAM cards can't run them. Getting a new + huge VRAM GPU is hella expensive right now. So an older card with lots of VRAM works. Also, the Gamers tend to overclock/overheat, but the Tesla and Quadro are usually datacenter liquidations, so there's less risk of getting a fried GPU. BTW: the P40 is newer version of the M40.

@SpottedHares 10 месяцев назад

So according to Nvidia own specs the m40 uses the same board as the titan x and 900 series. So theoretical any cooling system that works for either of those two should also work on the M40.

@KratomSyndicate Год назад

I just bought a rtx 4090 last night and all the parts for a new desktop, i9 13900K, MSI Meg Z790, ddr5 128gb, 4 - samsung 990 pros, to just do SD and AI, maybe over kill

@Mark300win 9 месяцев назад

Dude you’re loaded 😁$

@sa_med 7 месяцев назад

Definitely not overkill if it's for professional use

@madman1397 11 месяцев назад

Tesla P40 24gb cards are on ebay for sub $200 now. Considering one for my server

@zilog1 Год назад

They are going for $50 currently. get a server rack and fill them up!

@charleswofford5515 Год назад

For anyone wanting to do this. I found the best cooling solution is a Zotac gtx 980 amped edition 4 Gb model. It has the exact same footprint. The circuit board is nearly identical. Bolts right on with very little modifications. You will need to use parts from tesla and zotac gpuGPU to make it work. Been running mine for a while now without issue.

@schifferu Год назад

Got my Tesla M40 a while back, and now have a fan cooling on it (EVGA SC GTX 980ti cooler) to mess around with, but just seeing the power consumption 😅😅

@edgecrush3r Год назад

I just purchased a Telsa P4 some weeks ago, and having a blast with it. The Low Profile even fits in the QNAP 472XT chassis. Passthrough works fine (minor tweaks). Currently compiling kernel to get support for vGPU (if i ever succeed).

@FlexibleToast Год назад

You say you need a newer motherboard to use the P40. Does any motherboard with PCIe x16 3.0 work?

@k-osmonaut8807 Год назад

Yes, as long as it supports above 4g decoding

@DanRegalia 11 месяцев назад

So, I picked up a P40 after watching this video... Thanks! Do you have any videos that talk about loading these LLMs, or if I should go with linux/windows/etc... maybe install Jetpack from the Nvidia downloads? I've screwed around a little with hugging face, and that made me want to get the card to run better models, but rabbit hole after rabbit hole, I'm questioning my original strategy.

@NovaspiritTech 11 месяцев назад

i'm glad you were able to pick up a p40 and not the m40 since pascal arch can run 4bit modes which is most llm models but llm's changes so rapidly i can't even keep up myself but i have been running the docker container for github.com/Atinoda/text-generation-webui-docker . but yes this is a deep rabbit hole i feel your pain

@vap0rtranz 4 месяца назад

Easiest out-of-box apps for running local LLMs are GPT4All and AnythingLLM. Huggingface requires lots of hugging to not sink into rabbit holes :) The apps like I mention keep things simple. Both have active Discord channels that are helpful too.

@l0gic23 20 дней назад

Remember how much it was at the time?

@seanoneill9130 Год назад

Home Depot has free delivery.

@NovaspiritTech Год назад

😂

@garthkey Год назад

With them having the choice of worst wood, no thanks

@Bjarkus3 26 дней назад

If you put a p40 with a 3090 will it be bottlenecked at p40 speeds or will it be an average?

@timomustamaki5407 Год назад

I have been planning this move as well as the M40 is dirt cheap on ebay. But I worry about one thing you did not touch on this video (or at least I did not notice if you did): How did you solve the power cabling issue? I believe the M40 does not take a regular pcie gpu power cable but needs something different, an 8-pin cable?

@KiraSlith Год назад

That's right, the Tesla M40 and P40 use an EPS (aka "8-pin CPU") cable, which can thankfully be resolved using an adapter cable. Just a note, the 6-pin PCI power to 8-pin EPS cables some chinese sellers offer should ONLY be used with a dedicated cable run from the PSU to avoid cable meltdowns! Thankfully this isn't an issue if you're using a HP Z840 (which also conveniently solves the airflow issue too), or a custom modular PSU with plenty of PCI power connections, but it can quickly become an issue for something like a Dell T7920.

@win7best Месяц назад

the p40 from the price is already way better, also if you wanted more cuda cores you could have gotten 2 K80s for the same price

@sergiodeplata Год назад

You can use both card simultaneously. There will be two CUDA devices.

@joshuachiriboga5305 5 месяцев назад

The Tesla K80 with 24gb vram, claims a setup of 2 system each with it own Cuda and vram. When running Stable Diffusion does it behave as one GPU with 24gb or does it behave as 2? Does it run out of vram at 12gb or 24gb in image production?

@truehighs7845 4 месяца назад

That's exactly my question.

@AChrivia Год назад

2:21 Actually, that Tesla card has 1150 more cuda cores than that 2070... 3,072-1,922= 1150 The only thing im curious about is how well it can mine. 🤔 If anything, why the hell wouldnt you just get a 3090ti? It has 10,496 cuda cores which is far and beyond the tesla in both capabilities for work and gaming. If its due to sheer prices, i get it but the specs are still beyond what you currently have.

@Antassium Год назад

Cost:Performance...

@alignedfibers Год назад

I went with K80 but stable diffusion only runs with torch 1.12 and cuda 11.3 and right now only runs on 12GB half memory and half gpu in the k80 because it is Dual GPU. M40 should allow modern cuda and nvidia driver and also no work around needed to access full 24GB on K80.

@joshuachiriboga5305 5 месяцев назад

Thank you, I have been looking for this info

@truehighs7845 4 месяца назад

Does it use the whole 25Gb VRam, because it's basically 4 cores put together, is the Vram working as 1?

@jerry5566 Год назад

P40 is good, but only concern is that it had probably been used for mining

@Antassium Год назад

Mining has been proven to not cause any more significant wear than regular duty cycles.. In fact, in some situations the mining rig would be a cleaner and safer environment than in a PC case, on the floor in some persons home with toddlers sloshing their chocky milk around, for example 😂

@zygge Год назад

Pc dont need HDMI output to boot. Any display interface is ok. VGA, DVI or DP

@TheRainbowdashy 6 месяцев назад

How does the p40 perform for video editing and 3D design programs like Blender?

@beholder4465 Год назад

i have asus h410 hdv m.2 intel chipset, compatibilty good with the tesla m40? ty

@fuba44 Год назад

But wait, i was under the impression that both the M40 and the P40 are dual GPU cards, so the 24gb of vram is split between the to gpu's. or am i mistaken ? when i look up the specs it looks like only 12gb per gpu.

@unicronbot Год назад

M40 and P40 GPU are single CPU

@yb801 11 месяцев назад

I think you are talking about K80 gpu.

@simpernchong Год назад

Great video. Thanks!!

@markconger8049 Год назад

The Ford F150 of graphics cards. Slick!

@titopancho 2 месяца назад

after watching your video, i tried to do the same, but, i had a problem.. i have the HP DL380 server and I purchased the Nvidia Tesla P100 16GB, but i can't find the power cable. watching other poeple i am afraid to buy the wrong one and fry my server.... can you please tell me the right cable to buy please..

@idcrafter-cgi Год назад

My 4090 takes 2 seconds to make a 512x512 at 25 steps. It only has 24gb vrm which means that i can only like make 2000x2000 inages with no upscaling

@brachisaurous 10 дней назад

P100 would be better for stable diffusion

@nodewizard Месяц назад

Just buy a used RTX 3090 for $500. Works great with generative art, LLMs, etc.

@carlosmiguelpimientatovar8458 7 месяцев назад

Excellent video. In my case I have a workstation with an msi X99A TOMAHAWK motherboard with an Intel Xeon E5-2699 v3 processor, (and I currently use 3 monitors). Because of this I installed a GPU, AMD firepro w7100 which works very well for me in Solidworks. The RAM is Non-ECC 32 gigabytes. The problem is that I am learning to use ANSYS, and this software is married to Nvidea, and for GPU calculation acceleration, looking at the Ansys GPU compatibility lists, I see that the K80 is used, and taking into account the second-hand price, I am interested in purchasing one. How can I configure my system to install an Nvidea Tesla K80 and have the AMD GPU work as an image or video generator for my monitors as it currently does? Does the Nvidea K80 gpu have 24 GB of ram, can this be affected when using this gpu in conjunction with the AMD GPU that only has 8 GB of ram? Would the K80 be restricted to the RAM of the Firepro w7100? My PSU is 700 watts. Thank you.

@tomaszmaciaszczyk2116 5 месяцев назад

cuda cores my frend .ihave this card on my table right now.g f pol

@Robstercraw Год назад

You can't just plug that card in and go. There are driver issues. Did you get it working?

@gardenerofthesun 7 месяцев назад

Owner of P40 and 3090 in the same PC. No problems whatsoever, just install Studio driver

@blackthirt33n Месяц назад

i have one of these cards how do i use it an ubuntu 22.04 computer

@akissot1402 Год назад

Finally, i will be able to fine-tune and upgrade my Gynoid. Btw 3090 has 10496 cudas, and its about 850$ the cheapest in the market brand new.

@jetfluxdeluxe Год назад

what is the power draw "idle" of that?! if on 24/7 in a server. can it power down? cant find info on that online.

@execration_texts Год назад

My M40 idled at ~30 watts, P40 is closer to 20

@joshuascholar3220 7 месяцев назад

I'm about to try it with a 32 gb Radeon Instinct Mi 50.

@FreakyDudeEx 10 месяцев назад

kind of sad that the price of these cards in my region is ridiculous.... its actually cheaper to get a rtx3090 2nd hand rather than getting the p40.... and the m40 is double the price compared to the one in this video....

@bopal93 8 месяцев назад

What's idle power consumption the m40. I'm thinking to use in my server but can't find details on internet. Thanks

@mateuslima788 3 месяца назад

You could've made an actual comparison.

@MWcrazyhorse 11 месяцев назад

How does this compare to an RTX A2000?

@robertfontaine3650 8 месяцев назад

That is a heck of a lot cheaper than the 3090's

@cultureshock5000 Год назад

is the 8gb lopro good for my sff dell i like my rx550 but i could play alot more stuff i bet i could lay starfield 1080 on low on the 8gb m4 .... is it worth th e90 bucks

@chjpiu 6 месяцев назад

Can you suggest a desktop workstation can include tesla m40? Thank you so much

@truehighs7845 4 месяца назад

look for an HP z840, but buy a GPU separately because you are probably going to pay way more if included.

@davidburgess2673 8 месяцев назад

What about hbcc on a vega 64 to "unlimited" boost in ram all be it a little slower but with video out etc

@jameswubbolt7787 Год назад

I never knew .THANKS.

@hardbrocklife Год назад

So P40 > M40?

@b_28_vaidande_ayush93 Год назад

Yes

@ghardware_3034 Год назад

For training or FP16 inference get the P100, it got decent FP16 performance, the P40 is horrible at that, it was specialised for INT8 inference@@b_28_vaidande_ayush93

@bulcub Год назад

I have server that I'm going repurpose as a video renderer to a multiple storage drive bay (24) I wanted to know if this is possible? would I need proxmox etc would the p40 model be sufficient?

@NovaspiritTech Год назад

I have a video on this topic with using tdarr

@MrHasie Год назад

Now, I have Fit, what’s its comparison? 🤭

@trumpsextratesticle8590 10 месяцев назад

Too bad you could slap this thing in with a gaming GPU in an SLI Config and use the Vram off the secondary card and Computational power. MODDERS WHERE ARE YOU!!!

@112Famine Год назад

Did anyone able to get this server graphic card able to play video games? Or only able to get it to only work how you have, running tasks, its a "smart" card, like how cars are able to drive.

@llortaton2834 Год назад

All tesla cards can play games, the problem with those is the cooling because there is no heatsink fan, you have to either buy your own 3D printed shroud or have a server that shoots air across the chassis

@alignedfibers Год назад

m40?