Crazy Ethernet SSD Hands on with Kioxia EM6 NVMeoF SSDs

ServeTheHome

Подписаться 742 тыс.

Просмотров 68 тыс.

50% 1

Видео Поделиться Скачать Добавить в

Опубликовано:

22 окт 2024

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 248

@blackraen 2 года назад

As a large datacenter storage admin, this completely blew my mind. I spent probably an hour hypothesizing breathlessly with colleges about the options and possibilities this introduces, especially for highly composable infrastructure build-outs. But then I realized the hitch. What's the price tag on these drives? Adding the hardware to make these things work as network endpoints is probably not going to make the cost of NVMe flash storage go down.

@jacj2490 2 года назад

Notice they also offer an adapter meaning normal NVMe are supported. As a TOC I think it will be cheaper if you will consider the cost of Storage Server with CPUs RAM HBA NIC etc and a switch

@lhxperimental 2 года назад

There will be an SoC that handles the network and flash storage logic. Since the requirements will be put down in a spec, the SoC and firmware can be highly optimised for the task. So it could become as cheap as consumer grade router SoC. But it has to become a industry standard to drive the economies of scale. Till then it will be expensive.

@Vatharian 2 года назад

Have you used any networked PCI-Express? In our test lab we have some PCIe-over-fabric (I won't say NVMe-oF, as there are other devices on endpoints, from accelerators to RAM tanks) and genuine PCI-Express networks. While the reach is very short (no optical stuff, as far as I know), within single aisle at best, The scalabaility and ease of reconfiguration this introduced into our systems is insane. Cost of PCI-e switches is also starting to fall, and with networked Gen4 it actually makes sense.

@Cheekygorilla1337 2 года назад

Would you be able to do away with storage servers with this and just have racks of network switches? I know people are moving more towards software defined storage but cutting out the middle men seems risky to data integrity. Other that than, this blew my mind too because it seems so scalable compared to current configurations.

@2xKTfc 2 года назад

Opens the possibility of a single DNS snafu to knock a lot of separate storage offline, rather than a few big SAN appliances. :)

@EXOgreenMC 2 года назад

I am in love. I cant wait till this hardware gets to a point where a pro-sumer/homelabber can mess with it!

@aaronchamberlain4698 2 года назад

I mean..... I feel like that would be pretty far off. PCIe Gen 4 x4 (typical NVMe SSD) is rated at 8 GB/s (that's a big B as in Bytes). A 10GbE switch is small b as in bits, or about 1250 MB/s. So 10GbE is ~8x slower interface than PCIe Gen 4 x4. So really, you kinda need to be at the 100GbE scale for this to make sense, cause even a 40GbE switch is still only 6 GB/s. There are some "affordable" switches with 40GbE at the moment, but it's usually just uplink ports. Dunno. I feel like it would be a long time before it either gets retired or reaches the scale needed to be cheap to pro-sumers.

@JeffGeerling 2 года назад

@@aaronchamberlain4698 what about HDD over Fabric? 🤪

@KingLarbear 2 года назад

I can't wait until a con-sumer can get this and plug and play one day, but they have a long ways to go

@WizardNumberNext 2 года назад

@@JeffGeerling it is called Fibre Channel and is nothing new

@BobHannent 2 года назад

@@JeffGeerling it would be cool to see something like an interposer board made from a cheap Arm SoC which has native 1GbE and SATA, and some have PCIe as well. One of those sub-$20 designs would be great. Perhaps there will be an affordable 2.5GbE SoC soon from one of the vendors.

@aterribleloss 2 года назад

It would be really interesting run Ceph OSDs directly on the drives or NVMeoF adapter board. Would allow for some very interesting access with DPUs

@ServeTheHomeVideo 2 года назад

Yes. The DPU video coming soon (hoping next week) will shine more light on how to do this.

@DJ-Manuel 2 года назад

Was thinking the same, CEPH OSDs directly would make a lot of sense 👍

@InIMoeK 2 года назад

This has actually been done in the past with HDD's

@xPakrikx 2 года назад

My comment was deleted ? hmm nice :/ Their project was non commercial. Well this is weird.

@itujjp 2 года назад

@@xPakrikx would that be the the 504 OSD Ceph cluster blog entry from Sage back in 2016? I did wonder what happened to the He8 Microserver stuff they were testing ceph on. Would love to see it.

@georgehenes3808 2 года назад

I’m going to watch this three times in a row, and see if I can see how this makes the “don’t trust anything” model of data management easier and more cost-effective to execute. I don’t fancy my chances! Thank you Patrick for keeping on bringing us the crazy new things!

@bryansuh1985 2 года назад

Well one advantage I see is you don't have to trust a long line of sas expanders and backplanes so there's fewer devices to trust / maintain.

@jeremybarber2837 2 года назад

My brain is a little broken by this... in a totally good way. Excited to see what else you have in store for us.

@ferdievanschalkwyk1669 2 года назад

Same here

@krattah 2 года назад

I'd really want one of these to play with at home and test out various crazy use-cases. Too bad they don't seem to be available to mere mortals. NVMe key-value mode could have some really interesting implications in setups like this.

@ServeTheHomeVideo 2 года назад

I have been trying to get a pair for many years now

@kwinsch7423 2 года назад

@@ServeTheHomeVideo Would be nice, if the NVMe to NVMeoF adapter would be available for testing. There is even a test version of the Seagate X18 out there. Would be really nice to have block storage available like that.

@timramich 2 года назад

The type of tech that is showcased here isn't going to be available to us plebes for years, and it will be second-hand. I really don't understand the name of the channel. I would understand it if he bought and showcased stuff that is finally hitting the used market, for ya know, hobbyists at home to be looking for.

@ServeTheHomeVideo 2 года назад

Well, STH is 13 years old at this point. The YT is

@timramich 2 года назад

@@ServeTheHomeVideo "Part of the idea also for homelab folks is that the stuff used at work today trickles down in 3-5 years to homelabs as it is decommissioned." Yeah, so what, people are supposed to remember content from a video from 3-5 years ago? THAT's my whole point.

@Maleko48 2 года назад

thanks for the well wishes Patrick, they started my morning off right 👍

@JeffGeerling 2 года назад

I want more info on that adapter board... is it basically running an SoC on it that adapts PCIe to Ethernet?

@j0hn7r0n 2 года назад

I was wondering the same thing. If the Pi had more than a single PCIe lanes and faster NIC, I'm sure you'd create a Pi version of this @Jeff ;) But yeah, nvme-cli and NVM over TCP has been available in Linux for a while now. Seems like if someone made a board that attaches directly to a NVMe drive and translates to TCP, we could DIY something similar. I've searched a bit for an affordable SBC with PCIe lanes + 2.5Gbe to create a poor man's version. Unfortunately they're all: unavailable, too expensive, slow NIC, PCIe 2, not enough lanes, etc. It looks like even an ITX / ATX board as NVMe host would be expensive, because only latest gen server CPUs or HEDT support a lot of PCIe lanes (with room for fast NIC) or the denser PCIe 4/5.

@prashanthb6521 2 года назад

I suspect it does exactly that.

@j0hn7r0n 2 года назад

Maybe this? www.marvell.com/products/system-solutions/nvme-controllers.html

@ZeeroMX 2 года назад

@@j0hn7r0n in the industrial computing world there are some backplanes for video applications which have pcie switches for managing up to 14 X16 pcie slots and those are more affordable than any server with xepns for pcie lanes. I have installed some of those and with a core i5 or pentium G, we can connect many to the same system.

@project-xm3473 2 года назад

Jeff are you planning to use this on Rasperry 4 project?🤔

@jeffjohnson9668 2 года назад

This is a pretty cool evolution of a storage device and goes along well with the evolution of nvnmet in the kernel. It'll be interesting to see what processing you'll be able to do on the way the SSD. If they're on the network, users are going to want trust/encryption and then they'll want to do something else. Perhaps the next video on DPUs will cover that and more!

@idahofur 2 года назад

The big thing I noticed is that each unit has its own controller. Thus, moving the bottleneck to the switch. Providing you can max out the throughput of the switch. Everything else seems straight forward. Though at first I thought I saw 2.5gb and not 25/50gb. Probably due to lost of talk on 2.5gb for home stuff. :)

@pkt1213 2 года назад

Very cool. I am picturing a lower cost adapter for SATA SSD or HDD and maybe a couple of 1 or 10 gig connectors being used as a disk shelf in a home lab. My size constraints led me to a case that I am not wild about and offloading the storage would be nice.

@russellzauner 2 года назад

I'd just make an adapter for our current NVMe drives. We have super fast, super big, and super cheap NVMe drives already available to everyone - they've just wrapped them in an industrial grade carrier/package. Try buying an industrial RPi and it will blow your mind what they're asking for them - a Raspberry Pi.

@ezforsaken 2 года назад

This is crazy and awesome, thanks for the video Patrick!

@4sp3rti 2 года назад

This reminds me of Coraid's EtherDrive, from 2 decades ago. That was AOE (Ata Over Ethernet) on a "pro" level.

@maxwellsmart3156 2 года назад

Was thinking of using AoE with a number Raspberry Pi 4s connected to inexpensive SSDs and using ZFS instead of software RAID. Unfortunately, getting RPis is not too easy now but there are cheap (if not bulky) SFF PCs. Something cheap to experiment with network storage.

@halbouma6720 2 года назад

Yeah, I still use AOE myself at the data center because it can basically do what is discussed here - except you don't get as many ethernet ports lol. I wonder if they are using it in the firmware.

@TheInternalNet 2 года назад

As a sys administrator. This is so so huge. I can not wait to play with this kind of tech.

@midnightwatchman1 2 года назад

as much as we do not like this translation between server and storage disks, it also allows you to offload the management and disk IO to another box. it would interesting to see how it impacts the performance of the servers now that you have to manage the disks as well

@MoraFermi 2 года назад

Oh and now that I thought about it for a sec: It's basically Fibre Channel under a new name, isn't it.

@ServeTheHomeVideo 2 года назад

But Ethernet, so you just use network admins. With FC, you are usually running Ethernet + FC.

@I4getTings 2 года назад

Hi there! Kind of..... FC has a SCSI payload in the FC frame. NVME over Fabric wraps up an NVME payload. And here that is NVME essentially wrapped up in Infiniband over ethernet. RoCE v2 = RDMA (remote direct memory access, for Infiniband clustering by calling on memory addressing of resources across a network) wrapped up in UDP, IP, and Ethernet. This way you can make the direct memory calls to the NVME drives as if they were a local resource, but wrap it up and send it out across the ethernet network.

@prashanthb6521 2 года назад

This is totally awesome tech. I think this is the future of how datacenters will work. And thanks Patrick for bringing this to us.

@yvettedath1510 2 года назад

except you won't run any datacenters anymore as company dickheads all rush towards Cloud

@petruspotgieter4561 2 года назад

A few years ago around 2015 the was the kinetic open storage project. It was launched with several vendors , but only Seagate made only one drive. The tech was not limited to HDD, 4TB and 1GigE, but that was unfortunately the only product. It supported openstack Swift and Ceph OSD. Maybe a Linux server running on each drive was too resource inefficient in 2015. Hope Kioxia makes the ethernet direct to drive work comercially this time. The entire solution with dual ethernet switch in the same chassis is apealing.

@that_kca 2 года назад

The object storage\kv implementation in the kinetic was super useful for many workloads and gives you the ability to grown horizontally across many many disks. combined with how ceph does the hash map and replication you get some really cool capabilities.

@guy_autordie 2 года назад

So the end line is having a disk-as-a-node, with dpu and interface. You make a configuration and add, as needed new chassis-nodes. The only difference I see is the client machine asking directly to the drive/array the data. You still need some compute somewhere to handle that (the dpus). Therefore, it's still a computer with a bunch of drives. Still, for me, it's like the size of the universe, it's hard to conceive. Maybe it will make more sense with the second video.

@ServeTheHomeVideo 2 года назад

Yes

@KingLarbear 2 года назад

Wow, the simplification of this server compared to others we see.... holy cow

@tad2021 2 года назад

I've used fibre channel before, this totally makes sense and is completely awesome.

@keithreyes3163 Год назад

I'm a little late to this party but from a purely tech-based standpoint, this is amazing! With the continuous advancements in speed, multitasking, and increase in lanes that the modern CPUs have, removing any and all pitstops, layovers, and roadblocks to its ability to compute will drastically improve efficiency and workflow, as well as, reduce the amount of hardware needed to maintain that "hyperspace" workflow. Reducing power usage etc. all great news right! But after my nerdgasm past, I realized one thing. We've been giving IP addresses to resources forever but now we're going to be doing it on steroids. We are dangerously low on properly trained security individuals to monitor and maintain our current networks. What happens when we expanded our current network topologies from a couple dozen or a few hundred nodes to thousands or hundreds of thousands by giving even our HDs, GPUs, etc. IPs? What happens when someone hacks into your 22 drives simultaneously and takes the whole dataset offline or holds it for ransom? I'm all for advancement and I would never want to stop progression out of fear. But can someone please start the conversation about how we safeguard ourselves in this "hyperspace" work environment we will be advancing into?

@lost4468yt 2 года назад

This makes so much sense. Why didn't we do this before?

@TotesCray 2 года назад

Awesome!!! Super excited to see DPU accelerated NVMeoF! Any word from Kioxia on when this will be commercially available?

@ServeTheHomeVideo 2 года назад

I think if you are a big customer you can call Ingrasys (Foxconn sub) and get these. The drives are launched

@MrHack4never 2 года назад

Another step toward the inevitable introduction of IPv8

@ligarsystm 2 года назад

Ipv6 has More IPs than atoms in the known universe :P

@hariranormal5584 2 года назад

@@ligarsystm The way we are distrubuting them is not so efficient however

@ws2940 2 года назад

Definitely would make things a bit simpler hardware and setup wise. Definitely something to be on the look out for in the next few years.

@ThePopolou 2 года назад

The modularity of the technology is fascinating but I'm trying to understand the implications on latency. Say you have an array spanning multiple racks with drives acting as network endpoints, I suspect the acceaa speed of the array will only be determined by the furthest endpoint. Its a toss between the network layer vs the cpu/silicon layer of traditional systems. How this compares to traditional pcie attached storage in a SAN will be interesting.

@russellzauner 2 года назад

the speed of any data array appears to you as whatever the latency is of the furthest piece of data you need, regardless of how it's structured or configured. if your data is already where you need it, you can even break the network temporarily without interrupting the work being done. remember when we thought the best computer was the fastest one?

@gcs8 2 года назад

lol, now do ZFS on a DPU. More real note, I think this is cool for things that need access to a physical drive but may not have the chassis for it, but I don't think it's going to replace a SAN for things like a clustered filesystem (VMFS/vVOLs).

@ServeTheHomeVideo 2 года назад

Video is already recorded with the DPU :-)

@TotesCray 2 года назад

@@ServeTheHomeVideo eagerly awaiting the upload!

@jurepecar9092 2 года назад

This will push RDMA / RoCE as a hard requirement on ethernet networks. Fun times ahead ...

@MrRedTux 2 года назад

This looks like a new variant of SATAoE (Serial ATA over Ethernet), which was a pretty cool way of inexpensively attaching network based disks directly to a host.

@MrDavidebond 2 года назад

also similar (but probably more scalable) to connecting sas drives to a sas switch to share with multiple servers.

@jacj2490 2 года назад

Thanks Patrick, Truly amazing. I was waiting for this review since you showed it last year. It is a great concept & I think it 'll have a huge impact on storage industry. I only wish you can do some benchmarks because latency is the main factor & I believe it 'll be minimal here since as you mentioned it so directly attached to network hence less translation Great job & thanks for the entire STH team

@ForestNorgrove 2 года назад

Brantley Coile was doing this almost 20 years ago with ATA over Ethernet, deja vu all over again and good to see!

@MoraFermi 2 года назад

1. That sad feeling when all of technologies mentioned as "ancient" are still fairly new to you... 2. Can you install Ceph OSD on these drives?

@ServeTheHomeVideo 2 года назад

We were not running stacks on the drives themselves other than NVMeoF, but you get the idea of where this is going. Next up, we will have a DPU version of this.

@novafire99 2 года назад

That was my first thought, this could be really cool if it could turn into a Ceph cluster of drives.

@AndrewMerts 2 года назад

@@novafire99 Ceph actually did have an experimental drive in partnership with WD that ran the OSDs on the drives themselves and used 2.5G ethernet (1/10th what these NVMe drives do but it's HDD so... shrug). It was a really cool idea but managing OSDs and Ceph on the individual drives is something that's definitely a bit clunky because now you need extra ram and compute on each drive controller board. You're not eliminating those extra layers as far as the inefficiencies are concerned so much as just moving the microservers running your OSDs onto the same board. With NVMe-oF direct to the drives instead of adding a heavyweight Linux daemon to each drive you're only adding a network stack and an embedded Linux host mostly as a control plane and your data plane can still offload the bulk of the work. Aside from the reduction in processing, your interface is now much simpler. It's NVMe-oF, thats a much smaller jump to bridge from NVMe to NVMe-oF than it is to bridge from SATA to Ceph OSD. Yes it's still having to deal with authentication, encryption, session management, etc. and you can expect needing more FW updates but nothing like having to manage a bunch of Linux servers in your Ceph cluster. Having that logical separation with a clean, stable API avoids the added complexity of combining higher level storage cluster stuff in Ceph with lower level drives.

@stuartlunsford7556 2 года назад

How was this not already a thing?? The instant composable infrastructure came about, this should have been there...holy crap lol.

@shodanxx 2 года назад

I just started learning about 10g/100g Ethernet and SFP/qsfp and I was like why don't they make a 100gb qsfp m.2 slot, why is networking such an overengineered underperforming stagnant mess !

@JohnADoe-pg1qk 2 года назад

Maybe a silly question, but what part in the system does the parity calculations for RAIDs in this setup?

@ServeTheHomeVideo 2 года назад

This particular one we did not do parity since it was RAID 0. However, how we had it setup it would be in the server/workstation. In the next video in this series, we will show it being done on a DPU for full offload.

@fat_pigeon 2 года назад

Specifically, this was regular Linux software RAID (mdadm). You could also run ZFS over the block devices after connecting the workstation to them, or perhaps even a shared-disk file system.

@shodanxx 2 года назад

Like, that's iscsi, a thing I learned about this week after having forgotten about it for 20 years. The reason I found it was I need to build a 10 node cluster out of computers from the recycling. Honestly I'm shocked we haven't had IP capable nvme drives from the start. This is obviously because SAN sellers have been blocking it from existing. My whole week has been, how do I duct tape a 40$ 100g PCIe card to a 200$ 2tb nvme drive, preferably without using any fans.

@axiom1650 2 года назад

40$ 100g pcie card?!

@_TbT_ 2 года назад

@@axiom1650 some used Mellanoxes can be had quite cheap on eBay. 10Gig for 40$ I have seen myself. 100Gig is a stretch.

@movax20h 2 года назад

Please let us run Ceph OSD (with cleavis/Tang for encryption) deamon directly on this board. This would be some cool. I dreamed of this for years, but finally we are closer. Octa core A75 with 8GB and 4GB of NAND for OS, would be absolutely all I need. Accessing 25Gbps drives from one (or few servers) using nvme directly will not scale (how much you can put in one server, 800Gbps maybe). With ceph, each client will connect independently and you can easily saturate few Tbps with enough clients.

@ServeTheHomeVideo 2 года назад

So, stay tuned, hopefully late this week we will have the DPU version of this. Not doing Ceph, but you will recognize some of the other concepts.

@christopherjackson2157 2 года назад

Yea u called it. Mind is blown.

@dawolfsdenyo 2 года назад

Are they also looking @ adding spinning rust drive switches as well, for the mass storage needs that can be lower tier storage with hot data on the nvme switchs? Coming from the age of being some of the first users of NAS and SAN products back in the 90s and spending the next 20 years in enterprise storage and enterprise architecture, this new topology is sexy as hell

@ServeTheHomeVideo 2 года назад

Hopefully will discuss that this week if the DPU piece gets out

@dawolfsdenyo 2 года назад

@@ServeTheHomeVideo Looking forward to that, and everything else you bring out. STH the site has been a constant joy and so have your videos over the years! Thanks for these!

@alfblack2 2 года назад

sweeeet. And here I am preping for ISCSI for my home lab.

@__--JY-Moe--__ 2 года назад

why do I see U as ten, and were standing in the middle of a sand lot. talking about new stuff!🤣.....this is very helpful when gathering ideas, to assemble the latest N greatest systems!! super-duper! great ideas! thanks Patrick! accessing data, right from the storage. will really cut down on overhead latency!!! nice! I can't believe they didn't make plastic skid covers though! it's so easy 2 loose ec's, over usage time! good luck!

@linuxgeex 2 года назад

Wow Patrick must have eaten an entire box of Frosted Flakes before he did this video [ they're Grrrrrrrreat! ]

@ralmslb 2 года назад

oh man, this is amazing! I wish I was rich to be able to afford this as my home NAS :)

@System0Error0Message 2 года назад

this is gonna be good for ceph

@Codiac300 2 года назад

I see they taking the all-ip thing to the next level

@fat_pigeon 2 года назад

How does security / access control work in this model? As a traditional server physically has exclusive access to its directly connected drives, it's a natural place to put a security boundary. With every drive as its own node, I guess you would use network security techniques like VLANs. Sort of relatedly, it seems like the traditional model's intermediating layer of server nodes tends to compartmentalize damage from (accidental) misconfiguration. If you're reconfiguring one node's disks you might lose the data on that node, but other nodes will be unaffected. If you're configuring the network between nodes, you might lose access temporarily, but your data is still there on each node and you can recover by reconnecting them to the network. In contrast, if all your disks are in one big pool and all the configuration is in software, what stops a misconfiguration from hosing all your disks at once? In particular, there's a general system assumption that software can assume exclusive access to direct-attached disks (excluding exotic shared-disk filesystems), contrasting with the presumption that network nodes are able to simultaneously serve multiple clients that may connect to them. If you put the disks directly on the network, you would have to be very careful that your network config guarantees mutual exclusion of access to any one disk, or else a subsequent misconfiguration would cause multiple nodes to clobber each other's data on a single disk.

@ProTechShow 2 года назад

This is cool. Scalability is the obvious benefit, but I bet people come up with weird and wonderful ways to use it once it goes more mainstream.

@thishandleisntavailable1 2 года назад

"Honey, I accidentally granted the untrusted vlan direct access to our storage drives." - a horror film

@giornikitop5373 2 года назад

so, that little pcb acts as a micro-linux and presents the nvme to the network, right? while this is great in terms of overall management and scalability, doesn't it increase net traffic and address usage by a huge margin? but i guess that's not a problem in the datacenters, they can up the equipment like it's nothing. Something similar that i remember was the very old ataoe but probably was more similar to FC.

@georgeashmore9420 2 года назад

I feel that I missed something but where do the raid calculations now take place if the drive is connected directly to the network?

@ServeTheHomeVideo 2 года назад

Host in this case. DPU in the next video we will have in this series

@tinfever 2 года назад

Okay...that's pretty slick. I'm very keen to hear more about how one would actually use these SSDs, DPUs, or NVMEoF in production. I've always been wondering how you'd implement SSD redundancy with those. I'm assuming there isn't much compute power on each SSD so anything must be implemented on the storage consumer system. It does kind of scare me that you have to trust the storage consumer system to not accidentally screw up the disks it can access on the network. Oops...one server was acting up and so it wiped all the other namespaces on the 8 drives it was sharing...

@abx42 2 года назад

I would love to have this as a home user

@0xEmmy 2 года назад

So what I'm hearing, is that each NVMe drive is a near-zero-overhead ethernet-native NAS block device (as opposed to a limited component requiring a direct CPU attachment), and then you'd run the filesystem either on a dedicated, more traditional "heavy" NAS, or on the client itself. I wonder what kind of innovations this will enable elsewhere in the system. Maybe it'll be easier to design purpose-specific RAID accelerators (ideally compatible with a standard format like ZFS), once the drives aren't on the same PCIe bus. Maybe the individual parts of a modern filesystem (drive switching, parity calcs, caching, and file->block associations) could be separated into purpose-specific hardware modules. Or, maybe consumer operating systems start supporting NVMe-over-WiFi/Ethernet. Maybe NVMe gains network discovery features. Maybe drives with gigabit ports start showing up marketed to consumers, cheaper to use as a NAS directly than building an entire server around an NVMe-over-PCIe drive. (A single controller chip will probably under-cut even a Raspberry Pi, once optimized for cost.)

@movax20h 2 года назад

octet33 octet33 2 days ago So what I'm hearing, is that each NVMe drive is a near-zero-overhead ethernet-native NAS block device (as opposed to a limited component requiring a direct CPU attachment), and then you'd run the filesystem either on a dedicated, more traditional "heavy" NAS, or on the client itself. I wonder what kind of innovations this will enable elsewhere in the system. Maybe it'll be easier to design purpose-specific RAID accelerators (ideally compatible with a standard format like ZFS), once the drives aren't on the same PCIe bus. Maybe the individual parts of a modern filesystem (drive switching, parity calcs, caching, and file->block associations) could be separated into purpose-specific hardware modules. > Or, maybe consumer operating systems start supporting NVMe-over-WiFi/Ethernet. Already possible. > Maybe NVMe gains network discovery features. Already possible. > Maybe drives with gigabit ports start showing up marketed to consumers, 1Gbps is way too slow. You are putting exensive SSD that can do 500-4000GB/s (even cheapo ssds can do it), and will be limited to 120MB/s with worse latencies too. 10Gbps minimum for it to be useful.

@garmack12 2 года назад

Windel for level1 just did a video about how most raid systems anymore don’t do error checking at the raid controller level. Most just wait for the drive to report data error. This must be true for this as well correct?

@ServeTheHomeVideo 2 года назад

I think Wendell was doing hardware RAID. These are more for software defined storage solutions. Also, traditional "RAID" has really been used a lot less as folks move to scale out since you have other forms of redundancy from additional copies, erasure coding, and etc. The next video in this series that we do will be using a controller, but also software so a bit different than what he was doing.

@spencerj.wang-marceau4822 2 года назад

NVMeoF's nvme-cli feels a lot like the targetcli/iscsiadm cli tools merged into one.

@thx1200 2 года назад

Jesus we're going to need to firewall our disks now. 😀

@Trains-With-Shane 2 года назад

So it's creating a SAN while cutting out a lot of the middleware. That's pretty cool.

@SudharshanA97 2 года назад

Wow! This is what they can use like for Supercomputers right?

@ServeTheHomeVideo 2 года назад

The market right now for these is more in scale-out data centers. Think folks running cloud-like models.

@strandvaskeren 2 года назад

How is this different in concept from a bunch of Odroid HC1's? Each drive gets it's own tiny debian server and feeds a network port, sounds pretty much like this thing only this thing is more modern and faster. So rather than managing 4 servers with 24 drives each, you now manage 96 servers with 1 drive each.What am I missing?

@ServeTheHomeVideo 2 года назад

Sure, but it is like a bicycle to a Boeing 787 and saying both are modes of transportation but one is more modern and faster.

@strandvaskeren 2 года назад

@@ServeTheHomeVideo All I'm saying is, that sticking a tiny server to each drive has been done before, so the main new thing here is sliding 24 of those single drive servers into an 2U enclosure with a fancy switch interface, rather than powering them and network connecting them individually. The really exciting bit is how to manage those 24 servers so you don't need to micro manage each one individually.

@woodshop2300 2 года назад

Some custom ASIC in the Sonic switches could let the switches do RAID on there own.. RAID controller on steroids, LOL. Or maybe just put a Ryzen APU in there as the x86 control plain :) The integrated Vega should be able to do DPU i'd think.

@BloodyIron 2 года назад

Considering the redundancy is external to the system this sounds like it's very easy to accidentally remove the wrong drive for hot-swap replacement and incur very real data loss. :/ I'm thinking this primarily when Patrick started talking about namespace slicing. I don't yet see what the advantage of this topology is.

@im.thatoneguy 2 года назад

I'm curious how these scales based on price. Presumably the dual 25gbe controllers "in" each drive are pretty expensive and drive up the cost. Is that cheaper than relying on 6x dual 200Gbe DPUs to expose those PCIe lanes to the fabric?

@ServeTheHomeVideo 2 года назад

What the goal of this is (in the future) is to have the flash controller speak Ethernet not PCIe. So the incremental cost is very small.

@koma-k 2 года назад

@@ServeTheHomeVideo does it have an effect on power consumption though? Ethernet is meant for longer distances than PCIe, thus requiring more power... How does a rack of these fare power-wise compared to more "conventional" alternatives?

@eDoc2020 2 года назад

@@koma-k There are different variants of Ethernet. Presumably the drives use a variant like 25GBASE-KR which only has 1m range.

@chwaee 2 года назад

Security implications? First generation is bound to have zero days... Will make for some interesting news in a couple years :D

@bw_merlin 2 года назад

This sounds like true network attached storage.

@produKtNZ 2 года назад

0:16 - Those storage bays have also had SCSI and SCSI u320 and more I'm sure - unless my history is letting me down and those connection types only have with 3.5' formfactor HDD's ?

@mentalplayground 2 года назад

Future is here :) Love it!

@russellzauner 2 года назад

*chuckles in WiGig distributed storage*

@varno 2 года назад

I do wonder however if 25Gbps is fast enough given that pcie gen 5 can do that on a single channel. But the configurability does seem worthwhile, and given that PCIe uses the same series tech as electrical high speed Ethernet it does seem worthwhile, and it should be possible to have one controller that can do both 100g Ethernet and pci in the same controller silicon.

@nullify. 2 года назад

So what about security? Seeing that you can telnet into the drives I'm assuming they have a embedded Linux or like some proprietary OS?

@SLYKER001 2 года назад

Soooo, instead of having one big server for bunch of disks, now we have bunch of disks with integrated server each; main benefit wich i see is more reliability Hmmm, can single or pair drive be mounted as storage in windows? :D

@ServeTheHomeVideo 2 года назад

Yes, this is just NVMeoF

@zachradabaugh4925 2 года назад

3:15 Wait, are you using film for product photos of bleeding-edge tech? If so, I'm 100% down for it!

@blancfilms 2 года назад

That doesn't look like analog noise to me

@ServeTheHomeVideo 2 года назад

The lab was SUPER dark. Then add to the fact that there is a metal box (rack) around these systems. That is a Canon R5 (R5 C was not out yet) at ISO 12800 just to get something somewhat viewable.

@zachradabaugh4925 2 года назад

@@ServeTheHomeVideo fair enough! Data centers aren’t really known for perfect lighting. Honestly cool to see that the R5 has such usable photos at iso 12800

@that_kca 2 года назад

So close to getting kinetic v2 on there

@memyself879 2 года назад

Hi Patrick, why aren't EDSFF drives taking over by now?

@ServeTheHomeVideo 2 года назад

No Genoa/ SPR yet. But in unit volumes they are growing but a huge amount

@EverettVinzant 2 года назад

Thanks!

@ServeTheHomeVideo 2 года назад

WOW! Thank you!!!!!

@johng.1703 2 года назад

ah so it's not running on x86 it's running on x64.... but there is still a lot of translation going on, granted where the translation is happening has moved, but it is still happening. the ONLY real difference is the communication between the drive and the controller, rather than it being serial it is instead using ethernet. NVMe over fabric SSD, so it is still a serial device, that would be the NVMe part, then there is a controller sat out in front doing the conversion.... this looks like some sort of iSCSI connection. granted they changed the command name to NVME . the network diagram @12:15 is also not correct, or is there a large part of the network missing? or rather than the big box "switch" have these been connected individually to the network with a device we haven't seen?

@mamdouh-Tawadros 2 года назад

I wonder is that equivalent to a router being connected to a mass storage through USB3 ?

@marcogenovesi8570 2 года назад

in the sense that it is storage available over the network, yes.

@_TbT_ 2 года назад

Not really. It’s like plugging the hard drive directly into a switch/router.

@truckslove 2 года назад

Does this solve the problems with concurrency in protocols like iSCSI? Nevermind, I think I answered my own question. It looks like you'd need to use a filesystem or DPU that supports mounting across multiple machines

@virtualben89 2 года назад

Ethernet to rule them all.

@strongium9900 2 года назад

Cost brother. How much would this cost. Especially for a home user

@DrivingWithJake 2 года назад

It's interesting however, I wonder how it would really be in the data center world. We do a lot of high storage types of systems linked up with 40/100g ports. It is fun to dream of the ideas and usage for it.

@russellzauner 2 года назад

it's going to erode the market for gigantic monolithic building sized data centers because it facilitates distributed computing.

@christ2290 2 года назад

Don't get me wrong, I love new tech and this has applications I'm sure, but the applications where this would excel are fairly limited to thigs like putting lots of these talking.... directly to a server. Lots of small to mid range data centers would end up front-ending this with a normal x86 server to even use it and handle things like shares, permissions etc. In those cases, you'd actually be adding translation steps. I'd see this just being a competitor for FC/FCoE/iSCSI attached storage.

@SwedishDeathLlama 2 года назад

What’s old is new again. Remember fibre channel hard disks? They went the day of the dodo because the switching/cabling costs are too high.

@yiannos3009 2 года назад

Is there some way to manage array ownership in a distributed manner? In other words, is it possible to create an array on one machine such that all other storage clients on the network know that disks 1...n belong to array A?

@yiannos3009 2 года назад

I should clarify: know that disks 1...n belong to array A even if the array is not mounted, so that reallocation to another array would be prevented. Cool vid and tech btw, thanks!

@suchness18 2 года назад

Bit confused about the nomenclature if this is using ip addresses wouldn't it be nvme over ip? not nvme over ethernet.

@popcorny007 2 года назад

The underlying difference isn't that it uses IP, it's that NVMe is not exposed outside of the disk. It's more like: NVMe over Ethernet, therefore a MAC address is the lowest OSI layer for a directly connected device (ie. the switch chip) to access it. As opposed to: NVMe over PCI, therefore a PCI address is the lowest OSI layer for a directly connected device (ie. the CPU) to access it. No need to jump to OSI layer 3 with IP addresses. TLDR: Normal NVMe devices are "over PCI", which is OSI layer 1. With NVMe over Ethernet, the lowest accessible OSI layer is now layer 2 (Ethernet/MAC). FINAL EDIT (lol): I'm referring to all layer 2 protocols as "Ethernet" for simplicity. "Over Fabric" encompasses all layer 2 connections, such as Infiniband.

@OVERKILL_PINBALL 2 года назад

Who manages these drives, the storage team or the networking team? : P

@ServeTheHomeVideo 2 года назад

And... Fight! :-)

@mrmotofy 2 года назад

Who? The guy who has lunch with the boss :) like always

@CoolFire666 2 года назад

How do you manage security and access control on a setup like this?

@ServeTheHomeVideo 2 года назад

The DPU provider does, not the server provider

@olafschermann1592 2 года назад

If you weren’t so excited in the beginning of the vid i thought of an aprils fool.

@PrestonBannister 2 года назад

Think these might reach the more general market in 2-3 years?

@ServeTheHomeVideo 2 года назад

Hi Preston! Thank you for joining! I think for the general market, it is going to take some time. A lot of that is just based on how these drives are being marketed. If we saw an industrywide push, it would be much faster. I also think that as DPUs become more common, something like the EM6 starts to make a lot more sense since the infrastructure provider can then just pull storage targets over the network and then provision/ do erasure coding directly on the DPU and present it to client VMs or even the bare metal server

@winhtin3420 2 года назад

How much would that 80TB usable configuration box cost? Thanks.

@_TbT_ 2 года назад

Definitely mind blown.

@jonshouse1 2 года назад

iSCSI TNG 20 years later, a man talks excitedly about the next greatest thing from Microsoft while showing a Linux prompt. I like the idea and hardware but a small part of me feels like I am experiencing some sort of waking post truth nightmare.

@allenbythesea 2 года назад

This is amazing but its killing me you didn't do a single benchmark or even talk about performance once.

@jfkastner 2 года назад

Sadly this still uses the 'normal' L2, L3 and L4 protocols with their minimum packetsizes and therefore will not work as wonderfully for low latency use cases as they make it sound like

@kenzieduckmoo 2 года назад

i was wondering when you would get a video on these since i saw them on Linus's petabyte of flash video.

@slithery9291 2 года назад

This isn't what Linus used at all. Their project is standard NVME drives directly connected to your everyday x86 server. This is way beyond that type of setup...

@_TbT_ 2 года назад

Linus also uses Kioxia drives. That’s where the similarities end.