Do you need to understand more about storage performance? Iops, latency and throughput? Check out this video.ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-JUrU_DOq6YM.htmlsi=8UQCuSlG9OL5EJaj
I beg to differ. I like the music, helps me to relax and enable me to grasp the concept easier. There should be an option for youtube to disable background music without muting the voice.
Very informative, unfortunately it became unbearable to focus on what you were saying with your music in the background. I see now many people were having the same problem. Thank you however for your explanation.
Excellent video! I learned a lot. BTRFS was designed for massive enterprise installations. It is slow on home desktops. DJ Ware did an excellent comparison of file systems. F2FS shines on home desktops with SSDs and is what I use on all my systems. Snapshots can lull people into thinking of them as a real full backup. It is much better to create a full system image or functional clone, plus extra backups of irreplaceable files in Home, configs in etc, and the like.
This is good. I have similar conversations reguarly about the difference between snapshots, replication and backup. The only point I disagree with, which he only mentions bfriely, but backups are are NOT archives. In one job, I had to restore 12 or so users Notes email boxes from about 5 years prior - several years worth of monthly full backups. If I recall correctly, they'd all expired from the NetBackup catalog, and they'd changed from DLT to LTO tape in that time, so we didn't have the catalog information for where the data was, but they still kept the tapes. So basically spent 6-9 months running pase-1 then phase-2 import on hundreds of DLT tapes on the 2 rusty old DLT drives they'd kept. And then restore the data to a dedicated PC and burn it to DVD to give to the lawyers (2 copies - defence and prosecution). About 20% of them failed, either at the import or restore point. But the case went through. The prosecutors won. I have no idea if they used any of the evidence I'd restored via my very long restore process which ended up with about 100 DVDs being created. Oh well, it kept me off the street for 9 months, even if it was boring as hell. But things have changed a lot in that time. I'm talking about 15 years ago, restoring data that was already about 5 years old.
Ahhh.... archives.... the thing that nobody cares, talks about and wanna pay for. Until you need it and don't have it. 😁 Fortunately there are more reliable ways today for restoring super long term archives, be it on premise, cloud or even thru a hosted provider like Kroll. The issue remains, business stakeholders often don't see the value of investing in tech for archives.
is not the synchronous replication better than the backup since it is replicating the data in live time rather than backup which is happening at a particular time in the day? Please clarify. Thanks
They are for different usecases. Assuming you are hit with ransomware and all your files are encrypted, you just replicated synchronously all your encrypted files. Not much good to recover your data.
nice video, i agree with the other in regards of the music. however id say there is way more to it. Like replication of data to a different DC is mostly done for DR and way more is involved then just having a copy of the data sitting around somewhere. As for snapshots versus backups, for many business critical systems going back in time even longer then a few hours is simply not an option, so yeah sure for long term a backup is easier to keep and more useful as it has no relation any other data as its a separate set. The first question imo in regards of having a back out in case something happens to your data is to ask your self which scenarios will be applicable and what is still acceptable in case you need to fall back on a copy of your live data in terms of RPO. Basically RPO and RTO determines 90% of your solution. Too easily its decided to make daily backups and keep them x amount of time. One more addition, a backup strategy is often to make one a week a full and the rest of the days increments. This creates a similar dependency you described with snapshots.
Hello Charles! Thank you for the video is so informative! I have a question. Could you remove an intermediate snapshot and keep one from long ago? The objective would be to have like some kind of "backup" once a month (with snapshots) allowing time travel to check how the data was 1 year ago and also like for example the last 2 days. If this is possible you would delete a lot of data that we don't need. But I'm not so sure that if I "break the chain" of the snapshots I will be able to access to data from long time ago.
Hello everyone. I’m trying to understand how much space I need to allocate to my VSS volume assuming the operating system is stored on a 1 TB volume taking up around 200 GB. Next data is stored on another volume in the same raid 5 array taking up around 2 TB of data. On a 4TB volume. I appreciate the help.
Thks &; It seems likes snapshots are incremental file/folder backups stored within the computer filesys. Note if the filesys or computer are corrupted/hi-jacked/stolen/destroyed/etc, then all the snapshot data is gone. That's why it's wisest for backups to exist of the computer & at another location I guess.
backup = copy file system + backed up on another location snapshot = copies blocks of files + backup in same location I don't understand what the difference is between backup of file system (backup) and 'blocks of files' (snapshot). Is there a difference? For example, I do chown root:root on my /home directory. Yes, this will destroy server. Can I restore it with a snapshot? Or only with a backup?
This is confusing if you're on AWS. AWS has automated backups, which is snapshots you described in this video. AWS has snapshots (essentially ebs snapshots) which is talking about the full back ups this video is talking about. I got so confused while watching this video, but if you're on AWS, don't get confused!!!! Btw background music wasnt a big deal for me... Folks here make it sound like it's the end of the world.
Ebs snaps are essentially similar to traditional storage snapshot enhanced by the ability to stage snaps to a S3 target. So in some way, it's enhanced to kinda merge potentially replication and snapshots into a best of breed offering. Depending on the requirement on RTO and RPO objectives it would largely suffice for most organisations. Having said that, EBS snaps like traditional storage snaps are crash consistent and not application consistent. So just have to be mindful of that. On top of that, backups are kinda a different play altogether. If granular recovery, indexing and etc is a requirement. EBS snaps would need to then be integrated with a backup software to get there. DR Solutions are complex and often a combination of various tools to meet an objective. No silver bullet.
Consider snapshots like a rewind button or an edit undo when creating a documents in word, the backup is when you copy your file to another hard drive. I don't really understand why AWS would things confusing. They might just have different names, I guess.
There is a lot confusing and misleading information in this. If you don't already know what you are doing, it will be hard to distinguish between helpful and not so helpful bits of information ...
your content is incredibly useful and you present it very well, but please, PLEASE kill the background music, you are much more interesting than that audio assault
Damn, - that background music! Turn it down or off. makes the vid headache inducing. like listenign to somoe talking to you in a crowd. Too hard to follow.
Personally, I've always considered snapshots to be a "System Restore" solution. Great for recovery in case of issues with updates (highly recommended when using rolling release distributions). But not a backup solutions. This is why my snapshots are configured to only create a new snapshot when I install, remove or upgrade software. It also makes sure there are no more than 10 snapshots available on my system, by removing the old ones daily at 7 P.M.
subset = snapshot incremental = between snapshots roll-forwards roll-backwards concurrency read is not write, cell resolution use for replicated-read(mirror is hub) database strategies used for file-system and used for storage and used for compression database have full-set of tools and benchmarks and transactions per second and ... states(similar to network) baseline = transactional and relational and object - databases memory of Linux and segmented and disciplined. those who reinventthe wheel, reinvented old issues, and feel smart about distinctions in forgotten textbook basics of BSc but brag about MSc an PhD... might as well forget about humanities and accounting of first year undergraduate. we have in 2024 looking backwards to 2020 video and thinking about what happened in 2012 facebook and recalling pentium-3 FPGA era... millennials are their BS of BSc
So in short, Snapshot, low RPO and for corruption protection, Replication (business continuity or site level protection in casr of DR) with RPO> snapshot and backup can have RPO>> Replication
I've been thinking it's quite like this indeed. I've been using snapshops like before bigger system changes that could result in undesired concequences. But I still try to do like weekly backups.
how to nas recycle bin or some short tums data protection . under 1 hour. just copy then 20 min it delete how to recovery or protect. just like recylebin
Fantastic video! I learned a lot. Thank you. I always feel like the abstraction of data storage and accessibility/interface options have just become overwhelming. Maybe I'm dealing with FOMO haha. After a deep breath I'm just glad to see that rsync is will being used in industry and home labs! Finally a tool that hasn't been supplanted.
It really depends how you look at it. If you are able to recover to any of the snapshots in your required SLA, then its good enough for you. I'm not sure if your definition of mirroring here corresponds to replication, but just be mindful to check if the replication also includes all the snapshots. As far as I know, many don't, and if it doesn't it simply means no there is no way to recover back to a particular point in time.
@Charles Chow, one part of your video was so good that I actually copied it from the transcript and improved the punctuation just slightly: "Now let's look at replication; as the name suggests replication simply means copying or replicating data to another storage. It can be on another system in the same data center but often it is remote to protect against DC failures as well. There's generally two types of replications: asynchronous and synchronous replications. Let's start with async. Async replications often mean that data is replicated at a given interval, perhaps every five minutes; changes are then replicated to the remote site so in the event of disaster the worst that can happen, you will lose up to five minutes of data and it's often articulated as what we call recovery point objective or RPO equals five minutes. Sync replications on the other hand replicates all IO as it is written to the storage the system. It will commit both local and remote writes before and acknowledging to the host that the write is good. In many cases mission-critical apps that cannot tolerate any loss of data would often opt for sync replications. Similarly in terms of RPO sync replications is what we term as RPO equals zero, that simply means no data loss. So why would anybody pick Async replications then? As you can tell, sync replications' demand on bandwidth will be extremely high and latency-sensitive comparatively; async oftentime has generous allowances on bandwidth and latency, making it significantly cheaper in terms of processing. The advantage of replication is in its ability to recover very quickly with minimal data loss in the event of a complete data center failure or if a primary storage is completely lost. You often time have a ready copy of the data and you're ready to resume business. Having said that it is not without its caveats, because every data block written is replicated; that simply means if you have a corrupted block that is written or somebody accidentally or maliciously deleted a whole bunch of data, all this will also be replicated. Like the saying goes, "Dirty block in, Dirty block out"! So it's great for business continuity protections and insulation against primary storage failures but surely not so great if you want the ability to roll back to some specific point in time." -- if I could use that at work I'd greatly appreciate it. Let me know thanks!