In our latest episode of the Backup to Basics series, we talk about what I think is the most important invention in my career: deduplication. Without dedupe, much of what we do in backup and recovery, and disaster recovery, would simply not be possible. Without dedupe there really is no disk backup market; there is no cloud backup market. I'd be out of a job! What is dedupe, anyway, and how does it work? What are the different kinds of dedupe and does that matter? You should learn a lot about this important topic.
Mentioned in this episode:
Interview ad
On this episode of restore it all we talk about what I think is the
Speaker:biggest advancement in backup and recovery technology during my career.
Speaker:And that's deduplication.
Speaker:I hope you enjoy the episode.
W. Curtis Preston:Hi, and welcome to Backup Central's Restored all podcast.
W. Curtis Preston:I'm your host, w Curtis Preston, aka Mr.
W. Curtis Preston:Backup.
W. Curtis Preston:And a half with me, my network, rearchitect Rearchitect, engineer.
Prasanna Malaiyandi:Hey, Curtis, whatever I could do to keep you safe, you know?
W. Curtis Preston:You know what's really funny is like I, I consider myself a
W. Curtis Preston:pretty tech savvy guy, and when we were talking today, About what I'm, you know
W. Curtis Preston:how I've, I've replaced a bunch of gear and I'm swapping out some stuff and
W. Curtis Preston:moving some cables around, and then you were like, you were yelling at me.
W. Curtis Preston:You were like, you can't do that.
W. Curtis Preston:You can't put the switch on the thing.
W. Curtis Preston:And I was like, yeah, I can, like, what are you talking about?
W. Curtis Preston:And it, and it took me like a couple of seconds and I was like, oh, wait.
W. Curtis Preston:You're right.
W. Curtis Preston:I can't, that's not, I can't do that.
W. Curtis Preston:I can't put.
W. Curtis Preston:The switch.
W. Curtis Preston:I can't put the router.
W. Curtis Preston:That's gonna be my firewall on the same switch
W. Curtis Preston:As my home LAN
Prasanna Malaiyandi:Yeah,
W. Curtis Preston:I dunno what I was thinking.
W. Curtis Preston:Yeah.
Prasanna Malaiyandi:Just another topic that I know just a little bit about.
W. Curtis Preston:I'm a little, I feel a little ashamed that that was.
W. Curtis Preston:But I'm glad I talked to you about my, you know, as, as is
W. Curtis Preston:the case with many subjects.
W. Curtis Preston:I'm glad I talked to you about, you know, what I'm up to.
W. Curtis Preston:Um,
Prasanna Malaiyandi:Glad I could help.
W. Curtis Preston:I have successfully purchased and configured for the
W. Curtis Preston:video for the video watchers.
W. Curtis Preston:Let's see if it makes it into the camera before the cable runs.
W. Curtis Preston:There it is, the ASUS AX6600, which is a mesh router.
W. Curtis Preston:And I gotta say it's much more better than what I had before,
W. Curtis Preston:and it's able, I've got two.
W. Curtis Preston:It's supposed to provide 5,500 square feet, but of course that's, that doesn't
W. Curtis Preston:include drywall and two by fours, right?
Prasanna Malaiyandi:it's crazy how much signal degrades going through drywall.
Prasanna Malaiyandi:And the other thing people don't realize is five gigahertz,
Prasanna Malaiyandi:like degrades like no tomorrow
W. Curtis Preston:Right.
W. Curtis Preston:Remind me, remind me why five gertz is better again.
Prasanna Malaiyandi:It's faster because it can handle more bandwidth, and also
Prasanna Malaiyandi:the channel is wider, so you can have more things talking at the same time.
Prasanna Malaiyandi:It's just as your frequency goes up, the distance goes
Prasanna Malaiyandi:down for the same power levels,
W. Curtis Preston:So is this like DC versus ac?
Prasanna Malaiyandi:
Speaker:not quite DC versus ac.
Prasanna Malaiyandi:
Speaker:It's more about.
Prasanna Malaiyandi:
Speaker:You need to pump as many things as possible into, because high frequency,
Prasanna Malaiyandi:
Speaker:right, it's more per cycle, right, than 2.4, which is less airtime, if you will.
W. Curtis Preston:Right.
Prasanna Malaiyandi:And so every sort of peak, you can send
Prasanna Malaiyandi:more out with the five gigahertz because you're doing it more often.
W. Curtis Preston:right.
Prasanna Malaiyandi:And so it works a lot better.
Prasanna Malaiyandi:It's just the distance isn't as great.
Prasanna Malaiyandi:Now, I will tell people, so this is one of my, I'm gonna
Prasanna Malaiyandi:get up on my soapbox now, right?
Prasanna Malaiyandi:One of my rare soapbox events and tell people, a lot of times people
Prasanna Malaiyandi:think they need more wifi access points in their house to get coverage.
Prasanna Malaiyandi:And to those people, I will say, plan out your network carefully.
Prasanna Malaiyandi:Put your devices where they matter.
Prasanna Malaiyandi:And also don't put too many devices and don't crank up the power all the way
Prasanna Malaiyandi:to high, because I know Curtis, you and I were talking about this when you're
Prasanna Malaiyandi:looking at mesh, and it was like, imagine that your router can overpower your
Prasanna Malaiyandi:phone, your laptop, your iPad, so it's screaming at the top of its lungs and your
Prasanna Malaiyandi:phone can barely even scream back at it.
Prasanna Malaiyandi:And so that's actually worse for your network and for airtime than
Prasanna Malaiyandi:actually sort of balancing out power.
W. Curtis Preston:I just don't know if, like, the stuff
W. Curtis Preston:you're talking about, like is.
W. Curtis Preston:is that even, is that configuration option even on consumer class routers?
Prasanna Malaiyandi:you'll have sort of the low, medium, high power
Prasanna Malaiyandi:levels, uh, but it takes time to fine tune and tweet these, right?
Prasanna Malaiyandi:You have to walk around with a wifi analyzer on your phone, right?
Prasanna Malaiyandi:So Apple with their, uh, iPhones, right?
Prasanna Malaiyandi:They ship, what is it?
Prasanna Malaiyandi:Airport utility, which has a wifi scan.
Prasanna Malaiyandi:Option, which will show you all the wifi networks and sort of the signal
Prasanna Malaiyandi:strength, and you basically have to walk around your house with that and
Prasanna Malaiyandi:be like, okay, where is it strong?
Prasanna Malaiyandi:Where is it weak?
Prasanna Malaiyandi:Right, to figure out the placement.
Prasanna Malaiyandi:That's the ideal way, because what you want is you want coverage in the right
Prasanna Malaiyandi:places, because what you see is in a lot of high density housing areas, or
Prasanna Malaiyandi:even homes next to each other is most people end up with crummy wifi because
Prasanna Malaiyandi:their power is turned up so high, it bleeds into everyone else's area
Prasanna Malaiyandi:such that everyone has a crappy time.
Prasanna Malaiyandi:because then you get interference and then everyone sort of slows down and then it
W. Curtis Preston:Right.
W. Curtis Preston:Yeah.
W. Curtis Preston:I got a lot of wifi.
W. Curtis Preston:I got a lot of networks.
W. Curtis Preston:Um, you know, um, yeah,
Prasanna Malaiyandi:And for, and for the last bit, last bit of my
Prasanna Malaiyandi:soapbox is please, please, please do not use 40 megahertz channel widths
Prasanna Malaiyandi:on your 2.4 gigahertz channels.
Prasanna Malaiyandi:You do not need to use 40 megahertz and ruin everyone else's connectivity.
Prasanna Malaiyandi:Please only use 20 megahertz bands for 2.4 gigahertz.
W. Curtis Preston:Uh, I'll see what I can do but I, but I have this
W. Curtis Preston:new, you know, and again, I am not a wireless, I feel like a wireless nbe,
W. Curtis Preston:but I have this new fancy right where it automatically selects the right.
W. Curtis Preston:Um, that's
W. Curtis Preston:pretty cool.
Prasanna Malaiyandi:Point to go.
Prasanna Malaiyandi:Yeah.
W. Curtis Preston:Yeah.
W. Curtis Preston:Well, not just that, but also
W. Curtis Preston:2.4 versus five.
W. Curtis Preston:Yeah.
Prasanna Malaiyandi:So actually all of this is part of the wifi standard,
Prasanna Malaiyandi:so the figuring out which access point, that's part of the 8 0 2 11 R standard.
Prasanna Malaiyandi:And I think that the band steering is also part of the standard as well.
W. Curtis Preston:Yeah.
Prasanna Malaiyandi:a lot of folks are implementing now.
Prasanna Malaiyandi:Some devices don't do well with band steering.
Prasanna Malaiyandi:It basically looks at sort of the difference between the five gigahertz
Prasanna Malaiyandi:and the 2.4 gigahertz and says, okay, which one should I pick?
Prasanna Malaiyandi:And most devices, if it's seven decibels difference or more, then it'll pick,
Prasanna Malaiyandi:uh, the higher the faster speed.
Prasanna Malaiyandi:And so that's kind of how it tricks your devices into picking the right band.
W. Curtis Preston:Interesting.
W. Curtis Preston:Yeah, it's kind of cool.
W. Curtis Preston:Um, all I know is that I finally have a mesh that covers the two.
W. Curtis Preston:Cuz my problem is that I have things in the garage, things embedded inside
W. Curtis Preston:walls in the garage that need wifi, not just, not just inside walls.
W. Curtis Preston:, I have a device that's inside a wall, inside an electrical
W. Curtis Preston:cabinet, inside a wall.
W. Curtis Preston:Right?
W. Curtis Preston:I have a sense, uh, app or a bi, a device, and that's deep inside my
W. Curtis Preston:electrical, my circuit breaker box.
W. Curtis Preston:Um, and this reached to it.
W. Curtis Preston:No problem.
W. Curtis Preston:It didn't, it it had, it had like two bars.
W. Curtis Preston:Right.
W. Curtis Preston:So clearly, and, and the thing is, it's only, it's like 20 feet from.
Prasanna Malaiyandi:yep.
W. Curtis Preston:Right.
W. Curtis Preston:But it's, you know, a couple of drywall walls and some
W. Curtis Preston:two by fours and some metal.
W. Curtis Preston:Uh, but it worked.
W. Curtis Preston:That's the important part is that it worked.
W. Curtis Preston:Um, yeah, so I th I think I might be in, I think I might
W. Curtis Preston:be in wifi heaven for a while.
W. Curtis Preston:Um, and you too can be there for the low, low price of $350 That's
W. Curtis Preston:a two, that's a two node system.
W. Curtis Preston:Um, And it's supposed like, yeah, but I'm pretty happy.
W. Curtis Preston:But, uh, that's not what we're talking about today.
Prasanna Malaiyandi:really.
Prasanna Malaiyandi:We can talk about wifi all day if you want.
W. Curtis Preston:yeah.
W. Curtis Preston:Well, you could talk about wifi all day.
W. Curtis Preston:I feel really stupid when you're talking about wifi, because I'm
W. Curtis Preston:like, this is not my Bailey Wick.
W. Curtis Preston:That's a cool word, by the way, Bailey Wick.
W. Curtis Preston:So I thought we'd talk about backups instead because that's, that's my world.
W. Curtis Preston:And I feel comfortable knowing them.
W. Curtis Preston:Most people don't know crap about this space, uh, because they, they, you know,
W. Curtis Preston:they get the job as a junior person and then next thing you know, they become a,
W. Curtis Preston:a real sys admin or a network admin or a, you know, or a security admin or a dba.
Prasanna Malaiyandi:Yeah, well, except our listeners who are all
Prasanna Malaiyandi:awesome and probably experts in the backup field and know all about this.
W. Curtis Preston:Well, certainly Daniel.
Prasanna Malaiyandi:Hi Daniel.
W. Curtis Preston:Hi Daniel.
W. Curtis Preston:The backup anorak.
W. Curtis Preston:Um, I wonder, you know, he's never, he's never, he better still be
W. Curtis Preston:listening to the show since we call out to him every once in a while.
W. Curtis Preston:Him and Stuart, although Stuart's retired.
W. Curtis Preston:I don't think Stuart's listening to our show.
W. Curtis Preston:I only tell 'em when we talk about 'em.
W. Curtis Preston:But, um, so we're continuing in our backup to basic series.
W. Curtis Preston:It's been a couple of weeks, uh, as the kids say it's been a minute,
W. Curtis Preston:uh, since such a, I remember the first time I heard that thing, I was
W. Curtis Preston:like, what are you talking a minute?
W. Curtis Preston:Anyway, . But yeah, it's been a minute since we've done an episode of our
W. Curtis Preston:Backup to Basic series, but I am looking down at the book and of course,
W. Curtis Preston:uh, for those of you that don't know, basically we're doing a podcast version
W. Curtis Preston:of my book, modern Data Protection.
W. Curtis Preston:Make sure it gets in camera here from O'Reilly.
W. Curtis Preston:Uh, you can purchase the, uh, the, the print version from,
W. Curtis Preston:uh, your favorite book seller.
W. Curtis Preston:Um, , perhaps it's one based in the Amazon, perhaps not, uh, Um, and,
W. Curtis Preston:uh, but if you would like an ebook version of it, you can get your
W. Curtis Preston:own by going to druva.com/ebook.
W. Curtis Preston:That's d r uva.com/ebook.
W. Curtis Preston:They will, of course, ask for your contact information and then email the crap
W. Curtis Preston:out of you until you tell 'em to stop.
W. Curtis Preston:But, That is, that is the price that you pay.
W. Curtis Preston:Um, let's talk
W. Curtis Preston:about, oh yeah.
W. Curtis Preston:And while we're at it, uh, I'll throw out the disclaimer, uh, that
W. Curtis Preston:this is an independent podcast and, um, uh, I work for Druva,
W. Curtis Preston:Prasanna works for Zoom and, um,
W. Curtis Preston:The, um, but the opinions that you hear are ours.
W. Curtis Preston:Um, and.
W. Curtis Preston:Et cetera.
W. Curtis Preston:Please rate us, uh, by going to your, you know, most of you're on iTunes.
W. Curtis Preston:Just scroll down to the bottom there, give us five or six stars and a comment.
W. Curtis Preston:We love comments.
W. Curtis Preston:And, uh, if you'd like to join the conversation, just contact me, w Curtis
W. Curtis Preston:Preston gmail or WC Preston on Twitter.
Prasanna Malaiyandi:What about LinkedIn?
W. Curtis Preston:But n oh yeah, LinkedIn.
W. Curtis Preston:Uh, it's linkedin.com/what is it?
W. Curtis Preston:Slash in slash Mr.
W. Curtis Preston:Beck.
W. Curtis Preston:Um, and by the way, my Twitter account already has multifactor
W. Curtis Preston:authentication, configured not using sms, which as should you, especially
W. Curtis Preston:now that they're disabling, that so weird the way they did that.
W. Curtis Preston:What's funny is I support the desysion.
W. Curtis Preston:That's just the way
W. Curtis Preston:they
Prasanna Malaiyandi:way it came out.
Prasanna Malaiyandi:Yeah.
W. Curtis Preston:Oh, Elon.
W. Curtis Preston:Okay.
W. Curtis Preston:So in our backup to basic series, we're continuing on, and today we are talking
W. Curtis Preston:about using disk and deduplication.
W. Curtis Preston:You know, I, I, um, couple weeks ago, I hit 30 years in the backup industry,
W. Curtis Preston:and I got interviewed by Chris Mellor
Prasanna Malaiyandi:the register and blocks and files.
W. Curtis Preston:Yeah.
W. Curtis Preston:It's in his, for his block and file.
W. Curtis Preston:Um, um, blog and one of the questions was what I thought was the most,
W. Curtis Preston:um, important development in the backup industry since I joined.
W. Curtis Preston:And to me, hands down, it's not even, it's not, there's not even a close second, and
W. Curtis Preston:that is the invention of deduplication
Prasanna Malaiyandi:Yep.
W. Curtis Preston:and because.
W. Curtis Preston:I, I can't think of another technology in the backup space that has changed backup
W. Curtis Preston:architecture more than deduplication, and I can think of many other things
W. Curtis Preston:that we do that are only possible because deduplication is underneath them,
Prasanna Malaiyandi:Oh yeah, definitely.
Prasanna Malaiyandi:Yeah.
Prasanna Malaiyandi:I don't think we would be able to get, especially with the data growth
Prasanna Malaiyandi:and the size of these applications.
W. Curtis Preston:Is data growing?
W. Curtis Preston:Is
Prasanna Malaiyandi:No, not at all.
Prasanna Malaiyandi:Right.
Prasanna Malaiyandi:I don't think it would be possible to do, like I know Curtis, you've talked
Prasanna Malaiyandi:about previous, like in your early days, right, about trying to do a backup.
Prasanna Malaiyandi:I being like, oh my God, how am I gonna do this full backup in a weekend?
W. Curtis Preston:Yeah.
Prasanna Malaiyandi:And just with the fact, and I know we'll go and talk
Prasanna Malaiyandi:about more about deduplication, but yeah, just being able to now do that
Prasanna Malaiyandi:in a cost effective way, using new ways of actually doing the backups as well,
Prasanna Malaiyandi:which is enabled with deduplication.
W. Curtis Preston:Yeah.
W. Curtis Preston:So it, it's, it's like disk.
W. Curtis Preston:You could argue that disk using disk and backups is the bigger, uh, advancement.
W. Curtis Preston:But first off, not really an advancement.
W. Curtis Preston:It's just instead of tape, we're gonna use disc,
W. Curtis Preston:but
Prasanna Malaiyandi:was there to start with anyway.
Prasanna Malaiyandi:It was just sort of, the cost was so high, and especially given the type of workload
Prasanna Malaiyandi:you see with deduplication where, or with backups where you're doing periodic
Prasanna Malaiyandi:fulls or other things like that, and keeping them for long periods of time.
Prasanna Malaiyandi:Are you going to spend what, 40 x or 30 x on storage for your backup
Prasanna Malaiyandi:system versus your production?
Prasanna Malaiyandi:Right.
Prasanna Malaiyandi:That's a hard sell.
W. Curtis Preston:just, yeah, cuz that's a problem.
W. Curtis Preston:So one of the, one of the things, uh, that I remember from back in the
W. Curtis Preston:day, like I, I don't remember really thinking about this lately, but back
W. Curtis Preston:in the day, I would say that for every gigabyte of primary storage, you
W. Curtis Preston:had 20 gigabytes of backup storage.
W. Curtis Preston:And so if you're gonna do that with disk, even, you know, even once, many years ago.
W. Curtis Preston:Wow.
W. Curtis Preston:At this point, it's like 20 years ago, . But, but even once they came out with
W. Curtis Preston:this idea of, Uh, SATA disk instead
W. Curtis Preston:of
Prasanna Malaiyandi:nearline
Prasanna Malaiyandi:storage.
W. Curtis Preston:Right.
W. Curtis Preston:Um, that, that helped bring the cost down significantly.
W. Curtis Preston:But, But,
W. Curtis Preston:not, But not as much as deduplication.
Prasanna Malaiyandi:Yeah.
Prasanna Malaiyandi:Because even with those price differences, right?
Prasanna Malaiyandi:Maybe it was half the price or a third of the price, but once you add
Prasanna Malaiyandi:in that 20 x that you talked about, right, Curtis, then that adds up.
Prasanna Malaiyandi:And it's not only just the storage cost, it's also you have
Prasanna Malaiyandi:to account for the power, the cooling, the floor space, right?
Prasanna Malaiyandi:All the things that go into that system.
W. Curtis Preston:Yeah.
W. Curtis Preston:Yeah.
W. Curtis Preston:Um, it's funny, um, just sort of, just sort of a, an afterthought that, that.
W. Curtis Preston:Post that, um, that Chris Mellor did about the 30 years.
W. Curtis Preston:The one group that jumped on the article and just started retweeting all kinds of
W. Curtis Preston:parts of, of, or pieces of the article was the tape group , because I said,
W. Curtis Preston:I said really good things about tape.
W. Curtis Preston:And, and the thing is that, um, you know, I, I, you know, I, I
W. Curtis Preston:believe in all of those things, but.
W. Curtis Preston:You know, all of the advancements that I've seen in backup in the last 20 plus
W. Curtis Preston:years has been disk and deduplication.
W. Curtis Preston:Right.
W. Curtis Preston:Um, so let's talk about, so what, so not everybody really
W. Curtis Preston:understands what deduplication is.
W. Curtis Preston:Some people used to describe it like, well, it's like compression, uh, the way
W. Curtis Preston:I remember it's like macro compression.
W. Curtis Preston:Um, it's like compression over time.
W. Curtis Preston:do you think of that?
Prasanna Malaiyandi:uh, I don't quite like that, so, so, right.
W. Curtis Preston:may be some old blog posts that I might have
W. Curtis Preston:said that phrase, but go ahead.
Prasanna Malaiyandi:so in my mind, right, deduplication is.
Prasanna Malaiyandi:Finding two identical segments and tossing one away, keeping only one copy,
Prasanna Malaiyandi:but still keeping a reference to that so you can, so you still know you have two
Prasanna Malaiyandi:virtual copies, but one physical copy,
W. Curtis Preston:Mm-hmm.
Prasanna Malaiyandi:right?
Prasanna Malaiyandi:At a high level, that's what I, and now
W. Curtis Preston:you?
Prasanna Malaiyandi:what is compression is taking an object, a singular object,
Prasanna Malaiyandi:and squeezing it into a smaller space.
W. Curtis Preston:Right.
W. Curtis Preston:But how do you understand how compression works?
W. Curtis Preston:Cuz I Sure as hell don't
Prasanna Malaiyandi:yeah, so typically like you would run it
Prasanna Malaiyandi:through different types of algorithms like LZ compression and all the rest
Prasanna Malaiyandi:in order to look for patterns and throw away bits and compress it down.
Prasanna Malaiyandi:Now, the difference I would say between duping compression
Prasanna Malaiyandi:because they do sound the same,
W. Curtis Preston:Yeah.
Prasanna Malaiyandi:right?
Prasanna Malaiyandi:I would say one of the differences is with deduplication.
Prasanna Malaiyandi:It's more like a file system level compression, if you want to
Prasanna Malaiyandi:think of it that way, because it's not just I'm taking this block.
Prasanna Malaiyandi:Yeah.
Prasanna Malaiyandi:It's not just I'm taking this and I'm squeezing it down such
Prasanna Malaiyandi:that it could be, I just need to look at this and figure it out.
Prasanna Malaiyandi:Right.
Prasanna Malaiyandi:It's a lot more complex than that.
W. Curtis Preston:It is definitely a lot more complex than compression.
W. Curtis Preston:Right.
W. Curtis Preston:Um, I, I, I've just, I've, I've just honestly never really dug into the code
W. Curtis Preston:of how traditional compression works.
W. Curtis Preston:So the idea is that I'm looking for duplicate segments of data across many
W. Curtis Preston:places, both from different sources as well as different time periods, right?
W. Curtis Preston:I'm, I'm comparing the, this chunk of data that's coming in right
W. Curtis Preston:now and, and tonight's backup.
W. Curtis Preston:I'm comparing it literally with every chunk of data that I've ever received
W. Curtis Preston:from anywhere else.
W. Curtis Preston:. Prasanna Malaiyandi: I would say that's
W. Curtis Preston:builds their deduplication that way.
W. Curtis Preston:, W. Curtis Preston: So,
Prasanna Malaiyandi:where
Prasanna Malaiyandi:it's
W. Curtis Preston:there are, yeah, go ahead.
Prasanna Malaiyandi:Yeah.
Prasanna Malaiyandi:So it all goes down to sort of what is your deduplication domain is another
Prasanna Malaiyandi:term that some people talk about, right?
Prasanna Malaiyandi:Which is, is it limited to a system?
Prasanna Malaiyandi:Is it limited to a cluster which might be formed to multiple systems,
Prasanna Malaiyandi:or is it limited to sort of a single backup stream coming in?
Prasanna Malaiyandi:So
Prasanna Malaiyandi:there.
W. Curtis Preston:that the question is what is your data domain?
W. Curtis Preston:Uh,
Prasanna Malaiyandi:Yeah.
Prasanna Malaiyandi:D Domain.
W. Curtis Preston:So let's back up.
W. Curtis Preston:So a, as I understand it, right, so basically we're taking the data
W. Curtis Preston:that's, that's coming in or that's going to come in, we're slicing
W. Curtis Preston:it up into, I like the term chunk.
W. Curtis Preston:, right?
W. Curtis Preston:We run those chunks through a cryptographic hashing algorithm.
W. Curtis Preston:SH one, Shaw 2 56, whatever it, whatever you're using.
W. Curtis Preston:On the other side of that, we get a alpha numeric value, in the case of SH one,
W. Curtis Preston:it's 160 bit alpha alphanumeric value.
W. Curtis Preston:so basically you, you, depending on the algorithm you use, you get a, um, you get
W. Curtis Preston:an alpha numeric value at the end, and the size of that val, of that value is going
W. Curtis Preston:to be based on which algorithm you use.
W. Curtis Preston:In the case of SHA-1, it's 160 bits, right?
W. Curtis Preston:And.
W. Curtis Preston:You can then take the 160 bits.
W. Curtis Preston:You can't reverse engineer it.
W. Curtis Preston:You can't take the 160 bits and turn it into the chunk, but you can use that, that
W. Curtis Preston:value to uniquely identify that chunk.
W. Curtis Preston:And so if you have another chunk of data, regardless of where it came from,
W. Curtis Preston:If it's 160 bit value, again, that's SHA-1 and other values are different.
W. Curtis Preston:If it's fingerprint is the same, you can say that this chunk is identical
W. Curtis Preston:to that other chunk that had the same fingerprint, and you can then
W. Curtis Preston:discard the other chunk, right?
W. Curtis Preston:the,
W. Curtis Preston:the
Prasanna Malaiyandi:Yeah, you can, you can discard the actual data,
Prasanna Malaiyandi:but you should still keep track of it somewhere in a file system,
Prasanna Malaiyandi:just because you need, still need
W. Curtis Preston:Yeah.
W. Curtis Preston:You're gonna keep track.
W. Curtis Preston:Oh, we found another one of these,
W. Curtis Preston:right?
Prasanna Malaiyandi:And so usually that lookup is in a deduplication
Prasanna Malaiyandi:index is what they called them.
Prasanna Malaiyandi:Usually a dedupe index, which keeps a list of, Hey, here are
Prasanna Malaiyandi:all the fingerprints that I have.
W. Curtis Preston:Right.
W. Curtis Preston:As we, we were alluding to before, one of the things that determines
W. Curtis Preston:sort of your effectiveness of, of dedupe is the dedupe domain, right?
W. Curtis Preston:So I've seen it file system level, meaning it only looks for
W. Curtis Preston:duplicate data within each volume.
W. Curtis Preston:I've seen it host level, I've seen it backup level, meaning
W. Curtis Preston:literally backup configuration wise.
W. Curtis Preston:right?
W. Curtis Preston:So if I, if I have a Windows server and I'm backing up the host and I'm
W. Curtis Preston:backing up SQL Server, I only look for duplicates within SQL Server
W. Curtis Preston:backups right against each other.
W. Curtis Preston:Uh, then we have, um, if we're backing up several systems to a box, right?
W. Curtis Preston:Maybe that the dedupe domain is only within that box.
W. Curtis Preston:It's only looking for.
W. Curtis Preston:Duplicates between all of that.
W. Curtis Preston:And then there's what I would call truly global dedupe, which is , we're looking
W. Curtis Preston:for duplicates from everything that's coming in, uh, from multiple sources.
W. Curtis Preston:Right?
Prasanna Malaiyandi:Mm-hmm.
W. Curtis Preston:there is a.
W. Curtis Preston:Point of decreasing marginal returns, right?
W. Curtis Preston:You can argue, and certainly if you're a company that only does d dedupe within,
W. Curtis Preston:like earlier I was, we only looked for dupes within SQL server backups.
W. Curtis Preston:You could make an argument that, well, there's not a lot of duplicate data
W. Curtis Preston:between SQL Server and Windows, right?
W. Curtis Preston:so even though we're not comparing the two, there's not, there's not gonna
W. Curtis Preston:be a lot of duplicate data there, and there's not gonna be a lot of duplicate.
W. Curtis Preston:between the SQL Server database on this host and the SQL
W. Curtis Preston:Server database on that host.
W. Curtis Preston:So that's another argument that some
Prasanna Malaiyandi:but, but I think a lot of that was because of
Prasanna Malaiyandi:architectural limitations of the products themselves rather than,
Prasanna Malaiyandi:that is really what you wanted to do.
Prasanna Malaiyandi:Right?
Prasanna Malaiyandi:Because
Prasanna Malaiyandi:that's more of a management issue.
W. Curtis Preston:they didn't, It was like, it was like, well, if we're
W. Curtis Preston:gonna do it, if we're gonna do it that way, it's gonna be much harder.
W. Curtis Preston:to, to, to design a product to do it that way.
W. Curtis Preston:And we don't think, we don't think that there's going to
W. Curtis Preston:be that much more benefit, um,
Prasanna Malaiyandi:But on the other hand, if you look
Prasanna Malaiyandi:at things like VMware, right?
Prasanna Malaiyandi:If I have a bunch of VMs, right, there's a good cha, and they all came
Prasanna Malaiyandi:from a single golden image, right?
Prasanna Malaiyandi:There's a good chance that as you're backing it up, 80, 90% of that stuff
Prasanna Malaiyandi:is all gonna be deduplicated, right?
W. Curtis Preston:Absolutely.
W. Curtis Preston:Yeah.
W. Curtis Preston:There's also a lot of duplicate data even within like a large filer, right?
W. Curtis Preston:There's gonna be lots of duplicate data there, right?
W. Curtis Preston:So if you're only doing it volume to volume or backup configuration to
W. Curtis Preston:backup configuration, you, there's a lot of duplicate data that I
W. Curtis Preston:think you would, you would miss.
W. Curtis Preston:Um,
Prasanna Malaiyandi:I know you talked about the domains, but I think another
Prasanna Malaiyandi:thing to also mention is, Some products do different types of chunking, if you will.
Prasanna Malaiyandi:Some do it at the file level, others do it at sort of a smaller level, right?
Prasanna Malaiyandi:And some do sort of fixed segment where each one is sort of a fixed length.
Prasanna Malaiyandi:Others do sort of variable segments where they try to figure out what is optimal,
Prasanna Malaiyandi:because depending on how you're doing your fingerprinting, right, you want to
Prasanna Malaiyandi:find the most number of matches, right?
Prasanna Malaiyandi:So you can save on storage.
W. Curtis Preston:right.
W. Curtis Preston:I,
Prasanna Malaiyandi:another thing that also comes up.
W. Curtis Preston:I would argue that file level dedupe isn't really
W. Curtis Preston:dedupe, it's more a single instance.
W. Curtis Preston:Right.
W. Curtis Preston:Um, that's like single instance storage of a file, you
W. Curtis Preston:know?
W. Curtis Preston:Okay.
W. Curtis Preston:It, it's, yeah.
W. Curtis Preston:But so I, I'm always thinking subfile, uh, when I think about what I think
W. Curtis Preston:of actual dedupe . There is a much, like a very big, uh, other way that
W. Curtis Preston:we divide up the dedupe industry, and that is source versus target.
Prasanna Malaiyandi:Yep.
W. Curtis Preston:Um, the, um, the first dedupe product I ever saw,
W. Curtis Preston:which was, uh, no, was not, that was not the first, no, the first one I saw
W. Curtis Preston:the product at the time was called Undo.
W. Curtis Preston:Have we talked
W. Curtis Preston:about this?
Prasanna Malaiyandi:Mm.
W. Curtis Preston:Undo with two Os.
W. Curtis Preston:It was really funny that the name of a dedupe vendor.
W. Curtis Preston:Had duplicate data in their company name.
W. Curtis Preston:It was undoo with two os.
W. Curtis Preston:You know this product, you just don't know that that's what it used to be called.
Prasanna Malaiyandi:What is it?
Prasanna Malaiyandi:What
W. Curtis Preston:give you a, I'll give, I'll give you a hint.
W. Curtis Preston:It.
W. Curtis Preston:The name comes from the fact that it would be a C of availability.
W. Curtis Preston:I'm gonna, I'm gonna put the, the Jeopardy theme in here.
Prasanna Malaiyandi:What would it see of availability?
W. Curtis Preston:That's what the name, that's where the name for the company
W. Curtis Preston:comes from, or if I want to put it in the right order, an availability c.
Prasanna Malaiyandi:I don't know what this is.
W. Curtis Preston:Avamar
Prasanna Malaiyandi:Oh, oh, that makes sense.
W. Curtis Preston:Yeah.
W. Curtis Preston:So that's, that's where the name Avamar came from.
W. Curtis Preston:So the, the first
Prasanna Malaiyandi:I should know that
W. Curtis Preston:you shouldn't know
Prasanna Malaiyandi:I having being, uh, part of my former employer.
Prasanna Malaiyandi:Yes.
W. Curtis Preston:Yeah.
W. Curtis Preston:Well, I mean, you know, I, I have a bit of an inside track because that they're, They
W. Curtis Preston:were right up the road from me, right?
W. Curtis Preston:They were up there.
W. Curtis Preston:They were up in Irvine.
W. Curtis Preston:Um, and that was, uh, the first dedupe product.
W. Curtis Preston:They were a source dedupe . So what's the difference between source
W. Curtis Preston:dedupe and target dedupe Prasanna?
Prasanna Malaiyandi:So the biggest one is, so let's first
Prasanna Malaiyandi:talk about target tup, right?
Prasanna Malaiyandi:So Target Tup is data comes into the system and then a deduplication
Prasanna Malaiyandi:algorithm runs tosses away data.
Prasanna Malaiyandi:It can support any type of client as long as it supports
Prasanna Malaiyandi:whatever the protocol it has.
Prasanna Malaiyandi:So it's NFS or smb, right?
Prasanna Malaiyandi:Whatever can write to it, the data gets deduped.
W. Curtis Preston:Hang on, hang on.
W. Curtis Preston:Before you go on to that.
W. Curtis Preston:I don't disagree with what you said.
W. Curtis Preston:I just, I think there could be a little bit more clarification.
W. Curtis Preston:It's a box that I send whatever I want to.
Prasanna Malaiyandi:Yep.
W. Curtis Preston:Typically it, the thing about Target Dedup was that,
W. Curtis Preston:um, that it was, you didn't have to do a lot of re-engineering of
W. Curtis Preston:your backup system.
Prasanna Malaiyandi:it's like a VTL system, right?
Prasanna Malaiyandi:That came.
W. Curtis Preston:plug in a box.
W. Curtis Preston:Yeah.
W. Curtis Preston:And you would send you, and basically you stopped using tape and you
W. Curtis Preston:sent your backups to this box.
W. Curtis Preston:Maybe the box might even be pretending to be a tape library,
W. Curtis Preston:the virtual tape library.
W. Curtis Preston:Right.
W. Curtis Preston:Um, and then it did all the dedupe magic over there.
W. Curtis Preston:Um,
Prasanna Malaiyandi:Which was great because you can
Prasanna Malaiyandi:just plug in your box and go.
Prasanna Malaiyandi:Now the other side is called source side dedupe, instead of sending all the
Prasanna Malaiyandi:data and tossing it away, why don't we do something smart and actually figure
Prasanna Malaiyandi:out the duplicates on the client itself, on the source right, dedupe on the
Prasanna Malaiyandi:source, and only send the unique data.
Prasanna Malaiyandi:And this has the advantage.
Prasanna Malaiyandi:Actually not sending the data over the wire, which is actually
Prasanna Malaiyandi:a huge benefit that people don't understand always, right?
Prasanna Malaiyandi:Is not sending the data can actually make it a lot faster, even though
Prasanna Malaiyandi:you think, oh, I'm now putting additional load on my server itself.
Prasanna Malaiyandi:But it ends up being better than trying to send all the data and just tossing
Prasanna Malaiyandi:it away like target-side dedupe does.
W. Curtis Preston:I would say it theoretically should be better
W. Curtis Preston:right?
W. Curtis Preston:Because you, I'm just saying I've seen some crappy source dedupe systems, right?
Prasanna Malaiyandi:Okay.
Prasanna Malaiyandi:Sorry.
Prasanna Malaiyandi:I've seen some, I've seen some good ones, or the ones that I've
Prasanna Malaiyandi:interacted with have been good.
Prasanna Malaiyandi:And so I've seen the performance numbers around
W. Curtis Preston:Yeah.
W. Curtis Preston:I, I do think it, it makes more sense to me.
W. Curtis Preston:It always made more sense to me.
W. Curtis Preston:The only reason why we had Target dedupe was because to do source dedupe , you
W. Curtis Preston:have to redesign the backup product.
W. Curtis Preston:, right?
W. Curtis Preston:It took a long time to get, to get, uh, basically you have to
W. Curtis Preston:stop using net backup networker or tsm, whatever it was back in the
W. Curtis Preston:day, and you had to replace it.
W. Curtis Preston:Like in this case with Avamar, Avamar was a source do-do product.
W. Curtis Preston:You had to do what we call a four clipped upgrade.
W. Curtis Preston:You had to throw out the baby with the bathwater, whatever phrase, whatever.
W. Curtis Preston:You know,
W. Curtis Preston:uh, analogy you want to use there.
W. Curtis Preston:That was the main problem as I saw it with source dedup.
W. Curtis Preston:Right.
W. Curtis Preston:Is that, is that you, you had to change your backup product to get it,
Prasanna Malaiyandi:and that was in the beginning, right?
Prasanna Malaiyandi:At the very early
W. Curtis Preston:Well, well, You.
W. Curtis Preston:You, well, yeah.
W. Curtis Preston:Now you just had to, had to upgrade your backup product, right?
W. Curtis Preston:Because many of modern backup technologies now support source dedupe , although
W. Curtis Preston:even some newer backup technologies don't, I don't know if, I dunno if that
W. Curtis Preston:came out in English, so some I, there was some double negatives in there.
W. Curtis Preston:Some newer, very new backup technologies.
W. Curtis Preston:Don't do source dedupe
W. Curtis Preston:. Prasanna Malaiyandi: which seems bunkers.
W. Curtis Preston:
Speaker:which does seem bonkers.
W. Curtis Preston:
Speaker:Um, I, you know, and, um, I'm talking about the likes of
W. Curtis Preston:
Speaker:Rubric and Cohesity, right?
W. Curtis Preston:
Speaker:These are new, these are, you know, next gen backup products that were designed
W. Curtis Preston:
Speaker:in the last, less than the last 10 years.
W. Curtis Preston:
Speaker:Right.
W. Curtis Preston:
Speaker:And it's based on an appliance model.
W. Curtis Preston:
Speaker:and they do all the dedupe inside that box, is my understanding, right?
Prasanna Malaiyandi:And I just wanna challenge that, Curtis, because I thought
Prasanna Malaiyandi:in some cases, They do do source side deduplication, but I think because they've
Prasanna Malaiyandi:tried to be open and act as a target device, in those cases, you can't, like,
Prasanna Malaiyandi:you don't really have another option.
W. Curtis Preston:Yeah, I, I don't, well, again, I'm not,
Prasanna Malaiyandi:I, but I don't know
W. Curtis Preston:work at, I work at Druva, not at Rubrik, uh, or, or Cohesity.
W. Curtis Preston:But it is my understanding that they do target side dedup, which is, and,
W. Curtis Preston:and one of the challenges of target side dedup is you need an appliance.
W. Curtis Preston:at each location.
W. Curtis Preston:Now I know that they can do virtual appliances, right?
W. Curtis Preston:So they have a, they have a VM level appliance.
W. Curtis Preston:Uh, but you need a box or something pretending to be a box at each location,
W. Curtis Preston:because if you're not eliminating the duplicates before you send it
W. Curtis Preston:to the box, um, then you need, you need something that's on-prem, right?
Prasanna Malaiyandi:Because you definitely don't wanna
Prasanna Malaiyandi:send that all over the Wan
W. Curtis Preston:No, no, that's the, to me, that's the biggest advantage
W. Curtis Preston:of a source dedupe system is that it's ultimately scalable, right?
W. Curtis Preston:That you, that assuming, assuming it doesn't slow things down, assuming,
W. Curtis Preston:assuming all these things, assuming that the product actually works, um, that
W. Curtis Preston:you, um, you could back up a laptop.
W. Curtis Preston:, right?
W. Curtis Preston:You can back up a mobile phone and the, the duplicate data will be eliminated
W. Curtis Preston:before it's sent over the wan, which is what you need to do if you're
W. Curtis Preston:backing up something over the internet.
Prasanna Malaiyandi:Mm-hmm.
W. Curtis Preston:Right.
W. Curtis Preston:Um, and, um, so the, the downside that some, you know, again, you,
W. Curtis Preston:you, you talked about it already, is that it does put additional
W. Curtis Preston:compute requirement on the client.
W. Curtis Preston:The argument is that it's offset by the,
W. Curtis Preston:um, the savings of the network bandwidth.
W. Curtis Preston:Right.
W. Curtis Preston:Um,
Prasanna Malaiyandi:There is also one more downside,
W. Curtis Preston:okay.
Prasanna Malaiyandi:which is that.
Prasanna Malaiyandi:Not all applications can do source side deduplication.
Prasanna Malaiyandi:So if you do have an application which only supports writing to like
Prasanna Malaiyandi:an NFS Mount point or an SMB Mount point, or something that doesn't
Prasanna Malaiyandi:allow the integration of these source side deduplication duplication logic,
Prasanna Malaiyandi:then you are going to need to be able to support target side dedupe.
Prasanna Malaiyandi:do.
W. Curtis Preston:Yep.
W. Curtis Preston:Uh, agreed.
W. Curtis Preston:Um, and an example of that would be like, um, uh, Oracle, right?
Prasanna Malaiyandi:Yep.
Prasanna Malaiyandi:Incremental merge.
W. Curtis Preston:yeah.
W. Curtis Preston:Um, although I would think that you should be able, I don't know, we could, we
Prasanna Malaiyandi:No, you can't.
Prasanna Malaiyandi:You can't.
Prasanna Malaiyandi:You can't.
W. Curtis Preston:You can't take the Oracle stream and slice it and dice it.
W. Curtis Preston:I don't know.
Prasanna Malaiyandi:Did you what?
Prasanna Malaiyandi:Sorry?
Prasanna Malaiyandi:You could, um, there are companies out there which give, which provide
Prasanna Malaiyandi:a virtual file system interface
Prasanna Malaiyandi:that lives
W. Curtis Preston:So you you fake it.
W. Curtis Preston:You fake it out.
W. Curtis Preston:Yeah.
W. Curtis Preston:Okay.
W. Curtis Preston:All right.
W. Curtis Preston:And then I've got something called hybrid dedupe and this, this was
W. Curtis Preston:invented by your former employer.
Prasanna Malaiyandi:I don't even know what a hybrid dedupe is.
W. Curtis Preston:it's, it's, it's Target Dedoo pretending to be Source cdu.
W. Curtis Preston:It's
Prasanna Malaiyandi:D.
Prasanna Malaiyandi:Oh, see, here's my, okay, so here's my problem is I think Boost
W. Curtis Preston:Uhhuh.
Prasanna Malaiyandi:is source.
Prasanna Malaiyandi:I deduplication, I don't know if I would call it hybrid, because it is
Prasanna Malaiyandi:very similar to what Avamar DI did.
Prasanna Malaiyandi:, right?
Prasanna Malaiyandi:It's moving the deduplication logic to the client
Prasanna Malaiyandi:such that you could do all of the computation.
Prasanna Malaiyandi:The same thing that we have talked about with source I deduplication,
W. Curtis Preston:I, I'll tell you why I put it in a different category.
W. Curtis Preston:To me, hybrid dedupe is redoing the backup software.
W. Curtis Preston:I'm sorry, source dedupe, true source iDation.
W. Curtis Preston:It's done at the backup software level,
Prasanna Malaiyandi:Okay, then.
Prasanna Malaiyandi:I
W. Curtis Preston:with, with with hybrid dedupe . I'm still dumb sending
W. Curtis Preston:everything to this source dedupe thing that's gonna redo it, right?
W. Curtis Preston:Um, it doesn't matter in the end, you get, you get roughly the same benefits, right?
W. Curtis Preston:Um, that's what, uh,
Prasanna Malaiyandi:Okay.
Prasanna Malaiyandi:So with hybrid, yeah.
Prasanna Malaiyandi:You get the benefits of source without having to upgrade and, or sorry,
Prasanna Malaiyandi:throw away your backup software.
W. Curtis Preston:Right, right, right.
W. Curtis Preston:Um, so I, I, um, we spent most of this time talking about dedupe . Um,
W. Curtis Preston:there are a bunch of different ways to use disk in your backup system.
W. Curtis Preston:Some of which don't really require dedup, right?
W. Curtis Preston:We used to do what we call disk cashing, where you just had enough
W. Curtis Preston:disk for last night's backup.
W. Curtis Preston:You would back up to disk and then you would copy that to tape, and then
W. Curtis Preston:you would hand that to a man in a van.
W. Curtis Preston:Uh, then we got a bunch of different things.
W. Curtis Preston:I got D to D to T D to D to D, D to D, D to C, and D to D to to C.
W. Curtis Preston:Did I do all that?
W. Curtis Preston:So dis to dis to tape disc, to disc to disk, direct cloud and
W. Curtis Preston:dis to disc to cloud, right?
W. Curtis Preston:So these are all ways that people use disk in current backup systems.
W. Curtis Preston:Um, to me, d D to C or disto disc to cloud is really dis to disc.
W. Curtis Preston:To disc is just the cloud is or the
W. Curtis Preston:disc Is being run by the cloud, right?
W. Curtis Preston:And I will say that dedupe , by the way, I will say that without d.
W. Curtis Preston:The whole thing of using the cloud, the way we use the cloud just wouldn't work.
W. Curtis Preston:I mean, you can't send full backups to the cloud.
W. Curtis Preston:I mean, you could, with unlimited bandwidth.
Prasanna Malaiyandi:well, and yeah, with unlimited bandwidth
Prasanna Malaiyandi:it would just be expensive.
Prasanna Malaiyandi:Right.
Prasanna Malaiyandi:Just going back to the conversation we had earlier about the wan, right?
Prasanna Malaiyandi:You don't wanna send full copies out to over the wan.
W. Curtis Preston:right.
Prasanna Malaiyandi:Um, because that gets expensive and very slow.
Prasanna Malaiyandi:Um, the other one I was going to comment on was, uh, oh, I know we've
Prasanna Malaiyandi:been talking about disk, but I think it's also important to acknowledge
Prasanna Malaiyandi:that now it's no longer spinning disk.
Prasanna Malaiyandi:It could also be flash.
Prasanna Malaiyandi:Right.
Prasanna Malaiyandi:We've seen
W. Curtis Preston:yeah,
W. Curtis Preston:but that's a whole other thing
Prasanna Malaiyandi:I I, I, know, but I'm just saying that when it
Prasanna Malaiyandi:comes to deduplication and backup ST or protection storage, right?
Prasanna Malaiyandi:This, it could be flash, it could be disk, it could be object storage, right?
Prasanna Malaiyandi:So I think it's important to differentiate that, like what we're
Prasanna Malaiyandi:talking about with deduplication, when we mentioned disk, right?
Prasanna Malaiyandi:The media layer itself.
Prasanna Malaiyandi:Yeah, the media layer.
Prasanna Malaiyandi:Yes.
Prasanna Malaiyandi:The media layer is not tape.
W. Curtis Preston:Right, right.
W. Curtis Preston:Hang on one second.
W. Curtis Preston:Um, I need to, didn't realize I had a, I had a, um, Meeting
Prasanna Malaiyandi:Meaning a.
W. Curtis Preston:Yeah.
W. Curtis Preston:Four.
W. Curtis Preston:Well, four 15, which is an odd, um, all right.
W. Curtis Preston:It's a, it's a pre-meeting with a podcast thing.
W. Curtis Preston:It's, um, anyway, um, so, uh, yeah, so, okay, you know, I hate the idea of flash
Prasanna Malaiyandi:know, I know, I know.
Prasanna Malaiyandi:I'm, I, I'm just saying that people will bring it up.
Prasanna Malaiyandi:So I just wanna clarify that when we talk about disc, we're
Prasanna Malaiyandi:just talking about not tape.
W. Curtis Preston:The only place.
W. Curtis Preston:Yeah.
W. Curtis Preston:Correct.
W. Curtis Preston:The only place where I think maybe Flash has a place in the backup
W. Curtis Preston:system is, and you know, you know, the folks over at Pierre and
W. Curtis Preston:Neil, they're all mad at me now.
W. Curtis Preston:Right.
W. Curtis Preston:But, uh, the only place that I, where I think Flash has a place in the backup
W. Curtis Preston:system is with like live recovery.
W. Curtis Preston:If you're gonna do, if you're gonna do instant recovery and you're actually gonna
W. Curtis Preston:run VMs off of your backups, that better be some really nice performing disk.
W. Curtis Preston:But the thing is, it doesn't need to be your whole system.
W. Curtis Preston:It just needs to be like the most
Prasanna Malaiyandi:A part, part of, and it needs to, you don't need
Prasanna Malaiyandi:your entire system to be flash,
W. Curtis Preston:Yeah.
Prasanna Malaiyandi:You just
Prasanna Malaiyandi:need enough to be able to support that use case.
W. Curtis Preston:I, I just think that where Flash does really,
W. Curtis Preston:really, Is in random access, right?
W. Curtis Preston:Backup isn't a random access application.
W. Curtis Preston:Backup is a streaming application.
W. Curtis Preston:Even if what we're talking is large dedupe chunks.
W. Curtis Preston:I don't
W. Curtis Preston:know.
W. Curtis Preston:I, I,
Prasanna Malaiyandi:I,
Prasanna Malaiyandi:I,
W. Curtis Preston:say, let's just say the jury is out for me.
W. Curtis Preston:I, I am in Missouri.
W. Curtis Preston:Missouri.
W. Curtis Preston:Is that, is that the show me state?
W. Curtis Preston:That's the show me state.
W. Curtis Preston:Right?
Prasanna Malaiyandi:yeah.
W. Curtis Preston:So I'll tell you what, I'll tell you what.
W. Curtis Preston:If there's anybody that's listening to this that just got pissed off,
Prasanna Malaiyandi:what's his
Prasanna Malaiyandi:name?
Prasanna Malaiyandi:I'll come back
W. Curtis Preston:to, I welcome you to, come on and tell me why I'm wrong.
W. Curtis Preston:I, I just,
Prasanna Malaiyandi:I, I, I, I know who will come back on, you
W. Curtis Preston:who, who,
W. Curtis Preston:will come back on,
Prasanna Malaiyandi:what's his name?
Prasanna Malaiyandi:Bass Data guy.
W. Curtis Preston:uh oh.
W. Curtis Preston:Oh, are they flash
Prasanna Malaiyandi:Yeah,
W. Curtis Preston:mark?
W. Curtis Preston:Um, No, sorry, Howard.
W. Curtis Preston:Uh, Howard.
W. Curtis Preston:Yeah.
Prasanna Malaiyandi:Fastest.
Prasanna Malaiyandi:Pure flash.
Prasanna Malaiyandi:Yeah.
W. Curtis Preston:Yeah.
W. Curtis Preston:Um, all right.
W. Curtis Preston:All right.
W. Curtis Preston:Well, yeah, Howard, uh, you wanna tell, you wanna tell me why I'm wrong?
W. Curtis Preston:Um, I'm more than happy to have you back.
W. Curtis Preston:We can duke it out.
W. Curtis Preston:We can duke it.
W. Curtis Preston:wouldn't be the first time.
W. Curtis Preston:Howard and I have, have disagreed on something.
W. Curtis Preston:I don't know.
W. Curtis Preston:It's just, it's just there are so many area, there are so many
W. Curtis Preston:other places where I would wanna spend money in the backup system.
Prasanna Malaiyandi:Yep.
W. Curtis Preston:Um, and, um,
Prasanna Malaiyandi:comes down to what the cost is.
Prasanna Malaiyandi:Right.
Prasanna Malaiyandi:If you could get flash down to a low enough point,
W. Curtis Preston:which is the point of vast data, right?
W. Curtis Preston:Their architecture allows using flash in a, um, you know, a significant way,
Prasanna Malaiyandi:
Speaker:That's, that's why I brought
W. Curtis Preston:uh, close to cost.
W. Curtis Preston:Okay.
W. Curtis Preston:All right.
W. Curtis Preston:Okay.
W. Curtis Preston:All right.
W. Curtis Preston:All right.
W. Curtis Preston:All right.
W. Curtis Preston:Um, and then I got this whole other thing.
W. Curtis Preston:I'm not gonna go into that other thing.
W. Curtis Preston:Um, but yeah, so d d makes disk and, and cloud-based products, both physiologically
W. Curtis Preston:feasible as well as economically feasible.
W. Curtis Preston:Right.
W. Curtis Preston:Um,
Prasanna Malaiyandi:is.
W. Curtis Preston:hmm.
Prasanna Malaiyandi:Is there something that a person shopping for
Prasanna Malaiyandi:a dedupe system should be asking?
Prasanna Malaiyandi:Like what are the important things that they should be
Prasanna Malaiyandi:asking in order to determine
W. Curtis Preston:yeah, that's a, that's a great question.
W. Curtis Preston:I think the, the question would be things about what's the restored performance?
W. Curtis Preston:Because in the end, that's the only thing that matters.
W. Curtis Preston:I remember.
W. Curtis Preston:A product.
W. Curtis Preston:Now, this product is still on the market, but I believe, I believe
W. Curtis Preston:they have addressed this, this issue.
W. Curtis Preston:I remember a dedupe product.
W. Curtis Preston:It was a Target dedupe product that had, uh, I remember that had 400 megabytes
W. Curtis Preston:a second throughput in to an appliance.
Prasanna Malaiyandi:
Speaker:And like 10 megabits out
W. Curtis Preston:It was 40, it was 40, it was 40, uh, megabytes out.
W. Curtis Preston:It had a 90%, what we call dedupe tax.
W. Curtis Preston:Right.
W. Curtis Preston:That the, because the problem with dedupe, depending on how
W. Curtis Preston:you store it, is that you've got everything you need all over the
Prasanna Malaiyandi:All over the place.
W. Curtis Preston:Yeah.
W. Curtis Preston:And so this was just a really, really, really bad design.
W. Curtis Preston:And um, uh, I believe that they addressed it and, um, because that
W. Curtis Preston:product is still on the market today.
W. Curtis Preston:But that version, one of that product was ble.
W. Curtis Preston:Um, so yeah, it's about restored performance, right?
W. Curtis Preston:So one thing, oh, I'm.
W. Curtis Preston:Uh, dedupe ratio is crap.
W. Curtis Preston:Don't look at dedupe ratio.
W. Curtis Preston:dedupe ratio is a made up number.
W. Curtis Preston:Um, I will, um, I'll, I'll go back to, I'll pick on Avamar.
W. Curtis Preston:Avamar.
W. Curtis Preston:Back in the day, they used to say they had a 400 to one DEDUP ratio.
W. Curtis Preston:Do you remember this?
W. Curtis Preston:Because
W. Curtis Preston:they basically considered every backup as a full backup.
W. Curtis Preston:They're like, the way we store backups, which is the same way Druva stores
W. Curtis Preston:backups, the way we store backups.
W. Curtis Preston:It's like, even though they're incremental, it's like they're a full.
W. Curtis Preston:, right?
W. Curtis Preston:Because they behave like a full during a restore.
W. Curtis Preston:And so they considered every backup a full.
W. Curtis Preston:And so they said, well then therefore the dedup ratio is 400 to one.
W. Curtis Preston:Well, that was always complete nonsense.
W. Curtis Preston:Um, the other would be, I remember, uh, again, I'm gonna pick on people equally.
W. Curtis Preston:I remember sales reps of a certain large target.
W. Curtis Preston:D company that where you might've worked, where they would tell customers to go
W. Curtis Preston:and do full backups more frequently because it made their dedup ratio better.
W. Curtis Preston:, which is just, again, nonsense.
W. Curtis Preston:What matters, in my opinion, what matters is how big is a full backup versus
W. Curtis Preston:how big are all the backups, right?
W. Curtis Preston:So if I have.
W. Curtis Preston:If I, let me, let me explain what I'm saying.
W. Curtis Preston:If I have a hundred terabytes, if, if one full backup of my environment is a hundred
W. Curtis Preston:terabytes and then after three months how big is, or whatever number you want.
W. Curtis Preston:Uh, but it's just three months seems like a, a nice, long, um, what do you call it?
W. Curtis Preston:Uh, POC thing,
W. Curtis Preston:right?
W. Curtis Preston:Um, after a hundred, after, you know, three months, how.
W. Curtis Preston:How much stuff is stored over there?
W. Curtis Preston:That's what I'm saying.
W. Curtis Preston:Don't dedupe ratios is nonsense that that didn't come out in English.
W. Curtis Preston:dedupe ratios are nonsense, but if I can fit a hundred terabytes right, if I have
W. Curtis Preston:a hundred terabyte environment and then a series of incremental backups, and then
W. Curtis Preston:over there, my question is how big is.
W. Curtis Preston:How much data did I write to disk?
W. Curtis Preston:And let's say it's, it's, it's 200 terabytes after 90 days.
W. Curtis Preston:And then compare that with another product who writes a hundred terabytes?
W. Curtis Preston:You backed up the same data, but you used half as much storage on the back end.
W. Curtis Preston:. That's what I'm saying.
W. Curtis Preston:The the, the problem is, and the, the other reason, and again,
W. Curtis Preston:I'm a little extra sensitive to this cuz I work for Druva.
W. Curtis Preston:People ask us what's our, what's our dedupe ratio?
W. Curtis Preston:We're like, well the thing is we're like the opposite of Avamar.
W. Curtis Preston:Well we're actually similar to Avamar in that we're source I dedupe,
W. Curtis Preston:but we don't use that funny math.
W. Curtis Preston:So we could say 401, but that's nonsense.
W. Curtis Preston:So you know, we say, well, we.
W. Curtis Preston:Because, because we also do incremental forever backups.
W. Curtis Preston:That's, that's the problem.
W. Curtis Preston:Right.
W. Curtis Preston:So, um, but I know that on average, if we have a hundred terabyte
W. Curtis Preston:customer, we store, you know, roughly a year's worth of backups in less
W. Curtis Preston:than a hundred terabytes of disk.
Prasanna Malaiyandi:Yeah.
Prasanna Malaiyandi:And I think it's important there to also account for that increment, like
Prasanna Malaiyandi:how I look at these like numbers.
Prasanna Malaiyandi:I totally get what you said, Curtis, like you should just do an apples apples.
Prasanna Malaiyandi:But if you don't have that ability, you should also look to say, okay,
Prasanna Malaiyandi:I have a hundred terabyte full.
Prasanna Malaiyandi:And then say, my daily change rate is 2%.
Prasanna Malaiyandi:right?
Prasanna Malaiyandi:So if I do 2% for a month, right?
Prasanna Malaiyandi:That's, what is that two 60?
Prasanna Malaiyandi:60 more terabytes, right?
Prasanna Malaiyandi:So it should be 160 terabytes worth of data that I sent over, right?
Prasanna Malaiyandi:For 160 terabytes worth of data, how much should I actually store?
Prasanna Malaiyandi:Right?
Prasanna Malaiyandi:Which will give you similar things to what you're saying, right?
Prasanna Malaiyandi:But Bec, because what you're saying is if you had the two products, then
Prasanna Malaiyandi:you could do a direct comparison.
Prasanna Malaiyandi:But I'm saying if you don't have the two products, then
Prasanna Malaiyandi:here's another way you could
W. Curtis Preston:Well, I, well, I would argue that there's no way
W. Curtis Preston:to compare them if you don't have two pro, if you, if you're not, if
W. Curtis Preston:you're not doing a true comparison.
W. Curtis Preston:Right.
Prasanna Malaiyandi:A
W. Curtis Preston:it's just, it's just that d math is funny, right?
W. Curtis Preston:So different products charge differently, right?
W. Curtis Preston:You look at, um, like when you look at Metallic, which competes
W. Curtis Preston:with Druva, they have a frontend price and we have a backend price.
W. Curtis Preston:They have, they actually have the front end price, and then you also
W. Curtis Preston:need to pay for the backend storage.
W. Curtis Preston:Right?
W. Curtis Preston:So you're paying, so how do you, how do you compare that?
W. Curtis Preston:Um, it's, it's just, it's difficult
Prasanna Malaiyandi:hard.
Prasanna Malaiyandi:Yeah.
W. Curtis Preston:it's hard.
W. Curtis Preston:Uh, but all I'm saying is dedup ratio is crap and doesn't mean anything.
W. Curtis Preston:Um, but what does matter is how much data are you storing on that
W. Curtis Preston:backend because you will be paying for that one way or the other.
W. Curtis Preston:All right.
W. Curtis Preston:I don't know if we made this, if we, if this is clear as mud or what, but, uh, I
W. Curtis Preston:hope that was helpful and, uh, maybe we, maybe we ticked off Howard and Howard's
W. Curtis Preston:gonna come on next week's episode.
W. Curtis Preston:. I dunno.
Prasanna Malaiyandi:Come join us,
Prasanna Malaiyandi:Howard,
W. Curtis Preston:Thanks for, thanks for, uh, thanks for helping
W. Curtis Preston:me with my network as well, so,
Prasanna Malaiyandi:anytime, Curtis.
Prasanna Malaiyandi:Just remember I am not tech support.
W. Curtis Preston:Yeah.
W. Curtis Preston:Yeah.
W. Curtis Preston:All right.
W. Curtis Preston:Well, uh, and thanks to the listeners and remember to subscribe