Check out our companion blog!
Dec. 4, 2023

Snap, Replicate, & Protect: Leveraging Near CDP

Tired of backup windows and 24-hour recovery point objectives? Then it's time to learn about how snapshots and replication work together to create near-continuous data protection, or near-CDP.

In this episode, backup experts W. Curtis Preston and Prasanna Malaiyandi dive into leveraging snapshots for instant point-in-time recovery and replication for an offsite copy. By combining these technologies, you can achieve recovery point objectives measured in minutes rather than hours or days.

Listen in to understand what near CDP is, how it differs from backup and true CDP, and the key capabilities it enables. Discover when to take crash-consistent vs application-consistent snapshots. Learn how near CDP integrates with backup software and how you can use replicated snapshots for automated recovery testing.

If you need tighter RPOs and near-instant RTOs for your mission-critical systems, you can’t afford to miss this explanation of how snap and replicate delivers a high-frequency, budget-friendly data protection option. Tune in to become a hero by enabling your organization to recover quickly from data corruption, ransomware, and other threats!

Transcript

Speaker:

Listeners of this podcast have heard why snapshots aren't backup.

 

Speaker:

You've also learned why replication isn't backup either.

 

Speaker:

But what if I make a snapshot on one array and replicate it to another array?

 

Speaker:

Is that a backup?

 

Speaker:

A lot of people would say yes.

 

Speaker:

We'll also need things like reporting and cataloging, of course, but I would

 

Speaker:

argue that snap and replicate also known as near CDP is one of the most

 

Speaker:

efficient ways we have of protecting data.

 

Speaker:

Hi, I'm Debbie Curtis, press and AKA Mr.

 

Speaker:

Backup, and each episode of this podcast, dives deep into a backup related topic.

 

Speaker:

We turn unappreciated backup admins into cyber recovery heroes.

 

Speaker:

This is the backup wrap up.

 

Speaker:

Welcome to the show.

 

Speaker:

Thanks for listening today.

 

Speaker:

I am your host, w Curtis Preston, and I have with me the guy that

 

Speaker:

made me wait for him today.

 

Speaker:

Prasanna Malaiyandi

 

Speaker:

I am good, Curtis.

 

Speaker:

I am so sorry for making you wait.

 

Speaker:

and, and why did I wait again?

 

Speaker:

So you could finish the last few minutes of a series that you have already

 

Speaker:

Well, it's, and it's not even the entire series, it is just

 

Speaker:

the episode that I was on.

 

Speaker:

Now, granted, I'm near the end of the show, so it is sort of getting

 

Speaker:

to the cliffhanger phases, but I was enjoying my lunch while watching

 

Speaker:

the last bits of a show, and

 

Speaker:

Yeah.

 

Speaker:

then Curtis, and I was like, Curtis, I need five minutes.

 

Speaker:

And then I was like, no, I need seven minutes because

 

Speaker:

there's still six minutes left.

 

Speaker:

Well.

 

Speaker:

It is time for the news of the week.

 

Speaker:

the first news of the week falls into one of my favorite news categories.

 

Speaker:

Do you know what category that is?

 

Speaker:

Things that you should be backing up that people don't

 

Speaker:

realize until the data's gone.

 

Speaker:

Yeah, I was just gonna say, I told you so.

 

Speaker:

So what do do you wanna do?

 

Speaker:

You wanna jump right in on our first news

 

Speaker:

yeah.

 

Speaker:

So this is actually something that I ran into on Reddit, which

 

Speaker:

I for some reason get cis admin.

 

Speaker:

Subreddit, uh, articles in my feed and it was like, Hey, did anyone notice that

 

Speaker:

they have data missing from Google Drive?

 

Speaker:

And I was like, oh man, what is this?

 

Speaker:

And then it slowly started picking up.

 

Speaker:

I think the register carried it.

 

Speaker:

Bleeping computer and other folks as well, but.

 

Speaker:

Basically what happened is people realized all of a sudden that some

 

Speaker:

of their data was gone and that they didn't have any of their files and

 

Speaker:

other changes since May of this year.

 

Speaker:

Since May of this year, and this is being recorded in the end of November,

 

Speaker:

so that is a long amount of time

 

Speaker:

and, and Google has officially responded.

 

Speaker:

And what they're saying as I was looking at these, these instructions and,

 

Speaker:

and it basically said like, um, don't mess around with your drive right now.

 

Speaker:

Right.

 

Speaker:

So they were, they're saying don't disconnect.

 

Speaker:

Don't do any structural changes to your drive.

 

Speaker:

And to me what that says is their pro, that there was a suggestion that maybe

 

Speaker:

somebody did a rollback of some snapshot.

 

Speaker:

Is that

 

Speaker:

Yeah, and I think we should be clear, this isn't just general

 

Speaker:

Google Drive, so if you just access it through the web portal, right?

 

Speaker:

All of that stuff still works fine.

 

Speaker:

That has all your data.

 

Speaker:

This only specifically affected customers who were using Google

 

Speaker:

Drive for desktop to connect.

 

Speaker:

So that I, I gotta

 

Speaker:

Oh, is that no longer

 

Speaker:

I.

 

Speaker:

I'm seeing comments from the, that was what we thought a few days ago, but if

 

Speaker:

you follow some comments, and again, it we're, we're, we're coming this from,

 

Speaker:

you know, from the outside and we're not personally being impacted, but

 

Speaker:

there are people who are saying that they've never used the desktop version

 

Speaker:

and they're experiencing the problem.

 

Speaker:

Those people may be wrong, it's just a couple of people.

 

Speaker:

Um, but they're saying that they're experiencing the problem, but.

 

Speaker:

So I, whoa.

 

Speaker:

Which is in, yeah, that's news because I was worried about

 

Speaker:

that, so I went and looked.

 

Speaker:

And Curtis, I know we do, we use OBS as a backup for us recording this podcast

 

Speaker:

and we upload that to Google Drive.

 

Speaker:

And I did go and check to make sure, because we just uploaded one a couple

 

Speaker:

of weeks ago, and that still exists.

 

Speaker:

So I don't know if whatever we had shared it with, it looks like

 

Speaker:

not everyone is affected, but.

 

Speaker:

right.

 

Speaker:

It looks like there are some random set of folks who somehow,

 

Speaker:

some reason are affected by having some of their data gone.

 

Speaker:

Yeah, so it looks like, uh, I don't remember exactly where I read it,

 

Speaker:

but the idea is that someone rolled, basically rolled the entire drive or, or

 

Speaker:

a section of this entire drive back to, uh, essentially the end of April and.

 

Speaker:

So, and that's consistent with what they're saying of like, don't do,

 

Speaker:

don't do any work in this right now and don't do any structural changes.

 

Speaker:

'cause what I think they're going to try to do is to basically undo that

 

Speaker:

action, which would then put the, all of those same customers back to what

 

Speaker:

happened before all of this happened.

 

Speaker:

And if you're making any changes in there right now, those changes

 

Speaker:

will be undone by that change.

 

Speaker:

What's a little disconcerting is that there isn't, there's no, you know, we,

 

Speaker:

we talk a lot about how companies respond and Google isn't doing those things.

 

Speaker:

It's.

 

Speaker:

very, very little information out there about this.

 

Speaker:

Yeah, there's just this one page that it basically said, Hey, don't do anything.

 

Speaker:

There isn't, I haven't seen any updated stories I checked before

 

Speaker:

we recorded this, and I'm a little, a little concerned about that

 

Speaker:

I'm also surprised that.

 

Speaker:

They gave service engineering or whoever else the capability to even roll back

 

Speaker:

production to a snapshot that far back without checks and balances in place.

 

Speaker:

Now, I don't know the what happened, right?

 

Speaker:

This is all just assumption, but it's a little scary that someone

 

Speaker:

had that sort of capability.

 

Speaker:

It's a lot scary.

 

Speaker:

Uh, imagine if you're a company that is using this, you know, you've

 

Speaker:

got all sorts of stuff stored in there and you just rolled it back.

 

Speaker:

Uh, yeah, I, I, I hope that there's an update on this.

 

Speaker:

I hope we know more than we know right now, but, uh, if you're, if you're

 

Speaker:

a Google user, it's time to just double check what's going on in your

 

Speaker:

Yeah.

 

Speaker:

And this is one of the other reasons why you should back up

 

Speaker:

your data, even if you use Google

 

Speaker:

Yes, yes.

 

Speaker:

This is, this is why I put it in the, I told you So category, you know, the,

 

Speaker:

the Google, you know, the, the cloud is, is a magical place, but it's not magic.

 

Speaker:

So, um, let's take a look at this other story.

 

Speaker:

And it has to do with a ransomware attack on a hospital chain in

 

Speaker:

Nashville, Tennessee, and they've got.

 

Speaker:

30 hospitals and 200 care sites around the country.

 

Speaker:

Oklahoma, Texas, New Jersey, New Mexico, Idaho, and Kansas.

 

Speaker:

And they were forced to divert patients from a, a number of ERs and one of

 

Speaker:

the other things was that people weren't able to book appointments

 

Speaker:

at, um, you know, their usual doctor because the patient portal was down.

 

Speaker:

I just wanna say.

 

Speaker:

It wasn't that long ago.

 

Speaker:

Do you remember when the ransomware groups, they specifically

 

Speaker:

didn't target healthcare?

 

Speaker:

Um, because it tend, you know, people can

 

Speaker:

die, but that is

 

Speaker:

clearly gone.

 

Speaker:

And remember there was the case in Germany, I think, where a patient died

 

Speaker:

because an ER was closed and they had to re reroute them to a different one.

 

Speaker:

And that's, I think when it came out where ransomware actors were like,

 

Speaker:

yeah, maybe we will avoid hospitals.

 

Speaker:

Yeah, but clearly not here.

 

Speaker:

Right.

 

Speaker:

They targeted this group and, the only update that I've seen is that.

 

Speaker:

They, they are starting to restore some services.

 

Speaker:

The, the company said that they did notify law enforcement.

 

Speaker:

It did say that they, um, that they contracted a cybersecurity firm.

 

Speaker:

These are all good things.

 

Speaker:

These are the things that we like to hear.

 

Speaker:

These are what people should be doing.

 

Speaker:

Um, but we don't yet know if they were able to restore services

 

Speaker:

or if they paid the ransom.

 

Speaker:

Uh, we don't, you know, we don't know yet.

 

Speaker:

And I think the other big thing is these are your medical records, right?

 

Speaker:

And another unintended consequence is some of these particular

 

Speaker:

facilities provide specialized care for certain types of, uh, ailments.

 

Speaker:

And if they're.

 

Speaker:

They're providing that specialized care and then they're down.

 

Speaker:

It's not like they can just divert that care to some other place.

 

Speaker:

So yeah, this is a real mess.

 

Speaker:

Um, you know, I just, the, the thing I think we can take away from this is what

 

Speaker:

it's again, what, what I've already said.

 

Speaker:

I like that they contacted the law enforcement.

 

Speaker:

I like that they contracted with the cybersecurity professional.

 

Speaker:

The key there is that you want to start having those conversations.

 

Speaker:

Now you want to identify a cybersecurity firm that you can

 

Speaker:

contract with, that you can work with.

 

Speaker:

One of the ways to do this is to con, is to talk to a cybersecurity.

 

Speaker:

Uh, like if you, if you have cybersecurity insurance, to talk to them, uh, and see

 

Speaker:

if they can put you in touch with somebody now so that you can prepare, uh, you know.

 

Speaker:

Rather than, um, going to Google and saying cybersecurity first,

 

Speaker:

Yeah.

 

Speaker:

the middle of your, uh, ransomware

 

Speaker:

attack.

 

Speaker:

So that's, uh, that's hopefully what we, yeah.

 

Speaker:

A little bit too late.

 

Speaker:

All right, well, that is the news of the week,

 

Speaker:

All right.

 

Speaker:

This week on the backup wrap up, we have another backup to basics

 

Speaker:

topic, and I wanted to talk this week about near CDP and which is

 

Speaker:

near continuous data protection.

 

Speaker:

And I, I think in order to do that we have to sort of back up a little bit and

 

Speaker:

talk about the things that we've talked about that have led up to this point.

 

Speaker:

Uh, these are modern backup and recovery methods.

 

Speaker:

Basically things that have been birthed in the last 20 years,

 

Speaker:

basically in the 21st century.

 

Speaker:

Before we talk about near CDP, I think we need to talk about the

 

Speaker:

things that have led up to this point.

 

Speaker:

And, uh, we're gonna talk about replication, snapshots, and what

 

Speaker:

we call continuous data protection.

 

Speaker:

So let's talk about replication first.

 

Speaker:

Do you wanna take that on?

 

Speaker:

Yeah, so replication is basically you're taking data in one system and

 

Speaker:

replicating it to the other system.

 

Speaker:

So the second system is an exact copy of the first system.

 

Speaker:

And in the case of synchronous, there's no data loss, right?

 

Speaker:

So your RPO is zero.

 

Speaker:

And yeah, so it is in sync.

 

Speaker:

It's basically a mirror.

 

Speaker:

And that also means you don't have multiple versions on that secondary side.

 

Speaker:

So if you have a logical corruption or you have a user error on the primary,

 

Speaker:

it's just gonna replicate it blindly to the other side, and that's what you get.

 

Speaker:

Exactly.

 

Speaker:

It makes your stupidity just more effective is what the way I, the way I

 

Speaker:

like to say it, and it doesn't matter whether you're, I mean, I suppose

 

Speaker:

maybe if you had an asynchronous replication, you could, may, maybe

 

Speaker:

there's a big enough buffer that maybe you could stop a disaster if

 

Speaker:

you did something really stupid.

 

Speaker:

But you'd really have to be on the ball, I would think,

 

Speaker:

uh, to, to do that in general.

 

Speaker:

The, the, the replication.

 

Speaker:

Replication will be great for Dr when your site blows up, but

 

Speaker:

it will be really worthless if you're the one that blew it up,

 

Speaker:

Yep.

 

Speaker:

If, if you had dropped a table or, or you got a ransomware attack,

 

Speaker:

which I think we can all agree is.

 

Speaker:

Uh, a big deal, right, right now,

 

Speaker:

Oh yeah.

 

Speaker:

Uh, and by the way, each of these that we're summarizing have their

 

Speaker:

own episodes back before the episode.

 

Speaker:

So, so if you don't, if you're not familiar with these topics, these,

 

Speaker:

this is just a review of them.

 

Speaker:

They, they each have their own episodes.

 

Speaker:

Yeah.

 

Speaker:

So I think next in the list of topics, I think we talked about CDP next.

 

Speaker:

So do you wanna talk about continuous.

 

Speaker:

Yeah.

 

Speaker:

So basically continuous data of protection is replication with a back button, right?

 

Speaker:

It, it, it, it works very similar to replication except that the way it

 

Speaker:

stores the data on the other end, it does it in such a way that you can bring.

 

Speaker:

The, you know, the destination back from the bad thing, right?

 

Speaker:

The great thing about replication is that it's incremental, right?

 

Speaker:

That it's block level and it, and, and it, you know, it can keep up with, you

 

Speaker:

know, relatively speaking, real time of what's going on in your production site.

 

Speaker:

The bad thing about replication is the exact same thing, right?

 

Speaker:

So, so CDP gives you the ability to go back in time.

 

Speaker:

If you did something stupid, like drop a table, get a ransomware attack, have

 

Speaker:

some sort of logical corruption, it gives you the ability to go back in time.

 

Speaker:

It has a couple of different ways that it does that.

 

Speaker:

Um.

 

Speaker:

amazing.

 

Speaker:

Curtis, why isn't everyone using

 

Speaker:

Yeah, it wa if, if we were having this conversation, say 20 years

 

Speaker:

ago, everybody was gonna do CDP.

 

Speaker:

Uh, the only problem is it's, it is expensive af right?

 

Speaker:

Uh, not just the cost of the software itself, it's also the cost of all of the

 

Speaker:

IO and all of the storage required to restore essentially every single change.

 

Speaker:

From, you know, during the entire recovery continuum that you are, uh,

 

Speaker:

trying to be able to support and, uh, the, and, and so there are very few

 

Speaker:

actual, I think, true CDP products.

 

Speaker:

There are some that are very specific, like Zerto, I think,

 

Speaker:

uh, would be a CDP product.

 

Speaker:

The, there are some, uh, the EMC recover point.

 

Speaker:

I know that a couple of the other products that I tracked have now been

 

Speaker:

acquired by other companies and they're just a product on their portfolio.

 

Speaker:

The, um, but the, basically the problem is it's just too dang expensive, especially

 

Speaker:

if we're gonna use it for everything.

 

Speaker:

Right.

 

Speaker:

Um, and then we have snapshots.

 

Speaker:

Now you used to work at a company that.

 

Speaker:

Did a snapshot or two.

 

Speaker:

And by snapshots we mean storage snapshots, not the ones up in

 

Speaker:

AWS, which are entirely different.

 

Speaker:

Yeah.

 

Speaker:

So storage snapshots let you take a virtual copy of a particular volume

 

Speaker:

file system, et cetera, and keep it there so you can quickly go back to it.

 

Speaker:

If you need to restore really rapidly, it's all stored locally, which is great.

 

Speaker:

And some companies, actually, I would probably say most companies these

 

Speaker:

days allow users to browse snapshots.

 

Speaker:

So they don't need to call up the IT help desk and be like, Hey,

 

Speaker:

can you restore this file for me that I accidentally deleted?

 

Speaker:

It's already there in the system.

 

Speaker:

They can manually browse it, they can pull the data out themselves,

 

Speaker:

self service, it's awesome.

 

Speaker:

Saves the backup team a bunch of time having to do restores.

 

Speaker:

The downside of snapshots though, is it's on the local system.

 

Speaker:

And when we talk about backups and the purpose of backups, you wanna

 

Speaker:

make sure you have a copy that's independent from that primary copy.

 

Speaker:

When you have a snapshot, if something happens to that system, if someone deletes

 

Speaker:

that volume, then that snapshot is gone and you've lost your quote unquote backup.

 

Speaker:

So a snapshot is not a backup.

 

Speaker:

And I will caveat that with what Curtis said earlier.

 

Speaker:

Snapshots have changed their names based on what the vendor decides to implement.

 

Speaker:

So an EBS snapshot isn't really the same as what I've just been talking about.

 

Speaker:

It is completely different.

 

Speaker:

They actually make a copy into AWS S3 that is independent from the production,

 

Speaker:

and therefore it doesn't follow what we've been calling snapshots,

 

Speaker:

even though AWS calls it a snapshot.

 

Speaker:

Yeah, exactly, and, and I think they're not the only cloud vendor to do that.

 

Speaker:

I also know, for example, our previous employer calls their backups.

 

Speaker:

They call them snapshots, which I didn't like it when I worked there and

 

Speaker:

I don't like, I still don't like it.

 

Speaker:

But yeah, so when we're talking about snapshots here, we're talking about

 

Speaker:

storage snapshots, like what you would see in a NetApp or, uh, and there

 

Speaker:

are different kinds of snapshots.

 

Speaker:

There's copy on, right?

 

Speaker:

There's redirect on, right.

 

Speaker:

And again, there is a whole separate.

 

Speaker:

Episode just on that topic.

 

Speaker:

So my memory is that I coined the term near CDP back in the day.

 

Speaker:

They just, they just called it snapshots and replication.

 

Speaker:

And as you may recall, CDP was all the rage.

 

Speaker:

And I remember thinking that.

 

Speaker:

CDP was very expensive.

 

Speaker:

And because of that, very few people are going to use it.

 

Speaker:

They might use it for their severely, like tier one applications, but they're not

 

Speaker:

gonna use it for regular every day data.

 

Speaker:

And what was more common back in that time was that most people would use.

 

Speaker:

NetApps for that type of data.

 

Speaker:

I mean, at that time, NetApp was kind of, you know, ruling the roost

 

Speaker:

of the, of the NA world, right?

 

Speaker:

Network attached storage, and they were very big on snapshots and

 

Speaker:

then replicated snapshots, and they

 

Speaker:

and replicate.

 

Speaker:

I.

 

Speaker:

Um, you know, you could do multiple tiers of that.

 

Speaker:

They were happy and you could, you could

 

Speaker:

Yep.

 

Speaker:

Replicate the data all around the world.

 

Speaker:

exactly.

 

Speaker:

Exactly.

 

Speaker:

And I liked the term near continuous.

 

Speaker:

And I remember, um, one of the folks that I interfaced with was, um, uh,

 

Speaker:

storage Zilla, which is, uh, mark Toomey.

 

Speaker:

Uh, he lives over there in Cork.

 

Speaker:

And, uh, that was my, that was my attempt to do a cor for anyone who listens there.

 

Speaker:

And I remember he just, he just really hated my term.

 

Speaker:

Like, he's like, continuous is a binary term, you know, like, like immutable.

 

Speaker:

It's a binary term.

 

Speaker:

It's either continuous or it's not.

 

Speaker:

You can't be near continuous.

 

Speaker:

Like, like it's like saying you're near pregnant, right?

 

Speaker:

Pregnant is a binary term.

 

Speaker:

And I'm like, yes, but we do use the word like nearly dead.

 

Speaker:

Right there.

 

Speaker:

There aren't times when we do put the word near next to a binary term, and I

 

Speaker:

just felt that this was a world that was much closer to continuous than it

 

Speaker:

was to what we thought of as backup.

 

Speaker:

Backup at that time, and honestly, even to today.

 

Speaker:

I think, I don't know.

 

Speaker:

This is one of those, like, I don't know for a fact, but I'm

 

Speaker:

pretty darn sure that most people still just back up every night.

 

Speaker:

Yep.

 

Speaker:

Right.

 

Speaker:

It's like if your RPO is 24 hours or less, you are probably doing some

 

Speaker:

form of, I'm just gonna use quote unquote replication, which is all the

 

Speaker:

stuff we just talked about, right?

 

Speaker:

Which includes Async sync, CDP, near CDP, which by the way, I also don't like

 

Speaker:

the word near CDP, but that's just me.

 

Speaker:

Well, you just have to get over it 'cause you're on, you're

 

Speaker:

on the podcast now, buddy.

 

Speaker:

Yeah, but, and then everything beyond 24 hours is probably backup.

 

Speaker:

And I know as technologies change and everyone was like, Hey, database backups.

 

Speaker:

I wanna do it more frequently than every 24 hours.

 

Speaker:

Let me do log backups and all the rest of that stuff.

 

Speaker:

That's when things sort of backup, sort of started reducing the RPO

 

Speaker:

Right?

 

Speaker:

and started moving down into that near CDP space.

 

Speaker:

And, and again, if you're not familiar with the terms Rt, o and RPO, you

 

Speaker:

really should be recovery time objective, recovery point, objective.

 

Speaker:

It, it literally drives all backup design, right?

 

Speaker:

Recovery time objective is how, how long have, have, you know, us and

 

Speaker:

the, the, the, what do you call 'em?

 

Speaker:

The, um, sorry, the stakeholders.

 

Speaker:

What, what have we and the stakeholders agreed that it is an acceptable time

 

Speaker:

for the recovery to take, right?

 

Speaker:

We, we have to be able to bring the data back in four hours, right?

 

Speaker:

And then our PO is how much time, how much data we've agreed

 

Speaker:

that we are allowed to lose.

 

Speaker:

By a measurement of time, not, you know, we could lose 10 gigabytes of data.

 

Speaker:

It's, we could lose one hour or four hours or 24 hours worth of data.

 

Speaker:

That's what RPO and those two things drive backup design

 

Speaker:

Yeah, and I would say that it's also useful beyond backup design.

 

Speaker:

I think anytime you're talking about.

 

Speaker:

Data protection, disaster recovery, backup, all of these things

 

Speaker:

always take into consideration.

 

Speaker:

RTO and RPO.

 

Speaker:

Yeah.

 

Speaker:

Uh, no one cares about backup window anymore.

 

Speaker:

It used to be that was that, that drove a lot of backup, uh, design.

 

Speaker:

But, uh, you know, luckily we, we've, I think we've tackled the backup

 

Speaker:

window problem, so you would probably call what we're about to talk about

 

Speaker:

snapshots and replication instead

 

Speaker:

Snap and replicate.

 

Speaker:

And actually when we went back to the replication issue or replication

 

Speaker:

episode, I would actually call async replication snap and replicate.

 

Speaker:

But that's because of how I entered the storage space and

 

Speaker:

the technology with NetApp.

 

Speaker:

So that's what I, when I think of async replication, I

 

Speaker:

think of snap and replicate.

 

Speaker:

Interesting.

 

Speaker:

Um, obviously it doesn't have to be snap and replicate.

 

Speaker:

Ay replication could just have a buffer.

 

Speaker:

Right, right.

 

Speaker:

And a lag is just a snapshot.

 

Speaker:

Right.

 

Speaker:

So every six hours I do that.

 

Speaker:

That's my lag.

 

Speaker:

Gotcha, gotcha.

 

Speaker:

Yeah, I, I would say that.

 

Speaker:

Snap and replicate would be a way to do a sync replication for sure.

 

Speaker:

And the, I, I think the more common way people would probably

 

Speaker:

just use this term, uh, snap and replicate, and I'm fine with that.

 

Speaker:

Uh, this is one where I, where I, I'm not going to battle for the term.

 

Speaker:

I do like the term because I think it's a lot closer to continuous.

 

Speaker:

What's that?

 

Speaker:

Is it

 

Speaker:

not, it's not trademarked.

 

Speaker:

Feel free to use it.

 

Speaker:

There are some systems where you don't make the snapshot on the primary system.

 

Speaker:

You replicate the data, and then you make the snapshot over there.

 

Speaker:

My problem with that is that when you're making the snapshot, you often have to.

 

Speaker:

Interface with the thing that's writing the data to the snapshot, right?

 

Speaker:

So you want to put Oracle in backup mode.

 

Speaker:

Take a VSS snapshot, take a VMware snapshot, whatever it is, do the,

 

Speaker:

do the thing that you need to do to get the data to be consistent.

 

Speaker:

Then we take a snapshot, then we replicate that snapshot.

 

Speaker:

I don't like replicating

 

Speaker:

You don't, you don't, you know how people deal with that.

 

Speaker:

I'm laughing at it because I've actually worked with groups and

 

Speaker:

products that actually do that.

 

Speaker:

Right,

 

Speaker:

so, uh, one way you can solve what you're asking Curtis, is when you

 

Speaker:

take your snapshot, you are queing the application and you issue the

 

Speaker:

snapshot command to the target.

 

Speaker:

But, but the problem with that is that we need to make sure that the bits are

 

Speaker:

By QCing.

 

Speaker:

Yep.

 

Speaker:

I find, I find that very.

 

Speaker:

I find that messy.

 

Speaker:

I don't like it.

 

Speaker:

I'm

 

Speaker:

I It's not clean.

 

Speaker:

Yeah, it's not

 

Speaker:

Yeah, it's not as clean.

 

Speaker:

I, I like clean.

 

Speaker:

So when we're talking about, you know, near CDP or snapshots of replication,

 

Speaker:

the really nice thing about it is that you can take essentially as many

 

Speaker:

snapshots as you'd like to take within the limits of your storage system.

 

Speaker:

I, I don't know what, do you know what ONTAP is up to these days?

 

Speaker:

I am guessing probably a thousand.

 

Speaker:

Yeah, that's a lot of snapshots,

 

Speaker:

Yeah.

 

Speaker:

You could take a snapshot every minute for the first hour.

 

Speaker:

You could take a snapshot every hour after that, et cetera, et cetera, et you can

 

Speaker:

take the snapshots as much as you want.

 

Speaker:

And then basically what you're doing is you're replicating the changes that are

 

Speaker:

contained within that snapshot, right?

 

Speaker:

Um, and

 

Speaker:

it's much more efficient because the storage array itself is

 

Speaker:

keeping track computing those differences and sending 'em out.

 

Speaker:

So it's much, much faster at doing that than reading the data out, figuring

 

Speaker:

out the differences and sending it.

 

Speaker:

Yeah.

 

Speaker:

The challenge, I think, is that it is a storage level solution,

 

Speaker:

which means that you need to do the interfacing up to the application.

 

Speaker:

Sometimes the storage vendor can help you with that.

 

Speaker:

Sometimes you're on your own.

 

Speaker:

I've been in both scenarios.

 

Speaker:

But at today though, Curtis, I wanna say most backup vendors

 

Speaker:

integrate with most storage vendors.

 

Speaker:

Or, and it may not be a hundred percent, but if you're picking like the major

 

Speaker:

common ones, I'm guessing that most backup vendors have API integration

 

Speaker:

with the storage vendors APIs in order to be able to trigger that snapshot.

 

Speaker:

Yes, you can.

 

Speaker:

So the question is, do they both interface with the application and with

 

Speaker:

the storage snapshot at the same time?

 

Speaker:

All I'm saying is you need to look into that, right?

 

Speaker:

If you're taking a snapshot, you need to do your best to make sure

 

Speaker:

that that snapshot is, is application consistent, um, ver versus the

 

Speaker:

alternative, which is crash consistent.

 

Speaker:

Right.

 

Speaker:

And, and by the way, let, let me just, yeah, let me just use that, lemme just

 

Speaker:

talk about that term for a minute.

 

Speaker:

So if you are not.

 

Speaker:

Making a snapshot with the, in, in partnership with an application,

 

Speaker:

you're creating what's called a crash consistent snapshot.

 

Speaker:

It's called that because it is as consistent as a crash.

 

Speaker:

You, you're essentially like, it's like you flip the power switch off

 

Speaker:

on an, on an operational storage array and you get what you get.

 

Speaker:

Yes.

 

Speaker:

Nothing is moving, but.

 

Speaker:

Stuff was moving.

 

Speaker:

So your mileage will vary.

 

Speaker:

Well, nothing that was committed to DI or committed by the storage array.

 

Speaker:

Any rights that were committed by a storage array has been preserved.

 

Speaker:

Anything that was in flight may not have been committed.

 

Speaker:

And as an application, you might have to do some recovery steps once

 

Speaker:

a storage array comes back because you don't know what the state is

 

Speaker:

Yeah,

 

Speaker:

because some of those in-flight rights might have been

 

Speaker:

committed, some may not have.

 

Speaker:

And there are those who say, look, you know it works 99% of the time.

 

Speaker:

You just take more snapshots.

 

Speaker:

And if this snapshot doesn't work, maybe the previous snapshot will be,

 

Speaker:

I'm just, I just, I try to avoid crash consistent snapshots whenever I can.

 

Speaker:

Right.

 

Speaker:

I would.

 

Speaker:

I, I agree.

 

Speaker:

For the most part, but there are cases where you could use a crash

 

Speaker:

consistent snapshot at a more frequent basis and do like an application

 

Speaker:

consistent snapshot, say once a day.

 

Speaker:

So even though, so you can potentially

 

Speaker:

have that as your backup of your

 

Speaker:

Yeah.

 

Speaker:

Yes.

 

Speaker:

As a backup of your backup.

 

Speaker:

Right.

 

Speaker:

And that

 

Speaker:

why?

 

Speaker:

Why would you do that?

 

Speaker:

I'm guessing the answer to that question would be, I.

 

Speaker:

Perhaps if doing an application consistent snapshot has an impact on

 

Speaker:

the performance of the application.

 

Speaker:

Uh, I know in the case of Oracle, for example, when you put it in backup

 

Speaker:

mode, it changes how it stores the redo logs, which could, which can

 

Speaker:

have a minor impact on performance.

 

Speaker:

And so perhaps you only do that once a day when nobody's using the

 

Speaker:

database and then you do the crash.

 

Speaker:

Consistent snapshots more often than that, I, I don't have a problem with that.

 

Speaker:

Yeah.

 

Speaker:

Yeah.

 

Speaker:

But just relying on, yeah, and then you just take that snapshot

 

Speaker:

and then replicate it off.

 

Speaker:

Right, and the, the beautiful thing I think of snapshots and replication

 

Speaker:

or near CDP, is that what you have?

 

Speaker:

I'm glad that you find that term so amusing, what you have at the um,

 

Speaker:

I think that's why I can say that I coined this term 'cause nobody

 

Speaker:

else seems to want to use it, so I must have coined it and I love it.

 

Speaker:

Um, so the, um, and it's in at least two books, two that I wrote.

 

Speaker:

I don't know if it's anywhere, I don't know if it's in anywhere else, but,

 

Speaker:

uh, I don't care what you'd call it.

 

Speaker:

We're just talking about snapshots and replication.

 

Speaker:

Just don't,

 

Speaker:

the 15 years that I worked,

 

Speaker:

don't call near C-D-P-C-D-P because NetApp definitely tried that one.

 

Speaker:

Right?

 

Speaker:

It is not continuous.

 

Speaker:

The, the reason I was laughing is yeah, the 15 years that I worked

 

Speaker:

in the storage industry, I'd never come across near CDP ever in the

 

Speaker:

way that you're talking about it.

 

Speaker:

Yeah.

 

Speaker:

Yeah, yeah.

 

Speaker:

And I'm fine.

 

Speaker:

I'm fine with that.

 

Speaker:

So again, I'm still taking credit for coining it, even if nobody uses it but me.

 

Speaker:

It's not like the 3, 2, 1 rule or anything like that.

 

Speaker:

Um,

 

Speaker:

One other thing I wanted to mention about Snap and replicate that I don't

 

Speaker:

think you covered yet is there are some vendors when you're doing Snap

 

Speaker:

and replicate, you may not always have to have the same snapshot retention on

 

Speaker:

your source array and your target array.

 

Speaker:

You might, for instance, decide I'm only gonna keep 30 days worth of

 

Speaker:

snapshots on my production system.

 

Speaker:

And on my secondary system, I'm gonna keep 90 days worth of backup, uh,

 

Speaker:

worth of snapshots so I can go back.

 

Speaker:

Some systems allow you to set different retentions for snapshots on both sides.

 

Speaker:

Some may not.

 

Speaker:

So you should also, once again, look at your vendor, see what's possible.

 

Speaker:

But I know for some folks, instead of having to go beyond that 30 days and

 

Speaker:

say, okay, now I have to go to my backup infrastructure and pull data off of

 

Speaker:

it, they might be able to say, okay.

 

Speaker:

If it's not in production because it's beyond the 30 days, let me go

 

Speaker:

check my secondary storage system.

 

Speaker:

Okay?

 

Speaker:

I have 90 days worth of snapshot.

 

Speaker:

Can I restore the data from there?

 

Speaker:

Right.

 

Speaker:

Yeah.

 

Speaker:

I, I, I love that idea, right?

 

Speaker:

'cause it, one of the, one of the nice things about this idea is that you

 

Speaker:

could have, maybe have a more expensive primary storage array and you can have

 

Speaker:

a less expensive storage array that's based on Sada, for example, as you're.

 

Speaker:

As your a backup system.

 

Speaker:

And another thing, by the way, that you can do with a near CDP setup is that

 

Speaker:

you can use that secondary site to give, I'll coin a new term near C DP plus.

 

Speaker:

So, so near CDP plus is snap, replicate then back up, right?

 

Speaker:

Use that snapshot that's on that.

 

Speaker:

On that target and then back that up with some other method that isn't 'cause one

 

Speaker:

of the downsides that some people pick on, uh, snapshot and replication is that

 

Speaker:

your entire, basically storage and backup infrastructure are all within one vendor.

 

Speaker:

And the, the worry is about this idea of a rolling bug that somehow

 

Speaker:

takes out all of ONTAP one day.

 

Speaker:

And it takes some, it takes everybody's primary, uh, and their

 

Speaker:

secondaries along with it, so.

 

Speaker:

The other issue also with that just snap and replicate, is if you say, have a

 

Speaker:

backup proxy, so you're backing up your NASS system, you're using a proxy, which

 

Speaker:

is basically a backup client to mount that snapshot and copy the data off.

 

Speaker:

One of the challenges you have is when you mount it.

 

Speaker:

To the storage array.

 

Speaker:

That backup client looks no different than any other production client,

 

Speaker:

and so when it ends up reading the data, it could cause performance impact

 

Speaker:

because it has to read the entire file system on the source to figure out

 

Speaker:

what's different and move the data off.

 

Speaker:

This, of course, isn't integrating with the native snapshot storage APIs that the

 

Speaker:

storage vendor provides, but is actually just reading it like a normal file system.

 

Speaker:

When you do snap and replicate, you can actually mount the

 

Speaker:

snapshot on the target system.

 

Speaker:

And do your backup off of that, and therefore you're not affecting your

 

Speaker:

production application because you're not impacting the IO on that system.

 

Speaker:

Or you could use our friend Steven's favorite thing, the NDMP,

 

Speaker:

Yep.

 

Speaker:

You could use NDMP

 

Speaker:

the network data management protocol.

 

Speaker:

Which was, which was another solution.

 

Speaker:

This is like to, this is technically off topic at this point, but there was this

 

Speaker:

other way to back up, uh, NAS systems.

 

Speaker:

Well, it's still around.

 

Speaker:

Is that you can back up essentially to tape.

 

Speaker:

DMP is generally meant to go to tape, uh, or to virtual tape.

 

Speaker:

And, uh, it was meant to solve the issue that you mentioned because

 

Speaker:

it would recognize it as a backup process and then deprioritize it.

 

Speaker:

Uh, nice.

 

Speaker:

It,

 

Speaker:

I.

 

Speaker:

Yep.

 

Speaker:

There's another use case I wanna talk about with SNAP and Replicate, and it's

 

Speaker:

not necessarily backup related, but there are many companies who have a distributed

 

Speaker:

environment and they need performance.

 

Speaker:

And so what they sometimes do is they will snap and replicate to multiple

 

Speaker:

systems as kind of a fan, as kind of a fan out, and then they would have

 

Speaker:

clients read from those target systems because they're consistent at some

 

Speaker:

point, and use that as, uh, read.

 

Speaker:

Optimization rather than all these systems trying to hit a single production system.

 

Speaker:

And these secondary systems could be in the same building.

 

Speaker:

It could be spread across the world.

 

Speaker:

So you're now sort of doing read load balancing and you're leveraging

 

Speaker:

the snap and replicate technology in order to move a copy of the

 

Speaker:

data to close to the clients.

 

Speaker:

Yeah, that, uh, by the way, that's, we, we, I don't think we really mentioned

 

Speaker:

this before, but that's one of the best things here, is that that secondary

 

Speaker:

target, and maybe even a tertiary target could be very far away because

 

Speaker:

you're doing asynchronous replication, so you shouldn't be impacting the

 

Speaker:

performance of the, of the primary array.

 

Speaker:

Uh, at least not much anyway.

 

Speaker:

Um, but that, that's, we can put that generally speaking as far

 

Speaker:

as we want to from the primary.

 

Speaker:

Yep.

 

Speaker:

So I'd say the final thing that we would say about snapshots and

 

Speaker:

replication is that that which we've already sort of alluded to, and that

 

Speaker:

is that your backup vendor may support this as just another way to backup.

 

Speaker:

Production data.

 

Speaker:

Right.

 

Speaker:

Most of the popular NAS vendors, especially nas, uh, are gonna

 

Speaker:

have something like this.

 

Speaker:

And then, uh, the more popular they are as a NAS product, the greater

 

Speaker:

the possibility that they will integrate with a, a backup app.

 

Speaker:

Right.

 

Speaker:

So, um, this is just another way to backup up, especially your on-prem storage,

 

Speaker:

although some of these vendors are now starting to offer actually for quite some

 

Speaker:

time now, are offering cloud versions of these typically on-prem products.

 

Speaker:

Um, so anything, can you think of anything else that we should talk about?

 

Speaker:

Persona,

 

Speaker:

I think that covers it all quite a

 

Speaker:

it's just a, yeah, it's, it's, it's a great way, I think to have a very tight

 

Speaker:

RPOA ver a really tight RTO, right?

 

Speaker:

The RTO is really small.

 

Speaker:

'cause basically you just start using the snapshot that, that you,

 

Speaker:

that, that there's no restore.

 

Speaker:

You can start using like the replicated snapshot immediately while you're

 

Speaker:

restoring the primary snapshot.

 

Speaker:

Right?

 

Speaker:

That, you know, that's sort of the beautiful thing of, that.

 

Speaker:

You might get a.

 

Speaker:

Reduced performance.

 

Speaker:

Um, but so, so the RTO, it can, you can meet a really tight RTO, you

 

Speaker:

could do snapshots very frequently.

 

Speaker:

So you can also meet a, uh, a really tight RPO, um,

 

Speaker:

I did have one thing to add since you were just talking about it.

 

Speaker:

So one thing we didn't talk about, which I think is.

 

Speaker:

Super awesome about snapshots is we mentioned previously that snapshots

 

Speaker:

are read only, which is great if you wanna pull some piece of data out

 

Speaker:

of it or something else like that.

 

Speaker:

But if you have applications where you need to actually do some recovery process,

 

Speaker:

you can actually take a snapshot, which is read only, and most storage vendors allow

 

Speaker:

you to clone it into a read write volume that you can then mount and connect to

 

Speaker:

your and do your recovery process again.

 

Speaker:

Again, without occupying the full amount of space, because it's all

 

Speaker:

based on the snapshot, spins up a copy, allows you to do the recovery process.

 

Speaker:

It's read, write.

 

Speaker:

You could do all your testing, your restore verification, which

 

Speaker:

we always talk about on the podcast is go restore your backups.

 

Speaker:

And once you're done with that and you validate, you can quickly toss

 

Speaker:

it away, and then you're good to go.

 

Speaker:

So that's another benefit of sort of snap and replicate, is you can

 

Speaker:

do all this verification on your secondary system without once

 

Speaker:

again impacting your production.

 

Speaker:

Right.

 

Speaker:

There are a lot of advantages to the snap and replicate style of,

 

Speaker:

you know, I'm calling it backup.

 

Speaker:

Right.

 

Speaker:

And, uh, this is one of them is, is that, you know, the, basically that the.

 

Speaker:

The, the replicated copy stays in native format, and that leads, that

 

Speaker:

leads to all sorts of possibilities.

 

Speaker:

One of which I think probably the best of which is, is all of this you, you

 

Speaker:

can do automated recovery testing.

 

Speaker:

Right.

 

Speaker:

Automated cloning and then, uh, tested recovery.

 

Speaker:

And that way you're, you're validating the actual snapshot that

 

Speaker:

you would like to use for recovery.

 

Speaker:

So yeah, I, it's, it's a really great way, it's a really great way that I think

 

Speaker:

maybe not enough people take advantage of.

 

Speaker:

So hopefully, um.

 

Speaker:

You know, you've learned a thing or two.

 

Speaker:

And, uh, with that, I wanna say thank you for, uh, joining

 

Speaker:

us and of course, persona.

 

Speaker:

This was one where you really shined, I think.

 

Speaker:

'cause you, you know, your,

 

Speaker:

This is, this is what I lift and breathed.

 

Speaker:

Yeah.

 

Speaker:

Yeah, yeah, exactly.

 

Speaker:

Exactly.

 

Speaker:

So, uh, uh, great, great having you on again today.

 

Speaker:

And I promise I won't harp on the near CDP term as much.

 

Speaker:

It's gonna take off.

 

Speaker:

Uh, we'll see.

 

Speaker:

We'll see.

 

Speaker:

Maybe I'll, maybe I'll do it in Spanish and then, uh, it'll be, it'll be better.

 

Speaker:

Uh, and, uh, so thanks to the listeners.

 

Speaker:

Thanks for, thanks for listening because, uh, that's really the

 

Speaker:

only reason that we do this.

 

Speaker:

That's a wrap.