March 24, 2025

Disaster Recovery Lessons from 9/11 and Beyond

Disaster Recovery Lessons from 9/11 and Beyond

In this eye-opening episode of The Backup Wrap-up, W. Curtis Preston and Prasanna Malaiyandi unpack crucial disaster recovery lessons from major events like 9/11. They discuss how companies lost both primary and backup data centers when both World Trade Center towers fell, highlighting why geographic separation is non-negotiable. The hosts break down the technical aspects of disaster recovery, comparing hot sites versus cold sites, and the realities of synchronous versus asynchronous replication across distances.

Beyond the technical, Curtis and Prasanna share often-overlooked disaster recovery lessons about human factors—where recovery teams will sleep, eat, and work during extended outages when infrastructure is destroyed. They examine a real case from a hurricane-stricken island where teams converted conference rooms to sleeping quarters and relied on satellite communications. Whether you're planning for natural disasters, power outages, or ransomware attacks, these disaster recovery lessons will help ensure your organization can recover when—not if—disaster strikes.

Transcript
Speaker:

You found the backup wrap up your go-to podcast for all things

Speaker:

backup recovery and cyber recovery.

Speaker:

In this episode, we talk about some hard earned disaster recovery lessons

Speaker:

from major events like nine 11.

Speaker:

We talk about what we learned about DR that day, and we talk about

Speaker:

those critical human elements of DR.

Speaker:

That people often forget, like where your recovery team is going to

Speaker:

sleep when the hotels are all gone.

Speaker:

Disasters happen and they're never convenient.

Speaker:

Whether it's terrorists, hurricanes, or ransomware, you

Speaker:

need to think through what you do.

Speaker:

If you're completely isolated from the world, the time to

Speaker:

learn these lessons are now.

Speaker:

So I hope you enjoy this episode.

Speaker:

By the way, if you don't know who I am, I'm w Curtis Preston, AKA, Mr. Backup,

Speaker:

and I've been passionate about backup and disaster recovery for over 30 years.

Speaker:

Ever since.

Speaker:

I had to tell my boss that there were no backups of the

Speaker:

big database that we just lost.

Speaker:

I don't want that to happen to you, and that's why I do this podcast.

Speaker:

On this podcast, we turn unappreciated backup admins into Cyber Recovery Heroes.

Speaker:

This is the backup wrap up.

Speaker:

Welcome to the show.

Speaker:

Hi, I am w Curtis Preston, AKA, Mr. Backup, and I have with me a guy who

Speaker:

is as lazy as I am when it comes to how to move rugs around the house.

Speaker:

Prasanna Malaiyandi, how's it going?

Speaker:

Prasanna.

Speaker:

I am doing well, Curtis, by the way, isn't it amazing how quiet a room gets

Speaker:

when you put something on the floor?

Speaker:

Yeah, I, I, I wonder if anybody, uh, if anybody notices the difference

Speaker:

in sound because it, you know, and it's been, they probably just got

Speaker:

so used to the other sound, right?

Speaker:

Because I. I went with LVP, you know, upstairs and um, and then in

Speaker:

my office and it just got so echoy and I put the stuff on this wall.

Speaker:

And this over here is a acoustic panels.

Speaker:

Um, you know, behind me there are acoustic panels up on the ceiling

Speaker:

and it still wasn't the same.

Speaker:

I. And then I found, you know, I, I, I, this, this rug is from actually my

Speaker:

living room because we bought a nicer rug and, uh, a nicer, bigger rug.

Speaker:

And so we put that one down there and then I moved here.

Speaker:

But when I moved it in, I, I, I did what I thought was like

Speaker:

a really weird way to do it.

Speaker:

I didn't actually move the furniture.

Speaker:

I was like lifting it up and trying to do it without having to move everything out.

Speaker:

And you said, well, that's exactly how I did it.

Speaker:

exactly.

Speaker:

Yeah.

Speaker:

And I, I, I thought it was just me, but apparently.

Speaker:

It wasn't.

Speaker:

because at least with the rug, well, it depends on how your desk is too.

Speaker:

But like you can at least unfurl part of it and just kind of like shove it down.

Speaker:

Like you just lift

Speaker:

up the front part and

Speaker:

then you lift up the back part.

Speaker:

So,

Speaker:

yeah.

Speaker:

I, I think different people might go, no, no, we're gonna,

Speaker:

we're gonna move everything out.

Speaker:

We're gonna put the rug down and we'll move everything back in.

Speaker:

And they might think of that as easier, but I was like, I

Speaker:

don't wanna move everything.

Speaker:

well, it's more than moving.

Speaker:

It's setting everything back up again.

Speaker:

Yeah, I, I looked into it.

Speaker:

I could have potentially moved the desk without disassembling everything.

Speaker:

It was just, you know, but, um, yeah, so I, it was just nice to hear

Speaker:

when I was talking to you for once.

Speaker:

You weren't like, 'cause so many times I tell you about something that I'm

Speaker:

doing and then you're like, that's the dumbest thing I've ever heard.

Speaker:

Why would you do it that way?

Speaker:

And you're like, oh no, that's, uh, that's how I do it.

Speaker:

Yeah.

Speaker:

Especially if you're space limited.

Speaker:

Like for

Speaker:

What's that?

Speaker:

especially if you are space limited

Speaker:

Yes, space limited is true.

Speaker:

I was just looking, so I have my desk right here and then

Speaker:

like that way, that way I

Speaker:

have the door, but there's no way I could fit my desk outside of

Speaker:

the door with it fully assembled.

Speaker:

Oh yeah.

Speaker:

I'd have to at least take the top off, which is just a giant chore.

Speaker:

So.

Speaker:

Oh yeah.

Speaker:

Well, in my case, I don't have that excuse 'cause I have two French doors, so I

Speaker:

wouldn't have had that excuse, but yeah.

Speaker:

But it still, it still would've been, it still would've been annoying.

Speaker:

And so, but we figured, we figured it out and now I have better sound.

Speaker:

It's a beautiful thing.

Speaker:

And, uh, so, uh, today we're gonna talk about DR and disaster recovery and

Speaker:

lessons learned, and especially, um, some lessons learned from at least one

Speaker:

major event, uh, one major disaster.

Speaker:

And, you know, I, I don't want to, um, you know, the, these, these events,

Speaker:

especially this event, disasters are, are always difficult on, on the people.

Speaker:

Uh, and, and I don't want in any way make light of those disasters

Speaker:

or I, I don't know, whatever, whatever I'm trying to say.

Speaker:

Right.

Speaker:

So proper respect to the people.

Speaker:

'cause some of the, well, especially the one, the main one that we're

Speaker:

gonna talk about, people died in these disasters and, um, you know,

Speaker:

with respect to those people.

Speaker:

Having said that.

Speaker:

We can learn from the things that happened, uh, at that time.

Speaker:

Right?

Speaker:

And so let's talk about, and and the disaster.

Speaker:

The, the main disaster I'm talking about at this point would be what

Speaker:

we all called nine 11, right?

Speaker:

So September 11th, 2001.

Speaker:

Uh, I lived through it.

Speaker:

You lived through it.

Speaker:

Uh, so how old would you have been in 2001?

Speaker:

I would rather not say

Speaker:

I was very young.

Speaker:

You were very young.

Speaker:

I, I was actually in college.

Speaker:

okay.

Speaker:

I was here, um, and um, I was in this house and my kids were young.

Speaker:

My oldest would've been, uh, seven, and my younger one would've been in four.

Speaker:

And what I remember was she might've been in, she might've been five,

Speaker:

she might've been in preschool or might've been in kindergarten.

Speaker:

What I remember, and for those that don't know, I live in North County San Diego,

Speaker:

which means I live just south of Camp Pendleton, which is, uh, you know, one

Speaker:

of the biggest, uh, Marine Corps bases.

Speaker:

The, the idea was that we were under attack because there were multiple events

Speaker:

that were happening simultaneously.

Speaker:

Right.

Speaker:

Multiple planes hit the, the trade center, there was the plane that hit the Pentagon.

Speaker:

There was the plane that went down in Pennsylvania.

Speaker:

There was this feeling that, like we were being attacked

Speaker:

as a country, which we were, and living near a military base.

Speaker:

I don't know what I thought I was going to accomplish by keeping my kids home

Speaker:

from school, but that's what I did.

Speaker:

Yeah.

Speaker:

I was just like, I'm gonna, I'm gonna, I'm gonna, I'm gonna, I'm

Speaker:

gonna keep my kid, like, it, it, like, I, I, I don't even know what

Speaker:

I was thinking, but like, you know, thinking back on it now I'm thinking

Speaker:

that probably I was thinking that, um.

Speaker:

I, I, I, if the world's gonna end, I want my kids near me.

Speaker:

I

Speaker:

mean, it was like, it was, it was, it was, it's kind of morbid.

Speaker:

But that's,

Speaker:

I, I just remember that I didn't, you know, that, that I kept my kids near me.

Speaker:

Um, and I remember that I knew multiple people that worked in

Speaker:

the Real World Trade Center.

Speaker:

None of them were hurt that day

Speaker:

Mm.

Speaker:

and for various reasons.

Speaker:

One, uh, and I know one person that was in the World Trade Center that day.

Speaker:

I knew other people that worked in the World Trade Center, but

Speaker:

for one reason or another, chose not to go into work that day.

Speaker:

Hmm.

Speaker:

I know another person that was supposed to be on Flight 11, which

Speaker:

was from Boston to New York, which was the flight that ended up going.

Speaker:

I don't know if that's the one that went into the Pentagon or

Speaker:

if that's the one that crashed in Pennsylvania, but I knew a person that

Speaker:

was supposed to be on that flight.

Speaker:

That person did not get on that flight.

Speaker:

And, uh, and then the other person that, that I knew was the person in

Speaker:

the, in the trade center, it was, uh, Michael Hingis that was, uh, the, the

Speaker:

blind person that made it, that made it down thanks to his seeing eye dog.

Speaker:

Um, and he, you know, he.

Speaker:

Somewhat famous.

Speaker:

As a result.

Speaker:

He actually went on to become a motivational speaker.

Speaker:

And, um, and, uh, just to, just to lighten things up, I, I'll talk about,

Speaker:

uh, how I first met Michael, and that was, and I, it seems like I've told

Speaker:

this story relatively recently, but I was at my very first trade show.

Speaker:

This would've been in the early nineties, and it was at, it was in

Speaker:

New York, um, at the Javit Center.

Speaker:

In Manhattan, and it was Unix Expo, and I saw, uh, these hot swappable

Speaker:

dish drives, you know, where, where you

Speaker:

could push a button and pull the drives out.

Speaker:

And I just thought that was the coolest thing I'd ever seen.

Speaker:

'cause at the, at the time that, that was unlike anything I'd ever

Speaker:

seen it, it, that was really, really new and really, really cool.

Speaker:

Now we just take it for granted.

Speaker:

But back then it was really cool and he was, he was the one that was.

Speaker:

The se standing in the booth demonstrating these, these, uh, disc drives and

Speaker:

literally his, his shtick was.

Speaker:

It's so easy a blind man can do it.

Speaker:

Right?

Speaker:

And um, and we were like, this is amazing.

Speaker:

We're like, you guys are the only ones with it.

Speaker:

And he goes, yes, we are the only ones with this product.

Speaker:

And then we walked through the trade center or the trade show and we saw

Speaker:

several other vendors with the product.

Speaker:

And one of us said to the other.

Speaker:

Well, in his defense, he can't see the other vendors.

Speaker:

So true.

Speaker:

Um, do you, do you remember, do you have memories of that day?

Speaker:

Of course You do.

Speaker:

Oh yeah, so I remember, so I went to school, I was in Pittsburgh actually.

Speaker:

Mm-hmm.

Speaker:

And so we had a lot of friends, or I had a lot of friends whose families were

Speaker:

living in the city in New York City.

Speaker:

Right.

Speaker:

And so I remember everyone just was very, very concerned.

Speaker:

Um, and like you had mentioned, there was sort of the plane flying towards.

Speaker:

Uh, over Pennsylvania.

Speaker:

And so everyone in Pittsburgh was keeping a close eye being like, Hey,

Speaker:

where is that plane actually going?

Speaker:

Because

Speaker:

it was supposed to fly over Pittsburgh,

Speaker:

right.

Speaker:

right?

Speaker:

So I remember everyone sort of being worried, concerned because they had family

Speaker:

and relatives and they weren't able, like all the phone lines were shut down, right?

Speaker:

So.

Speaker:

No one was able to figure out like what was going on.

Speaker:

I remember going with a bunch of other folks down to the common area

Speaker:

and just kind of like just being like shellshocked as we're watching the news.

Speaker:

Yeah, I, yeah, absolutely.

Speaker:

Um, there's a, there's a great line in the beginning of a movie that I like.

Speaker:

It's become problematic now.

Speaker:

Um, maybe it was always problematic, but there's a movie called Love, actually,

Speaker:

and in the beginning, which is, uh, voiceover from, um, Hugh, Hugh Grant.

Speaker:

And, and, and he was saying on nine 11.

Speaker:

There were a lot of phone calls that were made from the plains,

Speaker:

and he said, to my knowledge, none of them were messages of hate.

Speaker:

They were

Speaker:

messages of love.

Speaker:

Hmm.

Speaker:

I just got a little of a clump there.

Speaker:

Anyway, so, okay.

Speaker:

So, um, when we think about that event, there is, in my world, we

Speaker:

immediately started talking about.

Speaker:

The, we saw the things that happened to companies, and we're gonna talk

Speaker:

about, I'm gonna talk about sort of two, two different kinds of companies.

Speaker:

One of them.

Speaker:

So first off, let's talk about, uh, the, the difference between

Speaker:

a cold site and a hot site.

Speaker:

Do you, so when we talk about disaster recovery, there was this

Speaker:

idea that we, that we're gonna have another site, like ready to go

Speaker:

Yep.

Speaker:

and we talk about cold site and the hot site.

Speaker:

Do you want to talk about what that means?

Speaker:

Yeah, so a hot site basically is you have a site, right?

Speaker:

A disaster recovery site that is fully operational, has all the

Speaker:

equipment, has everything replicated to it, and basically once you push

Speaker:

the BRI big red button, right?

Speaker:

Everything sort of fails over.

Speaker:

It's available, operational, all ready to go, sort of ready

Speaker:

to serve traffic and take over, usually minutes to hours within a

Speaker:

failure.

Speaker:

right.

Speaker:

And then a cold site is very much the opposite of that, right?

Speaker:

It's a site that's sort of ready to start a restore.

Speaker:

Uh, like I suppose there'd be a, there'd be a no site,

Speaker:

Yeah.

Speaker:

right?

Speaker:

That's where, uh, a bad thing happened.

Speaker:

And we're gonna go find some hardware to restore

Speaker:

to, uh, cold site, the, the.

Speaker:

Implication there is that you have some hardware ready to go, but

Speaker:

you haven't restored anything.

Speaker:

A warm site is somewhere in between those two things,

Speaker:

I've have, have you heard of this term that some people call pilot?

Speaker:

A pilot light

Speaker:

Yeah.

Speaker:

talk to.

Speaker:

Talk to me

Speaker:

So it's basically not quite cold, but not quite warm or hot, right?

Speaker:

So

Speaker:

it's a little bit better than cold, but not quite to the extent.

Speaker:

And a lot of the trade off comes from not necessarily having to

Speaker:

eat all of the costs upfront.

Speaker:

So

Speaker:

as an example, if your pilot site is, say, in the cloud, you might have your

Speaker:

data available, but not necessarily your compute and everything else ready to go.

Speaker:

Yeah, I think that I, I think I would still call that a warm sight, but

Speaker:

I mean, there is this concept, it comes from the concept of a pilot

Speaker:

light.

Speaker:

I. Right.

Speaker:

Which for those of you that don't know, when you have a

Speaker:

gas old school that

Speaker:

nowadays we have electronic ignition, but there used to be this in inside,

Speaker:

if you had a gas water heater or a gas furnace, there would be this

Speaker:

little flame that would burn all the

Speaker:

time.

Speaker:

And that would, that's called your pilot light.

Speaker:

And if the pilot light goes out, then you're gonna have to relight it

Speaker:

because otherwise your, your heat won't work.

Speaker:

Um, modern days we use.

Speaker:

electronic electronic ignition,

Speaker:

but, um, because a pilot like just wastes a lot of,

Speaker:

yep.

Speaker:

It's always running and you're always consuming gas.

Speaker:

Yeah.

Speaker:

Um, so there, so the, the best from a DR perspective, right?

Speaker:

The, the, the, the Cadillac, if you will, is, um.

Speaker:

The, I don't know if that's the right term anymore because nobody buys Cadillac, the

Speaker:

Ferrari, the Rolls Royce.

Speaker:

Does anybody buy, do they still make Rolls

Speaker:

Oh yeah.

Speaker:

Okay.

Speaker:

Um, is the hot site

Speaker:

that it's, it's, it's ready to go.

Speaker:

Number one.

Speaker:

It's ready to go when you need it, and, and two, it's kept and

Speaker:

it's ready to go within a, a few.

Speaker:

Minutes or seconds, right?

Speaker:

It's kept as up to date as possible, as much as technology

Speaker:

would allow you to do so.

Speaker:

And one of the challenges with a hot site is, is latency.

Speaker:

And so you might want to put the hot site

Speaker:

as close

Speaker:

as you can to, uh, the, the site that you're preparing

Speaker:

and what happened on nine 11.

Speaker:

So there were many companies to, for their hot site, uh, or had

Speaker:

their main data center in one tower

Speaker:

and had their hot site in another data in the other tower.

Speaker:

which makes perfect sense.

Speaker:

As long as nothing would take out both towers

Speaker:

Yep.

Speaker:

and.

Speaker:

So unfortunately, as we know, you know, both towers, I mean, that,

Speaker:

that's what I woke up to by the way.

Speaker:

I, I

Speaker:

woke up to my wife saying both of the, 'cause I'm on the west coast, so both

Speaker:

of the towers had already collapsed

Speaker:

as you know, as I was waking up.

Speaker:

And so people lost or companies lost their.

Speaker:

their.

Speaker:

primary site and their hot site in, in the same moment.

Speaker:

And so one of the things we learned, nine 11 is to make sure that you are,

Speaker:

if you're doing some sort of hot site or warm site, is to put that site, I'm

Speaker:

gonna say nowhere near, um, that the, the site that you're being protected.

Speaker:

Now let's talk about that.

Speaker:

Um, there were actually some attempts, um, I lived through

Speaker:

this because I was working.

Speaker:

For companies at the time, and that is there were attempts at regulation to say,

Speaker:

If you're a bank or whatever you need to put, if you're financial training,

Speaker:

you need to put a copy of your data.

Speaker:

Uh, that's, that's hot over 200 miles away.

Speaker:

Yep.

Speaker:

was an attempt at regulation.

Speaker:

What's the problem with that?

Speaker:

Um, 200 miles away if you're, depending on the type of disaster isn't far enough.

Speaker:

And so that's

Speaker:

not the problem.

Speaker:

But the second is latency.

Speaker:

yeah, that's the problem, right?

Speaker:

It's just simply not feasible because the, the, the round trip time, the 200 mile

Speaker:

round trip time, uh, was just far too long that couldn't keep the data up to date.

Speaker:

depending on the

Speaker:

Based on the technology at the time,

Speaker:

Yeah.

Speaker:

Well, and I think it depends.

Speaker:

So I do so.

Speaker:

Many years after nine 11, I was working at a storage company and one of the

Speaker:

things that they did was they also like talking to financial customers, right?

Speaker:

Is many of 'em had what they would call dark fiber,

Speaker:

Yep.

Speaker:

right?

Speaker:

Where they would basically run fiber optics, two fiber network between

Speaker:

their two sites, and it would.

Speaker:

You're right.

Speaker:

It wouldn't be completely eliminate the latency, but it would definitely

Speaker:

help versus say routing it over a public network of any type.

Speaker:

Yeah, we definitely, you would.

Speaker:

I, I think that there was an assumption of dark fiber, uh, at that point, but even

Speaker:

200 miles on a straight piece of glass, speed of light has a speed of, speed of

Speaker:

light is not, it's not instantaneous.

Speaker:

It's whatever it is.

Speaker:

187,000 miles a second or whatever,

Speaker:

six hundred or something like

Speaker:

something like that.

Speaker:

Right.

Speaker:

Um, the, the round trip time is gonna be measured in milliseconds.

Speaker:

It's not gonna be, it's not, it's, it's, it's going to significantly increase

Speaker:

latency, especially if we start talking about synchronous transfer of data.

Speaker:

Right.

Speaker:

Now let's talk about synchronous versus sacred.

Speaker:

Synchronous versus asynchronous transfer of data.

Speaker:

You want to give that a shot?

Speaker:

Yeah, so synchronous is.

Speaker:

Basically a right comes into your production site.

Speaker:

It gets forwarded over to your DR site.

Speaker:

The right gets, now there are different flavors, but typically the right gets

Speaker:

committed on the DR site acknowledged back to the primary site, and then the

Speaker:

primary site acknowledges the client during which, so you have to add up

Speaker:

basically the latency of going over the writes on both sides, coming back before

Speaker:

the client acknowledges, in which case you're guaranteed that that right has hit

Speaker:

both sites and so the client can move on.

Speaker:

Which is great as long as it doesn't take too long.

Speaker:

Yes.

Speaker:

So that's synchronous and asynchronous is something, is, is uh, different than that.

Speaker:

And we, we've had some different, between you and I, we've had some

Speaker:

different understandings of different kinds of asynchronous, but go ahead.

Speaker:

Yeah, so for me, asynchronous is, well, what I would call semi synchronous, but

Speaker:

that's a different case is where, uh, you accept the right on the production,

Speaker:

you forward it over to the secondary or DR site, but, but while it's in

Speaker:

process of being committed on the other side, you can acknowledge the client.

Speaker:

So there is a lag.

Speaker:

Um, you could decide how long that lag is, depending on technology and the vendor.

Speaker:

Some allow you to specify at an IO level so you can say, I want

Speaker:

10 transactions outstanding.

Speaker:

Others allow you to do it in terms of seconds.

Speaker:

So I'm allowing up to 10 seconds or 30 seconds, um,

Speaker:

Before you start, before you start kicking back a performance issue to the client.

Speaker:

Before

Speaker:

you start Yeah.

Speaker:

Putting back pressure.

Speaker:

Now

Speaker:

interestingly, there is a mode, which I'm not sure if you're aware of,

Speaker:

uh, that some financial co companies requested, which is called Domino mode.

Speaker:

Talk to me.

Speaker:

So it's a form of synchronous replication where it, because in

Speaker:

synchronous replication, typically you write to the client, it sends it over.

Speaker:

If the right fails on the secondary, it'll still accept it on the primary.

Speaker:

And acknowledge back to the client, right?

Speaker:

So you're not guaranteed that it'll stop writes if it can't write to

Speaker:

both sides At the same time, there's a mode called domino mode where if

Speaker:

it can't write to both sides, it will not acknowledge the client.

Speaker:

So that's why it's called a Domino.

Speaker:

One takes out the other.

Speaker:

I, I would think that,

Speaker:

this is one of those, this is one of those things where, you know, uh, this

Speaker:

is reality versus the idea, right?

Speaker:

To me.

Speaker:

domino mode that you described, that's synchronous.

Speaker:

Anything other than that is not synchronous, right?

Speaker:

And anything other than the domino mode that you described,

Speaker:

I would call asynchronous, right?

Speaker:

So these either synchronous, it's sort of like immutable and not immutable, right?

Speaker:

It's either synchronous or it's not synchronous.

Speaker:

And if it's synchronous, then it shouldn't acknowledge the right to the

Speaker:

client until both writes have been done.

Speaker:

And if one of them fails, then they're not done.

Speaker:

yeah.

Speaker:

So at least from most of the vendors I've seen,

Speaker:

they've never implemented it that way.

Speaker:

Yeah.

Speaker:

That's interesting.

Speaker:

Well, they're wrong.

Speaker:

So, um, uh, yeah, so that was, so that was a lesson we thought we

Speaker:

learned at the time, but we need to make sure we put it far enough away.

Speaker:

But then they were like, it's gotta be synchronous and

Speaker:

it's gotta be 200 miles away.

Speaker:

They're like, eh, it's not gonna work.

Speaker:

Right.

Speaker:

Um, nowadays, you know, you hinted at it earlier, nowadays we would do this with

Speaker:

the cloud and we can put it actually, because, you know, you, you did say that

Speaker:

your first problem was that it wasn't far enough, and that's probably true, right?

Speaker:

Because especially when we start talking about certain areas

Speaker:

like Southern Florida, right?

Speaker:

200 miles isn't far enough.

Speaker:

Um, and um, so with the cloud, you can put it.

Speaker:

Pretty much anywhere.

Speaker:

Now, if we're going to do that, if, especially if we're gonna use

Speaker:

public networks, we're pretty much going to have to use asynchronous

Speaker:

of some sort, right?

Speaker:

Uh, so we're gonna send the data and put it another place we're going to,

Speaker:

you know, like you said, you can have a buffer, you can have a certain amount

Speaker:

of time that it's allowed to get behind before it, uh, like you said, put, what

Speaker:

do you mean when you say back pressure?

Speaker:

So this is where you start to, um, elongate the time

Speaker:

before acknowledging a right.

Speaker:

So

Speaker:

to the client, because typically your client will sort of throttle itself

Speaker:

because at some point your latencies are gonna get into the seconds

Speaker:

and they'll be like, no, no, no.

Speaker:

Something's going on.

Speaker:

I'll slow down.

Speaker:

right.

Speaker:

Because otherwise you're just gonna start dropping the writes.

Speaker:

And.

Speaker:

And, and this is a, a configuration choice on the part of the customer

Speaker:

where they can say, I don't want to ever put back pressure.

Speaker:

I wanna, uh, you know, that, that the data protection is less important

Speaker:

than actually getting the job done.

Speaker:

And then other clients would say, if I don't back it up, I don't

Speaker:

wanna write it in the first place.

Speaker:

Right.

Speaker:

Um, I, I, obviously, I tend to be more towards the latter than the former.

Speaker:

Yeah.

Speaker:

Uh, from, uh, most of the companies I've seen are vendors.

Speaker:

Yeah.

Speaker:

They're not in line with you.

Speaker:

Meaning Meaning that they would just go ahead and do it anyway?

Speaker:

Yeah, because most customers Right.

Speaker:

Unless you

Speaker:

have very, very strict regulations.

Speaker:

Right.

Speaker:

They're like, best effort

Speaker:

Yeah.

Speaker:

I, I think it's

Speaker:

because they will

Speaker:

that I care about data protection more than the average

Speaker:

person,

Speaker:

Because the thing is, at some point it will catch back up.

Speaker:

Hopefully,

Speaker:

Hopefully yes.

Speaker:

Depending on what the problem was.

Speaker:

right.

Speaker:

Um, and, and again, as long as we're okay with the potential,

Speaker:

right, um, uh, I, I would think that there should still be some number.

Speaker:

number might be measured in hours if we're hours behind updating

Speaker:

our other copy, something.

Speaker:

Might be drastically wrong that we need to look at.

Speaker:

Yep.

Speaker:

And the other thing to also mention is with a lot of this.

Speaker:

High end.

Speaker:

I normally refer to it as tier one storage systems

Speaker:

Yeah,

Speaker:

because these are tier one applications with very strict requirements.

Speaker:

Um, usually also they provide the ability to do automatic failover.

Speaker:

So it's kind of, think of it like high availability plus clustering.

Speaker:

right.

Speaker:

So if you take a look at a lot of the tier one storage, right, they might have two

Speaker:

storage systems in both locations with the drives that are all interconnected.

Speaker:

So in case one unit fails, the other unit can take over the diss of the other side.

Speaker:

Um, the clients are also connected to both sides, so they don't

Speaker:

have to worry about failing over.

Speaker:

Um, I can't remember what it was called.

Speaker:

It's like the optimized and non-optimized connectivity for fiber channel,

Speaker:

which allows it to have a preferred path and a non-preferred path.

Speaker:

So you still

Speaker:

have connectivity, so your clients will automatically fail over, so you

Speaker:

don't have to do anything, and so your writes can still continue to happen.

Speaker:

Yeah.

Speaker:

So that's a big thing, is like, you know, we, we, we learned that

Speaker:

we should have it farther away.

Speaker:

We learned that maybe we shouldn't have it too far away, but, but,

Speaker:

but now with the, with the.

Speaker:

With the cloud, we can potentially have it pretty much anywhere, but we

Speaker:

definitely have to rely on some sort of asynchronous, uh, uh, communication.

Speaker:

When, I think about our, our friends that came on the show.

Speaker:

Talk about their experiences with disaster.

Speaker:

I think there's some, that was a major, that was a hurricane

Speaker:

that took out an island

Speaker:

and they had multiple data centers on that island.

Speaker:

One was more in the high ground than the other.

Speaker:

So one was flooded, the other was not.

Speaker:

And they were gonna recover from one data center to the other.

Speaker:

And that's, and they had everything they needed there.

Speaker:

They did.

Speaker:

They needed personnel.

Speaker:

They had to fly people to the island.

Speaker:

But other things happened that we can also learn from.

Speaker:

Do you, what do you remember from those?

Speaker:

So from, so there were two things I remember.

Speaker:

One was sort of.

Speaker:

The people process stuff, which I think we rarely think about.

Speaker:

And then the other was the technology piece.

Speaker:

So from a people process perspective, it was, do you have the right people in

Speaker:

country who have the expertise and the

Speaker:

skills to

Speaker:

recover?

Speaker:

Where are they going to sleep?

Speaker:

Where are they going to get food?

Speaker:

Right?

Speaker:

All the things that you kind of take for granted, right?

Speaker:

They had none of that.

Speaker:

Like how do they communicate

Speaker:

Yeah.

Speaker:

He, as I recall, he, he turned a, a, um, a conference room into a

Speaker:

hotel room

Speaker:

Right.

Speaker:

And slept on a cot and ate rice and beans for two

Speaker:

weeks.

Speaker:

And he was lucky to

Speaker:

get rice and beans right?

Speaker:

Because there were a lot of people who didn't even get that.

Speaker:

Yeah.

Speaker:

Um.

Speaker:

Yeah, that's, that is definitely a, a lesson learned is that, you

Speaker:

know, make sure to take the human element into your DR design.

Speaker:

Right.

Speaker:

Um, and the one that, the one that stands out to me was

Speaker:

the reliance on the mainland.

Speaker:

Yeah.

Speaker:

Right.

Speaker:

That when you have a true disaster, whether it's on an island or just.

Speaker:

You know, wherever you might not be able to get connectivity to the rest

Speaker:

of your computing infrastructure.

Speaker:

And in this case, their authentication and authorization, their IAM system

Speaker:

relied on active directory, which

Speaker:

all of which was in, was in the mainland.

Speaker:

And um, so this is just, you know, the lesson there is to just make sure that.

Speaker:

To just take that into consideration, right.

Speaker:

Just the, the realize that in a real disaster, you may be very

Speaker:

isolated from the rest, rest of the world, and you need to take

Speaker:

that into consideration in your DR.

Speaker:

Design.

Speaker:

Yeah.

Speaker:

And, but I think this becomes so difficult, right, Curtis, because

Speaker:

like how many scenarios are you gonna play in your head and how much time

Speaker:

are you gonna focus on some of these?

Speaker:

Now granted, a hurricane on a tropical island is probably

Speaker:

a high likelihood event,

Speaker:

Yes.

Speaker:

Right.

Speaker:

Well, I,

Speaker:

of that, but.

Speaker:

well, I, I, I think it's, I think it's a totally, I think that there's two

Speaker:

things that you cannot count on, right.

Speaker:

Um, I. and and I, I would just say one thing that covers the number of

Speaker:

things, and that is utilities, right?

Speaker:

You cannot count on the internet.

Speaker:

You cannot count on power.

Speaker:

You cannot count on, um, you know, uh, water,

Speaker:

right?

Speaker:

You can't count on those three things.

Speaker:

And so I, I'm just saying, I don't, I don't think you, you don't need

Speaker:

to think of all of the reasons that one or more of those might not.

Speaker:

Yeah.

Speaker:

available.

Speaker:

You just need to plan for them not being available.

Speaker:

If you don't have internet, if you don't have whatever it is you,

Speaker:

however you communicate between your sites, you are going to be isolated.

Speaker:

If you don't have power, you're gonna need to supply your own power,

Speaker:

right?

Speaker:

Um, these are just things that you can think, you can

Speaker:

think through these things, and these are all things that you can say either.

Speaker:

I'm just saying, have that discussion.

Speaker:

And say, you know what, if we don't have power, we're not gonna do anything.

Speaker:

We're not

Speaker:

gonna spend $15 billion on power generators

Speaker:

Yeah,

Speaker:

in case we, you know, we might, or, or you might be subject

Speaker:

to regulations, or you might

Speaker:

have, uh, financial reasons why downtime is enough.

Speaker:

That, or the cost of downtime is enough that you're gonna pay for generators.

Speaker:

Um, do you remember how they got internet over there?

Speaker:

It was satellite.

Speaker:

Yeah.

Speaker:

satellite.

Speaker:

I like this idea of making sure that you think about what would

Speaker:

you do if the utilities that you're normally counting on are not

Speaker:

available?

Speaker:

If you get completely isolated, what would you do?

Speaker:

There are a number of reasons why that might end up being the case.

Speaker:

or the, or the people that you normally depend on are not available.

Speaker:

Yes.

Speaker:

This is why we had this little thing called documentation.

Speaker:

Yeah.

Speaker:

Um,

Speaker:

And,

Speaker:

and

Speaker:

go ahead.

Speaker:

and hopefully I know when we've had Mike, Dr. Mike on the podcast, right?

Speaker:

Um, he's talked about sort of doing tabletop exercises, right?

Speaker:

So have you done a tabletop exercise, which is like, Hey, what happens if the

Speaker:

main IT person, Susie, is unavailable?

Speaker:

Right, right.

Speaker:

And, and that's what you do.

Speaker:

You work through those various scenarios.

Speaker:

You, you hire somebody to come in as an outsider is the best way to do that.

Speaker:

When we start talking about ransomware, um, Dr. Mike Sailor's company would,

Speaker:

would, would be a great resource for that.

Speaker:

It's good to use a very negative person.

Speaker:

Think the most negative person in your environment, the most

Speaker:

pess, pessimistic person.

Speaker:

And, uh, no, no.

Speaker:

Pessimist thinks they're negative.

Speaker:

They, they think they're, they're realists,

Speaker:

will realize

Speaker:

Um, yeah.

Speaker:

Sort of the final thought that I'm thinking is, and, and I, I remember.

Speaker:

Doing a disaster recovery of my own, and that is, I did it where.

Speaker:

I had not yet tested the throughput of the

Speaker:

backup system that we were doing and the throughput of the backup system we

Speaker:

were doing, it turned out to be crap.

Speaker:

Right?

Speaker:

And as a result, uh, that was a really hard day.

Speaker:

And in Curtis land, right?

Speaker:

This is

Speaker:

very early in my career, I learned it.

Speaker:

What's that?

Speaker:

Yes, it was the compression thing.

Speaker:

Yeah.

Speaker:

So make sure that when, when you make configuration changes to your

Speaker:

backup system or your DR system, make sure that you test that.

Speaker:

Just realize that in general, restore speed is slower than backup speed.

Speaker:

It just is.

Speaker:

It's the way it, you know it, you know, back in the day with tape, it

Speaker:

was because we were doing multiplexing.

Speaker:

Nowadays it would would with dedupe.

Speaker:

It's because of dedupe.

Speaker:

Um, and so just sort of plan for that.

Speaker:

But don't, don't just assume how much slower it is, uh, test it and see how much

Speaker:

slower it is and make sure you figure that into the, to the disaster recovery plan.

Speaker:

But, um, so, um, with that, I think that's enough for now in terms of lessons learned

Speaker:

from just, you know, the pains of others.

Speaker:

Um.

Speaker:

Just think of the scenarios that might be, you know, that you might be subject to.

Speaker:

Um, make sure you've got at least one copy of your data

Speaker:

that's far away from everything.

Speaker:

Um, and the way to probably do that today is cloud taped can still play a role.

Speaker:

In fact, our previous episode we talk about why tape is

Speaker:

still not dead in backup.

Speaker:

Uh, it certainly is on life support, but there

Speaker:

is, you know.

Speaker:

There is, there is a use for tape and backup and it's disaster recovery.

Speaker:

So, um, um, especially when we start talking about disaster

Speaker:

recovery from a ransomware attack.

Speaker:

So, well, thanks for chatting again, my friend.

Speaker:

No, it was good.

Speaker:

Uh, one thing I'm surprised you, I thought you were gonna mention, but you did not,

Speaker:

The 3, 2, 1 rule.

Speaker:

1 rule.

Speaker:

You were just

Speaker:

You know what's funny is we were so close.

Speaker:

We were so close.

Speaker:

Yeah.

Speaker:

Yeah.

Speaker:

The whole 3, 2, 1 rule with, you know, three copies of your data on two

Speaker:

different media, one of which is offsite.

Speaker:

That, that, that's a core design concept for anything.

Speaker:

Uh, backup in dr. And, and that's, yeah, you're right.

Speaker:

We never, we, we didn't mention, but now we have, so you

Speaker:

have corrected our oversight.

Speaker:

Thank you very much.

Speaker:

And thank you to our listeners.

Speaker:

We'd be nothing without you.

Speaker:

That is a wrap.

Speaker:

The backup wrap up is written, recorded, and produced by me w Curtis Preston.

Speaker:

If you need backup or Dr. Consulting content generation or expert witness

Speaker:

work, check out backup central.com.

Speaker:

You can also find links from my O'Reilly Books on the same website.

Speaker:

Remember, this is an independent podcast and any opinions that

Speaker:

you hear are those of the speaker and not necessarily an employer.

Speaker:

Thanks for listening.