Disaster Recovery Lessons from 9/11 and Beyond

In this eye-opening episode of The Backup Wrap-up, W. Curtis Preston and Prasanna Malaiyandi unpack crucial disaster recovery lessons from major events like 9/11. They discuss how companies lost both primary and backup data centers when both World Trade Center towers fell, highlighting why geographic separation is non-negotiable. The hosts break down the technical aspects of disaster recovery, comparing hot sites versus cold sites, and the realities of synchronous versus asynchronous replication across distances.
Beyond the technical, Curtis and Prasanna share often-overlooked disaster recovery lessons about human factors—where recovery teams will sleep, eat, and work during extended outages when infrastructure is destroyed. They examine a real case from a hurricane-stricken island where teams converted conference rooms to sleeping quarters and relied on satellite communications. Whether you're planning for natural disasters, power outages, or ransomware attacks, these disaster recovery lessons will help ensure your organization can recover when—not if—disaster strikes.
You found the backup wrap up your go-to podcast for all things
Speaker:backup recovery and cyber recovery.
Speaker:In this episode, we talk about some hard earned disaster recovery lessons
Speaker:from major events like nine 11.
Speaker:We talk about what we learned about DR that day, and we talk about
Speaker:those critical human elements of DR.
Speaker:That people often forget, like where your recovery team is going to
Speaker:sleep when the hotels are all gone.
Speaker:Disasters happen and they're never convenient.
Speaker:Whether it's terrorists, hurricanes, or ransomware, you
Speaker:need to think through what you do.
Speaker:If you're completely isolated from the world, the time to
Speaker:learn these lessons are now.
Speaker:So I hope you enjoy this episode.
Speaker:By the way, if you don't know who I am, I'm w Curtis Preston, AKA, Mr. Backup,
Speaker:and I've been passionate about backup and disaster recovery for over 30 years.
Speaker:Ever since.
Speaker:I had to tell my boss that there were no backups of the
Speaker:big database that we just lost.
Speaker:I don't want that to happen to you, and that's why I do this podcast.
Speaker:On this podcast, we turn unappreciated backup admins into Cyber Recovery Heroes.
Speaker:This is the backup wrap up.
Speaker:Welcome to the show.
Speaker:Hi, I am w Curtis Preston, AKA, Mr. Backup, and I have with me a guy who
Speaker:is as lazy as I am when it comes to how to move rugs around the house.
Speaker:Prasanna Malaiyandi, how's it going?
Speaker:Prasanna.
Speaker:I am doing well, Curtis, by the way, isn't it amazing how quiet a room gets
Speaker:when you put something on the floor?
Speaker:Yeah, I, I, I wonder if anybody, uh, if anybody notices the difference
Speaker:in sound because it, you know, and it's been, they probably just got
Speaker:so used to the other sound, right?
Speaker:Because I. I went with LVP, you know, upstairs and um, and then in
Speaker:my office and it just got so echoy and I put the stuff on this wall.
Speaker:And this over here is a acoustic panels.
Speaker:Um, you know, behind me there are acoustic panels up on the ceiling
Speaker:and it still wasn't the same.
Speaker:I. And then I found, you know, I, I, I, this, this rug is from actually my
Speaker:living room because we bought a nicer rug and, uh, a nicer, bigger rug.
Speaker:And so we put that one down there and then I moved here.
Speaker:But when I moved it in, I, I, I did what I thought was like
Speaker:a really weird way to do it.
Speaker:I didn't actually move the furniture.
Speaker:I was like lifting it up and trying to do it without having to move everything out.
Speaker:And you said, well, that's exactly how I did it.
Speaker:exactly.
Speaker:Yeah.
Speaker:And I, I, I thought it was just me, but apparently.
Speaker:It wasn't.
Speaker:because at least with the rug, well, it depends on how your desk is too.
Speaker:But like you can at least unfurl part of it and just kind of like shove it down.
Speaker:Like you just lift
Speaker:up the front part and
Speaker:then you lift up the back part.
Speaker:So,
Speaker:yeah.
Speaker:I, I think different people might go, no, no, we're gonna,
Speaker:we're gonna move everything out.
Speaker:We're gonna put the rug down and we'll move everything back in.
Speaker:And they might think of that as easier, but I was like, I
Speaker:don't wanna move everything.
Speaker:well, it's more than moving.
Speaker:It's setting everything back up again.
Speaker:Yeah, I, I looked into it.
Speaker:I could have potentially moved the desk without disassembling everything.
Speaker:It was just, you know, but, um, yeah, so I, it was just nice to hear
Speaker:when I was talking to you for once.
Speaker:You weren't like, 'cause so many times I tell you about something that I'm
Speaker:doing and then you're like, that's the dumbest thing I've ever heard.
Speaker:Why would you do it that way?
Speaker:And you're like, oh no, that's, uh, that's how I do it.
Speaker:Yeah.
Speaker:Especially if you're space limited.
Speaker:Like for
Speaker:What's that?
Speaker:especially if you are space limited
Speaker:Yes, space limited is true.
Speaker:I was just looking, so I have my desk right here and then
Speaker:like that way, that way I
Speaker:have the door, but there's no way I could fit my desk outside of
Speaker:the door with it fully assembled.
Speaker:Oh yeah.
Speaker:I'd have to at least take the top off, which is just a giant chore.
Speaker:So.
Speaker:Oh yeah.
Speaker:Well, in my case, I don't have that excuse 'cause I have two French doors, so I
Speaker:wouldn't have had that excuse, but yeah.
Speaker:But it still, it still would've been, it still would've been annoying.
Speaker:And so, but we figured, we figured it out and now I have better sound.
Speaker:It's a beautiful thing.
Speaker:And, uh, so, uh, today we're gonna talk about DR and disaster recovery and
Speaker:lessons learned, and especially, um, some lessons learned from at least one
Speaker:major event, uh, one major disaster.
Speaker:And, you know, I, I don't want to, um, you know, the, these, these events,
Speaker:especially this event, disasters are, are always difficult on, on the people.
Speaker:Uh, and, and I don't want in any way make light of those disasters
Speaker:or I, I don't know, whatever, whatever I'm trying to say.
Speaker:Right.
Speaker:So proper respect to the people.
Speaker:'cause some of the, well, especially the one, the main one that we're
Speaker:gonna talk about, people died in these disasters and, um, you know,
Speaker:with respect to those people.
Speaker:Having said that.
Speaker:We can learn from the things that happened, uh, at that time.
Speaker:Right?
Speaker:And so let's talk about, and and the disaster.
Speaker:The, the main disaster I'm talking about at this point would be what
Speaker:we all called nine 11, right?
Speaker:So September 11th, 2001.
Speaker:Uh, I lived through it.
Speaker:You lived through it.
Speaker:Uh, so how old would you have been in 2001?
Speaker:I would rather not say
Speaker:I was very young.
Speaker:You were very young.
Speaker:I, I was actually in college.
Speaker:okay.
Speaker:I was here, um, and um, I was in this house and my kids were young.
Speaker:My oldest would've been, uh, seven, and my younger one would've been in four.
Speaker:And what I remember was she might've been in, she might've been five,
Speaker:she might've been in preschool or might've been in kindergarten.
Speaker:What I remember, and for those that don't know, I live in North County San Diego,
Speaker:which means I live just south of Camp Pendleton, which is, uh, you know, one
Speaker:of the biggest, uh, Marine Corps bases.
Speaker:The, the idea was that we were under attack because there were multiple events
Speaker:that were happening simultaneously.
Speaker:Right.
Speaker:Multiple planes hit the, the trade center, there was the plane that hit the Pentagon.
Speaker:There was the plane that went down in Pennsylvania.
Speaker:There was this feeling that, like we were being attacked
Speaker:as a country, which we were, and living near a military base.
Speaker:I don't know what I thought I was going to accomplish by keeping my kids home
Speaker:from school, but that's what I did.
Speaker:Yeah.
Speaker:I was just like, I'm gonna, I'm gonna, I'm gonna, I'm gonna, I'm
Speaker:gonna keep my kid, like, it, it, like, I, I, I don't even know what
Speaker:I was thinking, but like, you know, thinking back on it now I'm thinking
Speaker:that probably I was thinking that, um.
Speaker:I, I, I, if the world's gonna end, I want my kids near me.
Speaker:I
Speaker:mean, it was like, it was, it was, it was, it's kind of morbid.
Speaker:But that's,
Speaker:I, I just remember that I didn't, you know, that, that I kept my kids near me.
Speaker:Um, and I remember that I knew multiple people that worked in
Speaker:the Real World Trade Center.
Speaker:None of them were hurt that day
Speaker:Mm.
Speaker:and for various reasons.
Speaker:One, uh, and I know one person that was in the World Trade Center that day.
Speaker:I knew other people that worked in the World Trade Center, but
Speaker:for one reason or another, chose not to go into work that day.
Speaker:Hmm.
Speaker:I know another person that was supposed to be on Flight 11, which
Speaker:was from Boston to New York, which was the flight that ended up going.
Speaker:I don't know if that's the one that went into the Pentagon or
Speaker:if that's the one that crashed in Pennsylvania, but I knew a person that
Speaker:was supposed to be on that flight.
Speaker:That person did not get on that flight.
Speaker:And, uh, and then the other person that, that I knew was the person in
Speaker:the, in the trade center, it was, uh, Michael Hingis that was, uh, the, the
Speaker:blind person that made it, that made it down thanks to his seeing eye dog.
Speaker:Um, and he, you know, he.
Speaker:Somewhat famous.
Speaker:As a result.
Speaker:He actually went on to become a motivational speaker.
Speaker:And, um, and, uh, just to, just to lighten things up, I, I'll talk about,
Speaker:uh, how I first met Michael, and that was, and I, it seems like I've told
Speaker:this story relatively recently, but I was at my very first trade show.
Speaker:This would've been in the early nineties, and it was at, it was in
Speaker:New York, um, at the Javit Center.
Speaker:In Manhattan, and it was Unix Expo, and I saw, uh, these hot swappable
Speaker:dish drives, you know, where, where you
Speaker:could push a button and pull the drives out.
Speaker:And I just thought that was the coolest thing I'd ever seen.
Speaker:'cause at the, at the time that, that was unlike anything I'd ever
Speaker:seen it, it, that was really, really new and really, really cool.
Speaker:Now we just take it for granted.
Speaker:But back then it was really cool and he was, he was the one that was.
Speaker:The se standing in the booth demonstrating these, these, uh, disc drives and
Speaker:literally his, his shtick was.
Speaker:It's so easy a blind man can do it.
Speaker:Right?
Speaker:And um, and we were like, this is amazing.
Speaker:We're like, you guys are the only ones with it.
Speaker:And he goes, yes, we are the only ones with this product.
Speaker:And then we walked through the trade center or the trade show and we saw
Speaker:several other vendors with the product.
Speaker:And one of us said to the other.
Speaker:Well, in his defense, he can't see the other vendors.
Speaker:So true.
Speaker:Um, do you, do you remember, do you have memories of that day?
Speaker:Of course You do.
Speaker:Oh yeah, so I remember, so I went to school, I was in Pittsburgh actually.
Speaker:Mm-hmm.
Speaker:And so we had a lot of friends, or I had a lot of friends whose families were
Speaker:living in the city in New York City.
Speaker:Right.
Speaker:And so I remember everyone just was very, very concerned.
Speaker:Um, and like you had mentioned, there was sort of the plane flying towards.
Speaker:Uh, over Pennsylvania.
Speaker:And so everyone in Pittsburgh was keeping a close eye being like, Hey,
Speaker:where is that plane actually going?
Speaker:Because
Speaker:it was supposed to fly over Pittsburgh,
Speaker:right.
Speaker:right?
Speaker:So I remember everyone sort of being worried, concerned because they had family
Speaker:and relatives and they weren't able, like all the phone lines were shut down, right?
Speaker:So.
Speaker:No one was able to figure out like what was going on.
Speaker:I remember going with a bunch of other folks down to the common area
Speaker:and just kind of like just being like shellshocked as we're watching the news.
Speaker:Yeah, I, yeah, absolutely.
Speaker:Um, there's a, there's a great line in the beginning of a movie that I like.
Speaker:It's become problematic now.
Speaker:Um, maybe it was always problematic, but there's a movie called Love, actually,
Speaker:and in the beginning, which is, uh, voiceover from, um, Hugh, Hugh Grant.
Speaker:And, and, and he was saying on nine 11.
Speaker:There were a lot of phone calls that were made from the plains,
Speaker:and he said, to my knowledge, none of them were messages of hate.
Speaker:They were
Speaker:messages of love.
Speaker:Hmm.
Speaker:I just got a little of a clump there.
Speaker:Anyway, so, okay.
Speaker:So, um, when we think about that event, there is, in my world, we
Speaker:immediately started talking about.
Speaker:The, we saw the things that happened to companies, and we're gonna talk
Speaker:about, I'm gonna talk about sort of two, two different kinds of companies.
Speaker:One of them.
Speaker:So first off, let's talk about, uh, the, the difference between
Speaker:a cold site and a hot site.
Speaker:Do you, so when we talk about disaster recovery, there was this
Speaker:idea that we, that we're gonna have another site, like ready to go
Speaker:Yep.
Speaker:and we talk about cold site and the hot site.
Speaker:Do you want to talk about what that means?
Speaker:Yeah, so a hot site basically is you have a site, right?
Speaker:A disaster recovery site that is fully operational, has all the
Speaker:equipment, has everything replicated to it, and basically once you push
Speaker:the BRI big red button, right?
Speaker:Everything sort of fails over.
Speaker:It's available, operational, all ready to go, sort of ready
Speaker:to serve traffic and take over, usually minutes to hours within a
Speaker:failure.
Speaker:right.
Speaker:And then a cold site is very much the opposite of that, right?
Speaker:It's a site that's sort of ready to start a restore.
Speaker:Uh, like I suppose there'd be a, there'd be a no site,
Speaker:Yeah.
Speaker:right?
Speaker:That's where, uh, a bad thing happened.
Speaker:And we're gonna go find some hardware to restore
Speaker:to, uh, cold site, the, the.
Speaker:Implication there is that you have some hardware ready to go, but
Speaker:you haven't restored anything.
Speaker:A warm site is somewhere in between those two things,
Speaker:I've have, have you heard of this term that some people call pilot?
Speaker:A pilot light
Speaker:Yeah.
Speaker:talk to.
Speaker:Talk to me
Speaker:So it's basically not quite cold, but not quite warm or hot, right?
Speaker:So
Speaker:it's a little bit better than cold, but not quite to the extent.
Speaker:And a lot of the trade off comes from not necessarily having to
Speaker:eat all of the costs upfront.
Speaker:So
Speaker:as an example, if your pilot site is, say, in the cloud, you might have your
Speaker:data available, but not necessarily your compute and everything else ready to go.
Speaker:Yeah, I think that I, I think I would still call that a warm sight, but
Speaker:I mean, there is this concept, it comes from the concept of a pilot
Speaker:light.
Speaker:I. Right.
Speaker:Which for those of you that don't know, when you have a
Speaker:gas old school that
Speaker:nowadays we have electronic ignition, but there used to be this in inside,
Speaker:if you had a gas water heater or a gas furnace, there would be this
Speaker:little flame that would burn all the
Speaker:time.
Speaker:And that would, that's called your pilot light.
Speaker:And if the pilot light goes out, then you're gonna have to relight it
Speaker:because otherwise your, your heat won't work.
Speaker:Um, modern days we use.
Speaker:electronic electronic ignition,
Speaker:but, um, because a pilot like just wastes a lot of,
Speaker:yep.
Speaker:It's always running and you're always consuming gas.
Speaker:Yeah.
Speaker:Um, so there, so the, the best from a DR perspective, right?
Speaker:The, the, the, the Cadillac, if you will, is, um.
Speaker:The, I don't know if that's the right term anymore because nobody buys Cadillac, the
Speaker:Ferrari, the Rolls Royce.
Speaker:Does anybody buy, do they still make Rolls
Speaker:Oh yeah.
Speaker:Okay.
Speaker:Um, is the hot site
Speaker:that it's, it's, it's ready to go.
Speaker:Number one.
Speaker:It's ready to go when you need it, and, and two, it's kept and
Speaker:it's ready to go within a, a few.
Speaker:Minutes or seconds, right?
Speaker:It's kept as up to date as possible, as much as technology
Speaker:would allow you to do so.
Speaker:And one of the challenges with a hot site is, is latency.
Speaker:And so you might want to put the hot site
Speaker:as close
Speaker:as you can to, uh, the, the site that you're preparing
Speaker:and what happened on nine 11.
Speaker:So there were many companies to, for their hot site, uh, or had
Speaker:their main data center in one tower
Speaker:and had their hot site in another data in the other tower.
Speaker:which makes perfect sense.
Speaker:As long as nothing would take out both towers
Speaker:Yep.
Speaker:and.
Speaker:So unfortunately, as we know, you know, both towers, I mean, that,
Speaker:that's what I woke up to by the way.
Speaker:I, I
Speaker:woke up to my wife saying both of the, 'cause I'm on the west coast, so both
Speaker:of the towers had already collapsed
Speaker:as you know, as I was waking up.
Speaker:And so people lost or companies lost their.
Speaker:their.
Speaker:primary site and their hot site in, in the same moment.
Speaker:And so one of the things we learned, nine 11 is to make sure that you are,
Speaker:if you're doing some sort of hot site or warm site, is to put that site, I'm
Speaker:gonna say nowhere near, um, that the, the site that you're being protected.
Speaker:Now let's talk about that.
Speaker:Um, there were actually some attempts, um, I lived through
Speaker:this because I was working.
Speaker:For companies at the time, and that is there were attempts at regulation to say,
Speaker:If you're a bank or whatever you need to put, if you're financial training,
Speaker:you need to put a copy of your data.
Speaker:Uh, that's, that's hot over 200 miles away.
Speaker:Yep.
Speaker:was an attempt at regulation.
Speaker:What's the problem with that?
Speaker:Um, 200 miles away if you're, depending on the type of disaster isn't far enough.
Speaker:And so that's
Speaker:not the problem.
Speaker:But the second is latency.
Speaker:yeah, that's the problem, right?
Speaker:It's just simply not feasible because the, the, the round trip time, the 200 mile
Speaker:round trip time, uh, was just far too long that couldn't keep the data up to date.
Speaker:depending on the
Speaker:Based on the technology at the time,
Speaker:Yeah.
Speaker:Well, and I think it depends.
Speaker:So I do so.
Speaker:Many years after nine 11, I was working at a storage company and one of the
Speaker:things that they did was they also like talking to financial customers, right?
Speaker:Is many of 'em had what they would call dark fiber,
Speaker:Yep.
Speaker:right?
Speaker:Where they would basically run fiber optics, two fiber network between
Speaker:their two sites, and it would.
Speaker:You're right.
Speaker:It wouldn't be completely eliminate the latency, but it would definitely
Speaker:help versus say routing it over a public network of any type.
Speaker:Yeah, we definitely, you would.
Speaker:I, I think that there was an assumption of dark fiber, uh, at that point, but even
Speaker:200 miles on a straight piece of glass, speed of light has a speed of, speed of
Speaker:light is not, it's not instantaneous.
Speaker:It's whatever it is.
Speaker:187,000 miles a second or whatever,
Speaker:six hundred or something like
Speaker:something like that.
Speaker:Right.
Speaker:Um, the, the round trip time is gonna be measured in milliseconds.
Speaker:It's not gonna be, it's not, it's, it's, it's going to significantly increase
Speaker:latency, especially if we start talking about synchronous transfer of data.
Speaker:Right.
Speaker:Now let's talk about synchronous versus sacred.
Speaker:Synchronous versus asynchronous transfer of data.
Speaker:You want to give that a shot?
Speaker:Yeah, so synchronous is.
Speaker:Basically a right comes into your production site.
Speaker:It gets forwarded over to your DR site.
Speaker:The right gets, now there are different flavors, but typically the right gets
Speaker:committed on the DR site acknowledged back to the primary site, and then the
Speaker:primary site acknowledges the client during which, so you have to add up
Speaker:basically the latency of going over the writes on both sides, coming back before
Speaker:the client acknowledges, in which case you're guaranteed that that right has hit
Speaker:both sites and so the client can move on.
Speaker:Which is great as long as it doesn't take too long.
Speaker:Yes.
Speaker:So that's synchronous and asynchronous is something, is, is uh, different than that.
Speaker:And we, we've had some different, between you and I, we've had some
Speaker:different understandings of different kinds of asynchronous, but go ahead.
Speaker:Yeah, so for me, asynchronous is, well, what I would call semi synchronous, but
Speaker:that's a different case is where, uh, you accept the right on the production,
Speaker:you forward it over to the secondary or DR site, but, but while it's in
Speaker:process of being committed on the other side, you can acknowledge the client.
Speaker:So there is a lag.
Speaker:Um, you could decide how long that lag is, depending on technology and the vendor.
Speaker:Some allow you to specify at an IO level so you can say, I want
Speaker:10 transactions outstanding.
Speaker:Others allow you to do it in terms of seconds.
Speaker:So I'm allowing up to 10 seconds or 30 seconds, um,
Speaker:Before you start, before you start kicking back a performance issue to the client.
Speaker:Before
Speaker:you start Yeah.
Speaker:Putting back pressure.
Speaker:Now
Speaker:interestingly, there is a mode, which I'm not sure if you're aware of,
Speaker:uh, that some financial co companies requested, which is called Domino mode.
Speaker:Talk to me.
Speaker:So it's a form of synchronous replication where it, because in
Speaker:synchronous replication, typically you write to the client, it sends it over.
Speaker:If the right fails on the secondary, it'll still accept it on the primary.
Speaker:And acknowledge back to the client, right?
Speaker:So you're not guaranteed that it'll stop writes if it can't write to
Speaker:both sides At the same time, there's a mode called domino mode where if
Speaker:it can't write to both sides, it will not acknowledge the client.
Speaker:So that's why it's called a Domino.
Speaker:One takes out the other.
Speaker:I, I would think that,
Speaker:this is one of those, this is one of those things where, you know, uh, this
Speaker:is reality versus the idea, right?
Speaker:To me.
Speaker:domino mode that you described, that's synchronous.
Speaker:Anything other than that is not synchronous, right?
Speaker:And anything other than the domino mode that you described,
Speaker:I would call asynchronous, right?
Speaker:So these either synchronous, it's sort of like immutable and not immutable, right?
Speaker:It's either synchronous or it's not synchronous.
Speaker:And if it's synchronous, then it shouldn't acknowledge the right to the
Speaker:client until both writes have been done.
Speaker:And if one of them fails, then they're not done.
Speaker:yeah.
Speaker:So at least from most of the vendors I've seen,
Speaker:they've never implemented it that way.
Speaker:Yeah.
Speaker:That's interesting.
Speaker:Well, they're wrong.
Speaker:So, um, uh, yeah, so that was, so that was a lesson we thought we
Speaker:learned at the time, but we need to make sure we put it far enough away.
Speaker:But then they were like, it's gotta be synchronous and
Speaker:it's gotta be 200 miles away.
Speaker:They're like, eh, it's not gonna work.
Speaker:Right.
Speaker:Um, nowadays, you know, you hinted at it earlier, nowadays we would do this with
Speaker:the cloud and we can put it actually, because, you know, you, you did say that
Speaker:your first problem was that it wasn't far enough, and that's probably true, right?
Speaker:Because especially when we start talking about certain areas
Speaker:like Southern Florida, right?
Speaker:200 miles isn't far enough.
Speaker:Um, and um, so with the cloud, you can put it.
Speaker:Pretty much anywhere.
Speaker:Now, if we're going to do that, if, especially if we're gonna use
Speaker:public networks, we're pretty much going to have to use asynchronous
Speaker:of some sort, right?
Speaker:Uh, so we're gonna send the data and put it another place we're going to,
Speaker:you know, like you said, you can have a buffer, you can have a certain amount
Speaker:of time that it's allowed to get behind before it, uh, like you said, put, what
Speaker:do you mean when you say back pressure?
Speaker:So this is where you start to, um, elongate the time
Speaker:before acknowledging a right.
Speaker:So
Speaker:to the client, because typically your client will sort of throttle itself
Speaker:because at some point your latencies are gonna get into the seconds
Speaker:and they'll be like, no, no, no.
Speaker:Something's going on.
Speaker:I'll slow down.
Speaker:right.
Speaker:Because otherwise you're just gonna start dropping the writes.
Speaker:And.
Speaker:And, and this is a, a configuration choice on the part of the customer
Speaker:where they can say, I don't want to ever put back pressure.
Speaker:I wanna, uh, you know, that, that the data protection is less important
Speaker:than actually getting the job done.
Speaker:And then other clients would say, if I don't back it up, I don't
Speaker:wanna write it in the first place.
Speaker:Right.
Speaker:Um, I, I, obviously, I tend to be more towards the latter than the former.
Speaker:Yeah.
Speaker:Uh, from, uh, most of the companies I've seen are vendors.
Speaker:Yeah.
Speaker:They're not in line with you.
Speaker:Meaning Meaning that they would just go ahead and do it anyway?
Speaker:Yeah, because most customers Right.
Speaker:Unless you
Speaker:have very, very strict regulations.
Speaker:Right.
Speaker:They're like, best effort
Speaker:Yeah.
Speaker:I, I think it's
Speaker:because they will
Speaker:that I care about data protection more than the average
Speaker:person,
Speaker:Because the thing is, at some point it will catch back up.
Speaker:Hopefully,
Speaker:Hopefully yes.
Speaker:Depending on what the problem was.
Speaker:right.
Speaker:Um, and, and again, as long as we're okay with the potential,
Speaker:right, um, uh, I, I would think that there should still be some number.
Speaker:number might be measured in hours if we're hours behind updating
Speaker:our other copy, something.
Speaker:Might be drastically wrong that we need to look at.
Speaker:Yep.
Speaker:And the other thing to also mention is with a lot of this.
Speaker:High end.
Speaker:I normally refer to it as tier one storage systems
Speaker:Yeah,
Speaker:because these are tier one applications with very strict requirements.
Speaker:Um, usually also they provide the ability to do automatic failover.
Speaker:So it's kind of, think of it like high availability plus clustering.
Speaker:right.
Speaker:So if you take a look at a lot of the tier one storage, right, they might have two
Speaker:storage systems in both locations with the drives that are all interconnected.
Speaker:So in case one unit fails, the other unit can take over the diss of the other side.
Speaker:Um, the clients are also connected to both sides, so they don't
Speaker:have to worry about failing over.
Speaker:Um, I can't remember what it was called.
Speaker:It's like the optimized and non-optimized connectivity for fiber channel,
Speaker:which allows it to have a preferred path and a non-preferred path.
Speaker:So you still
Speaker:have connectivity, so your clients will automatically fail over, so you
Speaker:don't have to do anything, and so your writes can still continue to happen.
Speaker:Yeah.
Speaker:So that's a big thing, is like, you know, we, we, we learned that
Speaker:we should have it farther away.
Speaker:We learned that maybe we shouldn't have it too far away, but, but,
Speaker:but now with the, with the.
Speaker:With the cloud, we can potentially have it pretty much anywhere, but we
Speaker:definitely have to rely on some sort of asynchronous, uh, uh, communication.
Speaker:When, I think about our, our friends that came on the show.
Speaker:Talk about their experiences with disaster.
Speaker:I think there's some, that was a major, that was a hurricane
Speaker:that took out an island
Speaker:and they had multiple data centers on that island.
Speaker:One was more in the high ground than the other.
Speaker:So one was flooded, the other was not.
Speaker:And they were gonna recover from one data center to the other.
Speaker:And that's, and they had everything they needed there.
Speaker:They did.
Speaker:They needed personnel.
Speaker:They had to fly people to the island.
Speaker:But other things happened that we can also learn from.
Speaker:Do you, what do you remember from those?
Speaker:So from, so there were two things I remember.
Speaker:One was sort of.
Speaker:The people process stuff, which I think we rarely think about.
Speaker:And then the other was the technology piece.
Speaker:So from a people process perspective, it was, do you have the right people in
Speaker:country who have the expertise and the
Speaker:skills to
Speaker:recover?
Speaker:Where are they going to sleep?
Speaker:Where are they going to get food?
Speaker:Right?
Speaker:All the things that you kind of take for granted, right?
Speaker:They had none of that.
Speaker:Like how do they communicate
Speaker:Yeah.
Speaker:He, as I recall, he, he turned a, a, um, a conference room into a
Speaker:hotel room
Speaker:Right.
Speaker:And slept on a cot and ate rice and beans for two
Speaker:weeks.
Speaker:And he was lucky to
Speaker:get rice and beans right?
Speaker:Because there were a lot of people who didn't even get that.
Speaker:Yeah.
Speaker:Um.
Speaker:Yeah, that's, that is definitely a, a lesson learned is that, you
Speaker:know, make sure to take the human element into your DR design.
Speaker:Right.
Speaker:Um, and the one that, the one that stands out to me was
Speaker:the reliance on the mainland.
Speaker:Yeah.
Speaker:Right.
Speaker:That when you have a true disaster, whether it's on an island or just.
Speaker:You know, wherever you might not be able to get connectivity to the rest
Speaker:of your computing infrastructure.
Speaker:And in this case, their authentication and authorization, their IAM system
Speaker:relied on active directory, which
Speaker:all of which was in, was in the mainland.
Speaker:And um, so this is just, you know, the lesson there is to just make sure that.
Speaker:To just take that into consideration, right.
Speaker:Just the, the realize that in a real disaster, you may be very
Speaker:isolated from the rest, rest of the world, and you need to take
Speaker:that into consideration in your DR.
Speaker:Design.
Speaker:Yeah.
Speaker:And, but I think this becomes so difficult, right, Curtis, because
Speaker:like how many scenarios are you gonna play in your head and how much time
Speaker:are you gonna focus on some of these?
Speaker:Now granted, a hurricane on a tropical island is probably
Speaker:a high likelihood event,
Speaker:Yes.
Speaker:Right.
Speaker:Well, I,
Speaker:of that, but.
Speaker:well, I, I, I think it's, I think it's a totally, I think that there's two
Speaker:things that you cannot count on, right.
Speaker:Um, I. and and I, I would just say one thing that covers the number of
Speaker:things, and that is utilities, right?
Speaker:You cannot count on the internet.
Speaker:You cannot count on power.
Speaker:You cannot count on, um, you know, uh, water,
Speaker:right?
Speaker:You can't count on those three things.
Speaker:And so I, I'm just saying, I don't, I don't think you, you don't need
Speaker:to think of all of the reasons that one or more of those might not.
Speaker:Yeah.
Speaker:available.
Speaker:You just need to plan for them not being available.
Speaker:If you don't have internet, if you don't have whatever it is you,
Speaker:however you communicate between your sites, you are going to be isolated.
Speaker:If you don't have power, you're gonna need to supply your own power,
Speaker:right?
Speaker:Um, these are just things that you can think, you can
Speaker:think through these things, and these are all things that you can say either.
Speaker:I'm just saying, have that discussion.
Speaker:And say, you know what, if we don't have power, we're not gonna do anything.
Speaker:We're not
Speaker:gonna spend $15 billion on power generators
Speaker:Yeah,
Speaker:in case we, you know, we might, or, or you might be subject
Speaker:to regulations, or you might
Speaker:have, uh, financial reasons why downtime is enough.
Speaker:That, or the cost of downtime is enough that you're gonna pay for generators.
Speaker:Um, do you remember how they got internet over there?
Speaker:It was satellite.
Speaker:Yeah.
Speaker:satellite.
Speaker:I like this idea of making sure that you think about what would
Speaker:you do if the utilities that you're normally counting on are not
Speaker:available?
Speaker:If you get completely isolated, what would you do?
Speaker:There are a number of reasons why that might end up being the case.
Speaker:or the, or the people that you normally depend on are not available.
Speaker:Yes.
Speaker:This is why we had this little thing called documentation.
Speaker:Yeah.
Speaker:Um,
Speaker:And,
Speaker:and
Speaker:go ahead.
Speaker:and hopefully I know when we've had Mike, Dr. Mike on the podcast, right?
Speaker:Um, he's talked about sort of doing tabletop exercises, right?
Speaker:So have you done a tabletop exercise, which is like, Hey, what happens if the
Speaker:main IT person, Susie, is unavailable?
Speaker:Right, right.
Speaker:And, and that's what you do.
Speaker:You work through those various scenarios.
Speaker:You, you hire somebody to come in as an outsider is the best way to do that.
Speaker:When we start talking about ransomware, um, Dr. Mike Sailor's company would,
Speaker:would, would be a great resource for that.
Speaker:It's good to use a very negative person.
Speaker:Think the most negative person in your environment, the most
Speaker:pess, pessimistic person.
Speaker:And, uh, no, no.
Speaker:Pessimist thinks they're negative.
Speaker:They, they think they're, they're realists,
Speaker:will realize
Speaker:Um, yeah.
Speaker:Sort of the final thought that I'm thinking is, and, and I, I remember.
Speaker:Doing a disaster recovery of my own, and that is, I did it where.
Speaker:I had not yet tested the throughput of the
Speaker:backup system that we were doing and the throughput of the backup system we
Speaker:were doing, it turned out to be crap.
Speaker:Right?
Speaker:And as a result, uh, that was a really hard day.
Speaker:And in Curtis land, right?
Speaker:This is
Speaker:very early in my career, I learned it.
Speaker:What's that?
Speaker:Yes, it was the compression thing.
Speaker:Yeah.
Speaker:So make sure that when, when you make configuration changes to your
Speaker:backup system or your DR system, make sure that you test that.
Speaker:Just realize that in general, restore speed is slower than backup speed.
Speaker:It just is.
Speaker:It's the way it, you know it, you know, back in the day with tape, it
Speaker:was because we were doing multiplexing.
Speaker:Nowadays it would would with dedupe.
Speaker:It's because of dedupe.
Speaker:Um, and so just sort of plan for that.
Speaker:But don't, don't just assume how much slower it is, uh, test it and see how much
Speaker:slower it is and make sure you figure that into the, to the disaster recovery plan.
Speaker:But, um, so, um, with that, I think that's enough for now in terms of lessons learned
Speaker:from just, you know, the pains of others.
Speaker:Um.
Speaker:Just think of the scenarios that might be, you know, that you might be subject to.
Speaker:Um, make sure you've got at least one copy of your data
Speaker:that's far away from everything.
Speaker:Um, and the way to probably do that today is cloud taped can still play a role.
Speaker:In fact, our previous episode we talk about why tape is
Speaker:still not dead in backup.
Speaker:Uh, it certainly is on life support, but there
Speaker:is, you know.
Speaker:There is, there is a use for tape and backup and it's disaster recovery.
Speaker:So, um, um, especially when we start talking about disaster
Speaker:recovery from a ransomware attack.
Speaker:So, well, thanks for chatting again, my friend.
Speaker:No, it was good.
Speaker:Uh, one thing I'm surprised you, I thought you were gonna mention, but you did not,
Speaker:The 3, 2, 1 rule.
Speaker:1 rule.
Speaker:You were just
Speaker:You know what's funny is we were so close.
Speaker:We were so close.
Speaker:Yeah.
Speaker:Yeah.
Speaker:The whole 3, 2, 1 rule with, you know, three copies of your data on two
Speaker:different media, one of which is offsite.
Speaker:That, that, that's a core design concept for anything.
Speaker:Uh, backup in dr. And, and that's, yeah, you're right.
Speaker:We never, we, we didn't mention, but now we have, so you
Speaker:have corrected our oversight.
Speaker:Thank you very much.
Speaker:And thank you to our listeners.
Speaker:We'd be nothing without you.
Speaker:That is a wrap.
Speaker:The backup wrap up is written, recorded, and produced by me w Curtis Preston.
Speaker:If you need backup or Dr. Consulting content generation or expert witness
Speaker:work, check out backup central.com.
Speaker:You can also find links from my O'Reilly Books on the same website.
Speaker:Remember, this is an independent podcast and any opinions that
Speaker:you hear are those of the speaker and not necessarily an employer.
Speaker:Thanks for listening.