Check out our companion blog!
Jan. 8, 2024

Backup Fails at Archive: Billion-Dollar eDiscovery Disasters

In this episode, Curtis and Prasanna do a deep dive on the differences between data backup and data archiving. They thoroughly explain that while backup focuses on restoring systems and files to a prior point in time, archiving is all about being able to search and retrieve specific information for legal or regulatory purposes.

Key reasons you'll want to tune in:

  • Learn exactly why companies archive data and how regulatory compliance and legal eDiscovery requests require specialized archive capabilities.
  • Understand the dangers of using your backup system as an archive for eDiscovery - lacking full search and exposing too much irrelevant data risks your legal case.
  • Hear multiple real-world horror stories of companies failing legal cases due to lacking proper archives - to the tune of billions of dollars lost.
  • Get clear examples of how continuous, comprehensive archiving captures all versions of files, emails, and data - including deleted and intermediate items.
  • Get a life-line for those of you who are still using your backup system as an archive

 

If you need to implement archiving or fix broken archive approaches that risk legal noncompliance, this episode delivers an excellent primer on how archive differs from backup and what genuine archive systems can do.

https://support.google.com/drive/thread/245861992?sjid=15540859157109248518-NC

https://support.google.com/drive/answer/14286582?sjid=8199341837463411967-NA

https://blog.23andme.com/articles/addressing-data-security-concerns

https://www.backupwrapup.com/what-is-archive-and-retrieve-backup-to-basics/

https://www.sullivanstrickler.com

Transcript

Speaker:

ATR2500x-USB Microphone & Logitech BRIO-2: Backup is not archive and archive is

 

Speaker:

not backup frequent listeners to this show, know that we say this all the time.

 

Speaker:

This episode.

 

Speaker:

We're going to dig a little bit deeper and we're going to talk

 

Speaker:

W. Curtis Preston (2): about exactly why that is the case.

 

Speaker:

It really comes down mainly to eDiscovery.

 

Speaker:

And why backup systems make a really bad tool when you get any discovery requests.

 

Speaker:

If you live in a place where you are likely to get any discovery request,

 

Speaker:

or if you're required to keep certain information, For compliance reasons.

 

Speaker:

You need to listen to this episode.

 

Speaker:

And interestingly enough, it ends up with kind of a surprise ending,

 

Speaker:

one that came to me actually only after actually editing the episode.

 

Speaker:

If this is your first time listening.

 

Speaker:

Hi, I'm w Curtis Preston, AKA Mr.

 

Speaker:

Backup, and I've dedicated my 30 year career to backup and recovery, disaster

 

Speaker:

recovery, and anything near to that.

 

Speaker:

And this podcast is dedicated to those unappreciated backup admins.

 

Speaker:

We turn you into cyber recovery heroes.

 

Speaker:

This is the backup wrap up.

 

Speaker:

Welcome to the backup wrap up.

 

Speaker:

I'm your host w Curtis Preston, and I have with me someone who

 

Speaker:

apparently has much better ability to remember to do things than I do.

 

Speaker:

Prasanna Malaiyandi

 

Speaker:

W. Curtis Preston: how's it going?

 

Speaker:

Persona.

 

Prasanna:

I am good.

 

Prasanna:

Curtis, what did I remember that I, that you seemed to forget?

 

Prasanna:

Oh, that's

 

Prasanna:

right.

 

Prasanna:

W. Curtis Preston: So we, uh, you know, in keeping with, you know, my

 

Prasanna:

philosophy, we do two recordings here.

 

Prasanna:

Uh, partly because we use a cloud, well, chiefly, I think because we use a cloud

 

Prasanna:

recording service that has been 98%.

 

Prasanna:

Reliable.

 

Prasanna:

And then sometimes it just doesn't have the recording, which is,

 

Prasanna:

which is incredibly frustrating.

 

Prasanna:

So we make a backup recording and we use something called OBS,

 

Prasanna:

which if it was just you and me, we probably could just use OBS.

 

Prasanna:

But, uh, you know, we, we have guests and installing OPS, isn't that

 

Prasanna:

helpful?

 

Prasanna:

. so anyway, but you remember today.

 

Prasanna:

And

 

Prasanna:

so we have

 

Prasanna:

our backup.

 

Prasanna:

Yeah.

 

Prasanna:

And then I know we're gonna get to the news in a second, the topics, but

 

Prasanna:

I think one of the things we should also mention is, uh, not only do we

 

Prasanna:

keep the recording of OBS locally, but we also upload it to a Google Drive.

 

Prasanna:

W. Curtis Preston: we do.

 

Prasanna:

We

 

Prasanna:

And one thing though that people may not realize.

 

Prasanna:

W. Curtis Preston: Two, one buddy.

 

Prasanna:

Exactly.

 

Prasanna:

I think one thing though people don't realize is I know you created a shared

 

Prasanna:

folder for us and throw things in, and over the weekend I got into a little bit

 

Prasanna:

of a hissy fit because I was like, why are you consuming my storage, Curtis?

 

Prasanna:

W. Curtis Preston: Yeah, I don't understand that at all.

 

Prasanna:

Where I

 

Prasanna:

share a drive with

 

Prasanna:

you

 

Prasanna:

you don't share drive.

 

Prasanna:

You share a shared folder, which is different than in enterprises

 

Prasanna:

where you create a shared drive.

 

Prasanna:

That has different association than creating a shared

 

Prasanna:

W. Curtis Preston: So you mean when you put it in there, it's still in your

 

Prasanna:

account, but it's sharing it with me.

 

Prasanna:

Oh,

 

Prasanna:

that's,

 

Prasanna:

it's basically an organization

 

Prasanna:

mechanism.

 

Prasanna:

W. Curtis Preston: Okay.

 

Prasanna:

All right.

 

Prasanna:

Well, hey, you know you learned something, uh, speaking of Google Drive, though.

 

Prasanna:

Uh, I think we have good news.

 

Prasanna:

I, I think this is good news.

 

Prasanna:

This is a follow up on the big Google Drive story that I, that I

 

Prasanna:

believe we reported on last week.

 

Prasanna:

And this was this thing where, uh, a bunch of Google users, there were like

 

Prasanna:

several hundred that had clicked the, this has also happened to me button

 

Prasanna:

and where basically all of the data since May had had disappeared and.

 

Prasanna:

The, the good news is that Google released a, uh, basically instructions

 

Prasanna:

and a blog post that talked about what to do, and it turned out that

 

Prasanna:

the data was never really gone.

 

Prasanna:

It was just data.

 

Prasanna:

It that, that, so first off, this.

 

Prasanna:

And you were right the first time you said it.

 

Prasanna:

When we talked about it last time, said that this only

 

Prasanna:

affected those with Google Drive.

 

Prasanna:

And I said, well, there are some people saying that they have the

 

Prasanna:

problem and they weren't using the desktop version of Google Drive.

 

Prasanna:

And you may recall I said they may be wrong, and it looks like they were.

 

Prasanna:

So, um, basically you needed to download the updated version of Google

 

Prasanna:

Drive and then click Restore from backup, which is interesting because.

 

Prasanna:

It means that there are files that are not quite in the drive but

 

Prasanna:

are stored, still stored locally.

 

Prasanna:

These are files that were not yet synced and that basically this

 

Prasanna:

should solve your problem, um, because it only affected files that

 

Prasanna:

that had not yet been synced to.

 

Prasanna:

Which I want to say, um, how do you not know that your drive

 

Prasanna:

has not been sinking since May?

 

Prasanna:

Um, right.

 

Prasanna:

Yeah.

 

Prasanna:

I, I

 

Prasanna:

don't know.

 

Prasanna:

That's

 

Prasanna:

just a thought.

 

Prasanna:

that it's kind of crazy.

 

Prasanna:

It's been like seven months and no one's noticed the problem.

 

Prasanna:

Like no one, like if you're using Google Drive for desktop, you've never gone

 

Prasanna:

to the web interface and been like, Hey, let me check if my file is there.

 

Prasanna:

Try to pull it down or use the Google Mobile app.

 

Prasanna:

You know, that's kind of crazy.

 

Prasanna:

I.

 

Prasanna:

W. Curtis Preston: Yeah, I think it's thi this really kind of drills to the heart

 

Prasanna:

of why I'm not that big of a fan of Google Drive as, or any similar drive like,

 

Prasanna:

uh,

 

Prasanna:

One drive.

 

Prasanna:

Yeah,

 

Prasanna:

W. Curtis Preston: Drive, um, is that the sinking process

 

Prasanna:

is

 

Prasanna:

not.

 

Prasanna:

Managed and reported on the way that a backup process typically was.

 

Prasanna:

So a sync failure apparently isn't bubbled up to anyone because you would think

 

Prasanna:

if they're using Google Drive desktop version, Google Drive would be flashing

 

Prasanna:

messages to them saying, Hey, you know, you haven't synced for seven months.

 

Prasanna:

You might wanna look into that.

 

Prasanna:

Um, yeah.

 

Prasanna:

Whereas a backup app is something that allows for centralized

 

Prasanna:

monitoring and, you know, especially when we talk about, um, companies.

 

Prasanna:

Anyway, I digress that, that that isn't what I wanted to talk about today.

 

Prasanna:

I do wanna say though, kudos to Google though for actually providing an

 

Prasanna:

update on both what the issue was, quickly resolving it and providing a patch.

 

Prasanna:

W. Curtis Preston: They did figure it out.

 

Prasanna:

They did fix it.

 

Prasanna:

And it looks like, um, my, my favorite part of the instructions

 

Prasanna:

were if you click the Restore, you're gonna either get, uh, two messages.

 

Prasanna:

Either Restore was successful or you're outta space if you get the second one.

 

Prasanna:

Um.

 

Prasanna:

You should really go get some more space.

 

Prasanna:

It just, they did like in a nice little professional way, and I

 

Prasanna:

really wanted them to go, Hey dude, man, clear out your drive, you

 

Prasanna:

know?

 

Prasanna:

Uh,

 

Prasanna:

W. Curtis Preston: Uh, anyway.

 

Prasanna:

Anyway,

 

Prasanna:

so

 

Prasanna:

you want, you wanna talk about the 23 and Me

 

Prasanna:

So I came across this article recently, so it looks like 23, and

 

Prasanna:

me had a security breach recently, and it was kind of alarming What was,

 

Prasanna:

what the media was talking about.

 

Prasanna:

They were like, oh, 23 and me was breached.

 

Prasanna:

All this information is available on something like.

 

Prasanna:

I think it was like, what was it?

 

Prasanna:

6.9 million I think was the number of users and it's like, wow.

 

Prasanna:

And they didn't have any other information.

 

Prasanna:

Right.

 

Prasanna:

So it's like 23 and me for those not familiar, it's

 

Prasanna:

W. Curtis Preston: Sounds very

 

Prasanna:

It's like an ancestry website where you can upload your

 

Prasanna:

DNA profile and it'll help connect with other people who may be related

 

Prasanna:

to you that you may not know about.

 

Prasanna:

And so with that DNA profile, it's like, wow, if hackers got into 23 and

 

Prasanna:

me, they're able to pull down your DNA.

 

Prasanna:

And what could they do with that?

 

Prasanna:

Like all this data could be used for nefarious purposes

 

Prasanna:

and other things like that.

 

Prasanna:

W. Curtis Preston: They could

 

Prasanna:

Prasanna a clone.

 

Prasanna:

in 10 years.

 

Prasanna:

Right.

 

Prasanna:

But, and so it was very scary.

 

Prasanna:

And just yesterday or uh, yeah, yesterday they just published an

 

Prasanna:

update from 23 and me saying, Hey, we figured out what the issue was.

 

Prasanna:

And once again, like the Google, I'm glad they were very transparent, so.

 

Prasanna:

W. Curtis Preston: Yeah,

 

Prasanna:

So what ended up happening was, uh, hackers basically use credential

 

Prasanna:

stuffing to access 14,000 accounts.

 

Prasanna:

Uh, what credential stuffing is, is they basically.

 

Prasanna:

Figured out that there was another website that was compromised.

 

Prasanna:

They downloaded those username and passwords and they use those compromised

 

Prasanna:

credentials and just started atta uh, using those same logins on 23.

 

Prasanna:

And me.

 

Prasanna:

And like we always talk about Curtis, people should be using

 

Prasanna:

different passwords on different websites and they should definitely

 

Prasanna:

be using a password manager, right.

 

Prasanna:

W. Curtis Preston: If, if, if everybody did what we tell them to do, which is

 

Prasanna:

don't use the same password anywhere.

 

Prasanna:

And use MFA, right?

 

Prasanna:

This hack would never have happened if, if these 14,000 and

 

Prasanna:

I, I, I, I'm not, I don't know.

 

Prasanna:

It, it sounds like victim blaming it.

 

Prasanna:

It is what it is.

 

Prasanna:

This is your job as a person, as a company to, to, to prevent the, you

 

Prasanna:

know, the stealing of your information.

 

Prasanna:

And if you just do all the things that you're not supposed to do, you know,

 

Prasanna:

your chances of getting hacked are much

 

Prasanna:

much higher.

 

Prasanna:

W. Curtis Preston (2): An interesting update to this story since we

 

Prasanna:

actually recorded this news item.

 

Prasanna:

And that is that.

 

Prasanna:

23.

 

Prasanna:

And me basically came out and said what we just said, which is, you know,

 

Prasanna:

if you hadn't reused your passwords, Uh, you know, from other sites on our

 

Prasanna:

site, you wouldn't have gotten hacked.

 

Prasanna:

And they got lambasted for it.

 

Prasanna:

Basically, they said, you know, 23 and me is blaming their users for their hack.

 

Prasanna:

It wasn't a hack, right?

 

Prasanna:

It wasn't a hack of 23 and me, it was, they went and got other people's

 

Prasanna:

credentials from, you know, their user's credentials from other sites.

 

Prasanna:

And they tried those credentials on 23 and me and they logged in.

 

Prasanna:

I, you know, I I'm, I have to say.

 

Prasanna:

Um, you know, I'm pretty much on 23 and me side here do not reuse your credentials.

 

Prasanna:

On other sides.

 

Prasanna:

Number one, number two, turn on MFA.

 

Prasanna:

Right?

 

Prasanna:

If, if, if you had done both or either of those things, Uh, you would

 

Prasanna:

not have been subject to this hack.

 

Prasanna:

I, you know, I don't know what else to say.

 

Prasanna:

Now, so it's not as bad, right?

 

Prasanna:

So it's 14,000 people.

 

Prasanna:

They didn't go to the backend systems and pull in all the data.

 

Prasanna:

They went through the front door, right?

 

Prasanna:

And so they saw whatever that user could have seen.

 

Prasanna:

Now.

 

Prasanna:

The downside is it wasn't just those 14,000 people and

 

Prasanna:

their information, right?

 

Prasanna:

Because with 23 and me, you could say, Hey, who else am I connected to?

 

Prasanna:

And you could look at their

 

Prasanna:

DNA relatives is what they call it.

 

Prasanna:

And you can also build a family tree, which also exposes a personal data of

 

Prasanna:

people you're connected to potentially.

 

Prasanna:

W. Curtis Preston: Right.

 

Prasanna:

Yeah, it is.

 

Prasanna:

It is an optional feature, but it's probably an optional feature

 

Prasanna:

that people use quite a bit.

 

Prasanna:

And you know, it shows, you know, it shows a lot about you, right?

 

Prasanna:

You're, you know, you know, obviously things like when you last logged in and it

 

Prasanna:

shows the people that you're related to, which may expose family relationships that

 

Prasanna:

you had not intended to expose publicly.

 

Prasanna:

I.

 

Prasanna:

Uh, there, the scariest part was, it, it, it said, it talks about DNA segments.

 

Prasanna:

It, it shows that the DNA segments that you have in common.

 

Prasanna:

I, I haven't looked at that specifically, so I don't know exactly what that means,

 

Prasanna:

but I, I think it's important to say that what it doesn't, what you're not

 

Prasanna:

able to do as a 23 and me customer, is to download your actual DNA profile.

 

Prasanna:

Right.

 

Prasanna:

The thing that, that I was joking about earlier, you know, cloning,

 

Prasanna:

you're not, you're not able to do that.

 

Prasanna:

And so while this sounds like a huge breach, the only thing I think that

 

Prasanna:

23 and Me could have done to prevent this is to force MFA on all customers.

 

Prasanna:

And you know, it's something that you as a company, I think should think about,

 

Prasanna:

um,

 

Prasanna:

that.

 

Prasanna:

You know, especially if you have sensitive data.

 

Prasanna:

Right.

 

Prasanna:

And I, I can't think of any data more sensitive than DNA profile.

 

Prasanna:

Right.

 

Prasanna:

Um, that's about, I mean, what the customers could have done

 

Prasanna:

is you use a different password and, and turn on, I'm sure.

 

Prasanna:

23 and Me has MFA enabled or available to you, but many people might not use it.

 

Prasanna:

Apparently 14,000 people, at least 14,000 people don't use it.

 

Prasanna:

So

 

Prasanna:

I also do wonder if companies should start integrating

 

Prasanna:

with things like, what is it?

 

Prasanna:

Am I owned?

 

Prasanna:

right.

 

Prasanna:

right.

 

Prasanna:

Those and I Pond.

 

Prasanna:

Yeah.

 

Prasanna:

Am I Pond?

 

Prasanna:

Yeah.

 

Prasanna:

Right.

 

Prasanna:

And it's those websites that I think companies should start

 

Prasanna:

thinking about integrating with and being like, Hey, there's already

 

Prasanna:

a list of breaches out there.

 

Prasanna:

So has this username password, the hash of the password been used elsewhere.

 

Prasanna:

So they could

 

Prasanna:

W. Curtis Preston: Oh, that's actually, I really like that

 

Prasanna:

idea Prasanna of basically.

 

Prasanna:

Proactively going through your username, um, and password database

 

Prasanna:

using the MI pond, uh, database to say here is, but you won't get, I

 

Prasanna:

guess the only thing you'll be able to notify, you'll be able to notify a

 

Prasanna:

person that this username, this email address has been owned in another site.

 

Prasanna:

And notifying.

 

Prasanna:

I was, I was thinking for a minute there that you could figure out if

 

Prasanna:

the actual

 

Prasanna:

password

 

Prasanna:

I wonder if you could use

 

Prasanna:

W. Curtis Preston: used.

 

Prasanna:

But

 

Prasanna:

W. Curtis Preston: Well, you could use that, but they don't put the

 

Prasanna:

password in the am I pond, right?

 

Prasanna:

They just say that, um,

 

Prasanna:

you

 

Prasanna:

know, um,

 

Prasanna:

Maybe that's a service that they should offer.

 

Prasanna:

W. Curtis Preston: who am I,

 

Prasanna:

pond?

 

Prasanna:

For companies to integrate, be like, Hey, we have, because they

 

Prasanna:

must have the hashes, you know, because they're getting the data from somewhere.

 

Prasanna:

W. Curtis Preston: Right.

 

Prasanna:

Yeah.

 

Prasanna:

There's a, there's a, I think there's, there's some money to

 

Prasanna:

be made there, I think, right?

 

Prasanna:

Um,

 

Prasanna:

yeah.

 

Prasanna:

The one other thing I would say is, and they didn't go into a lot of

 

Prasanna:

detail, so the breach happened back in October, so it's not known how

 

Prasanna:

long it took before they saw the issue and how also the attackers were.

 

Prasanna:

Forcing themselves in.

 

Prasanna:

For instance, if the attackers, and I'm sure attackers don't do this, right, but

 

Prasanna:

if they start coming from a region that you normally don't log in from, that

 

Prasanna:

should have been flagged immediately.

 

Prasanna:

You know?

 

Prasanna:

So their intrusion detection system should have realized, Hey, you're

 

Prasanna:

logging in from the east coast of the us.

 

Prasanna:

You normally log in from the west coast.

 

Prasanna:

Yeah, we should probably check and see.

 

Prasanna:

W. Curtis Preston: I, but I, I agree.

 

Prasanna:

I, I mean, I agree, but I'm just thinking about like, I don't know any, like

 

Prasanna:

the only, the only website or service that I log into like that where it

 

Prasanna:

says, Hey, you're in a different spot, is Netflix, and the only reason that

 

Prasanna:

that's the case is because I'm not supposed to watch outside of my home.

 

Prasanna:

They're not, They're not,

 

Prasanna:

thinking

 

Prasanna:

But, but, uh, there are other websites though, like that look

 

Prasanna:

and see, okay, are you logging in, like Google, for instance, right?

 

Prasanna:

Hey, you're logging in from this remote place.

 

Prasanna:

Are you sure that's you?

 

Prasanna:

And they

 

Prasanna:

W. Curtis Preston: a different place.

 

Prasanna:

And they send you a email, right?

 

Prasanna:

Or Apple does it

 

Prasanna:

W. Curtis Preston: If, if,

 

Prasanna:

if you've

 

Prasanna:

enabled

 

Prasanna:

No, no, no.

 

Prasanna:

But it doesn't even have to be MFA, right?

 

Prasanna:

At least a notification.

 

Prasanna:

W. Curtis Preston: sure?

 

Prasanna:

Well, well, I think at least getting a notification.

 

Prasanna:

W. Curtis Preston: Okay.

 

Prasanna:

Okay.

 

Prasanna:

That, yeah, that's true.

 

Prasanna:

That's something that they could do, right?

 

Prasanna:

They could notify users.

 

Prasanna:

Yeah.

 

Prasanna:

Yeah.

 

Prasanna:

Hey, uh, just so you know, someone's logging in from Russia

 

Prasanna:

with your,

 

Prasanna:

um, of course

 

Prasanna:

that,

 

Prasanna:

yeah.

 

Prasanna:

From a new browser, because you can fake VPNs, right?

 

Prasanna:

You can use VPNs to fake.

 

Prasanna:

Yeah.

 

Prasanna:

So I think there are

 

Prasanna:

W. Curtis Preston: So, I, I,

 

Prasanna:

and I

 

Prasanna:

don't know what systems they have in place at 23 and me, but hopefully they are

 

Prasanna:

reconsidering what systems they have to prevent things like this from happening.

 

Prasanna:

W. Curtis Preston: Yeah, agreed.

 

Prasanna:

I, I guess just short version here is this wasn't as bad as

 

Prasanna:

we thought it was initially.

 

Prasanna:

This wasn't an attack, this wasn't a backend attack.

 

Prasanna:

It is officially a breach because it was, you know, their company.

 

Prasanna:

This was basically a bunch of people who could have prevented the breach

 

Prasanna:

by just changing, you know, using different passwords and enabling MFA.

 

Prasanna:

Um, it sounds like they could have done some additional things to, to,

 

Prasanna:

uh, ameliorate this, uh, as a company.

 

Prasanna:

And we also, we don't yet know if they, um, um, if the, like the hack, you said

 

Prasanna:

that this, this happened in October.

 

Prasanna:

Did they sit on it for two months or did they just not know

 

Prasanna:

for two months?

 

Prasanna:

That's also

 

Prasanna:

Now the one thing is they did bring in forensic

 

Prasanna:

investigators to figure out what's going on, so I will also give 'em

 

Prasanna:

W. Curtis Preston: so maybe it, maybe it took, yeah, maybe it took that long.

 

Prasanna:

All right, well that's the news of the week

 

Prasanna:

All right.

 

Prasanna:

This week I thought we would dive, uh, a little bit deeper into archive,

 

Prasanna:

if you want the basics of the difference between backup and archive.

 

Prasanna:

There are a couple of episodes from a little bit, a little bit ago.

 

Prasanna:

Um, it just says, what is Archive and retrieve as opposed

 

Prasanna:

to what is backup and restore.

 

Prasanna:

We will put that in the show notes, right?

 

Prasanna:

In case you want to go back and listen to that.

 

Prasanna:

W. Curtis Preston: yeah, we'll put a link.

 

Prasanna:

Yeah.

 

Prasanna:

Thanks.

 

Prasanna:

We'll put a link in the show notes to that, uh, episode.

 

Prasanna:

If, if you wanna go listen to that before you wanna listen to this, we're

 

Prasanna:

gonna go a little bit deeper into what is Archive 'cause it is very different.

 

Prasanna:

What, um, do, do you remember how I, how I, um, sort of separate the two?

 

Prasanna:

What was it?

 

Prasanna:

It was, I believe one of the points you talk about is backup is.

 

Prasanna:

Restoring your, what your environment looks like to a point in time

 

Prasanna:

that plausibly existed, right?

 

Prasanna:

Archive is you get a whole bunch of data back and the way you

 

Prasanna:

search is very different, right?

 

Prasanna:

Usually backup is, I wanna go back to a point in time archive is I'm looking

 

Prasanna:

for all the emails from Curtis or those types of things where you're, what you

 

Prasanna:

end up with from an archive perspective.

 

Prasanna:

Is never a plausible point in time, or it may never be a plausible

 

Prasanna:

point in time in the life of that

 

Prasanna:

W. Curtis Preston: right.

 

Prasanna:

Absolutely.

 

Prasanna:

Is that fairly accurate?

 

Prasanna:

W. Curtis Preston: That, that's absolutely it, right?

 

Prasanna:

So it's basically a, a, a backup is to do a restore and

 

Prasanna:

an archive is to do a retrieve.

 

Prasanna:

And the difference between a restore and a retrieve is that a restore

 

Prasanna:

is basically restoring your system back to a particular point in time.

 

Prasanna:

Even if it's just one file, you're restoring one file.

 

Prasanna:

If you're just doing one file.

 

Prasanna:

Technically, I suppose you could use either, but generally we're talking about

 

Prasanna:

many files, and so with a restore, you're restoring the system to the way it looked.

 

Prasanna:

Generally speaking, just a few minutes ago or perhaps yesterday,

 

Prasanna:

basically to the most recent backup.

 

Prasanna:

Um, and, and possibly to something before that, especially in the

 

Prasanna:

case of a ransomware attack.

 

Prasanna:

You want to pick a particular point in time because it's before the ransomware

 

Prasanna:

attack happened, but the whole point is to bring it back to a point in time.

 

Prasanna:

That's a point in time that you know, whereas with archive, it's about, I

 

Prasanna:

need some information that matches a particular set of criteria that I.

 

Prasanna:

Um, and I don't even know where that information might be.

 

Prasanna:

Right.

 

Prasanna:

That's, that's another big difference between a backup and an archive, is

 

Prasanna:

that with a backup, we know we're restoring system A, you know, Apollo,

 

Prasanna:

we're restoring it to the way it looked yesterday with an archive.

 

Prasanna:

We are, you know, you gave an example, we're looking for Curtis'

 

Prasanna:

emails, um, or we're looking for any documents that Curtis created.

 

Prasanna:

In this span of time.

 

Prasanna:

It could be emails, it could be written documents, it could be

 

Prasanna:

drawings, it could be whatever, right?

 

Prasanna:

A, a perfect example of this is, let's say I work for a firm where my job is

 

Prasanna:

to design stuff, design widgets, and.

 

Prasanna:

What goes into designing those widgets?

 

Prasanna:

As I work there, I'm going to create drawings.

 

Prasanna:

I'm gonna create, uh, do we still call them CAD drawings?

 

Prasanna:

I don't if we even do we still, yeah, we

 

Prasanna:

accurate.

 

Prasanna:

I think it's still accurate.

 

Prasanna:

W. Curtis Preston: Yeah, I think so too.

 

Prasanna:

Um, I'm gonna create drawings of what I'm doing.

 

Prasanna:

I'm gonna create conceptual drawings, perhaps, uh, actual physical drawings.

 

Prasanna:

I'm going to have emails where I go back and forth, Prasanna,

 

Prasanna:

what do you think of this?

 

Prasanna:

And you're like, yes, I like it.

 

Prasanna:

I'm gonna have documents where I describe the requirements

 

Prasanna:

for the widget that I'm making.

 

Prasanna:

Can you think of anything else that I might

 

Prasanna:

It's like meeting recordings.

 

Prasanna:

W. Curtis Preston: Meeting recordings.

 

Prasanna:

Absolutely.

 

Prasanna:

Meeting notes where we discuss, uh, what I'm up to now, why

 

Prasanna:

would all of that matter?

 

Prasanna:

Because it's, let's say a couple years down the road and I've left

 

Prasanna:

the company and I've gone to a competitor, and suddenly the competitor

 

Prasanna:

comes out with an identical widget.

 

Prasanna:

And, uh, my former employer is now accusing me of stealing

 

Prasanna:

intellectual property.

 

Prasanna:

And what, so what they want is they want proof that I worked

 

Prasanna:

on that widget at this company.

 

Prasanna:

And so, um, they're going to, they want all these things.

 

Prasanna:

They want all the emails, they want all the documents, they want all the CAD

 

Prasanna:

drawings, they want all the meeting notes.

 

Prasanna:

They want all of these things.

 

Prasanna:

And they, they all have, and the widget, you know, the widget has a name.

 

Prasanna:

Right?

 

Prasanna:

That's one of the reasons we have project names.

 

Prasanna:

Right.

 

Prasanna:

Uh, the widget has a name and so we just want every email where I mention that

 

Prasanna:

project name, I want every drawing with it's named after it, et cetera, et cetera.

 

Prasanna:

Or just maybe if we, if we wanna cast a wide net, we just want

 

Prasanna:

any CAD drawings that Curtis did from this time to this time.

 

Prasanna:

Can you think of anything else?

 

Prasanna:

One other key aspect is that project may not exist on any production

 

Prasanna:

storage system anywhere else, right?

 

Prasanna:

The only copy may exist in this archive system.

 

Prasanna:

And you don't even know what system it initially existed on.

 

Prasanna:

Maybe that system has been retired, right?

 

Prasanna:

You don't know.

 

Prasanna:

You don't care.

 

Prasanna:

What you're really focused on is finding everything associated with this project,

 

Prasanna:

or like you mentioned, any CAD drawing that Curtis did between this timeframe.

 

Prasanna:

W. Curtis Preston: Yeah, exactly, because this may be happening.

 

Prasanna:

A few months, a few years, many years later, you don't have any knowledge

 

Prasanna:

of the servers or the applications.

 

Prasanna:

You don't remember if you were using Google Drive or perhaps maybe there was

 

Prasanna:

an outage in Google Drive and you switched over to Microsoft 365 or the vice versa.

 

Prasanna:

And you, you don't remember those things, and that's why you have an archive.

 

Prasanna:

And you can go in and just ask for information.

 

Prasanna:

Show me, uh, drawings that, um, you know, show me the CAD drawings.

 

Prasanna:

Show me the emails, show me the, um, uh, any documents that Curtis was working

 

Prasanna:

on that have these phrases in them.

 

Prasanna:

Um, because that is, um.

 

Prasanna:

The, the, the thing that you touched on is a really important part is that

 

Prasanna:

you don't know where this stuff is.

 

Prasanna:

The other thing is, it's not a point in time, it is a range of time, right?

 

Prasanna:

It's going from basically anything Curtis worked on in 2023 and, um,

 

Prasanna:

because again, we've accused him of stealing an an electric property.

 

Prasanna:

We wanna show that he created it here.

 

Prasanna:

So we wanna just see everything.

 

Prasanna:

Uh, and then you have a team once you do that.

 

Prasanna:

I think the other key though with the archive is, especially depending

 

Prasanna:

on the system, it doesn't matter if you, Curtis had created a, like a working

 

Prasanna:

document and then deleted it, right?

 

Prasanna:

It doesn't matter any of that, because as long as the archive

 

Prasanna:

captured that copy, it's still there.

 

Prasanna:

Regardless of what you do and the time, if you delete it or whatever else, the

 

Prasanna:

archive holds that copy, so it's there,

 

Prasanna:

W. Curtis Preston: Right, right.

 

Prasanna:

So you have all this documents, right?

 

Prasanna:

From 2023, everything.

 

Prasanna:

Now you need some ability to go through, right?

 

Prasanna:

And usually you will have your legal team or someone else go through and

 

Prasanna:

sort of filter and really figure out, okay, what is relevant, what is not?

 

Prasanna:

Because there might be thousands of documents that Curtis created in 2023,

 

Prasanna:

and so now you need that ability to figure out, okay, which are the ones

 

Prasanna:

that are relevant for this IP case versus which are the ones that are not?

 

Prasanna:

Because you don't, you wanna find the needles in the haystack, right?

 

Prasanna:

Not give everything over to, for discovery purposes, right?

 

Prasanna:

For legal discovery purposes.

 

Prasanna:

W. Curtis Preston: So we've talked primarily about e-discovery as one of the

 

Prasanna:

reasons we archive, and that is probably.

 

Prasanna:

The primary reason I think many companies archive, they may actually have a

 

Prasanna:

regulatory requirement to archive, to basically be able to show any version,

 

Prasanna:

any, um, you know, any conversation.

 

Prasanna:

I know that like, for example, I.

 

Prasanna:

Uh, financial trading firms have to show any conversations with customers.

 

Prasanna:

So they archive every conversation with a customer, whether it's audio

 

Prasanna:

or text or, you know, anything.

 

Prasanna:

Like they have to archive that so that they can then search it later.

 

Prasanna:

When you, uh, when the customer said that you promised them a

 

Prasanna:

certain financial return and you're like, uh, never did that.

 

Prasanna:

Right?

 

Prasanna:

And then it's like, uh, yeah, actually you totally did that.

 

Prasanna:

I think that probably is the primary reason many people archive, but

 

Prasanna:

if we look at the other reason to do archive is storage management.

 

Prasanna:

You wanna talk about that?

 

Prasanna:

Yeah.

 

Prasanna:

So before we talk about why you would archive to get storage management,

 

Prasanna:

I think it's important, like you mentioned to talk about the cost.

 

Prasanna:

So production data, right?

 

Prasanna:

Sitting on tier one storage, it's very expensive, and as data ages typically

 

Prasanna:

the value of that data starts to reduce.

 

Prasanna:

So a project that you worked on three years ago, you're probably not gonna go

 

Prasanna:

back and touch it, think about it, right?

 

Prasanna:

But at the same time, you might have need to access it maybe at

 

Prasanna:

some point in time, or like the example Curtis you gave, right?

 

Prasanna:

It's a widget that you created no longer really needed.

 

Prasanna:

You don't need to keep all that data around.

 

Prasanna:

So you'll keep like the final version of the widget, but you don't need

 

Prasanna:

all the working copies and working examples that you created along the way.

 

Prasanna:

And so.

 

Prasanna:

You wanna save all that space.

 

Prasanna:

So what you could do is there are different solutions most people

 

Prasanna:

would go about and say, okay, let's archive this project and move it

 

Prasanna:

to cheaper lower cost storage that doesn't need that high performance.

 

Prasanna:

At the same time, allowing me to search and find it because that

 

Prasanna:

is critical for these use cases.

 

Prasanna:

So what people do is they would want to archive old data sets, things

 

Prasanna:

that they don't actively need.

 

Prasanna:

And so you move it off to a archive system that allows you to store at a much lower

 

Prasanna:

cost than the tier one production storage.

 

Prasanna:

And typically they give you this additional functionality like

 

Prasanna:

being able to do the searches that we had talked about before.

 

Prasanna:

Now one of the challenges with sort of archive systems, right, and moving

 

Prasanna:

the data is you have to be able to identify the data before you can move it.

 

Prasanna:

And for some organizations that's very difficult to do.

 

Prasanna:

And so you will see a lot of times where people are like, okay, I'm just gonna keep

 

Prasanna:

things on my primary tier one storage.

 

Prasanna:

But within that system they have other tiers of storage.

 

Prasanna:

Like I can move it off to, uh.

 

Prasanna:

Serial a, uh, a TA disc or I could move it to object storage and the storage

 

Prasanna:

array itself will automatically deal, deal with all that tiering for me.

 

Prasanna:

So I don't need to worry about it.

 

Prasanna:

It's still all seamless, but it is important to note that that's not archive.

 

Prasanna:

You don't get all the capabilities to be able to find your data.

 

Prasanna:

You don't have that protection necessarily to make sure a user

 

Prasanna:

doesn't go manually delete the data.

 

Prasanna:

You don't have that ability to search all copies of all the data.

 

Prasanna:

It's really more like a backup system that tries to emulate

 

Prasanna:

archive, which we've talked about before, is not an archive system.

 

Prasanna:

W. Curtis Preston: Yeah, I would describe, I mean, you know, you can, you can do,

 

Prasanna:

basically one of the things that people do is they maintain the same structure.

 

Prasanna:

Of what they have in the primary side.

 

Prasanna:

They just move it into a less expensive place.

 

Prasanna:

They move it from S3 to Glacier Deep Archive.

 

Prasanna:

Right.

 

Prasanna:

And you can, you can maintain the same structure and, and you're right, that's

 

Prasanna:

long-term retention that isn't archive.

 

Prasanna:

The idea behind archive is again, just like archive isn't backup long-term

 

Prasanna:

retention isn't archive either.

 

Prasanna:

And there are backup systems that just move old data, old backups out to long,

 

Prasanna:

you know, to less expensive storage.

 

Prasanna:

And if you actually need those backups, then it, it brings it back again.

 

Prasanna:

That's more kind of an HSM style thing than it is archive.

 

Prasanna:

That would be hierarchical storage management, for those of you that,

 

Prasanna:

that haven't used that term before.

 

Prasanna:

Um, but a true archive.

 

Prasanna:

Is going to allow you to bring it back in a different way.

 

Prasanna:

Right?

 

Prasanna:

So we talked about earlier, so this is sort of the most basic type of

 

Prasanna:

archive, is we, we do that search, we identify all of these files, all of

 

Prasanna:

the CAD drawings, all of the emails, all of the documents, all of the phone

 

Prasanna:

calls, all of the bi, you know what, all of the information having to do with

 

Prasanna:

the, um, you know, the Booyah widget.

 

Prasanna:

And then we archive that into a, uh, essentially a digital box.

 

Prasanna:

That's the way I like to call it.

 

Prasanna:

And we put all of it together so that when it's five years down the road and

 

Prasanna:

somebody says, you know, this reminds me a lot, this is one of the things we call,

 

Prasanna:

you know, institutional knowledge, right?

 

Prasanna:

This reminds me a lot of that thing Curtis was working on back in 2023.

 

Prasanna:

Remember that?

 

Prasanna:

What was it called?

 

Prasanna:

The Booya Project, right?

 

Prasanna:

And then you go to the archive system and you search on Booya and poof,

 

Prasanna:

there are all the emails, all the, you know, all the, um, whatever, all

 

Prasanna:

of this stuff having to do with this.

 

Prasanna:

I, one of the things I like to liken this to, have you ever watched a case,

 

Prasanna:

uh, an episode of the show, cold Case?

 

Prasanna:

You ever watch that?

 

Prasanna:

Okay,

 

Prasanna:

That's probably the one show like detective show that

 

Prasanna:

I've never seen or crime show.

 

Prasanna:

W. Curtis Preston: Yeah, so in cold case, every episode of Cold Case

 

Prasanna:

involves this warehouse where they have these boxes, you know those

 

Prasanna:

like off the kind of office type

 

Prasanna:

boxes?

 

Prasanna:

Yep.

 

Prasanna:

W. Curtis Preston: Yeah, the file folder boxes.

 

Prasanna:

Right.

 

Prasanna:

And um, and basically it's like.

 

Prasanna:

You know, Steve got murdered and there's a Steve murdered Steve box, right?

 

Prasanna:

And it'll say like, it'll have like date and time and basically they put

 

Prasanna:

all of the stuff, all of the evidence, all of the notes, all of the stuff,

 

Prasanna:

and they put that into a box and they put it on the cold case shelf.

 

Prasanna:

And then somebody's pulling that out.

 

Prasanna:

And again, they get all of the stuff.

 

Prasanna:

This is a digital version of that, so that when you remember and you say, um.

 

Prasanna:

Uh, you know, we want, we want to go back to that project, and

 

Prasanna:

I have a perfect example of this.

 

Prasanna:

I used to work for a, or I did some consulting work for a satellite

 

Prasanna:

company and that satellite company, once they, they used to make a lot of

 

Prasanna:

satellites for China, and then some time passed, some significant amount

 

Prasanna:

of time passed like several years.

 

Prasanna:

And then, uh, China came back and said, you, you remember that stuff

 

Prasanna:

that you made for us back in 1998?

 

Prasanna:

You remember those satellites?

 

Prasanna:

We want 20 more of them.

 

Prasanna:

We, we don't want any improvements.

 

Prasanna:

We just want, like, we want exactly the ones, those worked out really well.

 

Prasanna:

We want a, you know, we want a bunch of them, you know, we want 20 more.

 

Prasanna:

And they had an archive system.

 

Prasanna:

They were able to pull up all those drawings and then poof.

 

Prasanna:

And then just produce, uh, essentially carbon copies of what they made.

 

Prasanna:

That's.

 

Prasanna:

Sort of the old school archive and you, you need to be able to attach

 

Prasanna:

metadata to it, a project name, other things, so that when you're searching

 

Prasanna:

for it, you're able to find it.

 

Prasanna:

Yeah.

 

Prasanna:

Or another example is animation studios, right?

 

Prasanna:

Typically when you make a sequel, you're going back and pulling up old

 

Prasanna:

frames and old animations to reuse again, and so being able to quickly

 

Prasanna:

find those right and pull them back up saves your animators so much time

 

Prasanna:

rather than them trying to recreate it.

 

Prasanna:

W. Curtis Preston: Yeah, absolutely.

 

Prasanna:

And the other, uh, I, I know that w with, you know, we, we

 

Prasanna:

have some insight into that.

 

Prasanna:

We've had somebody on here, uh, who, who's, who's worked there, Jeff Rochlin.

 

Prasanna:

And, you know, we know that, that they do both sort of cloud versions.

 

Prasanna:

They also do like hard copy versions.

 

Prasanna:

They use optical media.

 

Prasanna:

Because they know they don't wanna lose it.

 

Prasanna:

Right?

 

Prasanna:

So they have multiple copies, so then they get back there.

 

Prasanna:

So that is one purpose of, uh, archive.

 

Prasanna:

But I, I'd say the more, do you agree with me that the more

 

Prasanna:

common reason is that e-discovery

 

Prasanna:

I think so.

 

Prasanna:

I think so.

 

Prasanna:

I think just going to storage management.

 

Prasanna:

I think a lot of people see storage management or storage costs as

 

Prasanna:

cheap enough not to hinder their users from moving the data and

 

Prasanna:

then them going, having to go and try to figure that all out.

 

Prasanna:

That I think it's not as big of a motivation motivator as it is from

 

Prasanna:

a compliance and e-discovery case.

 

Prasanna:

W. Curtis Preston: Being told, you know, you need to do that

 

Prasanna:

from a compliance standpoint.

 

Prasanna:

And so, and you touched on this a little bit earlier in that.

 

Prasanna:

You were saying that if people like create something and delete something, it would

 

Prasanna:

still be in the archive And, and I think that's an important distinction because

 

Prasanna:

we need to talk about the way this works.

 

Prasanna:

A basically, a real time archive system is the only way that's gonna happen because

 

Prasanna:

if you're just running batch archives.

 

Prasanna:

Let's say just in the same way we, we run batch backups at the end of the night.

 

Prasanna:

If you're just running batch archives, you wouldn't get those

 

Prasanna:

intermediate, uh, versions, uh, or, or files that, um, you know, emails.

 

Prasanna:

Like you, you, you might send an email and then you realize, oh, I said something.

 

Prasanna:

And go delete it.

 

Prasanna:

W. Curtis Preston: in that email, then you go delete the email.

 

Prasanna:

But if you have an actual email archiving system, it's watching and it's going to

 

Prasanna:

archive every single email that goes out and every single email that comes in.

 

Prasanna:

You know what this reminds me of?

 

Prasanna:

W. Curtis Preston: go ahead.

 

Prasanna:

This reminds me of our CDP discussions because this is

 

Prasanna:

literally what you want is you want CDP for an archive use case.

 

Prasanna:

W. Curtis Preston: Yeah, exactly.

 

Prasanna:

You are essentially doing, uh, that's a continuous data protection

 

Prasanna:

for those that missed that episode.

 

Prasanna:

And it is, you, you definitely want, um, sort of, you know, it's, it's real time.

 

Prasanna:

It can be it, it can be asynchronous, right?

 

Prasanna:

But it's real time replication of every single.

 

Prasanna:

Whatever it is that you're archiving, you do file system archiving.

 

Prasanna:

You would need to do file system archiving.

 

Prasanna:

You would need some sort of plugin to the file system to be notified of any

 

Prasanna:

file changes, which I would assume would be available from most filers.

 

Prasanna:

Like it would be something that you would plug into a virus detection program.

 

Prasanna:

Right,

 

Prasanna:

And it doesn't have to be every single change.

 

Prasanna:

It's only when those changes have been committed, right?

 

Prasanna:

So when you close

 

Prasanna:

W. Curtis Preston: right, right.

 

Prasanna:

Right.

 

Prasanna:

Um, and so the idea is that you would, uh, you, you're, you're

 

Prasanna:

storing that every single thing that goes out, both good and bad.

 

Prasanna:

And, um, and so, and when you have a, this is sort of, I would call this

 

Prasanna:

like a real archive system, right?

 

Prasanna:

So like it's, if it's four e discovery, you need to be able

 

Prasanna:

to go in there and there are.

 

Prasanna:

I don't know, 30, 40 different pieces of metadata attached to any particular

 

Prasanna:

item, object, whatever you wanna call it.

 

Prasanna:

Right?

 

Prasanna:

Obviously there, these include things like the author, the, if there's a, a document

 

Prasanna:

name or a subject name to an email.

 

Prasanna:

If there's, uh, if it's an email, who it was sent from, who it was sent to,

 

Prasanna:

the date it was sent, uh, the content of the email itself or the document itself.

 

Prasanna:

Um, what, what else, what can you think of other

 

Prasanna:

I was thinking like if it's a document that was shared,

 

Prasanna:

like who else it was shared with.

 

Prasanna:

Who had access?

 

Prasanna:

When they had access?

 

Prasanna:

Who left comments?

 

Prasanna:

Who made updates?

 

Prasanna:

W. Curtis Preston: Right.

 

Prasanna:

All of that stuff would be shared or stored in a realtime, you know, in a

 

Prasanna:

realtime archive system so that you can say, I wanna see all the documents

 

Prasanna:

that Curtis created, and I wanna see everybody that saw those documents.

 

Prasanna:

Right?

 

Prasanna:

So, so you can search by, you know, document owner, document creator,

 

Prasanna:

uh, email owner and creator, and, um.

 

Prasanna:

Let's say you have an employee who has, uh, accused a company of a hostile

 

Prasanna:

work environment, and they say, uh, I got all these emails from all kinds

 

Prasanna:

of people saying all kinds of things.

 

Prasanna:

Okay, well, show me all the emails that were sent to this person.

 

Prasanna:

Over the time that they worked here.

 

Prasanna:

And then we're gonna, and then we'll do calling again, right?

 

Prasanna:

We'll go in and we'll do, we'll do one big search to pull and just get

 

Prasanna:

a giant pile of emails, all of the emails that were sent to this person.

 

Prasanna:

And then we go in and we search for phrases and words.

 

Prasanna:

Uh, that should not be in a business email.

 

Prasanna:

Um.

 

Prasanna:

And so that's why it's, that's why it's a two step phrase or a two step

 

Prasanna:

process, not, you don't want to do 10 e-discovery polls against the, the email.

 

Prasanna:

Um, so why don't we, um, why don't we do this with a backup system?

 

Prasanna:

because backup serves a different purpose, right?

 

Prasanna:

It's intended to restore data, not retrieve data.

 

Prasanna:

And there are some systems that try to do this and mix the two

 

Prasanna:

W. Curtis Preston: Probably the one that comes to my mind the

 

Prasanna:

most would be CommVault, right?

 

Prasanna:

They have a common technology engine between backup and archive.

 

Prasanna:

Um, I don't, and I, and I'm not saying this at like, I.

 

Prasanna:

I'm not saying that they don't do it, I just, I haven't

 

Prasanna:

spoken to anyone who does both.

 

Prasanna:

Theoretically it could be done with a single system and they do claim

 

Prasanna:

to do it with a single system.

 

Prasanna:

I just don't know anybody that does that.

 

Prasanna:

Uh, I'd love to hear from anybody that is using either Commvault or

 

Prasanna:

anything else that is doing both backup and archive with a single system.

 

Prasanna:

Again, it's, it's, it's theoretically possible, but generally speaking,

 

Prasanna:

backup systems don't store the data.

 

Prasanna:

In a way that, um, you, you can't search against it like that.

 

Prasanna:

The, the first thing a backup system is want to, okay.

 

Prasanna:

What system are you restoring?

 

Prasanna:

Uh, I don't know.

 

Prasanna:

Right.

 

Prasanna:

I'm just looking for emails.

 

Prasanna:

Oh, so you wanna restore the email server?

 

Prasanna:

What's the email server's name?

 

Prasanna:

Uh, I don't know.

 

Prasanna:

It was five years ago.

 

Prasanna:

I don't know.

 

Prasanna:

Um, and, uh, you know, part of the way we were on, we were on, uh, Google, uh, email

 

Prasanna:

and then we switched over to Microsoft 365 and then we were, you know, um, a

 

Prasanna:

good email system would have all of the email from all of those, regardless of

 

Prasanna:

which hosting provider you were using.

 

Prasanna:

The other thing also is that for a lot of these use cases, it's also

 

Prasanna:

that full text search index, right?

 

Prasanna:

To be able to find not many backup products can do the full text

 

Prasanna:

search across all your different data sets and data types, right?

 

Prasanna:

There are e-discovery products that allow you to search what people said in a video.

 

Prasanna:

And show that as, Hey, who was Curtis talking to in this

 

Prasanna:

meeting that got recorded?

 

Prasanna:

And be able to pull up the phrases and look at the transcripts.

 

Prasanna:

W. Curtis Preston: Yeah, that is a a really good point.

 

Prasanna:

There are some limited backup systems that are able to restore a

 

Prasanna:

file based on its full text, right?

 

Prasanna:

You can search against the full text of a file, but again, you're

 

Prasanna:

gonna find one file, right?

 

Prasanna:

Um, the, and again, generally speaking, you start with the system you're

 

Prasanna:

restoring and then you, and then you, uh, call if you will from there.

 

Prasanna:

Whereas this, this is casting an archive system is catching a

 

Prasanna:

much, casting a much wider net.

 

Prasanna:

The one question I had for you, Curtis, is

 

Prasanna:

W. Curtis Preston: Yeah.

 

Prasanna:

typically who operates the archive system versus who

 

Prasanna:

operates the backup system?

 

Prasanna:

W. Curtis Preston: Well, it's going to, the answer to that

 

Prasanna:

question will be de will depend on.

 

Prasanna:

Whether we did this for storage management purposes or free discovery

 

Prasanna:

purposes, if we're doing it for storage management purposes, uh, it

 

Prasanna:

can be just about anybody, right?

 

Prasanna:

That is qualified to operate it.

 

Prasanna:

But if we're doing it for e-discovery purposes or compliance purposes, it's

 

Prasanna:

going to be someone that is specifically trained in compliance and to make sure

 

Prasanna:

that they have the right requirements to make sure that you have both the.

 

Prasanna:

The initial creation of the archive and then the retention of the archive because

 

Prasanna:

there may actually be laws and rules on.

 

Prasanna:

How long you can re uh, actually retain certain amounts of data.

 

Prasanna:

And you may be told, uh, for whatever reason, a legal reason.

 

Prasanna:

There's certainly the legal hold reason to, you have to keep this

 

Prasanna:

data, but there may also be a legal reason why, where you're told to

 

Prasanna:

get rid of a certain set of data.

 

Prasanna:

I can think of things like GDPR.

 

Prasanna:

Most of the time in the case of GDPR, for example, and CCPA, if you have a business

 

Prasanna:

reason to hold that data, you, you can.

 

Prasanna:

But there still may be a scenario where you're told that you need to, um, you

 

Prasanna:

know, get rid of a certain set of data.

 

Prasanna:

And this is all very specific, um, compliance related things that.

 

Prasanna:

You and I and people that think like you, and I don't necessarily think

 

Prasanna:

about, uh, on a day-to-day basis.

 

Prasanna:

No, I think that's important because that just goes back to why backup

 

Prasanna:

and archive systems may not come together is because they are different teams

 

Prasanna:

W. Curtis Preston: Yeah.

 

Prasanna:

I.

 

Prasanna:

Yeah, a backup person, you know, a storage management person, a, a

 

Prasanna:

typical system administrator, can handle the u the a backup system.

 

Prasanna:

They quite possibly are not qualified to handle a full, uh, archive

 

Prasanna:

system, especially if it's one that's done for compliance purposes.

 

Prasanna:

aNd I'll just, I'll just end this with telling my favorite.

 

Prasanna:

Here's what can happen if you need an email archive and you don't have one.

 

Prasanna:

Uh, I worked for a consulting company.

 

Prasanna:

It was a big consulting company that had, I don't know, they had like a few

 

Prasanna:

hundred, uh, consultants and we were hired by, um, a actually have two stories.

 

Prasanna:

I, I, I, I think I know which story you're going to go to.

 

Prasanna:

Yeah.

 

Prasanna:

W. Curtis Preston: we were hired by a company that.

 

Prasanna:

Needed.

 

Prasanna:

They got, they got an e-discovery against their email, and what

 

Prasanna:

they had was a weekly full backup from their, uh, email in exchange.

 

Prasanna:

And we, basically, what that meant was because they, they

 

Prasanna:

just had a weekly backup.

 

Prasanna:

They didn't have an archive.

 

Prasanna:

It ended up costing them well over a million dollars in consulting

 

Prasanna:

time because we needed like this team of like 15 people.

 

Prasanna:

It was like a three teams of five that were working.

 

Prasanna:

24 hours a day, um, to, to be able to do these.

 

Prasanna:

Because what it meant was you restore, exchange to this week, extract the

 

Prasanna:

stuff you want, then you wipe that and you restore it to next week.

 

Prasanna:

And it was just this very, very, uh, difficult process.

 

Prasanna:

But the other one is this famous case.

 

Prasanna:

And I don't wanna say the company because of what, I don't wanna

 

Prasanna:

get the company wrong, but it was a large financial trading firm.

 

Prasanna:

And the, what happened was it became infamous because they, they

 

Prasanna:

didn't have an email archive system.

 

Prasanna:

They had backup system and, and it wasn't well maintained.

 

Prasanna:

And they, they changed things over time and so it took

 

Prasanna:

them forever to satisfy the.

 

Prasanna:

The electronic discovery request to get the emails that, that the

 

Prasanna:

plaintiff in this case was looking for.

 

Prasanna:

And then at some point you then tell the judge I.

 

Prasanna:

We're done.

 

Prasanna:

We, we, we, you know, we've satisfied the discovery request.

 

Prasanna:

And then a little bit later they came back and they're like, sorry, judge.

 

Prasanna:

Uh, we found this other box of tape.

 

Prasanna:

Right?

 

Prasanna:

And at that point, it had already gone on a really long time.

 

Prasanna:

And then they'd said they were done and then they, it

 

Prasanna:

turned out they weren't done.

 

Prasanna:

The judge ended up.

 

Prasanna:

Issuing what's called an adverse inference instruction, where where

 

Prasanna:

they said basically whatever.

 

Prasanna:

What it is, is it's an instruction from the judge that infers something

 

Prasanna:

that is adverse to your case, hence the term adverse inference instruction.

 

Prasanna:

So they basically said whatever the plaintiff says is on the tapes.

 

Prasanna:

It's on the tapes because no one could possibly be this

 

Prasanna:

bad at retrieving their data.

 

Prasanna:

And so they must be doing it on purpose.

 

Prasanna:

They're trying to hide something and boom, they lost a, it was like two billion

 

Prasanna:

dollar lawsuit as a result of simply not having an email archive system.

 

Prasanna:

W. Curtis Preston (2): Since recording this episode a month ago,

 

Prasanna:

I actually learned about an 11 year old company that is built a nice

 

Prasanna:

business that among other things.

 

Prasanna:

Helps companies who used their backup systems as archive systems.

 

Prasanna:

You know, the thing I tell you not to do.

 

Prasanna:

So if you're trying to do e-discovery using your old backup tapes,

 

Prasanna:

they'll do it for you as a service, saving you time and money on the

 

Prasanna:

extraction and calling phases.

 

Prasanna:

And reducing the amount of data that you have to give, whatever

 

Prasanna:

e-discovery system that you're using.

 

Prasanna:

All of those charged by the gigabytes.

 

Prasanna:

So everything you can reduce there is, you know, goes in your favor.

 

Prasanna:

They also have a service to significantly reduce, remove your iron mountain

 

Prasanna:

bill while allowing you to search against any of those tapes at any time.

 

Prasanna:

They call it the intelligent tape archive.

 

Prasanna:

Also for those of you using Dell source one archive system.

 

Prasanna:

That they are sundowning.

 

Prasanna:

They've got a service for that as well.

 

Prasanna:

They can directly extract cull and store that for you as well.

 

Prasanna:

There are a very impressive company that really understands the

 

Prasanna:

litigation and e-discovery worlds.

 

Prasanna:

And they really surprised me with how easily they're able to extract

 

Prasanna:

data directly from backup tapes without needing the original software.

 

Prasanna:

Even if you managed to encrypt your backup tapes.

 

Prasanna:

Their name is Sullivan Strickler and I'll put a link in the show notes.

 

Prasanna:

Uh, in case you're interested.

 

Prasanna:

W. Curtis Preston: So if you have the need for it, if you have a compliance reason.

 

Prasanna:

Or you have other business reasons we gave you some in the early, you know,

 

Prasanna:

this idea of do you, do you want to track who made what, you know, who

 

Prasanna:

made what, when, in case you, you know, want to be able to sue them later?

 

Prasanna:

Um, just realize a good archive is a double-edged sword in that

 

Prasanna:

if you were doing something wrong.

 

Prasanna:

It is the smoking gun.

 

Prasanna:

And you, you know, uh, it will show everything that you were

 

Prasanna:

saying, uh, to, to whom and when, and, you know, all this stuff.

 

Prasanna:

So a good email archive is really only helpful if you are.

 

Prasanna:

The type of company that does that, does the right thing.

 

Prasanna:

But, but if you've got somebody in your company that's doing the wrong

 

Prasanna:

thing, the email archive will prove that and you'll lose your case.

 

Prasanna:

But honestly, uh, maybe you should anyway.

 

Prasanna:

But, but here's the thing, if you are doing the right thing, a bad, like

 

Prasanna:

using your backup system as the archive system, it can actually do you much

 

Prasanna:

more damage than, uh, you could have trouble proving that you were right.

 

Prasanna:

Even if you were, even if you were right.

 

Prasanna:

And your company did nothing wrong.

 

Prasanna:

Uh, you could lose the lawsuit.

 

Prasanna:

Any final thoughts?

 

Prasanna:

no, I think that's, yep.

 

Prasanna:

Backup is not archive and archive is not backup.

 

Prasanna:

W. Curtis Preston: Yeah, and there are two different types of archive, right?

 

Prasanna:

There's sort of the storage management reason and there's the real time.

 

Prasanna:

System that is for compliance reasons that make sure that you get a every copy

 

Prasanna:

of everything that you can search against it, uh, for the purposes of eDiscovery.

 

Prasanna:

And if you need that type, then uh, you'd better buy that type.

 

Prasanna:

Because if you ever actually need it, you're really gonna

 

Prasanna:

want that functionality.

 

Prasanna:

All right.

 

Prasanna:

Hopefully that's helped with your questions about archive.

 

Prasanna:

Thanks for joining.

 

Prasanna:

That is a wrap.

 

Prasanna:

The backup wrap up is written, recorded and produced by me w Curtis Preston.

 

Prasanna:

If you need backup or Dr.

 

Prasanna:

Consulting content generation or expert witness work,

 

Prasanna:

check out backup central.com.

 

Prasanna:

You can also find links from my O'Reilly Books on the same website.

 

Prasanna:

Remember, this is an independent podcast and any opinions that you

 

Prasanna:

hear are those of the speaker.

 

Prasanna:

And not necessarily an employer.

 

Prasanna:

Thanks for listening.