Oct. 28, 2024

Backup from Hell: SMB vs 400TB

Experience the backup from hell in this eye-opening episode of The Backup Wrap-up. What started as a straightforward 40TB backup spiraled into a months-long battle with 400TB of data, failing tape drives, and directories containing hundreds millions of files.

Host W. Curtis Preston shares his first-hand account of tackling this backup from hell, including the challenges of dealing with SMB protocol limitations, tape drive failures, and the infamous "million file problem." Learn why backing up 99 million files in a single directory isn't just challenging - it's nearly impossible over standard protocols.

Discover the solutions that finally worked, from switching to disk-based backup to implementing local tar backups. Whether you're a backup admin or IT professional, this episode offers valuable insights into handling extreme backup scenarios.

Transcript

Speaker:
00:00:00

You found the backup wrap up your go-to podcast for all things

Speaker:
00:00:03

backup recovery and cyber recovery.

Speaker:
00:00:06

In this episode, you'll hear the harrowing tale of what I'm

Speaker:
00:00:09

calling the backup from hell.

Speaker:
00:00:12

A project that started as a simple one-time backup, a 40 terabyte

Speaker:
00:00:16

of two sonology boxes that turned into a 400 terabyte nightmare

Speaker:
00:00:20

that took months to complete.

Speaker:
00:00:22

We're talking hundreds of millions of files with one directory alone

Speaker:
00:00:27

containing 99 million of them.

Speaker:
00:00:29

I'll share how I dealt with failing tape drives ridiculously slow

Speaker:
00:00:33

backup speeds, and ultimate solution that finally got the job done.

Speaker:
00:00:38

If you've ever wondered what happens when everything that could go wrong

Speaker:
00:00:41

with the backup actually goes wrong.

Speaker:
00:00:44

This episode is for you, plus you'll learn some valuable lessons about what to check

Speaker:
00:00:48

before starting a massive backup job.

Speaker:
00:00:52

By the way, if you don't know who I am, I'm w Curtis Preston, AKA, Mr.

Speaker:
00:00:56

Backup, and I've been passionate about backup and recovery for

Speaker:
00:01:00

over 30 years, ever since.

Speaker:
00:01:02

I had to tell my boss that we had no backups of the production

Speaker:
00:01:05

database that we just lost.

Speaker:
00:01:07

I don't want that to happen to you, and that's why I do this show.

Speaker:
00:01:11

On this podcast, we turn unappreciated backup admins into Cyber Recovery Heroes.

Speaker:
00:01:16

This is the backup wrap up.

Speaker:
00:01:32

Welcome to the show, and if I could ask you to just take one quick second

Speaker:
00:01:36

and, uh, subscribe or follow us so you can make sure that you get all of this

Speaker:
00:01:40

great content, that would be great.

Speaker:
00:01:44

I'm w Curtis Preston, AKA, Mr.

Speaker:
00:01:46

Backup, and I have with me a guy that apparently owes Ben Kingsley

Speaker:
00:01:51

a huge apology Prasanna Malaiyandin

Speaker:
00:01:55

how's it going?

Speaker:
00:01:56

Prasanna, why do you owe

Speaker:
00:01:58

an apology?

Speaker:
00:01:59

so as everyone's probably like, who's Ben Kingsley.

Speaker:
00:02:03

So if you don't know, he is an actor and he also played Gandhi in the movie Gandhi.

Speaker:
00:02:09

He did.

Speaker:
00:02:09

Right?

Speaker:
00:02:11

And for the longest time I was a little, not upset, but like the fact that you have

Speaker:
00:02:18

like probably one of the most important Indian people in history being played

Speaker:
00:02:25

By a guy with the name Ben Kingsley.

Speaker:
00:02:27

Exactly.

Speaker:
00:02:28

Yeah.

Speaker:
00:02:29

Ben Kingsley.

Speaker:
00:02:31

And so today I found out that Ben Kingsley is actually Indian.

Speaker:
00:02:37

Half

Speaker:
00:02:37

How about that?

Speaker:
00:02:38

should say.

Speaker:
00:02:39

Yeah,

Speaker:
00:02:40

what?

Speaker:
00:02:41

he's Anglo Indian.

Speaker:
00:02:42

Anglo Indian.

Speaker:
00:02:43

Yes.

Speaker:
00:02:44

It's like us.

Speaker:
00:02:45

You and me we're Indian.

Speaker:
00:02:48

so his paternal side is from Gujarat.

Speaker:
00:02:53

Right.

Speaker:
00:02:54

And his mom's side I think is European.

Speaker:
00:02:57

His dad was a physician who was born in Kenya.

Speaker:
00:03:00

And Ben Kingsley's name is not actually Ben Kingsley.

Speaker:
00:03:04

It's like Krishna Bunge, I think

Speaker:
00:03:07

Yeah.

Speaker:
00:03:09

Yeah.

Speaker:
00:03:09

Yeah.

Speaker:
00:03:09

And he realized that he wasn't getting called into the right casting

Speaker:
00:03:13

roles when he was looking for, when he was starting off his career.

Speaker:
00:03:17

So he is like, let me change my name.

Speaker:
00:03:19

And so we changed his name to Ben Kingsley and people started calling

Speaker:
00:03:22

him in and he started getting roles.

Speaker:
00:03:24

Racism in early Hollywood say, it isn't so.

Speaker:
00:03:28

Racism in current Hollywood.

Speaker:
00:03:30

Say it isn't So, Wouldn't be the only to do so.

Speaker:
00:03:34

Yeah.

Speaker:
00:03:35

yeah, so I apologize to Sir Ben Kingsley, uh, for all these years.

Speaker:
00:03:43

Yeah.

Speaker:
00:03:43

You were putting it in the same category as the quote unquote

Speaker:
00:03:46

Indian guy from the Short Circuit movie, which I don't know his name,

Speaker:
00:03:50

but he is very much not an Indian

Speaker:
00:03:53

person.

Speaker:
00:03:54

Do you know who it was?

Speaker:
00:03:55

the name?

Speaker:
00:03:56

I'm looking it up.

Speaker:
00:03:57

Or it's also like how Apu from, uh, the Simpsons is not Indian,

Speaker:
00:04:03

Yeah, he's, he's played by, um.

Speaker:
00:04:07

Oh, I know that.

Speaker:
00:04:08

I know the actor, but his name is escaping me.

Speaker:
00:04:11

So Fisher Stevens, is that

Speaker:
00:04:12

Fisher Stevens.

Speaker:
00:04:13

Yeah, Fisher Stevens.

Speaker:
00:04:15

Who?

Speaker:
00:04:15

Those of you that watch succession

Speaker:
00:04:18

will, uh, uh, Fisher Stevens was in succession.

Speaker:
00:04:22

He was, he was a, a lawyer, a a smarmy lawyer, which

Speaker:
00:04:26

always plays smarmy characters

Speaker:
00:04:28

yeah, I was just thinking, because I remember him from the blacklist

Speaker:
00:04:31

where he plays Marvin, the lawyer.

Speaker:
00:04:32

Yeah, got, he's got kind of the lawyer face.

Speaker:
00:04:36

I'm glad that you, you finally realized the error of your ways.

Speaker:
00:04:39

But did you know he was

Speaker:
00:04:41

No, no, I didn't.

Speaker:
00:04:43

I guess I always brought it up just like you, like I would bring

Speaker:
00:04:47

Ben Kingsley playing Gandhi and, um, as just another example of, uh, you

Speaker:
00:04:52

know, what would we call it, brown face, I guess we'd call it brown face.

Speaker:
00:04:57

Yeah.

Speaker:
00:04:58

Yeah.

Speaker:
00:04:58

But people taking actor and there's been a lot of those great roles throughout the

Speaker:
00:05:04

Great.

Speaker:
00:05:05

You know, great roles played by very not,

Speaker:
00:05:09

you know, people that are not of that ethnic group.

Speaker:
00:05:11

Yeah.

Speaker:
00:05:12

and I think maybe also at the time, right, there weren't many

Speaker:
00:05:16

Indian actors in Hollywood at all.

Speaker:
00:05:20

And I would rather have the fact, or I would rather it like the movie be made

Speaker:
00:05:26

with someone who is non-Indian, rather, because it's a great movie.

Speaker:
00:05:30

don't know.

Speaker:
00:05:30

You've seen it,

Speaker:
00:05:31

good movie.

Speaker:
00:05:31

Yeah.

Speaker:
00:05:32

Yeah.

Speaker:
00:05:32

So I would rather have that rather than not having the movie at all.

Speaker:
00:05:36

Hmm, I see what you're saying.

Speaker:
00:05:38

I see what you're saying.

Speaker:
00:05:39

Yeah.

Speaker:
00:05:39

And of course, you know, we have the same challenge with, uh, Asian, uh, actors,

Speaker:
00:05:45

right?

Speaker:
00:05:45

Uh, there's literally only three Chinese actors in all of Hollywood.

Speaker:
00:05:49

Like if you, if you look at like the Chinese roles, they've gone to

Speaker:
00:05:53

literally like one, there's one guy.

Speaker:
00:05:56

Uh, I forgot how many roles he's had, but he has had a prolific career playing

Speaker:
00:06:02

every Chinese person that you know.

Speaker:
00:06:04

Um, but, um, anyway, so we're gonna talk about something that we've

Speaker:
00:06:11

alluded to a little bit on the podcast.

Speaker:
00:06:14

Uh, sort of tell the final saga of what I'm calling the backup from Hell.

Speaker:
00:06:22

I may maybe, uh, we should probably phrase that slightly differently.

Speaker:
00:06:27

It's probably the,

Speaker:
00:06:31

the backup that keeps giving.

Speaker:
00:06:34

the back, the backup that, yeah.

Speaker:
00:06:37

Uh, what a mess.

Speaker:
00:06:39

The beginning of the story

Speaker:
00:06:40

that I was asked to do a backup of two Synology boxes that they

Speaker:
00:06:46

were, uh, repurposing, right?

Speaker:
00:06:49

So they were, um, going to move the data.

Speaker:
00:06:52

They, they were gonna reuse these servers, but they wanted to get a backup of the, of

Speaker:
00:06:56

the, the data before they moved it off of

Speaker:
00:06:59

Backup is good.

Speaker:
00:07:00

Yeah,

Speaker:
00:07:00

Backup is good.

Speaker:
00:07:01

Yeah.

Speaker:
00:07:01

Apparently they hadn't had a backup of the, of these servers before.

Speaker:
00:07:05

And, um, then the, the, um, and, and , they said it was

Speaker:
00:07:12

about 40 terabytes of data.

Speaker:
00:07:13

That's the information that I was given and after I had started doing

Speaker:
00:07:18

the backup, I very quickly realized that 40 terabytes might have been.

Speaker:
00:07:27

An understatement.

Speaker:
00:07:30

You, found additional data around

Speaker:
00:07:31

right as you

Speaker:
00:07:33

data.

Speaker:
00:07:33

Yeah.

Speaker:
00:07:34

Uh, so it turned out that it wasn't like 40 terabytes of data.

Speaker:
00:07:38

It was more like 400 terabytes of

Speaker:
00:07:40

Yeah, and

Speaker:
00:07:42

I'm guessing because these were systems that were kind of probably off on the

Speaker:
00:07:45

side, they hadn't been used in a while.

Speaker:
00:07:48

Like that's, I think, the problem, and I think we talked about this in one of

Speaker:
00:07:52

our episodes about sort of systems that kind of get stored away in the corner.

Speaker:
00:07:58

No one worries about

Speaker:
00:07:59

it.

Speaker:
00:07:59

Right?

Speaker:
00:08:00

And do you leave it powered on your old backup systems?

Speaker:
00:08:03

Right.

Speaker:
00:08:03

We just talked about that.

Speaker:
00:08:04

And so I think that becomes a challenge.

Speaker:
00:08:06

It's when you have these systems that are no longer actively being

Speaker:
00:08:09

used, it kind of gets away from you.

Speaker:
00:08:12

Yeah.

Speaker:
00:08:13

Yeah.

Speaker:
00:08:13

And so the customer really didn't have any idea just how much data that they

Speaker:
00:08:17

were dealing with here, out to be, like I said, like close to half a petabyte of

Speaker:
00:08:21

Yeah.

Speaker:
00:08:22

And, and for you, that changes things significantly because

Speaker:
00:08:26

changes the backup design like massively.

Speaker:
00:08:28

Yeah.

Speaker:
00:08:29

Yeah.

Speaker:
00:08:29

because your backup target, I think you had mentioned previously

Speaker:
00:08:32

that it was like a server, right?

Speaker:
00:08:33

That you were backing this data up to

Speaker:
00:08:36

I was backing it up via a server, a window server.

Speaker:
00:08:40

And, um, and tape, right?

Speaker:
00:08:43

Had tape, but it's sized like, um, you know, four 40 terabytes.

Speaker:
00:08:50

And so, which is, which is basically the, the, the server and the tape

Speaker:
00:08:56

library was the perfect size for that.

Speaker:
00:08:59

But as I started realizing that it was figure, it was filling up.

Speaker:
00:09:05

And again, this, this is my fault for not really looking at the size of the

Speaker:
00:09:13

data before really jumping in there, but basically I realized very quickly

Speaker:
00:09:17

that this was a whole lot more data

Speaker:
00:09:19

than,

Speaker:
00:09:20

than,

Speaker:
00:09:21

That you expect it and and I think just kind of looking at lessons

Speaker:
00:09:25

learned as you're a backup admin who is being told, Hey, this new

Speaker:
00:09:29

application is coming online.

Speaker:
00:09:32

Make sure that you understand like what is the expected growth of that application.

Speaker:
00:09:36

Because what you size for, say, a five terabyte database with a 1% growth

Speaker:
00:09:42

is very different than like a file server with like a 50% growth rate.

Speaker:
00:09:46

Yeah, exactly.

Speaker:
00:09:48

Um, and just because somebody says they have 10 terabytes of data doesn't mean

Speaker:
00:09:51

that they have 10 terabytes of data.

Speaker:
00:09:53

So you mentioned you had a backup server, you had a tape drive.

Speaker:
00:09:57

Is there a reason you chose to use tape

Speaker:
00:10:01

Well, the, I mean, tape is great for long-term retention of data,

Speaker:
00:10:05

which is what this customer wanted.

Speaker:
00:10:08

They wanted to hold onto this data for a long period of time,

Speaker:
00:10:10

and that's where tape is great.

Speaker:
00:10:12

And tape also is has, uh, you know, if you're able to properly feed it,

Speaker:
00:10:18

tape is actually, can be quite fast.

Speaker:
00:10:21

the challenge that I had when backing up this data that for various reasons,

Speaker:
00:10:31

which I think I, I think by the end I sort of figured out the, the

Speaker:
00:10:35

core reason for various reasons.

Speaker:
00:10:39

Individual backups off of the, these filers, the, they were just

Speaker:
00:10:44

slow, just, um, you know, they were

Speaker:
00:10:47

Like how slow, slow.

Speaker:
00:10:48

like, slow, was like, like, like three and a half kilobytes a second slow.

Speaker:
00:10:58

So like slower than like a 56 K modem back

Speaker:
00:11:01

Yeah.

Speaker:
00:11:02

Right.

Speaker:
00:11:03

And you can multiplex all you want.

Speaker:
00:11:06

So first off, you know, I, I was using NetBackup, which, you know, NetBackup, it

Speaker:
00:11:11

did a great job at what we had available.

Speaker:
00:11:15

Um, the, challenge was that because I couldn't put.

Speaker:
00:11:22

The client on the filers themselves.

Speaker:
00:11:24

So the, was a way allegedly to put a, a backup client on the filer,

Speaker:
00:11:31

but I could never get that to work.

Speaker:
00:11:33

And so I had to back up over SMB because I'm backing up

Speaker:
00:11:38

over SMB, I'm just, I'm just.

Speaker:
00:11:41

I'm just limited at what that was, right?

Speaker:
00:11:42

What,

Speaker:
00:11:43

I could get, and because I'm backing up over SMB, the client

Speaker:
00:11:48

is just the backup server,

Speaker:
00:11:49

right?

Speaker:
00:11:51

So instead of running a backup from two clients, I'm running a backup from one

Speaker:
00:11:54

client because that's the backup server.

Speaker:
00:11:56

I'm backing it up over SMB.

Speaker:
00:11:58

And because of that, I'm limited to the number of jobs I can run at one time.

Speaker:
00:12:01

NetBackup, um, says 99 99 jobs, which should say, gee, that

Speaker:
00:12:07

sounds like a

Speaker:
00:12:08

Nine problems.

Speaker:
00:12:09

right?

Speaker:
00:12:11

But, but the thing is, towards the end, as I was running a lot of these backups,

Speaker:
00:12:18

the aggregate speed of like 99 backups was only like 30, 40 megabytes a second,

Speaker:
00:12:28

you

Speaker:
00:12:28

you're talking about 400 terabytes of data to

Speaker:
00:12:30

400 terabytes of data doing the math.

Speaker:
00:12:33

I backed up for months,

Speaker:
00:12:36

right?

Speaker:
00:12:36

And I tried all these different things.

Speaker:
00:12:38

Uh, you know, num, you know, was I running too many backups at a time?

Speaker:
00:12:42

Was I running not enough backups at a time?

Speaker:
00:12:44

You know, it, um, you know, and then the problem is every, every

Speaker:
00:12:48

test would take days or weeks.

Speaker:
00:12:50

Think we should mention one thing.

Speaker:
00:12:52

You were talking about these test taking days or

Speaker:
00:12:55

mm-hmm.

Speaker:
00:12:56

and then do you wanna mention sort of some of the issues you ran into with these long

Speaker:
00:13:00

running jobs just due to infrastructure or

Speaker:
00:13:06

Yeah.

Speaker:
00:13:07

other issues in the environment?

Speaker:
00:13:09

yeah, you, you backups are not made to run over weeks or months.

Speaker:
00:13:16

Just backup infrastructure isn't made to work like that.

Speaker:
00:13:20

And so when you do backups over weeks or months.

Speaker:
00:13:24

Weird things happen that, cause you know, consternation, one of the things

Speaker:
00:13:32

is LTO tape drives are great, but like we were using like the half high LTO

Speaker:
00:13:38

drives and as far as I could tell, their duty cycle was not meant to

Speaker:
00:13:43

be a hundred percent for two months.

Speaker:
00:13:46

Right.

Speaker:
00:13:46

Um, they're meant to be backed up for, you know, several hours and then give

Speaker:
00:13:51

'em a rest and then back up several hours and then give 'em a rest.

Speaker:
00:13:54

I was just beating the crap outta these things for weeks or months at a time.

Speaker:
00:13:57

And what would happen is after some significant period of time,

Speaker:
00:14:02

it would just go write error.

Speaker:
00:14:05

And that's fine when a backup runs for a few hours and then just try again.

Speaker:
00:14:08

But if you, but if it took you two weeks or three weeks to get to that point

Speaker:
00:14:12

and then you get a write error, um,

Speaker:
00:14:14

then

Speaker:
00:14:15

it's not like you could restart these jobs either, right?

Speaker:
00:14:18

I think you're running into

Speaker:
00:14:20

Yeah.

Speaker:
00:14:20

Well,

Speaker:
00:14:20

I mean, I mean, I could restart em, but, but it's like after

Speaker:
00:14:23

a period of time I became, I.

Speaker:
00:14:29

I eventually got to the point where I said, tape is not my friend.

Speaker:
00:14:33

I, anybody who

Speaker:
00:14:34

this is coming from Mr.

Speaker:
00:14:35

Backup.

Speaker:
00:14:36

know anybody who listens to this podcast knows that I am, I am a friend of tape,

Speaker:
00:14:42

right?

Speaker:
00:14:43

I believe strongly in tape for a lot of reasons, but I don't think that, uh,

Speaker:
00:14:50

specific and, and you know, maybe the, my LTO friends can chime in here, but I don't

Speaker:
00:14:54

think that these tape drives were designed to be backed up to like this for weeks and

Speaker:
00:14:59

months at a time, 24 7 with no, because as soon as one, I was multiplexing

Speaker:
00:15:04

as many backups together as I could.

Speaker:
00:15:07

And when one backup would finish, I would just add another backup onto it, right?

Speaker:
00:15:10

Because

Speaker:
00:15:11

I, I could, I could, I.

Speaker:
00:15:13

what I couldn't do is I couldn't say, well, let's do these 10 backups, let

Speaker:
00:15:18

them run until they're finished, and then we'll do the next 10 backups.

Speaker:
00:15:21

And that would've given the tape drives a, a moment to breathe, I think.

Speaker:
00:15:24

But, uh, I couldn't do that because the, because we, we just

Speaker:
00:15:32

didn't have that kind of time.

Speaker:
00:15:33

And so I

Speaker:
00:15:34

was just, I was just try, you know, tagging it

Speaker:
00:15:36

and, and I know you've always talked about like the shoe shining problem,

Speaker:
00:15:40

given that you're not going very fast with these backups, right.

Speaker:
00:15:45

Do you think that also led to some issues as well for the tape drives?

Speaker:
00:15:48

yeah.

Speaker:
00:15:49

So again, the core problem was that each individual backup was running slow.

Speaker:
00:15:54

matter how many of them that I multiplex together, it was not enough

Speaker:
00:15:58

speed to make the tape drive happy.

Speaker:
00:16:00

And so, yes, the tape driver shoe shining.

Speaker:
00:16:02

And when a tape tribe is continually shoe shining, the tape drive will fail.

Speaker:
00:16:07

And so everything, I remember learning about tape drives was

Speaker:
00:16:10

coming back to haunt me, right?

Speaker:
00:16:13

Um, this is all of the design that I was, that I had done throughout

Speaker:
00:16:17

the years on backup, um, you know,

Speaker:
00:16:22

um, backup system

Speaker:
00:16:24

And system.

Speaker:
00:16:26

all of the things that, you know, what do you do when the backups, you know?

Speaker:
00:16:29

And so I came to understand

Speaker:
00:16:33

that the only way I was gonna finish this backup was to do it to disc.

Speaker:
00:16:37

And just quickly before you move on, I think along the way, didn't

Speaker:
00:16:41

you also have a tape drive that failed that you then had to go

Speaker:
00:16:43

Oh, multiple Multiple times.

Speaker:
00:16:46

Swap out tape drives, reboot tape drives, put in cleaning tapes and tape drives.

Speaker:
00:16:50

And by the way, that's another thing is the way tape drives normally do

Speaker:
00:16:54

is you run them for a certain number of hours and then there's a cleaning

Speaker:
00:16:58

tape that goes in there and cleans it.

Speaker:
00:16:59

And when you have a robotic library, that happens automatically.

Speaker:
00:17:03

Well, when you just run the tape drive for.

Speaker:
00:17:06

Two months, you know, that

Speaker:
00:17:10

And so at some point the tape drive just fails.

Speaker:
00:17:13

Yeah.

Speaker:
00:17:13

um, yeah.

Speaker:
00:17:14

And so I ultimately that the only way to get this done was to, um, you know,

Speaker:
00:17:21

buy, uh, enough disc to back this up.

Speaker:
00:17:26

And that wasn't cheap.

Speaker:
00:17:27

Uh, but I, I didn't think that there was any other way that this was ever

Speaker:
00:17:33

going to get done 'cause again, the core problem that we've had with tape

Speaker:
00:17:38

for the last three decades has been that the backup, if the backup isn't

Speaker:
00:17:42

too fast enough for the tape drive it's a, it's a fundamental mismatch

Speaker:
00:17:47

right?

Speaker:
00:17:48

And so we use to make that better.

Speaker:
00:17:50

But if the multi, but if the speed you're dealing with is in kilobytes a second,

Speaker:
00:17:54

Yeah.

Speaker:
00:17:55

Well, and especially 'cause you're limited by those two, uh, Synology boxes, right?

Speaker:
00:18:00

Which are limiting your bandwidth, right?

Speaker:
00:18:02

It's not like

Speaker:
00:18:02

Yeah.

Speaker:
00:18:03

Synology boxes you can then pull from,

Speaker:
00:18:06

Yeah, and I was, I was watching, like, I was running every kind of tool I could

Speaker:
00:18:11

run to see, like, I wasn't overt tasking.

Speaker:
00:18:16

The, that was the really weird part is that the, it's not like the

Speaker:
00:18:18

Synology boxes were saying, you're really beating the crap out of it.

Speaker:
00:18:22

You shouldn't do so

Speaker:
00:18:23

backups at a time.

Speaker:
00:18:25

It wasn't, it, it was, I didn't have a high I/O wait.

Speaker:
00:18:29

I didn't have high CPU, I didn't have high ram.

Speaker:
00:18:33

There, there was no, there was no

Speaker:
00:18:35

rhyme or as to why we'll get to the rhyme or reason later.

Speaker:
00:18:39

I figured it out.

Speaker:
00:18:41

Um, but, but I knew the tape and I knew the tape and this wasn't gonna work.

Speaker:
00:18:47

So, so I had to bring in, uh, a couple of other Synology disc arrays, by the

Speaker:
00:18:52

way, and populate them with enough disc to handle all of this, uh, this backup.

Speaker:
00:18:58

Right.

Speaker:
00:18:58

Yeah,

Speaker:
00:18:59

And, um.

Speaker:
00:19:01

Then

Speaker:
00:19:02

but that wasn't without its issues either.

Speaker:
00:19:04

Right?

Speaker:
00:19:04

When you, when you brought those in, that wasn't without its issues either.

Speaker:
00:19:07

No, it wasn't without issues.

Speaker:
00:19:09

And the other thing, what I needed to do was to, I felt that with, in terms of the

Speaker:
00:19:15

number of directories that were remaining, I wasn't sure like the different sizes.

Speaker:
00:19:21

So what I did was I split, I.

Speaker:
00:19:23

Those jobs into many smaller jobs.

Speaker:
00:19:26

NetBackup is really good at like running thousands of jobs, right?

Speaker:
00:19:29

So rather than just have a hundred jobs, I turned that into like 2,400 jobs.

Speaker:
00:19:34

Like I went,

Speaker:
00:19:35

I went another level deep and created a policy for each of these

Speaker:
00:19:38

directories, and then I ran those and it was running for a while.

Speaker:
00:19:45

It was, it was, you know, again, more time.

Speaker:
00:19:49

And what I started seeing.

Speaker:
00:19:53

Were these jobs that were like an individual job that was running

Speaker:
00:19:57

inordinate amount of time.

Speaker:
00:20:00

but you also had some jobs that would finish like super fast, right?

Speaker:
00:20:03

They'd finish five, they'd finish in

Speaker:
00:20:05

Some of 'em, some of 'em finished in five minutes, some 'em would finish.

Speaker:
00:20:08

But I noticed that over time there were certain policies that were running for

Speaker:
00:20:13

really, really long periods of time, and eventually started poking around.

Speaker:
00:20:21

when I discovered what ultimately was the, the true culprit.

Speaker:
00:20:28

And, uh, anyone who's been around backup for a long time

Speaker:
00:20:32

has seen this culprit before.

Speaker:
00:20:35

It's just, this is the worst example of this culprit that I've ever seen.

Speaker:
00:20:43

And what is that?

Speaker:
00:20:46

We affectionately refer to it as the million file problem.

Speaker:
00:20:51

Hmm.

Speaker:
00:20:52

Because remember, again, going back to that, um, that client back from

Speaker:
00:20:59

25 years ago, we had one server.

Speaker:
00:21:03

That was going to be storing a bunch of images and it was going

Speaker:
00:21:06

to result in millions of files.

Speaker:
00:21:08

And we knew that back then that the million file problem is, a real problem.

Speaker:
00:21:13

and and million file problem ev over, over the network is even worse, right?

Speaker:
00:21:18

Because everything is, is, is a

Speaker:
00:21:19

round trip.

Speaker:
00:21:19

The way we fixed it back then was we used a product back then called

Speaker:
00:21:23

flashback, which would back up at the raw level, but store the

Speaker:
00:21:27

information, and that was not available to me.

Speaker:
00:21:32

Why?

Speaker:
00:21:34

Because that product no longer exists

Speaker:
00:21:36

No.

Speaker:
00:21:37

because it doesn't run on a Synology box.

Speaker:
00:21:40

Right.

Speaker:
00:21:42

Remember, I'm not the Synology

Speaker:
00:21:44

All it was was an SMB mount to me.

Speaker:
00:21:46

Right?

Speaker:
00:21:48

And by the way, for those curious, yes, I tested SMB, I tested NFS.

Speaker:
00:21:52

It didn't matter.

Speaker:
00:21:54

It didn't matter.

Speaker:
00:21:54

Um, the um.

Speaker:
00:21:59

And

Speaker:
00:21:59

by the way, this was a constant, you know, you know the phrase, never, never

Speaker:
00:22:03

go into battle with an untested weapon.

Speaker:
00:22:05

This was constant example of I am in the battle, I'm in the stuff,

Speaker:
00:22:10

and now I'm trying to test stuff

Speaker:
00:22:13

and, and I did to try to make things better, just made it take longer

Speaker:
00:22:17

and the client just had to wait.

Speaker:
00:22:20

And the the client was incredibly patient, honestly.

Speaker:
00:22:24

And, and you know, I did my best to say, look, I, I've been doing this for 30

Speaker:
00:22:30

years, I've never seen anything like this.

Speaker:
00:22:32

Right.

Speaker:
00:22:33

And that, that helped.

Speaker:
00:22:36

But in the end, I was backing up.

Speaker:
00:22:40

You know, we got down to, I, I learned a way to identify which

Speaker:
00:22:44

were the problem directories.

Speaker:
00:22:46

So I would kick off a policy and I would watch, and I would notice

Speaker:
00:22:49

that had run for, let's say an hour.

Speaker:
00:22:54

And it listed, let's say 300,000 files backed up.

Speaker:
00:22:59

kilobytes.

Speaker:
00:23:00

Hmm.

Speaker:
00:23:02

Literally there's, there's a kilobyte column that

Speaker:
00:23:05

kilobytes of byte and there's no value in there.

Speaker:
00:23:07

We backed up 300,000 files, no kilobytes.

Speaker:
00:23:11

so that, that helped me identify these problem

Speaker:
00:23:14

Problem child.

Speaker:
00:23:15

Yeah.

Speaker:
00:23:15

it and let the other non-problem policies finish.

Speaker:
00:23:19

And

Speaker:
00:23:19

Right.

Speaker:
00:23:20

Yeah.

Speaker:
00:23:20

up getting down to like 150 policies that were the problem policies.

Speaker:
00:23:27

And so I backed them up and I was able to get them.

Speaker:
00:23:30

Over time, I was able to get them backed up, and then finally I got down to about

Speaker:
00:23:37

20 policies, I think somewhere around

Speaker:
00:23:41

policies.

Speaker:
00:23:41

Go ahead.

Speaker:
00:23:42

And at this point when you're down to the 20, like some of these have

Speaker:
00:23:45

been running for a long time, right?

Speaker:
00:23:48

Like how?

Speaker:
00:23:49

like two months backups that have been running for two months,

Speaker:
00:23:52

successfully running for two months.

Speaker:
00:23:54

Yeah.

Speaker:
00:23:55

And what was good was at this point again.

Speaker:
00:24:00

Like this is information that would've been really helpful to have at the

Speaker:
00:24:03

beginning, but it was information that, to get all this information at the

Speaker:
00:24:07

beginning, it would've taken time to, like we, we just wanted to get started.

Speaker:
00:24:12

Yeah.

Speaker:
00:24:13

What I ended up finding was that, um, these backups, um.

Speaker:
00:24:21

The, the, there were millions and millions and millions, like one of the, one

Speaker:
00:24:24

of the directories that I was backing up, it had 99 million files in it,

Speaker:
00:24:28

one directory, 99 million files, and eventually what I realized was that

Speaker:
00:24:34

again, the problem this time was just SMB.

Speaker:
00:24:40

So the fact that every one of these files results in a round

Speaker:
00:24:44

trip conversation, possibly multiple round trip conversations.

Speaker:
00:24:47

Yep.

Speaker:
00:24:49

And I realized that the only way I was gonna back up these truly problem

Speaker:
00:24:53

directories was to back them up locally.

Speaker:
00:24:56

But how do I back them up locally?

Speaker:
00:24:58

Well, luckily this is when I just, you know, basically go back

Speaker:
00:25:02

to dumb, dumb old backup tools.

Speaker:
00:25:06

And so I was able to run a backup using tar logged in locally

Speaker:
00:25:12

on the filers, and then just.

Speaker:
00:25:17

Directing the tarball across the network that finally worked.

Speaker:
00:25:23

That's crazy.

Speaker:
00:25:24

So you had these 20 jobs, right?

Speaker:
00:25:27

And some of them you said were running for 60 plus days, and then you sort of

Speaker:
00:25:32

were like, okay, let me start this over.

Speaker:
00:25:34

And by the way, you were kind of forced to start them over

Speaker:
00:25:37

because something happened right?

Speaker:
00:25:39

yeah.

Speaker:
00:25:39

Something some unknown thing.

Speaker:
00:25:42

Um, I think I.

Speaker:
00:25:44

I, I, I don't know.

Speaker:
00:25:45

I, I actually don't know

Speaker:
00:25:46

what caused it, but they, they did fail

Speaker:
00:25:49

and,

Speaker:
00:25:49

And you were like, I'm not gonna start these

Speaker:
00:25:51

yeah.

Speaker:
00:25:52

I'm not gonna start 'em again.

Speaker:
00:25:52

It's just, yeah.

Speaker:
00:25:53

Well, Because

Speaker:
00:25:54

like, one of jobs, the, the one with 99 fi, 99 million

Speaker:
00:25:58

files, we were nowhere near.

Speaker:
00:26:00

Speaker:
00:26:01

yeah.

Speaker:
00:26:02

After 60 days you were barely

Speaker:
00:26:03

yeah, yeah.

Speaker:
00:26:04

We're barely, barely scratching the surface.

Speaker:
00:26:05

so I'm like, I, I, I don't have, I don't have that, you know, I, I don't

Speaker:
00:26:10

have the amount of time that it would take, so, so I switched to, you know,

Speaker:
00:26:15

experimentally once again, experimentally, I'm experimenting on the fly, I'm

Speaker:
00:26:19

doing development in production.

Speaker:
00:26:21

Uh, I was like, well, let me see how long, how quick a tar ball would run.

Speaker:
00:26:26

I ran a tar ball.

Speaker:
00:26:27

I remember for like a day, you remember this?

Speaker:
00:26:29

I ran a

Speaker:
00:26:30

a day and it, I, I had a du of the size of the directory and after a day it had

Speaker:
00:26:36

done like, like a half of it or something.

Speaker:
00:26:39

Yeah.

Speaker:
00:26:40

You're like, what?

Speaker:
00:26:40

Once taking 66 days and barely scratch the

Speaker:
00:26:43

yeah,

Speaker:
00:26:44

You are mainly done.

Speaker:
00:26:45

Almost done within a day.

Speaker:
00:26:46

yeah.

Speaker:
00:26:46

And so I was like, this is the way.

Speaker:
00:26:48

Right.

Speaker:
00:26:49

So it, it, it wasn't, it wasn't a way for everything because the, the, this

Speaker:
00:26:55

was, um, because I, you know, I'm glad that I, that I use NetBackup for the

Speaker:
00:27:00

bulk of it, because then I have the catalog data and, you know, and, um,

Speaker:
00:27:03

but

Speaker:
00:27:04

on the restore side.

Speaker:
00:27:05

yeah, yeah.

Speaker:
00:27:06

So this will.

Speaker:
00:27:07

This will be the diff the restores will be more difficult for these

Speaker:
00:27:11

like remaining 20 directories.

Speaker:
00:27:14

I mean, not, not astronomically.

Speaker:
00:27:15

So like,

Speaker:
00:27:16

you know, can create a tarball, a

Speaker:
00:27:17

list of this.

Speaker:
00:27:19

So, you know, lessons learned, like,

Speaker:
00:27:21

do that.

Speaker:
00:27:21

Don't store millions of files on the other side of a, of an SMB box.

Speaker:
00:27:25

I guess

Speaker:
00:27:26

Yeah, so Well, and I think a couple things, even if it's not SMB, right?

Speaker:
00:27:32

Just having that many files, because I think what people don't realize is

Speaker:
00:27:36

even though the size of every disc has gotten significantly larger, right?

Speaker:
00:27:40

You're talking like 18 terabyte, 20 terabyte disk

Speaker:
00:27:43

Yeah.

Speaker:
00:27:44

They can only handle so many operations per disc, right?

Speaker:
00:27:48

That number hasn't changed.

Speaker:
00:27:49

It's about a hundred per second.

Speaker:
00:27:51

And so no matter how many, how big your disc is, right?

Speaker:
00:27:54

If it was 21 terabyte discs, right, then you get 20 times a hundred iops.

Speaker:
00:28:01

Versus if it's one 20 terabyte disc, you only still get that a hundred.

Speaker:
00:28:04

So that's a big thing that people don't realize with these larger size discs.

Speaker:
00:28:09

Yeah.

Speaker:
00:28:09

And, and the thing was that the.

Speaker:
00:28:11

That many files.

Speaker:
00:28:13

So, because the problem, the, ultimately the problem wasn't disc io, the problem

Speaker:
00:28:18

io.

Speaker:
00:28:19

Right?

Speaker:
00:28:20

Network latency.

Speaker:
00:28:21

So, because

Speaker:
00:28:22

when I actually ran, I ran two tar balls.

Speaker:
00:28:26

Speaker:
00:28:27

Simultaneously is what I did.

Speaker:
00:28:28

I using

Speaker:
00:28:30

I just, I ran, I was always running two at a time.

Speaker:
00:28:33

When I was running two at a time, I/O wait was sitting at 10,

Speaker:
00:28:37

which is, is high,

Speaker:
00:28:39

but I was like, well, it's got nothing else going on, so I'm, I'm

Speaker:
00:28:43

it go.

Speaker:
00:28:43

Right?

Speaker:
00:28:44

The highest I/O wait ran during all of those hundreds of

Speaker:
00:28:48

simultaneous backups was like four.

Speaker:
00:28:52

yeah,

Speaker:
00:28:52

So like I wasn't disc bound.

Speaker:
00:28:54

I was

Speaker:
00:28:55

bound, but not network bound in terms of throughput, network bound, in terms of

Speaker:
00:28:59

Laid C,

Speaker:
00:29:00

and

Speaker:
00:29:01

of operations, just because SMB is very chatty.

Speaker:
00:29:05

very chatty.

Speaker:
00:29:06

It's probably the chattiest of the protocols,

Speaker:
00:29:09

and

Speaker:
00:29:10

we, you

Speaker:
00:29:11

it was just a really combination.

Speaker:
00:29:13

Yeah.

Speaker:
00:29:13

And you know why this, and this is why backup vendors have their own protocols,

Speaker:
00:29:19

like Data Domain has boost, right?

Speaker:
00:29:22

To help alleviate and solve some of these issues.

Speaker:
00:29:25

Yeah.

Speaker:
00:29:26

You talked about, don't, don't do the somewhere we were talking about.

Speaker:
00:29:29

Just don't do this.

Speaker:
00:29:31

I, I'd like, I'd like to talk today.

Speaker:
00:29:33

When I looked at these, these, uh, these directories that had these

Speaker:
00:29:38

tens of millions of files, it was a structure that was very clearly

Speaker:
00:29:42

created by some application.

Speaker:
00:29:45

one of these directors had a common structure created by some.

Speaker:
00:29:50

I'm gonna say stupid application that thought this was perfectly fine.

Speaker:
00:29:55

That it was perfectly fine to create 99 million files for

Speaker:
00:30:00

Do you know, I,

Speaker:
00:30:01

item.

Speaker:
00:30:02

I bet they were using the file system as a database

Speaker:
00:30:09

I don't know.

Speaker:
00:30:10

what it was.

Speaker:
00:30:11

given just like the number of files and the size of those files.

Speaker:
00:30:15

I know it was forensic type information

Speaker:
00:30:18

and I, I don't, I clearly

Speaker:
00:30:21

That, that's fine.

Speaker:
00:30:22

Yeah, yeah,

Speaker:
00:30:23

No, I'm just saying I clearly don't know enough about forensic stuff

Speaker:
00:30:25

to know why they would want tens of

Speaker:
00:30:27

of vials,

Speaker:
00:30:29

but

Speaker:
00:30:30

So where are you?

Speaker:
00:30:31

So you talked about these 20 jobs that you were starting to do tarballs with.

Speaker:
00:30:35

So where are you right now?

Speaker:
00:30:37

So, so we finished all of them, but one, there was one that for some reason

Speaker:
00:30:42

it, it, the file didn't look right.

Speaker:
00:30:43

It was weird.

Speaker:
00:30:44

Um, it, the, the, the backup completed, but the, some reason, the, the tarball,

Speaker:
00:30:50

it just, it just didn't look right.

Speaker:
00:30:51

I don't wanna go into details.

Speaker:
00:30:52

It just didn't look

Speaker:
00:30:53

so I'm rerunning that one.

Speaker:
00:30:55

So it, based on its size and how well it's doing, it should

Speaker:
00:30:57

finish in about a day or so.

Speaker:
00:30:59

Um, and what I'm

Speaker:
00:31:00

is a significant improvement in terms of

Speaker:
00:31:02

A significant improvement a day versus, you know, a year, um,

Speaker:
00:31:08

Or two, I think actually it might have been two.

Speaker:
00:31:10

Yeah,

Speaker:
00:31:11

Agreed.

Speaker:
00:31:12

Um, and what I'm doing is I'm, because again, I don't have the catalog.

Speaker:
00:31:16

What I'm currently running is I'm running a tar TVF.

Speaker:
00:31:19

On all of those files and creating tarballs or creating, I'm sorry, text

Speaker:
00:31:25

files, a list.

Speaker:
00:31:27

of the, the files that are in there.

Speaker:
00:31:29

And then I'm gonna do a count on the files that are in there and

Speaker:
00:31:31

check it against the count of the files that are in the directory.

Speaker:
00:31:34

And, and hopefully those numbers should be the same.

Speaker:
00:31:36

Yeah, because I believe you are even saying that to run things

Speaker:
00:31:40

like a find to get a list of all the files in a directory or a DU

Speaker:
00:31:44

Yeah.

Speaker:
00:31:45

hours, right?

Speaker:
00:31:46

Well, it was days actually.

Speaker:
00:31:48

fact, it was why I didn't have this information in the beginning

Speaker:
00:31:51

because everything was so big and every find, every du every command

Speaker:
00:31:57

that I had DU is quicker than find.

Speaker:
00:31:59

DU is.

Speaker:
00:32:00

It just does less work than find.

Speaker:
00:32:02

But the problem that I ultimately realized was that DU wasn't

Speaker:
00:32:06

really being helpful in terms of.

Speaker:
00:32:08

The

Speaker:
00:32:09

scope of the job, what was the scope of the job was determined

Speaker:
00:32:12

by the number of these files.

Speaker:
00:32:14

And I couldn't get those numbers because that was the thing that took forever.

Speaker:
00:32:20

the number of jobs dwindled down to about 20, that's when I

Speaker:
00:32:25

was able to run these, uh, the

Speaker:
00:32:28

and they would, they would actually complete.

Speaker:
00:32:30

And that's when I realized just how bad it was.

Speaker:
00:32:33

so if you had to start this over, and hopefully you never do, but I'm just

Speaker:
00:32:39

saying, if you had to go back to day one, what would you do differently?

Speaker:
00:32:43

I know you talked about making sure you understand the size of your backups.

Speaker:
00:32:47

Right.

Speaker:
00:32:48

It just feels like some of these, you just have to go through the process

Speaker:
00:32:52

though because you don't know what to do.

Speaker:
00:32:54

Like it's not like you could just start day one and be like,

Speaker:
00:32:56

oh, I know I need to go to disc.

Speaker:
00:32:58

I need to do X, Y, and Z.

Speaker:
00:33:00

Right?

Speaker:
00:33:00

It's sort of like a learning process.

Speaker:
00:33:03

would say that I.

Speaker:
00:33:05

Yeah, because the problem is you're going off into the unknown,

Speaker:
00:33:08

you're doing a backup of something that you don't know what it is.

Speaker:
00:33:11

And I, I would say if possible, if at all possible, get things like

Speaker:
00:33:18

dus, uh, you know, discus it, it's a Unix command, but you can load those

Speaker:
00:33:23

tools and windows as well get, like if you're going to back up, if you're

Speaker:
00:33:29

gonna back up a hundred directories.

Speaker:
00:33:31

Get a du of every one of those directories so that you have an idea

Speaker:
00:33:34

of just what you're dealing with,

Speaker:
00:33:36

if at all possible.

Speaker:
00:33:37

Also, look and see if the number files and if the number of, and if you're

Speaker:
00:33:41

trying to do a, you know, it's not that hard, you just run a fine dot dash,

Speaker:
00:33:46

you know, I didn't even do a print just fine dot pipe to wc -l, right?

Speaker:
00:33:50

That was it.

Speaker:
00:33:51

Right?

Speaker:
00:33:51

Um, to, to get the number of files.

Speaker:
00:33:55

I'd say if again.

Speaker:
00:34:00

If I could go back in time, I, I would say maybe do a little bit more of this

Speaker:
00:34:04

research prior to beginning the job.

Speaker:
00:34:08

Um, but that's diff it's, it's easy to say that now,

Speaker:
00:34:12

um, because I know what

Speaker:
00:34:13

I know.

Speaker:
00:34:14

Right.

Speaker:
00:34:16

Um, but the, you know, the core problem was that you've

Speaker:
00:34:23

got these millions of files.

Speaker:
00:34:24

I mean, which is all.

Speaker:
00:34:26

Already gonna be a problem if you're backing it up in any sort of normal way.

Speaker:
00:34:29

But if you're

Speaker:
00:34:30

up remotely over the network, it's going to kill you.

Speaker:
00:34:34

Yeah.

Speaker:
00:34:35

So, um, you gotta figure out a way to do that.

Speaker:
00:34:39

And then I would just say, see if there's anything that you can do with the, with

Speaker:
00:34:42

the application that's created this data

Speaker:
00:34:44

which is why it's important to get involved early on, right when an

Speaker:
00:34:48

application is being developed or deployed, right, to get involved so

Speaker:
00:34:52

they understand the backup requirements.

Speaker:
00:34:55

yeah.

Speaker:
00:34:56

And so, this backup that would never finish, I literally was, I

Speaker:
00:35:01

was starting to think that this thing was never gonna finish.

Speaker:
00:35:04

Um.

Speaker:
00:35:05

It's essentially finally, I mean, it's not, at this point, it's

Speaker:
00:35:08

not a hundred percent, but I'm, I'm now, you know, it's just, I'm

Speaker:
00:35:11

at the finish line.

Speaker:
00:35:12

Yeah.

Speaker:
00:35:13

at the finish line.

Speaker:
00:35:14

Yeah.

Speaker:
00:35:15

Um, it's nice.

Speaker:
00:35:17

I know one of the other things you mentioned that you were using

Speaker:
00:35:19

NetBackup, but you had also looked at other tools out there as well, right?

Speaker:
00:35:24

That could potentially help you with this effort.

Speaker:
00:35:27

Right.

Speaker:
00:35:28

So do you think that that becomes valuable, like either looking at other

Speaker:
00:35:33

tools, um, I know you had reached out to like synology support, you

Speaker:
00:35:37

had reached out to some experts, like

Speaker:
00:35:39

Yeah.

Speaker:
00:35:40

Yeah.

Speaker:
00:35:41

The problem there, there were, there were, you could do, like with Synology,

Speaker:
00:35:48

you can like copy the data from A to B.

Speaker:
00:35:50

Mm-Hmm.

Speaker:
00:35:51

They have this ability essentially like, you know, for lack of a

Speaker:
00:35:54

better word, they have Snap Mirror.

Speaker:
00:35:56

they have the equivalent of Snap Mirror.

Speaker:
00:35:58

Yep.

Speaker:
00:35:59

from onSynologygy box to another.

Speaker:
00:36:01

But to me that wasn't really a backup like I wanted in a, in a format, you know,

Speaker:
00:36:06

the end I was forced to not do what I wanted with the tar.

Speaker:
00:36:11

Um, but I wanted it in a cataloged format.

Speaker:
00:36:16

So we looked at a couple of, the problem was never NetBackup.

Speaker:
00:36:19

Right?

Speaker:
00:36:19

NetBackup made it, um, easy to script this whole thing because it was the

Speaker:
00:36:25

only way I could make sense of it.

Speaker:
00:36:27

'cause it was, it was thousands of directories and, um, and even

Speaker:
00:36:31

more thousands of sub directories under those directories.

Speaker:
00:36:35

And the only way I could make sense of this was to script it all.

Speaker:
00:36:38

And, um, the, the fact that NetBackup allowed me to do that was great.

Speaker:
00:36:43

Um, there are some other tools these days, some of the newer tools,

Speaker:
00:36:48

they want to make it easy for you.

Speaker:
00:36:51

But if you get into a complicated situation like this, some of the newer

Speaker:
00:36:54

tools don't even have the ability to sort of grab it by the horns.

Speaker:
00:36:58

The

Speaker:
00:36:59

able to do a NetBackup,

Speaker:
00:37:00

Yeah.

Speaker:
00:37:01

I think the other thing also that you were doing, which I thought was interesting,

Speaker:
00:37:06

was also your scripting, right?

Speaker:
00:37:08

Trying to automate this, like, uh, I know like scheduling your,

Speaker:
00:37:12

the backup policies to run, right?

Speaker:
00:37:14

And then you were sort of doing load balancing to make sure

Speaker:
00:37:16

that you keep the two filers

Speaker:
00:37:18

Yeah.

Speaker:
00:37:19

Yeah.

Speaker:
00:37:19

I couldn't, yeah, that was the thing.

Speaker:
00:37:21

I couldn't normally, I, I just, I believe in just throwing

Speaker:
00:37:24

everything in the NetBackup schedule or, and let it figure it out.

Speaker:
00:37:27

But because again, because of the limitations of the weird thing I had,

Speaker:
00:37:32

I, I couldn't figure out a way to load balance across the two target filers.

Speaker:
00:37:39

the NetBackup scheduler.

Speaker:
00:37:41

Um, maybe I could have, uh, done that better.

Speaker:
00:37:45

I don't know.

Speaker:
00:37:45

But, uh, so the way I was doing it was I was just assigning a backup.

Speaker:
00:37:50

a backup would finish, I would assign the next backup to that, that the

Speaker:
00:37:56

was now had more space available to it.

Speaker:
00:37:59

Right.

Speaker:
00:38:00

So I just had a while loop that was running, you

Speaker:
00:38:02

know, checking to see if a backup job was done.

Speaker:
00:38:04

but I think that's important, right?

Speaker:
00:38:05

You can always script some of these things that if it doesn't

Speaker:
00:38:08

exist in the native tools, right?

Speaker:
00:38:10

Don't be afraid.

Speaker:
00:38:12

Yeah.

Speaker:
00:38:12

Don't be afraid.

Speaker:
00:38:14

you know, obviously I'm, I'm pretty good at scripting and

Speaker:
00:38:16

I'm pretty good in the backup.

Speaker:
00:38:17

And, um, th there are, and, and, and, and thanks.

Speaker:
00:38:21

Thanks very much to Veritas for keeping their, uh, their documentation online.

Speaker:
00:38:27

Uh, the number of times I Googled.

Speaker:
00:38:29

You know, backup job, you know, how do, how do I list, uh, you know, and

Speaker:
00:38:34

I know there's a, there's, I know there's a command to, to do this.

Speaker:
00:38:37

How do I do that?

Speaker:
00:38:38

And, you know, and then a man page would come up and I would read it

Speaker:
00:38:40

and I was like, oh, yeah, yeah, yeah.

Speaker:
00:38:42

It's

Speaker:
00:38:42

been a while.

Speaker:
00:38:43

Yeah.

Speaker:
00:38:44

Um.

Speaker:
00:38:44

you have to also thank Cygwin, of course.

Speaker:
00:38:47

Yes, special thanks to to Cygwin Without Cygwin.

Speaker:
00:38:52

That is the tool that you can download and run on any Windows

Speaker:
00:38:55

server to give you Unix capabilities.

Speaker:
00:38:58

I will say there were, there were moments where Cygwin was both helpful and

Speaker:
00:39:03

terrorizing me because it was the whole like backslash versus forward slash thing.

Speaker:
00:39:09

Because in Windows, you know, the file separator is a backslash, which

Speaker:
00:39:14

in Unix is an escape character,

Speaker:
00:39:17

Yep.

Speaker:
00:39:17

and Cygwin wasn't consistent.

Speaker:
00:39:23

When that escape character would be an escape character.

Speaker:
00:39:26

Like, like if you piped it into a file, it would do one thing.

Speaker:
00:39:28

If you piped it into a command, it would do it, it would behave differently.

Speaker:
00:39:32

And, um, so that, that definitely l lent.

Speaker:
00:39:36

The fact that I was doing constant file manipulation on directories

Speaker:
00:39:40

that were seven levels deep,

Speaker:
00:39:42

Yeah.

Speaker:
00:39:43

did not help.

Speaker:
00:39:45

Yeah.

Speaker:
00:39:45

Oh, and then I couldn't, the, the, the, the one thing with

Speaker:
00:39:51

Cygwin is that it doesn't see.

Speaker:
00:39:54

It doesn't see the, to point the backups to NetBackup, I have to point

Speaker:
00:40:00

'em in the backs back slash filer name

Speaker:
00:40:04

share name.

Speaker:
00:40:06

Cygwin doesn't see that.

Speaker:
00:40:08

Cygwin sees only mapped drive names

Speaker:
00:40:12

and

Speaker:
00:40:13

have to map it using

Speaker:
00:40:14

you have to map it to a drive name.

Speaker:
00:40:15

Let's say you map it to,

Speaker:
00:40:17

to letter F, and then in Cygwin you would see /cygdrive/f.

Speaker:
00:40:23

Which would be the same as this backs slash backs mount.

Speaker:
00:40:27

know, I was constantly having to go back and forth between

Speaker:
00:40:30

those two and, and that was fun.

Speaker:
00:40:33

Um,

Speaker:
00:40:34

scripting

Speaker:
00:40:35

here's the thing.

Speaker:
00:40:36

After all of this experience and everything you've learned, you're probably

Speaker:
00:40:40

never gonna use any of this again.

Speaker:
00:40:42

I don't know about that.

Speaker:
00:40:43

I dunno about that.

Speaker:
00:40:44

I tell you what, I'm, I'm taking a tar, all those scripts that

Speaker:
00:40:47

I wrote, um, because I will say this, that, that the NetBackup

Speaker:
00:40:53

documentation while, uh, extensive, it doesn't give a lot of examples.

Speaker:
00:41:00

And so like, I'm thinking of like, um, like the BP duplicate command,

Speaker:
00:41:06

which is the command to copy backups from one place to another.

Speaker:
00:41:11

I couldn't, I couldn't figure out from reading the man page how to

Speaker:
00:41:16

actually do, to do what I needed to do.

Speaker:
00:41:19

So I would, I would like.

Speaker:
00:41:22

I would do, I would have to run tests, you

Speaker:
00:41:25

know, I'd, you know, um, and, um, the, you know, not like now that Cohesity's

Speaker:
00:41:31

acquiring them, it's not like they're now gonna rewrite their man pages.

Speaker:
00:41:34

I just thought that they could have used some more, some more examples.

Speaker:
00:41:38

But

Speaker:
00:41:38

Yeah.

Speaker:
00:41:39

I figured it out eventually.

Speaker:
00:41:42

You know, I think someone used to have a forum that people would post on about.

Speaker:
00:41:46

Yeah, someone used to have that and then, but people stopped posting

Speaker:
00:41:51

on that forum, so I don't know

Speaker:
00:41:52

You know?

Speaker:
00:41:53

Um, where people are getting their help now,

Speaker:
00:41:56

but, uh,

Speaker:
00:41:58

Well, I'm glad that this is almost over,

Speaker:
00:42:00

yeah.

Speaker:
00:42:01

Yeah.

Speaker:
00:42:02

nearly over and I'm glad you're still alive,

Speaker:
00:42:05

I am alive.

Speaker:
00:42:06

I didn't kill anyone along the way.

Speaker:
00:42:08

I didn't scream at anyone.

Speaker:
00:42:09

Like the, the story that

Speaker:
00:42:11

you have heard were, were Curtis Cuss Preston.

Speaker:
00:42:14

I didn't scream at anyone.

Speaker:
00:42:15

yeah.

Speaker:
00:42:16

but I really, really, really think you should do an office space on those filers.

Speaker:
00:42:23

yeah.

Speaker:
00:42:23

Well, that would sort of defeat the purpo of the

Speaker:
00:42:26

but, uh, I, yeah, I, like that idea.

Speaker:
00:42:30

Hmm.

Speaker:
00:42:31

Anyway.

Speaker:
00:42:31

Well, uh, thanks Prasanna for helping me, uh, sort of through this.

Speaker:
00:42:37

You were my constant counselor through this.

Speaker:
00:42:40

I think I learned a bunch.

Speaker:
00:42:42

I know usually I'm all about YouTube knowledge, but in this case it was

Speaker:
00:42:46

the Preston knowledge, so it was good.

Speaker:
00:42:48

Yeah.

Speaker:
00:42:48

Yeah.

Speaker:
00:42:50

uh, thanks everybody else for, uh, uh, listening along with this sad, sad story

Speaker:
00:42:56

with I think a decent, happy ending.

Speaker:
00:42:58

That is a wrap.

Speaker:
00:43:02

The backup wrap up is written, recorded and produced by me w Curtis Preston.

Speaker:
00:43:07

If you need backup or Dr.

Speaker:
00:43:09

Consulting content generation or expert witness work,

Speaker:
00:43:12

check out backup central.com.

Speaker:
00:43:15

You can also find links from my O'Reilly Books on the same website.

Speaker:
00:43:19

Remember, this is an independent podcast and any opinions that you

Speaker:
00:43:23

hear are those of the speaker.

Speaker:
00:43:25

And not necessarily an employer.

Speaker:
00:43:28

Thanks for listening.

Backup from Hell: SMB vs 400TB

Listen On

Recent Episodes

Ransomware Episodes

Backup to Basics Episodes

Cloud Recovery Episodes

Sponsored Episodes

Cybersecurity Episodes

Browse episodes by category