Check out our companion blog!
April 18, 2022

Vast Data really does appear to be "vast"

Vast Data really does appear to be

Vast is a massively-scalable storage system designed around multiple pieces of technology that weren't available just a few years ago (e.g. NVMe, Storage class memory, QLC) that offers both file and object functionality, immutable snapshots, and integration with the cloud to address the "smoking hole" problem. Their typical sale (of which they've made many) is north of $1 million, and they have many exabytes of disk in the wild. It's a scale-out storage system without all the typical East-West traffic such systems have. We do our best to poke holes in their offering, but Howard Marks goes toe-to-toe quite well. This one went a little long (one hour) but we truly were fascinated with the Vast story Howard was telling.

Mentioned in this episode:

Interview ad

Transcript
curtis:

I'm pretty sure we've said smoking hole more times

curtis:

than we've on this podcast.

curtis:

Just for the record.

curtis:

Just saying

curtis:

It's getting a

curtis:

lot of play today.

curtis:

Hi and welcome to Backup Central's Restore it All podcast.

curtis:

I'm your host, W.

curtis:

Curtis Preston, AKA Mr.

curtis:

Backup and I have with me, my Bhangra dance consultant, Prasanna

curtis:

Malaiyandi, how's it going Prasanna?

Prasanna:

I'm good Curtis, but I have to warn you.

Prasanna:

I have not a dancer at all.

Prasanna:

So probably the wrong person to be seeking advice about dancing from,

curtis:

But you said that you knew about Bhangra dancing and that you

curtis:

could advise me on these things.

Prasanna:

I told you that it's like a Indian dance style, if you will.

Prasanna:

And you had asked a question of, have I seen it because I bet I've

Prasanna:

seen a bunch of Bollywood movies.

curtis:

You expanded my horizon I bought my tickets my wife and I will be going

curtis:

to see the show it's called Bhangin' it!

curtis:

It's bangin' was spelled would be H so it's it's trying to like

curtis:

do an homage to the Bhangra.

curtis:

So it's a, new musical at the LA Jolla Playhouse, which is a very nice

curtis:

Playhouse that I've actually never been.

curtis:

I've lived here 20 something years.

curtis:

I've never watched a show there, but a lot of like big Broadway

curtis:

shows actually start out.

curtis:

I've never started.

curtis:

I've, always watched the Broadway shows

Prasanna:

Like

Prasanna:

gone to Broadway.

curtis:

This is the kind of show that could possibly hit big on Broadway.

curtis:

And so we'll see it and we'll see if it's any good and

curtis:

I'll

Prasanna:

waiting for

curtis:

with my review.

Prasanna:

Yes.

Prasanna:

I think our listeners will be curious.

Prasanna:

And by the way, for those in San Diego, when is it running?

Prasanna:

Do you know how long?

curtis:

It's running.

curtis:

It's running until April.

Prasanna:

Okay.

Prasanna:

So that

Prasanna:

was

Prasanna:

folks in San Diego.

curtis:

Yeah.

curtis:

Yeah, depending on when this goes live, if it goes live less than a month from

curtis:

now, then you have two days left to go see it because it runs until April

curtis:

17th at the LA Jolla Playhouse, by which time all the tickets will be

curtis:

gone and you won't be able to see it.

curtis:

Sorry, I don't know what to tell you, but so we, have a longtime

curtis:

friend on the podcast here today.

curtis:

Prasanna.

curtis:

I'm excited to bring him on I, and not just because he's one of those

curtis:

people that make me feel young.

curtis:

As, been in the it industry for an awfully long time, makes me feel like

curtis:

a young whippersnapper sometimes.

curtis:

He is now the technologist extraordinary and plenipotentiary at Vast Data.

curtis:

Welcome to the podcast, Howard Marks.

Howard:

Thank you.

Howard:

It's very nice to be here.

Howard:

I was always about Fauci guy, so I don't know much about Indian dance.

Prasanna:

Curtis didn't either before he met me.

Prasanna:

So it's fine.

curtis:

Yeah I, my knowledge of Indian dance it basically includes the

curtis:

reference to it in what was that movie?

Prasanna:

Millionaire.

curtis:

bright, the BR the bride and prejudice.

Prasanna:

Oh

curtis:

There's a

Prasanna:

yeah.

Prasanna:

I th I think

curtis:

there's a, it's a pride and prejudice,

Prasanna:

yeah.

curtis:

Knock off done with what's her name?

curtis:

. Prasanna: Ashwaryia Rai.

curtis:

I think.

curtis:

She remember she, she, in the movie she, gives two D two dance moves.

curtis:

It was petting the dog and screwing in the light bulb.

curtis:

I don't know if you remember that.

curtis:

She says that.

curtis:

That's literally the extent of my knowledge of Indian dance.

curtis:

That, and the fact that I've watched a bunch of Bollywood movies, but

curtis:

that's all thanks to Prasanna.

Prasanna:

Yeah.

curtis:

So you never know what you're going to get when you're listening to the

curtis:

Backup Central Restore it All podcast.

curtis:

Speaking of which, let me throw out our usual disclaimer, Prasanna

curtis:

and I work for different companies.

curtis:

Persona works for Zoom.

curtis:

I worked for Druva.

curtis:

This is not a podcast of either company and the opinions that you hear are

curtis:

ours, and be sure to rate this podcast ratethispodcast.com/restore, or just

curtis:

go at your on your favorite pod catcher apple podcasts and just scroll down

curtis:

to the bottom and give us some stars.

curtis:

And if you really want to make my day, actually put some words there.

curtis:

Yeah, absolutely.

curtis:

And if you are interested in the things that we're interested in, like

curtis:

backups and storage and resilience and ransomware recovery and cyber

curtis:

warfare and all of these things.

curtis:

Then just send me a note @wcpreston on Twitter, or wcurtispreston@gmail, and

curtis:

I'll be happy to get you on the podcast.

Prasanna:

friendly.

Prasanna:

We ask questions.

curtis:

we even apparently, although the last episode I said, unless your

curtis:

name was Stewart and apparently Stewart has now reached out to you Prasanna

Prasanna:

Yes, he has.

Prasanna:

He

curtis:

and,

Howard:

So even

curtis:

and

Howard:

name

curtis:

Even Stuart can get on this podcast.

curtis:

So if we're going to let you know a mouse on the podcast, then surely we can let

curtis:

you, his name is Stuart Liddle for those of you that didn't get that reference

curtis:

anyway.

Howard:

to make me feel honored here.

curtis:

We literally let anybody in the door,

curtis:

including guys who always wear Hawaiian shirts.

Howard:

They're comfortable.

Howard:

They come in my size and at this point I'm just known for them.

Howard:

I have been known to tell people I'm going to meet at the

Howard:

Starbucks at some conference.

Howard:

Just look for Santa Clause in an Aloha shirt.

Howard:

That will be me.

curtis:

much.

curtis:

It pretty much

Prasanna:

that's.

Howard:

Yeah.

Howard:

You know how many 350 pound guys with a gray beard are there walking around the

Howard:

average tech show, wearing an Aloha shirt?

Howard:

Two

curtis:

I'm going to, yeah.

curtis:

Two, yeah.

curtis:

At most.

curtis:

Absolutely.

curtis:

And one of them is going to be you.

Howard:

Yeah.

curtis:

so how long have you been at Vast Data?

Howard:

I've been at Vast Data three years and 15 days.

curtis:

Wow.

Prasanna:

And the company is fairly new as well.

Howard:

I joined Vast Data the day before we came out of stealth.

Howard:

My, my first official act at Vast Data was a briefing for Chris Mellor followed

Howard:

the next day by Storage Field Day.

curtis:

Wow.

Howard:

Nothing like starting off running

Howard:

Now, I joined Vast from being an independent analyst.

Howard:

So there were a couple of weeks there where I was getting brought up to speed

Howard:

and such before my official start date.

Howard:

But yeah,

curtis:

And why don't you give a for those that aren't familiar with Vast

curtis:

Data, give us a, know, the elevator

Howard:

sure.

curtis:

and

Howard:

The really short form on Vast Data is that we make very large scale all

Howard:

flash file and object storage systems.

Howard:

And when I say very large scale our average selling price for

Howard:

our cluster is well on the north side of a million dollars.

Howard:

It's multiple petabytes.

Howard:

Today we're just introducing a new storage enclosure that brings

Howard:

our building block down from 675 terabytes per HA enclosure to 338.

Howard:

So we're taking it down by factor of two.

Howard:

We're going from a two U to a one U enclosure.

Howard:

We'll talk about that in a little bit, but the innovative thing

Howard:

about Vast is the architecture.

Howard:

If you talk about a large scale system, like we build traditionally, that's been

Howard:

done with a scale out, shared nothing model where you have a lot of x86 servers.

Howard:

Each of those x86 servers owns some set of media and they communicate

Howard:

on a backend network and software makes it look like one big system.

Howard:

But those systems start to break down at really large scale.

Howard:

And so we've come up with a new model.

Howard:

We call DASE the shared everything architecture instead of having a field of

Howard:

peer nodes, each of which owns some media, we disaggregated the media into these HA

Howard:

enclosures that I was just talking about.

Howard:

So no single point of failure, 400 gig connections to an NVME fabric and

Howard:

that's typically a hundred gig Ethernet.

Howard:

Some of our HPC customers like to run InfiniBand so we

Howard:

can do InfiniBand as well.

Howard:

All those enclosures do is hold data.

Howard:

There's no services there.

Howard:

All of the services, everything that you would think of as the controller function

Howard:

of the system runs in stateless Docker containers in the front end servers.

Howard:

So when a user makes a request to a protocol server to one of

Howard:

those front end servers could be NFS, could be SMB, could be S3.

Howard:

That server looks in the metadata that's stored in storage class memory

Howard:

in the enclosures, finds the data the user's requesting in the data in

Howard:

QLC flash in those same enclosures, retrieves it over the NVME over fabric's

Howard:

fabric and delivers it to the user.

Howard:

So there's none of the traffic from node to node required to reassemble

Howard:

data, everything's north, south across that NVME over fabrics connection.

Howard:

And since the metadata is in storage class memory, it's fast enough to

Howard:

directly access by all of the front end servers that they can just share it.

Howard:

They don't have to cash it.

Howard:

And by not having the cache, we don't have all the complexities

Howard:

of keeping the cache coherent.

Prasanna:

I was just going to ask about that, Howard.

Prasanna:

So it looks like though you're dis-aggregating the actual storage

Prasanna:

and metadata from all the front end processing, which allows,

Prasanna:

would assume the front end to scale independently of the backend.

Howard:

So each of those front end protocol servers, mounts all of the

Howard:

SSDs in the cluster at boot time.

Howard:

And then it looks at all of those SSDs, and at those are the SCM

Howard:

SSDs that hold the metadata and the QLC SSDs that hold the data.

Howard:

So everybody has access to everything.

Howard:

And instead of sending messages back and forth between the front end servers,

Howard:

they simply write a single of truth in the shared metadata, so that the

Howard:

old so that you can place a lock on the metadata or update the metadata.

Howard:

But you never have to tell everybody else you updated it because if they want

Howard:

to know what the state is, they'll go look in the one place where it's true.

Prasanna:

Yeah.

Prasanna:

And because everything is stateless in the front end, you don't have to worry

Prasanna:

about that necessarily to everyone

Howard:

Right,

Prasanna:

that backend

Howard:

right.

curtis:

So the backend has both SSDs and QLC.

Howard:

What has SCM sort of storage class memory SSDs, and that can be

Howard:

Optane or and it has low end QLC SSDs.

Prasanna:

So

curtis:

And the, the, yeah the, storage class memory is what's

curtis:

holding the metadata and the

curtis:

QLC is, what's holding the data.

Howard:

Primarily.

Howard:

It's also used as a write buffer.

curtis:

Okay.

curtis:

Okay.

Howard:

So writes come into the storage class memory and get mirrored to two

Howard:

different SCM SSDs and then get ACKd.

Howard:

And then the migration from SCM to QLC happens after the act.

Howard:

So we have more time to do things like compress more fully.

curtis:

This is a very different game than.

curtis:

This idea of all of the front end nodes, being able to mount the entire

Howard:

Yes.

curtis:

the background

Howard:

Yeah.

Howard:

We we eliminate the whole concept of ownership and all the

Howard:

complexity that, that creates.

Howard:

And now I'm going to blow your mind because when I say the metadata is in

Howard:

the SCM, I don't mean just the element store metadata, the metadata for our

Howard:

merged file system object store, but also the data reduction metadata.

Howard:

And so when you add another enclosure to the cluster, you add more SCM, which

Howard:

means you add more room for that metadata.

Howard:

So regardless of the size of cluster, the cluster is one data reduction realm

Howard:

across tens or hundreds of petabytes.

Prasanna:

Because everything's looks like one cluster, if you will, or one system.

Howard:

right.

Howard:

And, we don't have to hold the data deduplication hash

Howard:

table in memory any place.

Howard:

It's all in SCM where it's fast enough we don't need that.

Howard:

So we don't have the limitations of how big a deduplication realm can be

Howard:

that most deduplication systems have.

curtis:

right.

curtis:

They typically top out around a a petabyte or so, and then you

curtis:

can't get any bigger than that.

curtis:

I don't know where to start on my questions!

Howard:

so from that, from the backup point of view, we're discovering that

Howard:

the customers are starting to demand higher restore speeds that traditionally

Howard:

all a customer worried about when they were picking the storage for their

Howard:

backups was it fast enough that I can make my backup within the window?

Howard:

And so we got systems like Data Domain and other disk based deduplicating systems,

Howard:

where there was a big write read asymmetry where you could write data faster to

Howard:

them than you could read data from them.

Howard:

Because reading data that caused the system to rehydrate turned

Howard:

sequential IO into random IO.

Howard:

And they had disks on the backend.

Howard:

And as disk drives have gotten bigger, this has gotten worse

Howard:

because a 20 terabyte disk drive today delivers exactly the same

Howard:

number of IOPS that a one terabyte disc drive delivered 10 years ago.

Howard:

So now 20 terabytes of data gets a 20th as many IOPS.

Howard:

And so you discover, yes, it takes me eight hours to back this up.

Howard:

It takes me 82 hours to restore it

Howard:

and

curtis:

Yeah.

curtis:

D D dedupe has never been very friendly for, large restores, especially if

curtis:

you're doing any sort of, if you want to do a live mount, forget it right.

curtis:

From a directly, from a Data Domain.

curtis:

It's possible in the same way, it's possible that...

Howard:

That's, but that's, you can bring up the Oracle or the SQL server VM.

Howard:

So that the it guys can access the passwords database, so that everybody

Howard:

can start at running ERP on it again.

Prasanna:

Yeah.

Prasanna:

Don't use it as production.

Prasanna:

That's a bad thing.

Howard:

Right.

curtis:

right.

Howard:

And we're discovering that people's requirements are getting tighter.

Howard:

You start thinking about software as a service providers where, you know, if you

Howard:

run some account, some industry specific accounting as a service for a thousand

Howard:

customers, that's a thousand databases.

Howard:

And when something goes wrong, you want to restore those databases

Howard:

as fast as you can, because your customers are going to be standing

Howard:

over your shoulder, yelling at you.

Howard:

And the last thing that's kicked, a couple of our potential customers over

Howard:

the edge is the ransomware threat.

Howard:

Because the size of the restore grows so much with ransomware.

Howard:

You start off with, they need to protect my data against ransomware

Howard:

and use various methods to do that.

Howard:

And so we have indestructable snapshots.

Howard:

So you can say snapshot this folder at 6:00 AM when the backup window

Howard:

closes and retain it for 30 days.

Howard:

And even if the administrator wants to delete it he can't.

Prasanna:

So I

Howard:

but

Prasanna:

about that.

Prasanna:

So I did read a little small blurb about that.

Prasanna:

So

Prasanna:

What prevents, is that locked down forever?

Prasanna:

Like an admin can't delete it no matter what, or is it just, there

Prasanna:

are additional safeguards in place to make sure that someone doesn't

Prasanna:

compromise the admin password,

Howard:

Anyone who ever talked to any customer of EMC Centera knows that if you

Howard:

build a system where you literally can't delete data someone will get themselves in

Howard:

trouble and fill it a hundred percent up with junk, and it will be a bad situation.

Howard:

So you have to provide some mechanism for overriding this because customers

Howard:

will paint themselves in corners.

Howard:

As I said, our average selling price is well over a million dollars.

Howard:

We don't have small customers who we only know third hand through VARs.

Howard:

We are in relatively intimate contact with every one of our customers.

Howard:

And so we don't have a fixed policy that says, if you jump through these

Howard:

hoops, then we will let you delete the undeletable snapshots we, and the

Howard:

customer agree what the hoops are.

Howard:

Yeah, multifactor authentication must be three of the five people on this list.

Howard:

They have to know the passphrase and the proper response to the passphrase.

Howard:

And if they respond with this other response to the passphrase, then for

Howard:

the next 24 hours, do not give anybody the secret as complicated as you want.

Howard:

We'll as long as we can write it down, those are the rules.

Howard:

And then once you've jumped through the hoops, we give you a time limited

Howard:

token that allows you to delete snapshots for a short period of time.

Howard:

And that token is a one-time pad.

Howard:

So that you can't re it's not good for

Prasanna:

Yeah.

Howard:

an hour whenever you use it.

Howard:

It is good for the time when we issue it for some limited period of time.

Howard:

And then you have to know the next one.

Howard:

And it's just, it was the best solution we could come up with.

Prasanna:

And this is probably helps in cases where someone

Prasanna:

attacks a company, they get access to the, to a storage system.

Prasanna:

They start deleting back-ups or what have you, it gives you

Prasanna:

that extra layer of protection.

Howard:

I've seen ransomware , you know, we think of ransomware as being on the

Howard:

order of the viruses we've dealt with.

Howard:

And the ransomware reports I see are much more frequently and this ransomware

Howard:

opened a door and then someone physically hacked for a long period of time.

Howard:

And they took over some workstation, eventually that some

Howard:

administrator logged into and they have an administrator password.

Howard:

And if we're just worried about, if we're just worried about the

Howard:

script kiddies in a, I can protect against the script kiddies in

Howard:

building my backup infrastructure and architecture and those permissions.

Howard:

But we're talking about more sophisticated attacks than that.

Howard:

And frankly we talk about it as ransomware, but it's also

Howard:

rogue administrator protection.

Howard:

Then it's also just the guy who is disgruntled and decides his

Howard:

way out the door, he's going to make life for his employer.

Howard:

You're protected against that too.

curtis:

Yeah.

curtis:

Yeah.

curtis:

And, sometimes rogue administrator is a true rogue administrator, meaning

curtis:

it's a, it's someone masquerading as an administrator as well.

curtis:

That hacker that you talked about.

curtis:

So let me let, me ask call it a difficult question, call it

curtis:

whatever you want to call it.

curtis:

But when I hear about boxes that where you're not supposed to be able to

curtis:

delete data, but then there is this other way where you can delete data.

curtis:

I immediately ask I, I have to ask the question doesn't that suggest

curtis:

that there is a this is, I'm assuming this is a, Unix-based OS and that

curtis:

there's that there is a root account,

Howard:

It we, run in containers under linux

curtis:

So there is an account, there is a a root account and that

curtis:

if someone did some sort of just the right attack against that box.

curtis:

And again you've already mentioned that there is that

curtis:

these are sophisticated attacks.

curtis:

If someone Did a privilege escalation attack against

curtis:

the CoreOS, and now they've gained access to a privileged Couldn't want

Howard:

if someone

curtis:

want.

Howard:

administrative access to the management network, because the

Howard:

ports that face users as storage

Howard:

ports, can't be logged into

curtis:

Okay.

curtis:

they're

curtis:

cause they're back.

curtis:

Cause they're backend,

Howard:

so if you're wondering, if you want to log into

Howard:

Linux as root on one of our appliances, then you need,

Howard:

then the management network has to be set, has to be compromised.

Howard:

And we start saying, are you looking for protection against destruction?

Howard:

Because if your data center is compromised, everything can be destroyed,

Howard:

but that's not really the level of attack that we're, concerned about.

Howard:

We're not talking about and someone walked into the data center because we

Howard:

hadn't disabled their key card and left 20 pounds of thermite in the middle of

Howard:

the floor, who would do such a thing.

Howard:

I've done that on video I was being paid.

Howard:

So you know, I, it is a vulnerability, but it's the

Howard:

generalest of the vulnerabilities.

Howard:

You're pointing out that if I have sufficient

Howard:

access, I can destroy anything.

curtis:

The but it sounds like you have protected from the rogue

curtis:

administrator, the stupid administrator.

curtis:

And and someone gaining access to those.

curtis:

But let me just you to clarify something from your previous answer, when you said

curtis:

that means the management network has been compromised, what do you mean by that?

Howard:

So you manage the system through different ethernet ports,

Howard:

then you access the system.

Howard:

And so too, you're if there's a vulnerability where a user could log

Howard:

into the appliance as the Linux root user that Linux root user can only

Howard:

log in on the management, physical Ethernet port on the appliance, not

Howard:

on the gigabit NVMe over fabric port.

curtis:

Gotcha.

curtis:

Okay.

Howard:

so network security should keep that from being an internet

Howard:

connected network and to attack.

curtis:

Gotcha.

curtis:

Gotcha.

curtis:

sense.

curtis:

Okay.

Prasanna:

I had a

Prasanna:

question.

Prasanna:

So Howard, before we dive more into the data protection side, one thing that

Prasanna:

was curious to me was you mentioned that vast supports file and object.

Prasanna:

Could you talk about some of the use cases that you see

Prasanna:

your customers using Vast Data?

Prasanna:

And then I think maybe some of the protection stuff will

Prasanna:

probably come alongside that.

Howard:

Sure.

Howard:

We have the majority of our customers use us for primary storage.

Howard:

And that includes one of the biggest travel sites who uses us for their

Howard:

big data analytics and are using the S3 Presto connectors to store

Howard:

all of their analytic data on us.

Howard:

So that we're much faster than a disk based object store, obviously.

Howard:

And they can do that processing faster.

Howard:

We have a lot of hedge funds who do time series analysis of trade

Howard:

data against large databases to try and predict the market.

Howard:

We have a lot of life sciences customers who are doing things like.

Howard:

Molecular modeling and cryo electron microscopy where one microscope generates

Howard:

many terabytes of data a day because we have very high resolution images.

Howard:

And we have a major motion picture studio who makes movies.

Prasanna:

And so it looks like they are using both sort of the file and the object

Prasanna:

interfaces for a lot of these use cases.

Prasanna:

So specifically around data protection and backup.

Prasanna:

A lot of times you hear The vendor's customers say, object

Prasanna:

store doesn't need to be backed up.

Howard:

This is a subject that personally I find myself on the fence about part

Howard:

of me goes I've built a huge amount of resiliency into this single system.

Howard:

And for durability, if for, availability, I may need to have it in another

Howard:

location, but for durability, assuming that the whole data center doesn't end

Howard:

up being a smoking hole in the ground I could get away without backing this up.

Howard:

I am N I remain firmly on the fence there.

Howard:

But

curtis:

assuming you have the second copy somewhere, you're going to

curtis:

write.

Howard:

may decide that it's, it is data that If, the whole data

Howard:

center goes away, I don't need.

curtis:

Okay.

curtis:

Yeah.

curtis:

Agreed.

curtis:

If, yeah, if we have That, data I would argue why did we make

curtis:

it in the first place, but,

Howard:

That the risk of that is the risk of that is small enough that I'm

Howard:

going to go once every thousand years this is going to cost me a million

Howard:

dollars, but it's going to cost me a million dollars a year to protect.

Howard:

So I'm going to take that risk.

curtis:

okay.

curtis:

So such I will agree to such data classes exist.

curtis:

I don't run into them much, but I will agree

Howard:

yeah.

Howard:

And and then we get to the okay, so this is the object store that does a

Howard:

deep dispersal coding, and they have three locations and I can lose one.

Howard:

So do I need to back that up?

Howard:

That starts getting really close to now I need to back it up because there could be

Howard:

a bug in the software that loses my data.

Howard:

'cause, that's the only thing that could cause that it's like

Howard:

unprotected against one of my three data centers being a smoking hole.

Howard:

what again, it's I could see you going, I want to be safe and I can

Howard:

see you going, it's not worth it.

curtis:

And.

Howard:

Now for us, most of our users use us for primary storage.

Howard:

And for someone like that, big data analytics data, they may not back it

Howard:

up because it's regenerate Hubble, and it's not actually in the form

Howard:

it's in on the object store, but it's extracts from other things and they

Howard:

can run the ETL again and it would be really annoying, but it is replaceable.

Howard:

And then we and then for other use cases this is primary data.

Howard:

I gotta protect it.

Howard:

And so we can do snapshots to an S3 compatible object store

Howard:

and back ourselves up that way.

Howard:

Or you can back us up the usual ways.

curtis:

And could you use one of the, like ones that are like

curtis:

glacier deep archive where I hope I don't ever have to use this.

curtis:

I know it's going to cost me a crap ton of money, but it'll save me a lot of money.

curtis:

In the meantime, can you use that kind of storage?

Howard:

The risk reading data out of that kind of storage

Howard:

requires a few manual steps.

Howard:

If you just use S3 standard then data in those snapshots is available

Howard:

in a .Remote folder, like the .Snapshots folder in the file system.

Howard:

So users can do self-service restore, but that required, but

Howard:

this, that feature means the object has to be immediately readable.

Howard:

And so if you, if it went to

Howard:

Glacier, then.

Howard:

And it would be like your net backup

Prasanna:

Okay.

Howard:

this backup isn't in the catalog anymore.

Howard:

So I got to put those files someplace where I can catalog it and then I got

Howard:

a catalog and then I can restore it.

Howard:

So if you

curtis:

so it's possible.

curtis:

it

curtis:

doesn't sound like it's very it's the smoking hole copy, right?

Howard:

It is annoying.

Howard:

But if it's just, but if you're protecting against the smoking hole,

Howard:

then you know, you may be willing to put up with the annoyance.

curtis:

I'm pretty sure we've said smoking hole more times

curtis:

than we've on this podcast.

curtis:

Just for the record.

curtis:

Just saying

curtis:

It's getting a lot of play today.

Howard:

I spent way too long as a disaster recovery planner.

curtis:

Yeah.

curtis:

Yeah.

curtis:

So the majority of your customers use you for primary storage, but clearly

curtis:

you're trying to expand your TAM,

Howard:

Well, w we, we deliver all flash at a substantially lower

Howard:

price than anybody else does.

Howard:

We start with using the cheapest QLC flash.

Howard:

We have a file system designed to treat that flash properly.

Howard:

So we never do small writes that would consume a lot of write amplification.

Howard:

We do very wide erasure code stripes.

Howard:

So we've got under 3% overhead, and then we do guaranteed better data reduction

Howard:

than anybody else in the business.

Howard:

And so that combination means that on an effective byte basis, from whatever backup

Howard:

data mover you're planning on using, we're going to be cheaper than a Data Domain.

Howard:

When you start saying that it's, you have more than a petabyte of data

Howard:

and you need multiple Data Domains.

Howard:

And each one of those is going to be a separate deduplication realm.

Howard:

Then the gap starts to grow substantially.

Howard:

So if so for these very large customers who have five or 10 or 20

Howard:

petabytes data across a bunch of Data Domains, simply the fact that we're

Howard:

one reduction realm makes that makes us much more efficient that can be.

Howard:

it's one system to manage.

Howard:

It's one namespace it's one 20 petabytes or 50 petabytes system.

curtis:

So you're saying, so let me just make sure I understood

curtis:

what you said there correctly.

curtis:

saying on a, regardless of the size of the system, you should

curtis:

be priced competitive with a Data Domain, but then the bigger you get,

curtis:

better you look.

Howard:

under about 500, any pricing experiments under about 500 terabytes,

curtis:

Okay.

curtis:

Okay.

Howard:

in the large end of the business, but yes.

curtis:

Right, That is interesting though, that sort of.

curtis:

into that end of the business.

curtis:

And you had another there was another large, all flash competitor that's

curtis:

doing very well, but they have a very different architecture, they're referring

curtis:

of course, to the orange company.

curtis:

And

Howard:

Yeah, but there,

curtis:

than you.

Howard:

If you're talking about Flash Blade, that's really a shared nothing

Howard:

architecture it's of being pizza box servers, they're blade servers, and each

Howard:

blade has flash modules built in And they they don't scale nearly as large.

curtis:

So it sounds like you, you just took, you've built an

curtis:

architecture based on several new pieces of technology that simply

curtis:

weren't available, say, five years ago,

Howard:

Yeah.

Howard:

We, are the storage system designed from a clean slate around the 2016 toolbox.

Howard:

So QLC, flash,

Howard:

SCM, NVMe over fabrics and other people shoe horn one or two of those technologies

Howard:

into an existing architecture, but we built the whole architecture

Howard:

around having those technologies.

Howard:

Yeah, putting all of the metadata in SCM with no cache meant it had to be in SCM.

Howard:

And it meant the connection between the compute server and that SCM had to be

Howard:

fast enough that we weren't going if we cached this, it would be a lot faster.

Howard:

So that meant it had to be NVMe over fabrics.

Howard:

And then the QLC flash gives us the cost.

Howard:

But it, really is if you look at any storage system, it's by definition built

Howard:

with the parts that the industry is making when they sat down to design it.

curtis:

Yeah.

Howard:

And that when x86 processor when Mahalum came along and the

Howard:

memory bandwidth and the number of PCI e-lanes on processors got big enough.

Howard:

All of a sudden we stopped seeing FPGAs and ASICs in storage systems, we started

Howard:

seeing software defined storage, cause what was available for the designers

Howard:

changed and the NVMe over fabrics has been used by most of the storage

Howard:

vendors for that last mile connection going well, it's going to be fast and

Howard:

then fiber channel or iSCSI for the user machine to access the storage.

Howard:

But it hasn't been as effectively used for the server that is the logical

Howard:

controller to access the media on the back end and the way we use it, we broke the

Howard:

traditional limitation that a drive had to be owned by one or two controllers.

Howard:

Cause I drive a SAS drive where an NVMe drive has one or two ports.

Prasanna:

Yea.

Howard:

We connect that NVMe SSD to what we call a fabric module, which

Howard:

is an NVMe over fabrics router.

Howard:

And in fact, in the new box, it's going to be a pair of Nvidia Bluefield cards

Howard:

and the Bluefield card routes, NVMe over fabrics requests from the ethernet network

Howard:

to the SSDs and routes the responses back.

Howard:

But that's all it does.

Howard:

We don't need x86 servers in the enclosure.

Howard:

We can do it on the ARMs and the offloads and the Bluefields.

Prasanna:

and these are the DPUs, correct?

Howard:

Yes.

Howard:

Yeah.

Howard:

The Bluefield is, the DPU it's the Nvidia Mellanox version of that.

Howard:

And so it has an ARM some ARM cores and NVMe over fabrics and RDMA and

Howard:

other built-in offloads in the chip.

Howard:

And so we leverage that to do the routing of requests from the front

Howard:

end servers, everything is, all the work gets done the SSDs and get that

Howard:

clean fast, more cost-effective channel

curtis:

Let me go back in time when you did that first presentation that

curtis:

you did to the Storage Field Day folks,

Howard:

Yep.

curtis:

how did that go over with, with those folks?

Howard:

It went over pretty well.

Howard:

There was a little being from Missouri and,

Howard:

you,

Howard:

know, we should show you,

curtis:

Cause you weren't because you were brand new.

curtis:

at that point.

Howard:

We We were brand new.

Howard:

And now we're going, okay, look, we've sold a couple of exabytes of storage.

Howard:

Now at this we, our go to market model's a little different, we sell software.

Howard:

We arrange for customers to buy the pre-approved hardware at cost.

Howard:

And the

Howard:

software licenses are,

curtis:

a little interesting.

Howard:

and the software licenses are transferable.

Howard:

So you license a petabyte of software.

Howard:

And you upgrade the hardware when you feel like you're want to upgrade the hardware.

Howard:

Cause you want the denser faster one that is always coming, but we'll write

Howard:

the support contract for 10 years for any appliance from install date.

Howard:

So

Prasanna:

That's very different

Howard:

well, a typical

Howard:

vendor, you would buy an appliance, it would come with an oEM software license.

Howard:

They would write five years of support.

Howard:

And in year six they would encourage you very strongly to rebuy.

Prasanna:

yep.

Howard:

And then when you rebuy, you have to buy another appliance the

Howard:

software license isn't transferable.

Howard:

So you have to buy another software license.

Howard:

So with us, you gotta have your VAR go to a VAR, a hundred

Howard:

percent channel you go to a VAR.

Howard:

your VAR, goes to Avnet, says, I want this hardware for Vast.

Howard:

Now $1.2 million average selling price.

Howard:

One of our sales guys is involved.

Howard:

We're writing the high touch sale.

Howard:

It's not somebody went on a website someplace.

Howard:

Um, but essentially the VAR, writes two POs: one to Avnet for the hardware and one

Howard:

to us for the actually he writes one PO to Avnet, Avnet cuts us a PO for the software

Howard:

and, that's a capacity subscription.

Howard:

So if you bought a 675 terabyte, enclosure and an appliance, that's got

Howard:

four servers that provide the front end, which is our usual entry point.

Howard:

You could license a hundred terabytes for a year.

Howard:

Multiples of a hundred terabytes for multiples a year.

curtis:

And so that, I think that addresses the question that I had.

curtis:

Cause I listened to the Chris Evans podcasts that you guys did.

Howard:

Yeah.

curtis:

and there was this talk of the 10 year And, again I'm gonna, I'm gonna just

Howard:

Perfect.

curtis:

acknowledge that I live in a SaaS world where we preach against

curtis:

large capacity licensing and capital purchases and all of that stuff.

curtis:

So when I heard 10 year purchase.

curtis:

I was like, what?

curtis:

I gotta, I got to decide now how much I need for 10 years, but that doesn't

curtis:

sound like what you're talking about.

Howard:

No, No, no.

Howard:

no.

Howard:

You th you buy the hardware.

curtis:

Right.

Howard:

We will write a support contract and software license.

Howard:

One agreement.

Howard:

For that hardware for up to 10 years from install date at the same rate.

Howard:

So if you want to keep it for 10 years, you keep it for 10 years

Howard:

Bought

curtis:

I could buy a smaller one and then add capacity.

Howard:

Oh yeah.

Howard:

Our NRR is three.

Howard:

Lots of people buy small and add capacity.

Howard:

We had a 300% NRR.

Prasanna:

I think you meant NRR,

Prasanna:

right?

curtis:

Thanks for explaining.

curtis:

Yeah.

curtis:

NRR,

curtis:

you said ARR.

curtis:

That's why you

curtis:

have me confused there for a minute.

Howard:

Yeah

curtis:

I was like an annual recurring revenue of three, three.

curtis:

Met meant net retention rate, you're saying?

curtis:

yeah.

curtis:

So you're saying 300% your customers start out at X and they end up

curtis:

with three X very regularly.

curtis:

Okay.

Howard:

You know, and you can do that.

curtis:

it just grows as they need it to grow.

Howard:

Yeah.

Howard:

And you can do it in the hardware, so if you want to start really small, then

Howard:

you can buy hardware and license it

Prasanna:

oh, interesting.

Howard:

So You can buy, a 600 terabyte box and a hundred terabytes software

Howard:

license, and the 600 terabyte box you bought at what would be our cost.

Howard:

If we were still selling hardware, we negotiate the cost with the intel

Howard:

and key Aksia and those vendors.

Prasanna:

so you used to sell hardware and then you

Prasanna:

of,

Howard:

started off in an appliance model.

curtis:

Why would I do that?

curtis:

Is that just like ease of large capital purchase thing?

Howard:

Yeah.

curtis:

why

curtis:

would I buy a bigger box

Howard:

university, we had a university had this much money in this year's budget.

curtis:

Oh, okay.

Howard:

We won't put more than a hundred terabytes on it before the next budget

Howard:

comes around when we renew, we'll renew it as a 400 terabyte license.

Prasanna:

and I think this is where at the beginning, you said Howard, that you're

Prasanna:

looking at releasing a smaller unit.

Howard:

Yeah.

Howard:

So the new box is one.

Howard:

You,

Howard:

it uses the ESS F one L the ruler form factor as, DS.

Howard:

So we can, we have 2215 terabyte SSDs for 3 38 raw bat, 300 usable.

Howard:

And that's half the physical size, half the capacity, because what we

Howard:

have now, it holds 56 SSDs and two U

Prasanna:

Gotcha.

Howard:

Yeah, the new one is, from the fabric module is those NVMe routers today.

Howard:

Each one has to be a dual Xeon.

Howard:

So we have enough PCIE

Howard:

lanes and the processors don't do hardly anything.

Howard:

So there's just there's costs there.

Howard:

We don't need, if the Bluefield

Howard:

thing

Prasanna:

That's exciting.

curtis:

right.

curtis:

So let's, focus for a little bit on.

curtis:

The only reason I have historically been when, I historically heard the

curtis:

idea of using flash for backup, I'm like, that sounds ridiculous because

curtis:

for the same for cost reasons, too expensive I'm hearing you that so I

curtis:

would put it this way that, in, in this upcoming world, in this current world

curtis:

in a world where we have large nation states invading other nation states

curtis:

and then large ransomware organizations in those countries, we had this, was

curtis:

our last th they're talking about.

curtis:

So we're, talking about being retaliated against because of this other country.

curtis:

It's crazy.

curtis:

So you have this this, need more than ever before for large recoveries.

curtis:

And I, do believe strongly that there's really only one of two

curtis:

ways to be really successful in any sort of ransomware situation.

curtis:

And, it's basically about fighting the laws of physics .Either you

curtis:

have to have already restored it.

curtis:

So you already have a hot standby ready to go to switch over to or you're

curtis:

doing live mount directly from your backup and live mount directly from

curtis:

your backup is only going to happen if you either aren't, deduplicating

curtis:

like, the way Data Domain does, or

Howard:

Right.

curtis:

have flash as far

curtis:

Tell.

Howard:

if you're not, even if you're not, deduplicating when you start talking

Howard:

about big, hard drives the IO density just

Howard:

isn't there it's better

curtis:

Some somewhere between you and Data Domain, I would put Exagrid,

curtis:

because exa grid has that front end.

curtis:

It's not de duplicated now they're there.

curtis:

They're nowhere near the size of you.

Howard:

right, no.

Howard:

And they have some, and they, have, some flash cache.

Howard:

And if you look at guys who do integrated appliances where the

Howard:

software and the target are one thing, those are typically hybrids.

Howard:

And, so they'll do an instant recover for one or two VMs pretty well.

Howard:

Cause there's enough flash for that.

Howard:

But when you start going, I need the database server behind my ERP, instant

Howard:

recovered, or I need all 50 of these VMs, instant recovered, then it's then you

Howard:

just, don't have enough flash and you're going to get hard drive performance,

curtis:

And so

curtis:

what it sounds like you've replaced the hard drives with QLC

Howard:

right,

curtis:

Help me because I don't live in this world QLC from

curtis:

a cost perspective regular.

Howard:

it's, not just QLC.

Howard:

So QLC means quad level cell holds four bits per cell.

curtis:

okay?

Howard:

The more, bits you hold, the closer, the voltage levels

Howard:

that represent the differences are, and the more sensitive the cells

Howard:

become to a few electrons escaping.

Howard:

If you have SLC, it's like a light switch it's on or off,

Howard:

It doesn't matter if a few electrons escape, you can still

Howard:

tell whether it's on or off.

Howard:

QLC.

Howard:

You got 16 values.

Howard:

The difference between value 13 and value 14 might only be a handful of electrons.

Howard:

So QLC has less endurance.

Howard:

Cause every time you erase it, the insulating layers wear down

Howard:

a little and a few more electrons have opportunities to escape.

Howard:

And it's slower to write because you have to adjust the voltage level just right

Howard:

to be one of those 16 voltage levels.

Howard:

And that takes a little bit longer.

Howard:

Now the slower to write, we don't really care about because

Howard:

we acknowledge the writes while it's still in the SCM.

Howard:

So as long as we are flushing that data out of the SCM, in bandwidth terms

Howard:

fast enough, Latency is unimportant.

Howard:

and the endurance we specifically do a lot of things in our

Howard:

software to manage endurance.

Howard:

So we write very large writes so that the SSD doesn't have to garbage collect

Howard:

internally to accommodate small writes.

Howard:

We erase very large erases so that we delete all of the data in an erase block

Howard:

in the flash so that the SSD doesn't have to garbage collect internally.

Howard:

And that means not only can we use QLC, but we can use dirt cheap QLC

Howard:

SSDs that don't have a DRAM buffer in them to protect the QLC from wear.

Howard:

If you have a DRAM buffer, then you can aggregate multiple small

Howard:

writes, but yet, but now if power fails, it's DRAM, you lose the data.

Howard:

So you need a power fail protection circuit, and you need big capacitors

Howard:

to power, the power fail protection

Howard:

circuit so that you can that you can dump the DRAM into flash and

Howard:

right, and it all starts to add up.

Howard:

So the SSDs we buy, the other customers are hyperscalers.

Howard:

They put them in servers.

Howard:

They only need one port they're writing long tail data.

Howard:

It's not like they're overriding this stuff all the time.

Howard:

It's just too many people are looking at that drunken fat frat

Howard:

boy picture on Facebook it to be on disk so it's on flash.

curtis:

A.

Howard:

We're leveraging all of that to keep so that we can literally

Howard:

use that lowest cost flash.

Howard:

And do the 10 year support because the 10 year support includes if the

Howard:

SSD wears out, we'll replace it.

Prasanna:

cause normally QLC isn't rated for that long.

Prasanna:

I believe.

Prasanna:

Right.

Prasanna:

SLC is years

Howard:

S SLC SLC is the very high endurance flesh, but the typical

Howard:

flash that you see for volume use today is TLC triple level cell.

Howard:

So it's three bits instead of four bits.

Howard:

So QLC is 30% cheaper to make because it holds more bits per cell.

Howard:

And QLC has substantially less endurance.

Howard:

So when you start looking at enterprise SSDs on newegg.

Howard:

The 0.1 drive write per day, SSD is slightly better than the ones we use.

Howard:

And the three drive write per day, SSD, you notice has less capacity because

Howard:

it's got the same amount of flash.

Howard:

It's just more over-provisioned so they can wear level across more of it.

Howard:

And the three drive rate per day, SSD probably has a DRAM cache

Howard:

and all this stuff to protect it.

Prasanna:

Yeah

Howard:

And that's what most enterprise storage systems need because how

Howard:

they put the data in the drive dates back to when it was a disk drive.

Howard:

And you were trying to keep data logically adjacent, not try and manage

Howard:

the write pool inside the drive.

Prasanna:

yeah,

Howard:

The requirements were different.

curtis:

Yeah.

curtis:

Interesting.

curtis:

Yeah.

curtis:

So again, going back to.

curtis:

the fact that you built this from the scratch with that toolbox

curtis:

from 2016, and you were like we need to, manage write leveling,

Howard:

And look, our founder Renen Hallak was the chief engineer at Extreme IO.

Howard:

And when he got tired of working for Michael Dell, he got to talk to Extreme IO

Howard:

customers and find out what they wanted.

Howard:

And nobody said we want faster, Extreme IO was already all flash.

Howard:

They were still adjusting to all flash.

Howard:

And it was plenty fast, but everybody wanted to be able to use

Howard:

that all flash for more things.

Howard:

And so our whole system is designed to provide very high, random read

Howard:

performance, across large amounts of flash at an affordable price.

curtis:

Got it.

Howard:

And so our our performance asymmetry is exactly

Howard:

the opposite of data domains.

curtis:

wait, explain what you just said.

Howard:

Our performance asymmetry is exactly the opposite of data domains.

Howard:

They don't publish restore speeds anymore.

Howard:

Haven't for years we publish, read speeds and writes speeds and reads

Howard:

are at eight times faster than rights.

Prasanna:

That doesn't mean your rights are slow either.

Prasanna:

Just for

Howard:

No Our, smallest system does five gigabytes per second of rights.

Howard:

Yeah.

Howard:

Or your story system probably doesn't keep up with that, but that's the SLOs.

Howard:

But what that means is if you scale a system the traditional way, and

Howard:

you say, I need to move this many terabytes over this many hours, so you

Howard:

have to scale it by right performance.

Howard:

Your backups are going to be much faster than your restores.

Howard:

Excuse me.

Howard:

your restores are much

Howard:

faster than your

Howard:

backups.

Prasanna:

Yeah,

Howard:

Yeah

Howard:

we read much faster than we write.

Howard:

And so if you size for backups speed, you're a store.

Howard:

Speed's going to be

curtis:

yeah.

Howard:

nice.

curtis:

All right.

curtis:

Consider me impressed, Howard.

curtis:

you know, I,

Prasanna:

do by the way

curtis:

I

Howard:

I I've

curtis:

I, I,

Howard:

time.

Howard:

I've impressed him once.

Howard:

this is makes twice.

Howard:

I'm really, I'm happy with that,

curtis:

yeah it sounds like you're, clearly you've been

curtis:

in the business a long time.

curtis:

You've seen those companies that have really interesting technology

curtis:

and nobody's buying anything.

curtis:

You're not that you,

Howard:

but

curtis:

the really interesting technology, but you're also actually selling it,

curtis:

right?

Howard:

I decided it was time to get a job.

Howard:

And I talked to the folks at Vast, who were still in stealth.

Howard:

And I said to myself, look, Howard, you're a storyteller.

Howard:

And this is a really good story.

Howard:

And it doesn't matter whether it succeeds or not.

Howard:

You're going to have a good story to tell.

Howard:

and low and behold, it's one of those cases where it was a good

Howard:

story and the market requirement fit.

Howard:

And

curtis:

don't have to create the need.

Howard:

we are selling we have, for the past couple of years

Howard:

done comparisons, all the storage companies have gone public you.

Howard:

Yeah.

Howard:

We're growing faster than all of them put together

curtis:

all right Howard thanks a lot for coming on.

curtis:

We might have to have you back.

curtis:

Cause I, I know that I know we've, just begun to scratch the surface and but

curtis:

sounds like you got a good gig over there.

curtis:

I'm glad.

curtis:

Both of us could be

curtis:

employed.

Howard:

Ed

curtis:

well.

Howard:

for the people have known us a long time.

Howard:

It really must be shocking to you and I both the same job multiple years, but

Howard:

I'm still having fun at Vast.

Howard:

And there's lots of interesting stuff still to come.

Howard:

Having taken a fresh eye to the market.

Howard:

We got all sorts of good stuff coming.

curtis:

Cool.

curtis:

All right.

curtis:

I wish you the best.

curtis:

And thanks Prasanna.

curtis:

This is one of those cases where your background was very helpful.

curtis:

I think,

Prasanna:

Oh, I try.

Prasanna:

I try,

Prasanna:

Yeah Yeah.

Prasanna:

Having spent a bunch of time building storage arrays.

Prasanna:

It helps, but

Prasanna:

no, it's still interesting problems though, and, yeah.

Prasanna:

Thank you, Howard, for sharing some of the details and indulging in my questions.

Prasanna:

So.