Google Cloud explains how it accidentally deleted a customer account

rcduke · May 30, 2024

Google says there are, but those warnings are for a "customer-initiated deletion" and didn't work when using the admin tool.

So Google can delete your information whenever they seem necessary, since this statement lists that a customer is not informed when the admin tool is used. Yet asking Google to delete your personal information when it's for advertising is impossible.

Will customers now be informed when their data is deleted? I feel like that would be a pretty useful notification for users.

Anony Mouse · May 30, 2024

I don't care how big the basket is. Don't put all your eggs in it.

stormcrash · May 30, 2024

That's an atrocious failing. Even if a mistake led to a default fixed term of service (which itself is asinine) in no way should that have led to a total purge of the account including all backups. At worst it should have kept the backups for some period for restoration of the account, but really it should have suspended the account and put it in holding/escrow in case the customer decided to reactivate/renew service.

This is just bad design all around, which is hilarious considering how G Cloud constantly annoys it users by deprecating services for having "bad design" in favor of a new "good design" which will then be a "bad design" in 18 months

Chuckstar · May 30, 2024

rcduke said:
So Google can delete your information whenever they seem necessary, since this statement lists that a customer is not informed when the admin tool is used. Yet asking Google to delete your personal information when it's for advertising is impossible.

Will customers now be informed when their data is deleted? I feel like that would be a pretty useful notification for users.

Aren't you referencing a completely different type of information stored in a completely different way?

50me12 · May 30, 2024

It's still not clear to me why "automatic deletion" wouldn't involve some form of SOFT or logical delete. To the outside world it is gone, but everything remains in a form that can be reverted (even if that process is laborious).

I don't write anything that doesn't have some form of soft delete that persists for a given amount of time ...

Just common sense / working with customers should teach you that people make mistakes and soft deletes are the way to go. Someone is going to footgun themselves at some point, even me, gotta be safe.

leonwid · May 30, 2024

I find concerning that the storage associated with this VMWare cluster was also permanently deleted after that 1 year period. I’d have expected a soft delete with cleanup after a month.

Wickwick · May 30, 2024

It seems there should be safeguards at something higher than the permissions level of an individual user to do certain things. I get that an unfilled entry on a script can cause unstable behaviors. It's just that, why does that script execution environment have the ability to delete an account and all its saved data with no grace period? This seems like a structural problem for Google, and not (only) a coding error.

Nilt · May 30, 2024

It's good that they deprecated the tool but why would any tool have existed that could do that in the first place? Was it an undocumented ~~bug~~ feature or what?

xoid · May 30, 2024

While I feel it shouldn't be the default, I'm not surprised that there is a way to set things up to immediately delete backups on the deletion of an account, as I have dealt with situations where clients have wanted that functionality.

bugabuga · May 30, 2024

System automatically deletes everything because of initial misconfiguration yet it's "not a systemic problem"? That's just... ugh.
The true statement would be "We had a systemic problem but we've fixed it".

Maybe all statements were heavily edited by PR department, accuracy be damned?

terrydactyl · May 30, 2024

"Data backups that were stored in Google Cloud Storage in the same region were not impacted by the deletion, and, along with third-party backup software, were instrumental in aiding the rapid restoration." It's hard to square these two statements, especially with the two-week recovery period.

Talk about a 'It's only a flesh wound' statement.

I do wonder about putting backup deletion in the same process of account deletion. What's the hurry? A simple safeguard would have been to schedule backup deletion on a later date.

stormcrash · May 30, 2024

bugabuga said:
System automatically deletes everything because of initial misconfiguration yet it's "not a systemic problem"? That's just... ugh.
The true statement would be "We had a systemic problem but we've fixed it".

Maybe all statements were heavily edited by PR department, accuracy be damned?

Agreed, they're trying to use systemic to mean widespread, which I guess it's not, but this is absolutely a systemic problem as it's a problem that was baked into the system itself

marsilies · May 30, 2024

Wickwick said:
It seems there should be safeguards at something higher than the permissions level of an individual user to do certain things. I get that an unfilled entry on a script can cause unstable behaviors. It's just that, why does that script execution environment have the ability to delete an account and all its saved data with no grace period? This seems like a structural problem for Google, and not (only) a coding error.

By "user," they mean "customer." This was an internal tool used only by Google employees, for certain custom deploys the customer-facing interface couldn't then handle.

The Google employees have a light higher privileges and access, especially considering they're creating custom deployments that customers can't do themselves.

Also, neither the Google employee nor the script started an account deletion. Instead, a parameter that was left blank meant that the system did it automatically, since it treated that blank parameter as an instruction to automatically delete the account after a year.

But it's a structural issue, yes, with the system they set up for account creation and deletion. They seem to think they patched away this issue though, at least by stopping ways customers or staff could instigate it.

stormcrash · May 30, 2024

Crying Croc said:
You always need a way to totally and permanently nuke something - in case the Feds come calling, etc. Yeah, I know, I'm cynical.

That would be called obstruction of justice and evidence tampering, providing there is a valid warrant for that data

questionable · May 30, 2024

Was there any indication that the backups that were deleted was WORM/immutable storage? If so, that's even scarier that an admin tool like that exists.

Arstotzka · May 30, 2024

The silence during this entire process from Google was deafening. Having a customer posting updates, co-signed by the CEO of Google Cloud but not cross-posted to any official Google domain, made an already bad issue worse.

50me12 · May 30, 2024

Arstotzka said:
The silence during this entire process from Google was deafening. Having a customer posting updates, co-signed by the CEO of Google Cloud but not cross-posted to any official Google domain, made an already bad issue worse.

Publicly posting about one of your individual customer's problems sounds like a bad idea by default.

At least until you get through your post-mortem.

Zukunftsweber · May 30, 2024

Wickwick said:
I get that an unfilled entry on a script can cause unstable behaviors. It's just that, why does that script execution environment have the ability to delete an account and all its saved data with no grace period?

It didn't. It set a fixed term, and the end of term code did the deletion

ninjonxb · May 30, 2024

I get it, even AWS has had an issue of a script that an engineer was running missing a parameter (or a wrong parameter) that brought things to its knees. Anyone else remember the big s3 outage several years ago? I still remember my mom messaging me "is something going on with the internet?" given how many unrelated websites were suddenly having issues. Things like this can happen.

But unless I am mistaken, there was no data loss at that time? Are there any instances of AWS loosing data due to their negligence?

This continues to paint a picture of, why would you possibly use google cloud? It just seems like way too much of a risk to your business. First you are stuck dealing with Google's love of deprecation, Second Google increasing prices, and third this.

As others have stated, this is very much a fundamental issue. Data should not have been immediately deleted and instead have been locked away and soft deleted so it could be easily recovered.

Sure it was an impact of one system having an unintended affect on another, but that other system should have never been able to do this in the first place regardless of what the cause was. This screams a lack of proper concern over critical systems on Google's part.

M3PH · May 30, 2024

reasons why my offsite data backups are with another provider.

Wickwick · May 30, 2024

marsilies said:
By "user," they mean "customer." This was an internal tool used only by Google employees, for certain custom deploys the customer-facing interface couldn't then handle.

The Google employees have a light higher privileges and access, especially considering they're creating custom deployments that customers can't do themselves.

Also, neither the Google employee nor the script started an account deletion. Instead, a parameter that was left blank meant that the system did it automatically, since it treated that blank parameter as an instruction to automatically delete the account after a year.

But it's a structural issue, yes, with the system they set up for account creation and deletion. They seem to think they patched away this issue though, at least by stopping ways customers or staff could instigate it.

By "user" I meant in the old-school "guy sitting at a desk accessing the company's network."

Wickwick · May 30, 2024

Zukunftsweber said:
It didn't. It set a fixed term, and the end of term code did the deletion

That happened to coincide so it happened automatically. That's not how grace periods work...

jhodge · May 30, 2024

So, basically: oops.

terrydactyl · May 30, 2024

There were joint statements from the Google Cloud CEO and UniSuper CEO on the matter, a lot of apologies, and presumably a lot of worried customers who wondered if their retirement fund had disappeared.

I have to imagine there are a lot of Google Cloud customers with mission critical data wondering if they picked the right horse. This whole thing is not about appeasing UniSuper, but the rest of Google's customers.

I had a couple of coworkers who worked in financial IT. Those IT departments were really conservative. I can envision some exec extolling the advantages of moving to the cloud. This should give a bank CEO pause.

MrMalthus · May 30, 2024

Arstotzka said:
The silence during this entire process from Google was deafening. Having a customer posting updates, co-signed by the CEO of Google Cloud but not cross-posted to any official Google domain, made an already bad issue worse.

Eh, 28 days for public post mortem seems pretty typical (and obviously better that companies with no public post mortems), especially with two weeks of it working on recovery.

Ildatch · May 30, 2024

xoid said:
While I feel it shouldn't be the default, I'm not surprised that there is a way to set things up to immediately delete backups on the deletion of an account, as I have dealt with situations where clients have wanted that functionality.

I don't think anyone is arguing against the existence of an immediate delete option, just that it should require a human in the loop. Soft deletes can be automated because they're, well, soft and can be restored but if you really want to nuke everything that should require manual intervention. The tool shouldn't even be capable of doing a full delete without confirmation.

jbjhillbilly · May 30, 2024

xoid said:
While I feel it shouldn't be the default, I'm not surprised that there is a way to set things up to immediately delete backups on the deletion of an account, as I have dealt with situations where clients have wanted that functionality.

Which would be fine if the client had asked for it, and someone at Google then pressed Y for “Are you sure you want to delete every damn thing?”

Leaving an unfilled entry in a script seems to be the Google equivalent of smashing shit with a hammer in secret. Did Google even know this happened before the client called them?

Also, good on the client for keeping multiple backups in multiple places. I know that’s best practice, but too many times on Ars we read about the horror of not doing the right thing when it comes to data.

schnackenpfefferhausen · May 30, 2024

50me12 said:
It's still not clear to me why "automatic deletion" wouldn't involve some form of SOFT or logical delete. To the outside world it is gone, but everything remains in a form that can be reverted (even if that process is laborious).

I don't write anything that doesn't have some form of soft delete that persists for a given amount of time ...

Just common sense / working with customers should teach you that people make mistakes and soft deletes are the way to go. Someone is going to footgun themselves at some point, even me, gotta be safe.

The counterpoint would be an Apple “ghost photo” scenario.. where what was thought deleted appears again.

If you’re dealing with sensitive information (classified, medical, financial, etc) deleted needs to mean deleted.

choco bo · May 30, 2024

Chuckstar said:
Aren't you referencing a completely different type of information stored in a completely different way?

Even if he is, I am pretty certain Google has ability to delete or manage any information stored in any way, within their organization.

If you think otherwise, you are incredibly naive.

NetMage · May 30, 2024

Wickwick said:
It's just that, why does that script execution environment have the ability to delete an account and all its saved data with no grace period?

Why do you think the grace period didn’t expire as well. Since the deployment was misconfigured as fixed term but the account was actually open ended it seems like any warnings that your term was over would never be triggered.

NetMage · May 30, 2024

Ildatch said:
The tool shouldn't even be capable of doing a full delete without confirmation.

While I realize Google Cloud doesn’t have the largest customer base, that really doesn’t scale well.

LukeSchlather · May 30, 2024

Describing this as "accidentally deleting a customer account" seems very misleading and doesn't do a good job of explaining what happened. My understanding of this incident is that Google basically deleted some sort of VMWare cluster in a single region.

But Google doesn't even have "accounts" if you're familiar with AWS Accounts. It goes Organization -> Project. And accounts are essentially just IAM users in GCP nomenclature.

I think people familiar with AWS are reading this headline as equivalent to "AWS deleted a customer account" but that's not accurate and doesn't make any sense in GCP terms.

The terrifying thing would be if Google accidentally deleted a project or an organization, which is what I read this headline as. (I'm also pretty confident that's as unlikely as AWS deleting an organization or an account by accident.)

druid318 · May 30, 2024

The lack of controls over account deletion are frightening. Some things should not be automated.

RuralNinja · May 30, 2024

So what Google is saying, is that if you're on a fixed term contract, the moment it expires, your data is purged, instantly and irrevocably? Most service providers offer a grace period, then a period where your data is read only, and then terminate your account.

meisanerd · May 30, 2024

NetMage said:
Why do you think the grace period didn’t expire as well. Since the deployment was misconfigured as fixed term but the account was actually open ended it seems like any warnings that your term was over would never be triggered.

Probably because UniSuper didn't notice something was wrong. If Google had a grace period, UniSuper's system would have gone down, they would have called Google, and Google would have been "Oops, my bad, let me flip this flag to turn it back on for you". Since Google couldn't just turn the system back on, but had to have things restored from UniSuper's backups, there clearly wasn't a grace period.

Any provider I have dealt with, if there is a billing issue or other reason the service didn't get renewed, it was pretty much immediately restored a few days later once things were cleared up, and no backup restoration was needed.

Wickwick · May 30, 2024

NetMage said:
Why do you think the grace period didn’t expire as well. Since the deployment was misconfigured as fixed term but the account was actually open ended it seems like any warnings that your term was over would never be triggered.

I assume the customer was writing to their databases all this time. If you're actively updating storage, it's not in a grace period. Unless Google didn't implement that sort of thing properly.

Nilt · May 30, 2024

stormcrash said:
That would be called obstruction of justice and evidence tampering, providing there is a valid warrant for that data

Also a great way to get an adverse inference in a civil context, which is quite a bit more common but still fits "the feds calling".

Kjella · May 30, 2024

Crying Croc said:
You always need a way to totally and permanently nuke something - in case the Feds come calling, etc. Yeah, I know, I'm cynical.

Even if there's nothing nefarious there's usually situations where you're required by law to hard delete. I work with healthcare records and 99.99% of the time an in-system correction (visible in history) or soft delete (only visible on back end) is good enough. But if you get a court to agree that these entries are both provably false and very damaging you can get them expunged. And that means "nuke it from orbit" gone, being unable to recover the contents is the desired result.

There's also been situations where providers have sent us records we're not legally allowed to have due to poor system configuration. Those too get nuked because if anyone - even our most trusted sysadmins - can extract that information from our systems we're in deep trouble, since we have no legal basis whatsoever for having that data. So yeah, maybe those controls should be guarded like the launch keys for a nuclear submarine but I'm not surprised that somewhere there's "really, really, REALLY delete this" code.

Google Cloud explains how it accidentally deleted a customer account

Ars Scholae Palatinae

Smack-Fu Master, in training

Ars Tribunus Angusticlavius

Ars Legatus Legionis

Ars Tribunus Angusticlavius

Ars Tribunus Militum

Ars Legatus Legionis

Ars Legatus Legionis

Seniorius Lurkius

Ars Centurion

Ars Tribunus Angusticlavius

Ars Tribunus Angusticlavius

Ars Legatus Legionis

Ars Tribunus Angusticlavius

Smack-Fu Master, in training

Ars Scholae Palatinae

Ars Tribunus Angusticlavius

Wise, Aged Ars Veteran

Smack-Fu Master, in training

Seniorius Lurkius

Ars Legatus Legionis

Ars Legatus Legionis

Ars Tribunus Angusticlavius

Ars Tribunus Angusticlavius

Ars Scholae Palatinae

Ars Scholae Palatinae

Ars Praetorian

Ars Centurion

Ars Praetorian

Ars Tribunus Angusticlavius

Ars Tribunus Angusticlavius

Smack-Fu Master, in training

Wise, Aged Ars Veteran

Ars Centurion

Ars Centurion

Ars Legatus Legionis

Ars Legatus Legionis

Ars Tribunus Militum