Mission Impossible: Complete Disaster Recovery for Google Workspace

I'm a frequent user of Google Workspace and even accept the switch from free to paid for my family domain. One topic has always been on my to-do list: Proper backups to support disaster recovery after a major problem.

It turns out that Google Workspace has a significant flaw: It is technically impossible to create a full backup of all data and to restore that! Google simply doesn't offer any API for that. As a result, all backup vendors are forced to work with the regular APIs. As a result, not everything in Google Workspace can be stored in a backup, e.g. Google Sites (new, not classic). Some content, like Google Drawings and others, can only be backed up as a static file (e.g. PDF) and not restored into a new Google Drawing.

Google itself doesn't offer much on backup and disaster recovery:

The white-paper doesn't even mention backup and disaster recovery as a recommended solution, it only lists standard recommendation to increase user account security.

The built-in solutions from Google only allow recovering deleted Drive content and Gmail content, if less than 25 days passed since deleting it. They don't allow a point-in-time recovery or help with large-scale file changes, e.g. after a ransomware attack. And there are no centrally managed solutions for other Google Workspace content.

Restore vs. Disaster Recovery

When thinking about the problem I like to discuss disaster recovery rather than backup and restore. The reason is the difference in the problem space:

  • Restore is required if some or all user data becomes unusable and needs to be restored into the same infrastructure or platform
  • Disaster recovery is required if the infrastructure or platform becomes unusable and all user data needs to be recovered either somewhere else or new infrastructure.

Any solution to the disaster recovery problem will also be able to cover the restore problem, but a solution to the restore problem won't necessarily also cover the disaster recovery problem.

Data Ownership

An additional aspect unique to Cloud services is the problem of "having" my data. While I might legally own the data that is stored in Cloud services, I am actually at the mercy of the vendor to have access to the data. To take control of my own data I want to have a copy of the data that I legally and technically own and that nobody can take away from me with the push of a button.

As a Google Workspace domain admin I also owe my users that level of data sovereignty. I wouldn't be surprised, if there is also a legal or compliance requirement for companies to have such a "fully owned" copy of their data.

Tool Vendors

I did a little vendor survey for backup & disaster recovery solutions and by large they all told me the same: Restore happens to a different file with a different document ID, and restored Google Docs, Sheets, Slides lack all "smart" content like embedded charts, Apps Scripts code or Add-On settings.

The change in the document ID is in my opinion the biggest problem of all: With the Google Workspace ecosystem we only share links to documents. These links are based on the document ID and we assume that they never change. This is true till such a file has to be recovered from the backup. Then it will have a new document ID and also a new link. And every place where this link was used will be broken and require manual updating.

Some vendors, like Afi.ai (from Cloud Architecture Center) offer somewhat better features and can restore document IDs. However, this only works if the file is restored to an existing file, thereby replacing it. If the original file has been deleted more than 25 days ago then the document ID is lost and also Afi.ai cannot restore it.

My Recommendation: NAS Appliance

The surprising conclusion of my research is that a NAS appliance from QNAP or Synology would probably be the best solution, at least for small domains like my family. All the backup vendors work with the same Google APIs and support more or less the same feature set, especially for the Google Workspace core services: Mail, Drive, Contacts and Calendar. With the NAS appliances the software is free, I would only pay for buying the storage. Additional benefits are the fact that the data is usable offline and doesn't need to be downloaded from the Cloud. This approach gives me a lot of confidence to keep my data, even if all the Cloud services suffer a major outage.

I'm not able to say which of the two NAS vendors offers the better solution, I hope to be able to try them out and compare them in detail.

I talked about this at the Workspace Arena: Backup-Lösungen für Google Workspace event on April 6th, 2022, you can watch the video (in German) or flip through the slides:

Comments

Like this content? You could send me something from my Amazon Wishlist. Need commercial support? Contact me for Consulting Services.

Popular posts from this blog

Overriding / Patching Linux System Serial Number

A Login Security Architecture Without Passwords

The Demise of KaiOS - Alcatel 3088X