For Resilience, Go Glacier

Three core rules to resilience: backup, backup, and backup. I appreciate there’s a whole lot more to resilience than backups, especially in…

Photo by Adriana Lorena Benavides Estrada

For Resilience, Go Glacier

Three core rules to resilience: backup, backup, and backup. I appreciate there’s a whole lot more to resilience than backups, especially in the support of the failover of systems, but backups and snapshots must be at the core of the data resilience of companies.

I remember backing up to floppy disks, and where each disk had a capacity of 1.2MB, and where you just had to hope that there were no errors on the disks when you recovered from them. But, with the advent of the public cloud, we increasingly back up to Cloud-based systems. A great advantage with these is that an intruder may delete files, but where the cloud provider will often support the recovery of deleted files (and even for different versions). You can also put back-up at arm's length of any intruder.

Overall, the bottom line comes down to costs, and how often you do the backups. Snapshots can be expensive, but allow you to quickly revert running systems. But it is backups that often provide the lowest level of storage of key files. AWS S3 buckets are one way to back up data and cost $0.023 per GB. Thus, 1TB of data can cost $23 per month to store.

But, for less expensive storage, S3 provides the Glacier storage classes for less expensive storage. This can be schemes such as S3 Glacier instance access (millisecond retrieval), S3 Glacier Flexible Retrieval (retrieved between 1 minute and 12 hours) and S3 Glacier Deep Archive (retrieve in 12 hours). The costs significantly drop:

  • S3 Glacier instance access. $0.004 per GB ($4 per 1TB).
  • S3 Glacier Flexible Retrieval. $0.0036 per GB. ($3.6 per 1TB).
  • S3 Glacier Deep Archive. $0.00099 per GB ($0.99 per TB).

And, so, we see that a Dropbox-type solution for 1TB costs around $48 per year for storage (obviously there would be a cost for data retrieval). But for resilience, a full system backup could be put into the Deep Archive, but there would be up to a 12-hour wait for it to be retrieved.

To enable data objects from a bucket to move into Glacier storage, we enable a Lifecycle rule on the bucket:

After this, we define the storage method to be applied, and in how long it will take a data object to move into the Glacier storage:

Finally, our lifecycle configuration is enabled:

Conclusions

So no more swapping disk drives, or inserting tapes in a machine, that you hope will eventually find the files you need to recover. The Cloud now provides a fairly secure and available method of recovery of data.

Three core rules to resilience: backup, backup, and backup. A CEO should ask their IT team about the team to recover the data, network and compute infrastructure for a range of disaster situations. In the worst case, we have a complete disaster, and for that, a Deep Storage solution is one alternative. It is relatively inexpensive but could save your company.

If you are interested, we will be running AWS Academy courses in Data Engineering, Cloud Security and Cloud Architecture. Get in contact if you are interested.