1. Library
  2. Computer Networks
  3. Servers and Infrastructure
  4. Storage

Updated 8 hours ago

Your data will die. The question is whether you'll get it back.

Hardware fails. Humans make mistakes. Ransomware encrypts. Buildings flood. The threats are endless, but they cluster into three categories: corruption (the data goes bad), medium failure (the storage dies), and location disaster (everything in one place is destroyed at once).

Understanding these three death modes is understanding backup strategy.

The 3-2-1 Rule

The 3-2-1 rule isn't arbitrary. Each number defends against a specific way your data dies:

3 copies defend against corruption. If one copy goes bad—silent bit rot, accidental overwrite, malware encryption—you have others to compare and restore from. Two copies might both be corrupted before you notice. Three gives you a tiebreaker.

2 different storage types defend against medium-specific failure. Hard drives fail in hard drive ways. Tape fails in tape ways. Cloud storage fails in cloud ways. If all your copies live on the same type of medium, a single failure mode can take them all. Mix local disk with cloud, or disk with tape—technologies that don't share failure modes.

1 copy offsite defends against location disaster. Fire doesn't care that you had three copies on two different storage types if they were all in the same building. Geographic separation means the disaster that destroys your primary location leaves your offsite backup untouched.

A working example: Production database on your servers (original), nightly backup to local NAS (second copy, same location, different hardware), weekly backup to cloud storage (third copy, different medium, different location). Each component of 3-2-1 is covered.

Full, Incremental, and Differential

Three approaches to what gets copied:

Full backups copy everything. Simple to restore—everything you need is in one place—but slow to create and expensive to store. If your data is 10TB, every full backup is 10TB.

Incremental backups copy only what changed since the last backup of any type. Monday's incremental has Monday's changes. Tuesday's has Tuesday's. Fast and storage-efficient, but restoration is complex: you need the last full backup plus every incremental since then, applied in order.

Differential backups copy everything that changed since the last full backup. Monday's differential has Monday's changes. Tuesday's differential has Monday's AND Tuesday's changes. Larger than incremental but simpler to restore: you need only the last full plus the last differential.

The common pattern: weekly full backups with daily incrementals (or differentials). You get storage efficiency during the week and a clean restore point each weekend.

How Often to Back Up

The question isn't "how often should I back up?" It's "how much data can I afford to lose?"

This is your Recovery Point Objective (RPO). If your database loses the last hour of transactions and that's catastrophic, your RPO is under an hour—you need continuous backup or backups every few minutes. If losing a day of work is annoying but survivable, daily backups suffice.

Different data has different RPOs:

  • Transaction databases processing orders or payments: minutes or continuous
  • Application servers where work can be recreated: daily
  • User files: daily or weekly depending on how irreplaceable the work is
  • Archives of data that rarely changes: weekly or monthly

Automate this. Backups that depend on someone remembering to run them don't happen.

How Long to Keep Backups

Retention policy answers: "How far back might I need to recover?"

Short-term retention (7-30 days of daily backups) handles the common case—you deleted something yesterday, a file got corrupted last week, you need to undo a recent mistake.

Medium-term retention (3-6 months of weekly backups) handles problems discovered later. The data corruption happened two months ago but you just noticed. The deleted file was actually important but no one realized until the project resumed.

Long-term retention (years of monthly backups) handles compliance requirements and historical reconstruction. Auditors want records from three years ago. Legal discovery requires emails from 2019.

The classic Grandfather-Father-Son rotation balances these: keep daily backups for a week, weekly backups for a month, monthly backups for a year. Multiple recovery points, manageable storage costs.

Testing: The Only Rule That Matters

Untested backups aren't backups. They're hopes.

You think your data is protected. You've followed the 3-2-1 rule, you've got incrementals running every night, you've got cloud copies syncing to another continent. But until you've actually restored from those backups—until you've proven they contain what you expect and the restoration process works—you have assumptions, not assurance.

Schedule restoration tests. Quarterly at minimum, monthly if you can. Verify:

  • The backup contains the data you expect
  • The restoration procedure actually works
  • Recovery completes within acceptable time (your Recovery Time Objective)
  • Someone other than the person who set it up can perform the restore

Document the procedures. The person who configured the backups might not be available during the disaster. Written runbooks mean anyone can restore.

Run disaster recovery drills. Not just "can we restore this file" but "can we recover operations if this building burns down." Test the offsite backups. Test the failover. Test under pressure, because that's how real disasters feel.

Security: Attackers Know the 3-2-1 Rule Too

Backups are targets.

Ransomware operators understand that victims with good backups don't pay ransoms. So modern ransomware specifically hunts for backups and encrypts or deletes them before revealing itself. Your backup server is a priority target, not an afterthought.

Encrypt backups at rest and in transit. If attackers steal your backup files, encryption keeps the data protected.

Control access tightly. The credentials that can delete backups should not be the same credentials used for daily operations. Compromising a workstation shouldn't give access to destroy backups.

Make backups immutable. Many backup systems now support write-once storage—backups can be created but not modified or deleted for a specified retention period. Even if attackers get administrative access, they can't destroy immutable backups.

Air-gap your offsite copies. Truly critical backups should be physically disconnected from networks that could be compromised. Tape in a vault. Offline drives. Cloud storage with credentials that exist nowhere in your production environment.

The threat model has evolved. Plan accordingly.

Cloud Backup Realities

Cloud storage makes offsite backup easy. A few considerations:

Storage is cheap until it isn't. Pennies per gigabyte per month sounds trivial until you're storing 50TB with 7-year retention. Model the costs over time.

Uploads are free, downloads cost money. Getting data into cloud storage is usually free. Getting it out during restoration costs per-gigabyte. A large restore can be expensive. Budget for it.

Initial upload takes time. Your first full backup of 10TB over a 100Mbps connection takes about 10 days. Some cloud providers offer physical transfer—ship them drives.

Proprietary formats create lock-in. If your backups are in a vendor-specific format, switching providers means re-backing-up everything. Open formats keep you flexible.

Backup vs. Archive

These serve different purposes:

Backups are for recovery—getting operations running again after failure. Relatively short retention, optimized for restoration speed.

Archives are for preservation—maintaining records for compliance, legal, or historical purposes. Long retention, optimized for storage cost, rarely accessed.

Don't conflate them. Your backup system optimized for fast recovery is expensive for 10-year retention. Your archive system optimized for cheap long-term storage is slow for operational recovery. Use appropriate tools for each.

Define Your Objectives Before Disaster

Two numbers matter:

Recovery Point Objective (RPO): How much data can you lose? If the answer is "one hour," you need backups at least hourly. If the answer is "one day," daily backups suffice.

Recovery Time Objective (RTO): How long can you be down? If the answer is "four hours," your backups need to be accessible and your restoration process needs to complete within that window. If the answer is "one week," you have more flexibility.

Different systems have different tolerances. Your payment processing database might have RPO of 5 minutes and RTO of 1 hour. Your marketing website might have RPO of 24 hours and RTO of 48 hours. Size your backup strategy to each system's actual requirements.

Define these numbers now, while you're calm and thinking clearly. During a disaster is the wrong time to discover you have no idea how much data loss is acceptable.

The Mistakes That Kill You

No testing. The backups ran every night for three years. The restoration failed in the first five minutes. You never knew because you never tested.

No offsite copy. Three copies on three different drives in the same server room. The sprinkler system destroyed all of them.

Same failure domain. Backups stored on the same physical machine as the original data. The disk controller failed and took both.

No monitoring. Backups silently failed six months ago. You discovered this during the incident.

Unencrypted backups. Stolen backup drives exposed every customer record you had.

Single point of knowledge. Only one person knew how the backup system worked. They left the company. Then the disaster happened.

Every one of these has happened. Multiple times. To organizations that thought they were protected.

Frequently Asked Questions About Backup Strategies

Was this page helpful?

😔
🤨
😃