Introduction

We all store more and more of our lives in digital form; spreadsheets, résumés, wedding speeches, novels, tax information, schedules, and of course digital photographs and video. All of this data is easy to store, transmit, copy, and share, but how easy is it to get back?

All of this data can be a harsh reminder that computers are not without fault. For years, storage costs have been dropping while at the same time the amount of storage in any one computer has been increasing almost exponentially. We are at a point where a single hard drive can contain multiple terabytes of information, and with a single mishap, lose it all forever. Everyone knows someone who has had the misfortune of having a computer stop working and wanting their information back.

It’s always been possible to safeguard your data, but now it’s not only necessary thanks to the explosion of personal data, it’s also more affordable than ever. When you think of the costs of backing up your data, just remember what it would cost you if you were to ever lose it all. This guide will walk you through saving your data in multiple ways, with the end goal being to have a backup system that is simple, effective, and affordable. In this day and age, you really can have it all.

It’s prudent at this point to define what a backup is, because there are a lot of misconceptions out there which can cause much consternation when the unthinkable happens, and people who thought they were protected find out they were not.

Backups are simply duplicates of data which are archived, and which can be restored to a previous point in time. The key is the data must be duplicated, and you have to be able to go back to an earlier time. Anything that doesn’t meet both of those requirements is not a backup.

As an example, many people trust their data to network storage devices with RAID (Redundant Array of Independent Disks). Without going into the intricacies of various forms of RAID, none of these Network Attached Storage (NAS) devices are any sort of a backup on their own. RAID is designed to protect a system from a hard disk failure and nothing more. Depending on the RAID level, it either duplicates disks, or uses a calculation to create a parity of the data which can be used to calculate the original value of the data if any part of the data is missing from a failed disk. While RAID is an excellent mechanism to keep a system operational in the event of a disk failure, it is not a backup because if a file is changed or deleted, it is instantly updated or removed on all disks, and therefore there is no way to roll back that change. RAID is excellent for use as a file share, and can even be effectively utilized as the target for backups, but it still requires a file backup system if important data is kept on the array.

Another similar example is cloud storage. Properly configured, cloud storage can be a backup target, and different services can even properly perform backups, but the average person with the average Google Drive or OneDrive account can’t copy their files there and hope they are protected. As with RAID, it is a more robust file storage than any single hard drive, but if you delete a file, or copy over another, it can be difficult or impossible to go back to a previous version.

Both RAID and cloud storage suffer from the same problem – you can’t go back to an earlier time, and therefore are not a true backup. True backups will allow you to recover from practically any scenario – fire, flood, theft, equipment failure, or the inevitable user error. This guide will walk you through several methods of performing backups starting at simple and moving up to elaborate systems that will truly protect your data. These methods work for home and business alike, just the type of equipment will likely differ.

There is some common terminology used in backups that should be defined before we start discussing the intricacies of backups:

  • Archive Flag: A bit setting on all files which states whether or not the file has been modified since the last time the flag was cleared.
  • Full Backup: A backup of all files which resets the archive flag.
  • Differential Backup: A backup of all files with the archive flag set, but it does not clear the archive flag.
  • Incremental Backup: A backup of all files with the archive flag set which resets the archive flag.
  • Image or System Based Backup: A complete disk level backup which would allow you to image a machine back to a previous state.
  • Deduplication: A software algorithm which removes all duplicate file parts to reduce the amount of storage required.
  • Source Deduplication: removing duplicate file information from files on the client end. This requires more CPU and memory usage on the client, but allows for a much smaller file size to be transferred to the backup target.
  • Target Deduplication: removing duplicate file information from files on the target end. This saves client CPU and memory usage, and is used to reduce the amount of storage space required on the backup target.
  • Block Level: A backup or system process which accesses a sequence of bytes of data directly on the disk.
  • File Level: A backup or system process which accesses files by querying the Operating System for the entire file.
  • Versioning: A list of previous versions of a file or folder.
  • Recovery Point Objective (RPO): The amount of time since the last backup deemed safe to lose in a disaster scenario. For example, if you perform backups nightly, your RPO would be the previous night’s backups. Anything created in between backups is assumed to be recoverable through other methods, or an acceptable loss.
  • Recovery Time Objective (RTO): The amount of time deemed acceptable between the loss of data and the recovery of data. For home use, there’s really no RTO but many commercial companies will have this defined either with in-house IT or with a Service Level Agreement (SLA) to a support company.
 
Plan Your Backups
POST A COMMENT

133 Comments

View All Comments

  • wumpus - Thursday, May 22, 2014 - link

    The whole point of RAID *is* to protect you from things like bit rot. The difference between RAID5 and RAID6 is that RAID6 protects you from two rotted bits in a single sector (more specifically, two different drives with failures in the same location). You should be able to avoid this with RAID5 by periodically reading the entire drive and correcting any single error you find (called "scrubbing"). Reply
  • Mr Perfect - Thursday, May 22, 2014 - link

    It's not really a sure thing with the RAID though. The array has no idea which version is correct, and which one is rotten. The best it can do is take a consensus and go with whatever version of the file the most drives agree is correct. They did an article about bit rot over at Ars Technica, and the author's RAID 5 happily used the rotten version.

    http://arstechnica.com/information-technology/2014...
    Reply
  • bsd228 - Thursday, May 22, 2014 - link

    not, really, wumpus. The whole point of RAID (minus 0) is to protect you from a disk failure. By itself it does not deal with bit rot at all. On a mirror, who is right? In typical implementations, disk 0 is presumed to have the correct copy. ZFS (and I believe MS's knockoff, ReFS) implemented scrubbing with checksumming to give a means to identifying the correct copy. Reply
  • beginner99 - Thursday, May 22, 2014 - link

    I use Microsoft's free tool SyncToy. With it you can synchronize folders to anywhere else, like an external hdd. And of course only updates are synched and you can specify in which direction to sync. I use it to backup my media collection. The external hard drive can then be stored off-site (at work). The advantage I see with this is that the media files are copied over and are readable on the backup directly. You can take the external hdd on the road and have your full media collection at hand. With image files you will have to first restore them before being able to use them.

    Important documents should be stored in the "cloud". This can be a simple encrypted zip sent by email and it will be stored on the email server (say gmail) or whatever. That was possible like over a decade ago already.
    Reply
  • gsvelto - Thursday, May 22, 2014 - link

    I do most of my backups from Linux: I use rsync to sync my home directory and other relevant files outside of /home and ntfsclone to backup my Windows drives. The latter option is definitely slower than incremental backups or somesuch but allows me to restore a Windows installation very quickly w/o need for reinstalling. It's also handy when moving Windows from a hard drive to another. Reply
  • AlexIsAlex - Thursday, May 22, 2014 - link

    Another aspect to backups is bit rot. Both on the backup media (are the files in the backup still good?) and on the live media (do I need to restore this file from backup, as it has become corrupted?)

    For a decent backup system, I want checkusms stored with the backed up data, and verified regularly. I also want the backup to actually read all files to be backed up from the source, even if they are not supposed to be modified since the last backup, and check that they still have the same checksum. Unfortunately, this takes rather a long time, but I don't see any alternative to discovering months down the line that some rarely accessed files have become corrupted, and worse, been backed up in a corrupted state.
    Reply
  • boomie - Thursday, May 22, 2014 - link

    >Windows 8 fixes that issue, but creates new ones by no longer allowing automated image backups
    Well, I didn't think supposed IT pros at anandtech would be so casual as to be afraid of command line.
    If you cannot live in this world without regular image backups, who prevents you from adding a task in task scheduler with wbadmin call?
    Come on now.
    Reply
  • ruthan - Thursday, May 22, 2014 - link

    There are extended tutorials to Windows native backup setting, but for Winserver essentials, here are very compressed version of descriptions. Could you more explain it - for example - "Once the connector software is installed" - this is big shortcut - after installation is backup set up from server or from local machine?
    How is linux / macs backup support, because of this is real different, Windows backup solution isnt now big problem. From my experience - best solution are form Acronis and Paragon, but they have lots of limitations and known issues.
    Reply
  • davidpappleby - Thursday, May 22, 2014 - link

    We have two laptops, and two desktops. Each has a boot drive and a separate physical backup drive for images using acronis. All pictures/music/data reside on the server which has separate backup drives for its OS and data (again with acronis). I'll be looking into S3 again as a result of this article (last time I looked I thought 2tb was too much). My wife has an external drive we use as off site backup of her important data (downside is that that is current only). Reply
  • Mikuni - Thursday, May 22, 2014 - link

    Mega gives 10GB for free, encrypted storage, why wasn't it mentioned? Reply

Log in

Don't have an account? Sign up now