[AccessD] WHS and backups

jwcolby jwcolby at colbyconsulting.com
Thu Dec 27 00:59:20 CST 2007


Precisely.

As part of my research into doing this I examined backing up files out to
the internet.  I have 5 megabit down 500K up.  Backing up 70 gigs of data
turns into a multi-week affair, and that is if the ISP doesn't suddenly slap
throttling on me.

Backups in the modern age are just problematic.  WHS is a truly radical
technology but they cut some corners that should have been left square.
There are a lot of people banging on them about this very issue so maybe
they will acknowledge the problem and fix it.  Unfortunately given the
radical stuff they are doing (the cool part) they may have painted
themselves into a corner not easily squared back up.

All I know is that I like what they are trying to do and if a raid
foundation is the cost of doing business then that is what I will do.

>From various readings:



Unlike most backup products that operate at the file level, the Windows Home
Server computer backup solution works on "clusters". Clusters are the lower
level constructs of the file system. They are usually 4k bytes in size on
most NTFS disks. The "magic" you are seeing is a result of the fact that
Windows Home Server makes sure that any particular cluster is stored only
once on the server...even if that cluster is found on multiple disks and
within multiple files. This is known as "single instance storage" in geeky
circles.

Here's some more detail on how this works:

    * The server side of the solution is a database (not some off the shelf
database, but one developed specifically for this application). The
"records" in the database are clusters and hashes of those clusters (a hash
is a number that uniquely identifies a cluster based on its contents). The
database also contains information on the structure of a volume (NTFS file
system information).  If a cluster on the C: drive of Mom's computer has the
first 4096 characters of "War and Peace" in it, and a cluster on the E:
drive of Joey's computer has those same 4096 characters in it then their
hashes will be the same.
    * When a computer is backed up to the server, the code on the client
computer figures out what clusters have changed since the last backup.
    * The software then calculates a hash for each of these blocks and sends
just the hashes to the server.
    * The server looks its database of clusters to see if any have hashes
that match those it just received. If a hash matches then that cluster is
already stored on the server.
    * If they are NOT stored on the server already, then the computer sends
them to the server and the server adds them to the database
    * All file system information is transferred and stored on the server
such that a volume (from any machine) at any backup point (time) can be
reconstituted from the database.

And this is how 220GB of data spread out across 4 computers can be stored in
98GB of space on your home server.

 
So as you can see they are using cool stuff to do the magic but... there are
chinks in their armor, chinks that would be fixed by RAID.  Basically you
need to protect WHS from a drive failure.  Unfortunately motherboard raid
solutions are pretty much trash, they work but come with their own issues
including abysmal speed.  Dedicated raid controllers with XOR capabilities
are lightning fast and oh so expensive.  But... if you pay the price then
WHS becomes a super efficient backup solution (AFAICT YMMV DSMIYLE).

John W. Colby
Colby Consulting
www.ColbyConsulting.com 
-----Original Message-----
From: accessd-bounces at databaseadvisors.com
[mailto:accessd-bounces at databaseadvisors.com] On Behalf Of Jim Lawrence
Sent: Thursday, December 27, 2007 12:41 AM
To: 'Access Developers discussion and problem solving'
Subject: Re: [AccessD] WHS and backups

Hi John:

Thanks for the info. Backups always are no better than the media they are
on. There is nothing like a stack of hot swappable 500GB drives to add a
degree of confidence to any backup solution.

Jim

-----Original Message-----
From: accessd-bounces at databaseadvisors.com
[mailto:accessd-bounces at databaseadvisors.com] On Behalf Of jwcolby
Sent: Wednesday, December 26, 2007 8:27 PM
To: 'Access Developers discussion and problem solving'; 'Access Developers
discussion and problem solving'
Subject: [AccessD] WHS and backups

Oh man oh man.  Before anyone rushes out to implement WHS as a backup
solution you need to know that it has a glaring weakness, the backup stuff
is not duplicated, nor can you force it to duplicate.  Thus backups will be
trashed (or may be) if a disk fails.  I have to say I LOVE what WHS is doing
here but you need to be aware of this issue.  

My intention is to soldier on but I will be implementing a raid solution in
order to prevent disk failures from impacting the solution.  It appears that
WHS has some pretty cool technologies and it claims to allow access to
daily, weekly and monthly backups back as far as 10 years.  IF you have a
raid solution you win, else you lose BIG-TIME if you have a hard disk
failure.


Sigh.



John W. Colby
Colby Consulting
www.ColbyConsulting.com 

--
AccessD mailing list
AccessD at databaseadvisors.com
http://databaseadvisors.com/mailman/listinfo/accessd
Website: http://www.databaseadvisors.com

--
AccessD mailing list
AccessD at databaseadvisors.com
http://databaseadvisors.com/mailman/listinfo/accessd
Website: http://www.databaseadvisors.com




More information about the AccessD mailing list