jwcolby
jwcolby at colbyconsulting.com
Thu Dec 27 00:59:20 CST 2007
Precisely. As part of my research into doing this I examined backing up files out to the internet. I have 5 megabit down 500K up. Backing up 70 gigs of data turns into a multi-week affair, and that is if the ISP doesn't suddenly slap throttling on me. Backups in the modern age are just problematic. WHS is a truly radical technology but they cut some corners that should have been left square. There are a lot of people banging on them about this very issue so maybe they will acknowledge the problem and fix it. Unfortunately given the radical stuff they are doing (the cool part) they may have painted themselves into a corner not easily squared back up. All I know is that I like what they are trying to do and if a raid foundation is the cost of doing business then that is what I will do. >From various readings: Unlike most backup products that operate at the file level, the Windows Home Server computer backup solution works on "clusters". Clusters are the lower level constructs of the file system. They are usually 4k bytes in size on most NTFS disks. The "magic" you are seeing is a result of the fact that Windows Home Server makes sure that any particular cluster is stored only once on the server...even if that cluster is found on multiple disks and within multiple files. This is known as "single instance storage" in geeky circles. Here's some more detail on how this works: * The server side of the solution is a database (not some off the shelf database, but one developed specifically for this application). The "records" in the database are clusters and hashes of those clusters (a hash is a number that uniquely identifies a cluster based on its contents). The database also contains information on the structure of a volume (NTFS file system information). If a cluster on the C: drive of Mom's computer has the first 4096 characters of "War and Peace" in it, and a cluster on the E: drive of Joey's computer has those same 4096 characters in it then their hashes will be the same. * When a computer is backed up to the server, the code on the client computer figures out what clusters have changed since the last backup. * The software then calculates a hash for each of these blocks and sends just the hashes to the server. * The server looks its database of clusters to see if any have hashes that match those it just received. If a hash matches then that cluster is already stored on the server. * If they are NOT stored on the server already, then the computer sends them to the server and the server adds them to the database * All file system information is transferred and stored on the server such that a volume (from any machine) at any backup point (time) can be reconstituted from the database. And this is how 220GB of data spread out across 4 computers can be stored in 98GB of space on your home server. So as you can see they are using cool stuff to do the magic but... there are chinks in their armor, chinks that would be fixed by RAID. Basically you need to protect WHS from a drive failure. Unfortunately motherboard raid solutions are pretty much trash, they work but come with their own issues including abysmal speed. Dedicated raid controllers with XOR capabilities are lightning fast and oh so expensive. But... if you pay the price then WHS becomes a super efficient backup solution (AFAICT YMMV DSMIYLE). John W. Colby Colby Consulting www.ColbyConsulting.com -----Original Message----- From: accessd-bounces at databaseadvisors.com [mailto:accessd-bounces at databaseadvisors.com] On Behalf Of Jim Lawrence Sent: Thursday, December 27, 2007 12:41 AM To: 'Access Developers discussion and problem solving' Subject: Re: [AccessD] WHS and backups Hi John: Thanks for the info. Backups always are no better than the media they are on. There is nothing like a stack of hot swappable 500GB drives to add a degree of confidence to any backup solution. Jim -----Original Message----- From: accessd-bounces at databaseadvisors.com [mailto:accessd-bounces at databaseadvisors.com] On Behalf Of jwcolby Sent: Wednesday, December 26, 2007 8:27 PM To: 'Access Developers discussion and problem solving'; 'Access Developers discussion and problem solving' Subject: [AccessD] WHS and backups Oh man oh man. Before anyone rushes out to implement WHS as a backup solution you need to know that it has a glaring weakness, the backup stuff is not duplicated, nor can you force it to duplicate. Thus backups will be trashed (or may be) if a disk fails. I have to say I LOVE what WHS is doing here but you need to be aware of this issue. My intention is to soldier on but I will be implementing a raid solution in order to prevent disk failures from impacting the solution. It appears that WHS has some pretty cool technologies and it claims to allow access to daily, weekly and monthly backups back as far as 10 years. IF you have a raid solution you win, else you lose BIG-TIME if you have a hard disk failure. Sigh. John W. Colby Colby Consulting www.ColbyConsulting.com -- AccessD mailing list AccessD at databaseadvisors.com http://databaseadvisors.com/mailman/listinfo/accessd Website: http://www.databaseadvisors.com -- AccessD mailing list AccessD at databaseadvisors.com http://databaseadvisors.com/mailman/listinfo/accessd Website: http://www.databaseadvisors.com