This is the third in a series of articles about a new backup process I have implemented for my home network. In the previous article I covered a mirror backup process that maintains a storage-efficient backup history. In this article I'll cover the tools I used and the issues I had to overcome while using them.
Common tools and a not so common use of them
Once I had decided to create a backup system that creates space-conserving mirror backups by leveraging NTFS hard links, I set out to make a simple prototype. It occurred to me that I already had a very good tool for copying data around, a free tool called robocopy from the Windows sources kit. Robocopy is a very powerful file copying tool that can be configured in a multitude of ways, including the ability to copy files in backup mode, a special mode of file access that can be used to bypass file security for the purposes of backing up files. It is also faster and more reliable than the file copy tools that come with Windows and has a very good set of options to control which files to copy. However robocopy know nothing about the process of creating hard links to previous versions of files. This step I would have to do myself.
In searching for information on how to create hard links, it wasn't long before I ran across references to the fsutil tool that is included in Windows XP and Windows Server 2003. Using this tool you can creating NTFS hard links from the command line.
Together with robocopy and a bit of creative CMD scripting, I was able to throw together a prototype that could create mirror backups while hard linking to the files that had not changed since the previous backup just like rsync did. I started by duplicating the directory structure of the old backup by using robocopy to copy just the directories. Next I used fsutil to hard linked copies of the previous backup files into the new directories. I did this by traversing the old backup directories and using fsutil to create hard links to each of the older files. Then I used robocopy to generate a list of the files that had changed since the last backup, including files that were no longer present. From that listing I then deleted those files from the newly created mirror backup. Finally, I used robocopy to copy over just the newer files into the new mirror backup. While it wasn't the most efficient method, it worked pretty well but it had one important limitation: fsutil only works on local disks. It was also a pretty hacky bit of CMD script since I had to do string manipulation to create the hard links. I had considered re-writing the whole process in C# but then something else popped on my radar.
PowerShell, isn't that some sort of new gasoline?
It was about this time that Microsoft released RC2 of PowerShell (which as just recently gone RTM). PowerShell is Microsoft's new administrative scripting language for the future. Besides be a very good replacement for command shell scripting and VBScript, it is also the new foundation of the management tools for the next version of Microsoft Exchange. It is an amazingly powerful scripting language, easily learned, easily extended, and is easily the more important tool I have learned in a long time.
PowerShell is different from other scripting languages because it is based on the concept of pipelining objects. Many scripting languages, including the native Windows shell, support pipeling text data from command to command. PowerShell is different however in that it pipelines complete .NET objects instead of just textual data. As full .NET objects, each object in the pipeline has state, properties, and methods. They can be passed as parameters to functions, extended dynamically, coerced into other types, and placed back into the pipeline. Functions in PowerShell can also be treated as objects allowing you to do some types of functional programming tasks that are not easily done in other .NET languages. It is a very powerful idea and my brief description doesn't even scratch the surface of the power that lies within PowerShell. It is all still very new to me but already I am finding many uses for it.
Tip: Here's a PowerShell gotcha to keep in mind. Every expression in PowerShell that produces output places that output in the pipeline. This can lead to pretty weird debugging issue if you aren't careful. I had more than one case where a function was returning more than I wanted because I was calling a command that placed things in the pipeline without realizing it. There are two ways to avoid this however. One is to assign the output of commands to a variable and the other is to redirect the output to $null (i.e. do-something > $null).
PowerShell's object pipeline nature along with the rich set of built-in commands knows as cmdlets, makes for a perfect system for doing administrative computer tasks. There are cmdlets for accessing PowerShell providers such as the file system and the registry, accessing WMI object, COM objects, and the full 2.0 .NET framework. I've seen examples of everything from a simple file parsing scripts to a simple but complete HTTP server written in PowerShell in just a few lines of code. To me it appeared to be the perfect language for scripting a new backup process. However PowerShell does not offer support for creating NTFS hard links either. For this I would need to extend PowerShell.
Extending PowerShell through custom C# objects and P/Invoke
Starting with Windows XP there is a new API for creating hard links, CreateHardlink. In previous versions of Windows, creating hard links was somewhat of a black art. You had to use the complex and sparsely documented Win32 Backup API's. It could be done and there are examples of how to do it out there, but it was not for the faint of heart. The CreateHardlink API however solves that, making it almost trivial to create hard links on NTFS. Furthermore, unlike fsutil the CreateHardlink API fully supports creating hard links on remote network NTFS drives. PowerShell cannot easily call native API's on its own though. To do that, you need to extend PowerShell with a bit on .NET code.
PowerShell is very easy to extend. You can write complete cmdlets', objects that fully plug into the PowerShell pipeline framework or you can just create simple .NET objects that can be created and invoked thanks to PowerShell's ability to access the .NET framework.
Using C# and a bit of P/Invoke it was almost trivial to solve the problem of not being able to create hard links in PowerShell (and .NET) by writing a simple object that called the Win32 CreateHardlink API. Once that was done, I could easily create my new .NET object in PowerShell and use it to create all the hard links that I wanted. Now I could create a more complete backup script from the ground up using PowerShell.
If you'd like to access the CreateHardlink API in PowerShell or .NET, here is a C# code snippet to help you. Simply create a new class in a .DLL and add this method. I added this method as a static member since it does not require any state from the class. This also makes it very easy to call from PowerShell.
[DllImport("kernel32.dll", SetLastError = true, CharSet = CharSet.Auto)]
internal static extern int CreateHardLink(string lpFileName,
string lpExistingFileName, IntPtr lpSecurityAttributes);
static public void CreateHardlink(string strTarget, string strSource)
{
if(CreateHardLink(strTarget, strSource, IntPtr.Zero) == 0)
{
throw new System.ComponentModel.Win32Exception(Marshal.GetLastWin32Error());
}
}
To call this code from PowerShell, you simply load the .NET assembly and then call that static method on your class. Note that this will throw an exception if it fails so make sure you have a PowerShell trap handler somewhere in your script.
# load the custom .NET assembly
[System.Reflection.Assembly]::LoadFrom('YourLibrary.dll')
# create a hard link
[YourLibraryName.YourClass]::CreateHardlink($Target, $Source) > $null
Whoops, that file is in use
There was still one more issue to tackle before I could write a robust backup system, accessing files that are in use. Starting with Windows XP Microsoft introduced a new system for accessing files that are currently in use on Windows systems, the Volume Shadow Copy Service (VSS for short, but not to be confused with Microsoft's VSS source control system).
One of the ideas behind VSS is that when requested, the OS will make a read-only copy of the drive, a snapshot frozen in time, available to a backup program. Other programs can continue to change the original disk files but this shadow copy, or snapshot will remain frozen and completely accessible to the program that created it. Furthermore when a backup program requests that a shadow copy is to be created, the OS can coordinate with shadow copy providers to ensure that the data on the disk is in a consistent state before the shadow copy is created. This further ensures that the files that the backup program has access are in a consistent enough state on the disk to be backed up. This is especially useful for files that are either always open or always changing like the system registry, user profiles, Exchange, or SQL databases. Once the backup program is finished with this temporary read-only shadow copy, it then releases it and it disappears from the system. By using the VSS system backup programs can gain access to every file on the drive even if they are exclusively in use by other programs. For me it was essential to use VSS with any backup process I implemented.
There were a few tough problems though. On windows XP these VSS snapshots are very temporary in that they only exist for as long as you hold a reference to them via COM. Once released, they auto-delete themselves. And unlike VSS on Windows Server 2003 they cannot be exposed as a drive letter for easy access. You have to access them via the native NT kernel's method of addressing NT namespace objects, the GLOBALROOT namespace. On XP when you ask the VSS service to create a snapshot, what you get is a NT GLOBALROOT path that looks like this:
\\?\GLOBALROOT\Device\HarddiskVolumeShadowCopy1. Unfortunately this is something that not even the native Windows command shell fully understands and if you try and access it from PowerShell or .NET you'll get an exception telling you that you really shouldn't be accessing internal NT paths in .NET. To solve this I would need another bit of custom code to extend PowerShell.
VSHADOW.EXE and exposing a snapshot as a drive letter
VSHADOW is a sample tool that is part of the VSS SDK. It is a command line interface to the VSS API. By using this tool you can create and release VSS snapshots at will. It even has a way around the COM auto-destruction of snapshots on Windows XP by allowing you to call an external program once the snapshot has been created so that you can access the snapshot while VSHADOW is still keeping it alive. It will even create a set of environment variables for you to let you know the names of the GLOBALROOT shadow copies that it has created. This still didn't solve my problem of not being able to access them via PowerShell though (or robocopy for that matter) but having this source code was a good start.
All physical devices in Windows like hard drives exist in the GLOBALROOT namespace. It is only through device name mapping that we can access them via their friendly DOS names like C:, D:, etc.... Normally the OS creates these device mapping automatically at start up or whenever a new device has been connected. VSS snapshots however don't automatically get recognized and mapped. Mapping a friendly name to a VSS snapshot has to be done by directly using the win32 DefineDeviceCreate API. By using this API you can create and remove DOS device mappings to VSS snapshots on the fly even on Windows XP. But since VSS snapshots are temporary you have to manage them carefully or the system could become unstable.
Creating a VSS snapshot and mapping it to DOS device names is well beyond what I wanted to try to do in C#. Luckily for me, the VSHADOW C++ source code was written in a very reusable manor and I could easily reuse it by wrapping a COM object around it.
The not so nice COM interop experience with .NET 2.0
Creating a snapshot is not the simplest of procedures. You have to query for the list of VSS writers, map them against the target volume, determine which ones to include in the process, and finally request that the snapshot be created. You have to hold on to the VSS COM interface to keep the snapshot alive on XP for the duration of its use. When you are done, you have to release it in a controlled manor or the VSS system can completely degrade and require a system restart to recover from in most cases. It is also not the fastest process either, something that would come back to bite me later. However the VSHADOW source which is written in C++, was written in such a way that it made it very easy to turn into a COM object using ATL. It was as simple as creating a new ATL COM object project in Visual Studio and including the core VSHADOW sources file into the project. Once I had it building as a COM object it didn't take me long to put a .NET friendly interface on this new COM object that exposed methods to create and destroy VSS snapshots as well as map them to DOS device names.
PowerShell has native support for create and calling COM objects that is even easier than in other .NET languages. There is no need to create .NET interop classes, you just dynamically creating the COM object and use it much like you would in VBScript. Once I created my new VSS COM object it was trivial to create VSS snapshots on the fly and map them to DOS device names using PowerShell. With my new VSS COM object I now had complete access to VSS snapshots from any tool that could access a standard drive. It has some limitations but for this backup process it works very well.
Releasing the VSS snapshot in PowerShell however was another story. There is no clean way that I can find to force a created COM object to be released in PowerShell. You have to wait for the .NET garbage collector to do its thing which is usually not until the PowerShell process is exiting. My new COM object had its clean-up code in the COM object's Release method so that when it was released it would clean up the VSS state in the proper way, ensuring that the system remained stable. Unfortunately for me relying on a COM object's Release method to work during the .NET shutdown process proved to be one huge headache.
After many, many hours of debugging and not really believing what I was seeing I finally had to accept what was going on. From what I was seeing and from the research I have done it is my understanding that Finalizers in .NET, which are called when an object is being destroyed and which are also responsible for calling a COM object's Release method in PowerShell, are not guaranteed to complete when a process shuts down. Usually this is not a problem as the process is going away anyway. It is a problem however when you have native resources to release.
What I was seeing and not believing for literally hours and hours was that in the middle of my COM object's Release method the PowerShell process would just exit normally. No exceptions, no faults, nothing - just poof it's gone. And every time that it did this it would leave the VSS system in such a state that the machine had to be restarted because I was never given the chance to properly execute the VSS clean-up code which can be a lengthy process. It seems that the PowerShell shutdown process was timing out my clean-up code. It was a complete mess and still one that I cannot believe is acceptable but apparently to the folks who created .NET it is (you can read about it here in way more detail than anyone should have to know. Just search for "timeout" and "watchdog" on that page). The thought that external native code can have the plugged pulled just blows me away.
The fix was rather simple once I realized that I cannot count on my COM object's Release method to always complete. I had to move all critical clean-up code and put it in a public method that my PowerShell script would always call. Luckily PowerShell has pretty decent error handling and it wasn't too hard to ensure that I always called the clean-up method on my COM object before PowerShell normally terminates. I'm still not thrilled about this though. I would have preferred that my COM object be allowed to clean up after itself as necessary.
The moral of this story is that you are responsible for all complex clean-up even when calling native code. Don't depend on the .NET framework to always play nice.
Now that I had this behind me I had all the pieces that I needed: a robust file copy tool, a powerful scripting language, the ability to create hard links, and full access to Volume Shadow Copy snapshots.
In part four I'll cover the process overview and implementation details of creating the intelligent mirror backup process that I choose to be the foundation of my new backup strategy.