Thursday, July 16, 2009

VMware Fusion and TimeMachine

Been a while since I blogged; swamped in work.

This quick post to share a solution I found on Mac OS X to make VMware Fusion (or Parallels) and Apple's Time Machine backup work better together.

I use a fair amount of virtual Windows and Linux machines, and these tend to be primarily stored in large virtual hard disk files - 40GB, 60GB,... pretty massive.

Because Time Machine finds and copies any modified file, and the mere act of running the virtual Windows machines caused these large files to be marked as 'modified', I was faced with endless copies of massive virtual hard disk files to my Time Machine hard drive.

Figuratively speaking, a one-byte change in one of those massive Windows virtual hard disk files would cause a 60GB copy operation - which also indirectly forced older backup files to be erased from the Time Machine drive to make room for these behemoths.

Initially, I put all my virtual machines in a single folder, and explicitly excluded that folder from the Time Machine backup - not a good solution, but at least, my backup drive was not swamped with those big files.

But then, a few days ago it dawned on me - I could have my cake and eat it too!

The solution is to use the 'snapshot' feature of VMware (Parallels also has it) - you can make a snapshot of a virtual machine, so you can always 'undo' whatever happened to that virtual machine since the last snapshot.

The way VMware handles this is by 'freezing' the underlying virtual hard disk, and storing any changes made since the snapshot to the frozen hard disk in separate files.

And that's the solution: I first make sure my virtual machine is in a useful, stable state, and then I make a snapshot (I call it 'Base').

That effectively 'locks' the massive many-GB virtual drive - so running VMware does not cause it to be modified any more, and any changes from then on are kept in a bunch of much smaller snapshot files.

Time Machine now makes a copy of my 'frozen' many-GB virtual drive once, and from then on only backs up the changes that are kept in the snapshot files - which results in a much smaller backup set.

After a while, when the snapshot file size grows to many GB in size, I make a new snapshot, and delete the old snapshot. That 'merges' all changes that occurred 'between' the two snapshots into the main virtual drive, and starts with a fresh slate.

After I do that, Time Machine backs up the behemoth once again, and from then again only backs up the changes in the new set of snapshot files.

I now have a good backup of my virtual machines again without having to jump through hoops!




kR said...

You can also split the VM disk into 2GB files so only the changed chunks will be backed up instead of the entire disk each time.

You can enable that by going into the VM's properties to Hard Disk and click the checkbox that says "Split into 2 GB files"

Kris Coppieters said...

Hi kR, thanks for your comment!

Yes, I am aware of that - I have tried that approach too, but for me it did not work.

Any 'reasonable' kind of VMware session (i.e. where I did some real work - not just boot up followed by shut down) seemed to modify just about all of my 2GB segment files, so Time Machine again ended up copying the whole lot each time I used the virtual machine, independent of whether I kept it in a 60GB disk or 30 x 2GB slices.

The 'snapshot' approach on the other hand works really well.

cool1o said...

Given that snapshots are stored in the same directory as the VM files, how does this solve the problem?

Seems like the VM files and the snaphsots need to live in different directories. AFAIK you cannot split them apart, in VMWare Fusion 3.

Hopefully I am wrong about that...?

Kris Coppieters said...

Time Machine makes backups on a per-file basis.

Whether these files are together in a folder or not has no relevance.

If a file has not been modified it is not backed up. Instead, Time Machine makes a special symbolic link to the unmodified file - so if a file has not been modified over the course of 10 Time Machine backups, you will see 10 copies of that file, but they'll all be symbolic links to a single, unmodified copy.

The trick makes sure that the 'bulk' of the virtual machine is in a single, big, unmodified file, and all changes are channeled into a much smaller 'changes to the base' file.

Hope that clarifies it for you...



Chris said...

Kris, just to be clear: did you *remove* the Time Machine exclusion for your Virtual Machines folder to make this setup work? (I assume it put the snapshot files in this folder, or inside the .vmwarevm bundle.)
And are you using the AutoProtect feature to make your snapshots, or manually?

Kris Coppieters said...

I have no exclusions in Time Machine - i.e. my whole machine is backed up - including my VMware virtual machines. Because of the snapshotting, the size of the modified files inside the virtual machine stays reasonable - the 'main' snapshot file remains unchanged and is not re-backed up.
No, I am not using AutoProtect - though you could; it would work (but probably you still need to occasionally 'merge' them back into the main 'base' snapshot - after which you get a one-time 'whopper' backup, and then the regular regime resumes).

respuestafacil said...

I used this article, amongst some others to write an entry on my blog entitled VMware Fusion Backup Time Machine . My latin readers will say thanks for sure!

Anonymous said...

Thanks a lot ! I was wondering how to handle this with time machine, then I was wondering if the split vm hd would do the trick and happily I found your page !