Part II: Rebuilding ZEUS – The Operating System, FileSystem & Virtualisation
Now that I’ve decided what I want out of the server (and the hardware I’ve got), its time to workout what operating system to run the system on. Currently, ZEUS is running on Ubuntu Gutsy (7.10) which is running LVM with an XFS volume holding approximately 2.5Tb worth of data. There’s a cron job that defrags the XFS volume to keep things in order.
The Operating System
As the operating system is no longer maintained (my oversight into how long it would survive) I have to find an OS that supports the hardware platform without hacky hacky bits (and by this I mean avoiding buggy ACPI and issues with the NForce4 chipset and IRQ problems) and has a file system that will benefit long term.
There were a few considerations:
I like Ubuntu, I’m comfortable with the user land and find the Debian package system (in particular the dependency resolving) most impressive. Hardware is well supported and 8.04.3 (at the time of writing) boots on the hardware I originally selected (Intel) and the new configuration I recently selected (AMD). I could most definitely use Ext4 but the problems with data-loss (which I’ve reproduced on several occasions on desktop machines) scare me.FileSystem: I’d have to adopt either XFS or Ext4 on an LVM to factor in future-proofing, maybe get some fakeRAID happening for redundancy._
Installation_: comes with a Server edition that’s bare bones allowing it to be a minimalistic installation which is always nice!
Initially when I started to rebuild Zeus back in April I wanted to use Ubuntu 9.04, I was really excited about Ext4 and the promise of a brand-spanking new file-system and what it would bring to the table. Unfortunately after using Ext4 with 9.04 I’ve come to realise its probably not the wisest to trust your data with it just yet – unless you get yourself a UPS! Laptop seems to be chugging nicely though.Installation: Like LTS, comes with a Server edition that’s bare bones allowing it to be a minimalistic installation which is always nice! (copy/paste!) Unfortunately picking 9.04 when 9.10 is just around the corner is not going to be ideal, I’ll be stuck with where I am right now in a year or so.
CentOS 5.3 or wait for CentOS 5.4 coming soon!
CentOS is based off Redhat Enterprise Linux and having only started using CentOS after joining HSD (though my first splash in Linux was with Bowtie Redhat Linux 6.0 – I still have the CD!) I find the RHEL clone to be stable and solid as a rock! CentOS 5.4 will (thanks to RHEL 5.4) bring some KVM goodness that is a high priority on my list.FileSystem: Ext4 is not in the RHEL5 stream (have to wait for RHEL6 – which is a good thing!) so the best bet here is to use XFS with LVM._
Installation_: Customised via kickstarts or setup a cobbler server to install via PXE.
OpenSolaris has some really exciting technologies that are worth a look at, but in particular is ZFS. ZFS breaks all the rules – and for a good reason. I’ve yet to really use Solaris – quite cosy with Linux, but there’s nothing like getting used to another OS by jumping straight into the deep end!FileSystem: ZFS all the way baby!_
Installation_: Installation is via a LiveCD, I’ve looked into customising the installation so it removes some bits I don’t which I’ll discuss later but its quite fiddly.
- Very simple administration – you only use two commands,
- Highly scalable – 128-bit means we can hold 16 exabytes or 18 Million terabytes worth of data! More porn for you! XFS can no doubt handle the TBs we use for our home boxes now, but no-chance you can get the performance or benefits of ZFS in Ext3/Ext4 or XFS.
- Data integrity to heal a filesystem (no fsck’ing around!) – 256bit checksuming to protect data, if ZFS detects a problem it will attempt to reconstruct the bad block and continue on its merry way (utilising available redundancy)
- Compression – you can elect to compress a particular file-system or a hierarchy just by setting one command! I’m thinking things like logs here.
- No hardware dependency – JBOD on a controller, let ZFS maintain the RAID volumes in software. Checkout Michael Pryc’s crazy adventure with ZFS using USB thumb drives and Constantin’s original voyage with USB drives! RAID-Z is essentially RAID-5 without the write-hole problems has plagued it if power is lost during a write, it can also survive a loss of a drive (with RAIDZ-2 you can loose two drives).
- Happy snaps for free! Snapshot (a live) file-system as many times as you like, again one easy command. Its like that tendency to hit CTRL
when your working in Windows from back in the days of Windows 9x, snapshot regularly!
So in case the sudden influx of OpenSolaris posts didnt give you the hint, I decided on OpenSolaris to power the new iZeus 2.0, actually no that sounds lame, zeusy will be the new ZEUS until ZEUS is retired in which case zeusy becomes zeus (confused?).
ZFS is one of those file-systems you look at and think, wow! Why didn’t anyone else think of that before?
So ZFS sounds much like marketing spiel right now, best thing since sliced bread, cooler than a cucumber, and you’d be right it is cool and the best thing since filesystems came to being. Over the coming days I’ll post some more on my musings with ZFS – keeping in mind that I’m still learning these things. It helps to have lots of hardware to play with, but even if you don’t, you can knock up a virtual version of OpenSolaris in VirtualBox, create some virtual disks and try it out.
There are a few caveats that I’ve come across though using ZFS, one is memory! ZFS will try and cache as much data as it can in RAM, so if you have 8Gb of RAM (as I have in this box) it will happily use as much of it as it can afford. Rightfully so, I was getting ~96MB/s transferring a 16Gb MPEG from one box to the other over our Gig link (thats from one end of the house to the other!) mind you this was just a test configuration using 2x 74Gb Western Digital Raptors (WD740ADFD) in a RAID-0 style hitting a single 150Gb Western Digital Raptor (WD1500ADFD). They could have gone much higher, but I was happy with that.
There are also (as of writing) no recovery tools for ZFS, but these are slated to arrive soon (Q4 2009) which is quite scary after you read this post about a guy loosing 10Tb worth of data, however a possible revert to an older uberblock may fix some problems.
Initially I wanted to concentrate quite a bit on Virtualisation, I tried Xen on OpenSolaris. It was quite easy to setup a Xen Dom0 in OpenSolaris but with the 2009.06 release you had to tweak the Xen setup a bit. I wasn’t too enthusiastic about using Xen after seeing the performance lag in Windows in my musings. Instead I’m opting for my crush, VirtualBox.
So why use VirtualBox when you can get a bare-metal hypervisor? Firstly, performance seems to be sluggish with Xen for me (I didn’t investigate this too much), secondly I want to be able to run the latest and greatest OS’s out without worrying about upgrading Xen (I’m a sucker for OS’s!). VirtualBox development has accelerated at a feverish pace, I started with VirtualBox 1.3 in 2007 and its come an insanely long way since then. When a new release comes along, its as easy as updating VirtualBox and getting all the benefits. Plus with SunOracle‘s backing of VirtualBox you know things are going to work well on OpenSolaris, the Extras repository of VirtualBox makes it as easy as doing a pkg update.
I’m still quite intrigued by the way KVM is heading and how it will pan out, but for the future zeus, it will be VirtualBox.