This post is the result of a challenge given to me by Seth Vidal, which showed up in
his weekend blog post. He was musing about whether it's possible to actually do a kickstart, or even an interactive install, in a cloud instance. I have to put some disclaimers around this post, because I am _not_ advocating this approach, and I'm going to show you a feature of Eucalyptus 3 that could void your warranty if used in anger. As my friend Michael likes to say, if you break it, you get to keep both pieces.
What we hashed out on Friday was that, in order to be able to kickstart inside an instance, you have to be able to pass boot parameters. In Eucalyptus 2, the only real way to do this was by patching the node controller with something similar to the
NEuca patches. In Eucalyptus 3, we've implemented a sort of "escape hatch" called nc-hooks to allow folks to customize behaviors at instance definition and launch time. There's an example shell script in /etc/eucalyptus/nc-hooks/ which shows how you might write your own hooks.
Knowing that the nc-hooks feature existed, I had to think about exactly how to pass boot parameters and get them into libvirt.xml before instance launch. Passing them via userData was the obvious choice. I came up with a couple of xslt files and this script to make the magic happen:
#!/bin/sh
event=$1
euca_scripts=/home/eucalyptus/scripts
inst_home=$3
rewrite_libvirt_xml() {
# Get only the value of the "bootparams=..." line from userData
BP=$( xsltproc $euca_scripts/get-user-data.xsl $inst_home/instance.xml \
| base64 -d \
| sed -r "/bootparams=/!d; s/^.*bootparams=(.*)/\1/" || exit 1 )
# Substitute the value of $BP into the stylesheet
sed -e "s!@@BOOTPARAMS@@!$BP!" < $euca_scripts/insert-boot-params.xsl \
> $inst_home/insert-boot-params.xsl || exit 2
# Rewrite and replace libvirt.xml for this instance
xsltproc $inst_home/insert-boot-params.xsl $inst_home/libvirt.xml \
> $inst_home/libvirt.xml.new || exit 3
cp $inst_home/libvirt.xml $inst_home/libvirt.xml.orig
mv -f $inst_home/libvirt.xml.new $inst_home/libvirt.xml
}
case "$event" in
euca-nc-pre-boot)
rewrite_libvirt_xml
exit 0
;;
*)
exit 0
;;
esac
I don't have a vast amount of experience when it comes to xml processing, so forgive the horror of these stylesheets. The first one, get-usr-data.xsl, is quite simple:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output encoding="UTF-8" indent="yes" method="text"/>
<xsl:template match="/instance">
<xsl:value-of select="/instance/userData"/>
</xsl:template>
</xsl:transform>
The second is a little stranger, and was done with some help from StackOverflow:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output encoding="UTF-8" omit-xml-declaration="yes" indent="yes" method="xml"/>
<xsl:template match='node()|@*'>
<xsl:copy>
<xsl:apply-templates select='node()|@*'/>
</xsl:copy>
</xsl:template>
<xsl:template match="cmdline">
<cmdline>@@BOOTPARAMS@@</cmdline>
</xsl:template>
</xsl:transform>
So with these files in place, I now need to configure an installer kernel and ramdisk. These come from the /fedora/releases/16/Fedora/x86_64/os/images/pxeboot/ directory of your favorite Fedora mirror site. The kernel and ramdisk registration process is the usual:
- euca-bundle-image --kernel true -i vmlinz
- euca-upload-bundle -b f16 -m /tmp/vmlinuz.manifest.xml
- euca-register f16/vmlinuz.manifest.xml
- euca-bundle-image --ramdisk true -i initrd.img
- euca-upload-bundle -b f16 -m /tmp/initrd.img.manifest.xml
- euca-register f16/initrd.img.manifest.xml
I don't really *need* a disk image here, but an EMI cannot be registered without one, so I fake it:
- dd if=/dev/zero of=fake-emi.img bs=1k count=10000
- mke2fs fake-emi.img
- euca-bundle-image -i fake-emi.img
- euca-upload-bundle -b f16 -m /tmp/fake-emi.img.manifest.xml
- euca-register --kernel eki-EA183EA8 --ramdisk eri-6ED23EF2 f16/fake-emi.img.manifest.xml
Note that even if you aren't trying to do key injection, the disk image needs to have an ext2-compatible filesystem on it.
Next, I need a volume to install into:
- euca-create-volume -s 10 -z PARTI00
Before I boot the instance, here's where I have to be honest with you readers: I make a lot of mistakes when I test things like this. Typos, logic errors, you name it. So for debugging purposes, I uncomment this line in /etc/eucalyptus/libvirt.xsl:
<graphics type='vnc' port='-1' autoport='yes' keymap='en-us' listen='0.0.0.0'/>
You definitely should not have this line uncommented for normal use, as it will allocate a port for vnc for every instance you launch, and without some extra configuration, it doesn't even require a password to connect. For quick debugging on a safe network, though, it's a good way to see what's going wrong during the boot process.
Now to launch my installer instance:
euca-run-instances -t m1.xlarge \
-d "bootparams=ksdevice=link ip=dhcp vnc keymap=us lang=en_US console=ttyS0" \
emi-BA8F405E
This boots into an interactive install, which listens for vnc connections. Note that due to the size of the initrd, this instance needs a significant amount of RAM; I used 2GB, but 1GB would have worked. Before proceeding, I attach the volume (which I could have done via block device mapping):
euca-attach-volume -i i-447E3E89 -d sdd vol-14AE3F68
I check euca-describe-instances for the instance's IP address, connect to it with a vnc client, and proceed with the install. Once the install completes, I detach the volume and terminate the instance:
- euca-detach-volume vol-14AE3F68
- euca-terminate-instances i-447E3E89
Finally, I convert the volume to a snapshot and register it:
- euca-create-snapshot vol-14AE3F68
- euca-register -n f16-test -s snap-2CBB42D9
I boot an instance of my new EMI, and ... it fails to have a network. There were multiple problems with the networking configuration:
- The MAC address is hard-coded.
- The device name has changed from eth0 to eth1 (maybe related to #1)
- The NIC is configured to be controlled by NetworkManager
This is when I'm happy to have a vnc connection provided at the libvirt layer to debug the instance. A quick setup of ifcfg-eth1 and a restart of the network gives me connectivity, and I'm up and running with a Fedora 16 instance installed entirely in the cloud.
The whole process took me about an hour or so this morning (not counting writing the xsl and shell script yesterday), and I imagine that the process would be much faster for subsequent attempts, and even faster when a kickstart is used. Still, I'm not convinced that an approach like this has significant value over something like
BoxGrinder or
ami-creator. Let the debate begin! :-)