Friday, April 19, 2013

Cloud-in-a-box-in-a-VM (in a nutshell)

For the past few months, one of my projects at Eucalyptus has been our CentOS 6 based "Silvereye" (a.k.a. FastStart) installer, which assembles multiple repositories, custom install classes, and a few helper scripts onto a DVD-sized ISO image.  One of the challenges has been that I'm a remote employee, and downloading an ISO file from our Jenkins server takes too long.  At the same time, loading the ISO into Cobbler doesn't really test the same code paths as booting a DVD.  So how can I test my cloud installer?  Nested virtualization, of course!

As of Fedora 18, I've found nested KVM to be fairly reliable.  The performance isn't earth-shattering, but it's usable.  There are several things about the setup that may not be obvious, though, so I'll walk through everything I did for my test machine.

The server I'm using has two Xeon CPUs, 8 GB of RAM, and 2 TB of disk.  I've allocated 50GB of LVM space for the root filesystem, and when I create new VMs, I give them a 100GB logical volume as their disk ( lvcreate -n vmName -L 100G vg01 ).

The server is on a network, and I "own" within that, as well as for "private" addressing.  I've connected bridge device br0 to my primary ethernet device, em1:
  • /etc/sysconfig/network-scripts/ifcfg-br0
    BOOTPROTO=dhcp  # IP is reserved:
  • /etc/sysconfig/network-scripts/ifcfg-em1
I enabled on the system by placing "options kvm_intel nested=1" in /etc/modprobe.d/kvm.conf and reloading the kvm_intel module.

To allow the VMs to be connected to the bridge interface, I've added "allow br0" to /etc/qemu/bridge.conf

I also set /proc/sys/net/ipv4/ip_forward to 1, so that packet forwarding works.

For now, I've completely turned off iptables.  This is not ideal, but I'll leave iptables rule building for another post.

Booting an image into the FastStart installer looks like this:

qemu-kvm -m 2048 -cpu qemu64,+vmx -drive file=/dev/vg01/ciab,if=virtio \
         -net nic,model=virtio,macaddr=52:54:00:12:34:60 -net bridge,br=br0 \
         -cdrom silvereye-nightly-3.3-m5.iso -boot d -vnc :1

Let's look at the individual pieces of that.

1) "-m 2048" is 2GB of RAM
2) "-cpu qemu64,+vmx" specifies that we are emulating a CPU capable of virtualization
3) "-drive file=/dev/vg01/ciab,if=virtio" is our LVM-backed "disk"
4) "-net nic,model=virtio,macaddr=52:54:00:12:34:60" is our virtual network interface. Specifying a unique MAC address is very important!  If you don't do this, every VM will get the same MAC, and they won't be able to communicate with each other.  They will all be able to send and receive other traffic, though, which can be maddening if you don't realize what's going on.
5) "-net bridge,br=br0" says to connect the host TAP device to bridge br0.  This gets you non-NAT access to the physical network, since the bridge is also connected to em1.
6) "-cdrom silvereye-nightly-3.3-m5.iso -boot d" connects our ISO image and boots from it.
7) "-vnc :1" starts a VNC server on the host (on port 5901) for this VM's console.

I connect to with a vnc client, and proceed with the install, selecting "Cloud in a Box" at the boot menu.  Since I own 10.101.7.x, I give this VM the IP, and configure its "public IP range" to be  The subset of IPs here is arbitrary; I wanted to leave some for my other test clouds.  For private IPs, I'll use, half of my allotted range.  The default gateway and nameserver are exactly the same as for the host system.  The host bridge is just a pass-through, not an extra routing "hop".

Once installed, I can access the cloud through the normal channels -- SSH, the admin UI on port 8443, and the user console on port 8888.  I test by logging into the user console, generating and downloading a new key, launching an instance, and SSHing into the instance's public IP.

The chain of devices through which data flows may not be entirely clear.  When you launch an instance inside the virtual cloud on -- and let's say for the example that it gets an IP of -- here's the list of "devices" through which packets flow when you connect to it from a different physical system:
  1. the physical interface of the host, em1
  2. the host bridge, br0
  3. the host tap interface
  4. the VM's "eth0" interface
  5. the VM's "br0" bridge (which is not enslaving eth0 in this case)
  6. the VM's tap interface (vnet0)
  7. the nested VM's eth0
The only place that NAT happens here is between 4 and 5, and that's only necessary because I chose to use eucalyptus's "MANAGED-NOVLAN" mode.

So that's a cloud-in-a-box-in-a-VM (in a nutshell).

For a more traditional deployment, simply boot more VMs into the installer using the same qemu-kvm command format mentioned above, choosing "Frontend" for one, and "Node Controller" for the other(s).   For each one you boot, you need to create a new lvm volume and change the MAC address and the vnc port to avoid conflicts.  When running multiple clouds, you should also make sure that your IP ranges never overlap (i.e., don't let two clouds use the same public IPs or private subnet range).

Thursday, January 3, 2013

Using aws-cli with Eucalyptus

Just before the holidays, Amazon released awscli, a new command-line interface for managing AWS resources.  The code is based on botocore, the core python library for the next major version of boto.  I took awscli for a spin to see if it worked with the Eucalyptus Community Cloud, and as is often the case, the answer was ... almost.

First, it's useful to understand the fundamental problems that awscli was trying to address.  The most obvious is profiles.  Cloud users deal with multiple regions, accounts, users, etc., and keeping separate configurations for each one is a hassle.  awscli uses a section-based config file format which allows for multiple profiles, each of which can reference it's own region, access keys, etc.

Another problem that this new code solves is the centralization of region and service data into JSON files which are easy to read, write, and parse.  See _regions.json and _services.json in botocore for examples.

What I found was that rather than trying to alter the existing data files, what I really wanted was a eucalyptus "provider" with its own JSON files.  I'll spare you all my trial-and-error, and simply explain what worked:

  1. git clone
  2. git clone (note that this is my fork -- upstream is )
  3. Install botocore and aws-cli however you prefer ( I use "python install --user" in each directory)
  4. create a provider data directory, and a "euca" directory inside it.  I'll use /var/tmp/providers as the top directory.
  5. create _regions.json and _services.json under the "euca" directory (the linked examples here should work for ECC verbatim)
  6. symlink to botocore/data/aws/ec2.json and botocore/data/aws/iam.json in the euca provider directory
  7. Create your ~/.awsconfig file (or whatever you'd like to call it):
  8. export AWS_CONFIG_FILE=$HOME/.awsconfig
  9. export AWS_DATA_PATH=/var/tmp/providers
  10. try some commands, such as:
    aws ec2 create-volume --size 1 --availability-zone partner01
    aws ec2 describe-volumes
    aws ec2 describe-images
It may take a couple of iterations for the patch I've proposed to be accepted upstream, but in the meantime, I hope this is useful information.  As I've mentioned in the pull request, the solution is not ideal, as it requires that your default profile in a config file reference the euca provider, but I went for the least invasive fix first.  Note that even with this version, you can use profiles to group all of your eucalyptus cloud credentials into a single config file, and then have a second file for AWS profiles.  Switching back and forth is just a matter of setting AWS_CONFIG_FILE.