VMs in my home lab

TL;DR: Read the scripts here.  There is a simple “if; this; then; that” at the bottom.

It’s often I see some new bit of software I’d like to deploy in my home lab to test against my infrastructure. I’m not super keen on loading it on a machine I use for some other function, especially if I end up tossing the software.

With this in mind, I started using Linux’s Kernel Virtual Machines (KVM) and Quick Emulator (Qmeu) to test things out.  This worked quite well, with one exception: Spin up time was too long. WAY too long waiting 20 mins for a new VM wasn’t acceptable for me.  So I started digging around.

Caveat emptor: I am running dhcpd, kvm/qemu, and all the rest on one massive box.  This stuff CAN be split out into multiple hosts, but will require a bit more work to pull it together. I know you can do it. You are a smart cookie, after all…

I found that cloning VMs was possible and actually, pretty straight forward. I was just a matter of copying the image file, mounting it on a loopback device, editing a few OS specific files and importing a new libvirt host definition from a modified XML file.

Obviously, doing this over and over meant putting in a Bash script new-vm.sh, which can be found: https://github.com/sawasy/build_vm Of course, there is also a teardown script for VMs called destroy-vm.sh.

There are a few extra bits or gotchas in this script, which you’ll want to pay attention to.  It checks dhcpd for an existing MAC definition and uses that for the hostname provided, if available.  If dhcpd is going be on a different host, you’ll probably want to pull that over or check it remotely.

For a long while this worked pretty well.  Test hosts could now be rebuilt very quickly, sub 5 mins.

$ time new-vm.sh testhost
testhost
Domain golden-host is being shutdown

error: failed to get domain 'testhost'
error: Domain not found: no domain with matching name 'testhost'

error: failed to get domain 'testhost'
error: Domain not found: no domain with matching name 'testhost'

Copying image...
sending incremental file list
golden-host.qcow2
3,130,195,968 100% 43.68MB/s 0:01:08 (xfr#1, to-chk=0/1)
Dumping XML...
Domain golden-host started

Checking for dhcp definition
Clean up stale mounted....
umount: /mnt/loopback: not mounted
Fixing hostname
/dev/nbd0 disconnected
Importing testhost's definition
Domain testhost defined from /tmp/testhost.xml

Starting testhost
Domain testhost started

new-vm.sh testhost 15.24s user 4.98s system 21% cpu 1:33.50 total

As you can see in the example above, there is a golden host that runs.  This golden host is just that, the base host you want to load all your custom bits and bobs on to and then clone from with each new VM.

This worked pretty well for about a year. The disk on the golden-host became corrupted and I had to rebuild the golden host from scratch.  This worked well for a long while again, until the host wouldn’t boot. At this point, I decided to spend some time automating the golden-image build process.  Having worked with Packer for AWS AMIs, I used it for this.  I cooked up some JSON to do the Packer things I needed it to in the above github repo ubuntu-14.04-server-amd64.json. If you open the JSON file, you’ll see I have the ubuntu ISO on a internal host. This is the same host I run packer from.  If you look in the build.sh script, you’ll see it reaches out and checks that we have the freshest ISO available. If not, it downloads it and puts it where the webserver can access it.

If we plan to use a separate host for serving the various pieces for this, you can just dump it in a cron job. I digress…

The build.sh script sets up the ISO and preseed.txt file, cleans up old build, opens up the UFW firewall (because you are using firewalls, right? RIGHT?) and then kicks off the build.

Side note: Those with a keen eye will notice some scary passwords, sshd settings and such. Those are all fixed up by my ansible playbooks post install. They are there for bootstrapping.

Once that is done, Jenkins will run the post-build.sh script.  Sorry? What do I mean Jenkins? Shit. Right. I have a Jenkins install managing the automated Recurring build of all this goodness.  I’ve set up Jenkins to pull down a fresh copy of my Ansible repo (soon Saltstack because it is so bad ass), and handle the build and post build stuff. Also send me a little mail when it’s done. The post install script, really, just sets up a bunch of the environment variables for Jenkins.

And, that’s about it.

In a nutshell: Every morning at 3AM, Jenkins kicks off. -> Jenkins runs build.sh. -> Build.sh sets up the ISO and preseed.txt in the webserver -> Jenkins runs Packer. -> Packer creates a VM and preseeds it. -> Jenkins runs the post-install.sh. -> post-install.sh cleans up, if needed, and kicks off Ansible and sends and email out when done!

 

xmlrpc and bots

Some time ago, I was told I should give back to the Internet and post some technical musings. As one does, I grabbed some off the shelf software and set up Apache on a little tiny virtual server. Off we went.

Some time later, I started getting alarms that the host was not responding or that one of the services had crashed. I’d fix the issue and some time later it would happen again.

Digging into this a little deeper I saw that all of the available Apache threads were active but there was little activity in the access/error logs. Something was causing them to hang. In response to this, I cranked down the time-outs in Apache.

This worked a little.

The site was afloat for longer periods of time before it would crash or hang. In digging deeper, I notice that xmlrpc.php was being hit more than occasionally. What the heck is xmlrpc.php?

From xmlrpc.com:

It’s a spec and a set of implementations that allow software running on disparate operating systems, running in different environments to make procedure calls over the Internet.

It’s remote procedure calling using HTTP as the transport and XML as the encoding. XML-RPC is designed to be as simple as possible, while allowing complex data structures to be transmitted, processed and returned.

Okay, so it’s a service in WordPress that allows for remote calls to be made, much like RPC. I’m not sure I need that service…Wait…What’s that link a few down in my Google Search.

WordPress “Pingback” DDoS Attacks

Crap snacks…

Turns out there is an opening in xmlrpc which allows for malicious types to send requests from my host to victims systems. WordPress had this service enabled by default. I went through and disable it in 3 different ways! Lo and behold, my site stopped crashing!

Looking at the logs, I see that more bots than available Apache threads are trying open xmlrpc.php. They would connect and hold the connection open while DDoSing some poor soul. This is what was causing the site to stop responding.

Hmmm…I wonder if those Internet Security researches have my site in their databa..oh..yeah…there I am…shucks.

Sorry poor hapless DDoS recipients. Mea culpa!