Stop wasting bandwidth with vagrant-cachier
If you have done any kind of Puppet manifests / Chef cookbooks development using Vagrant chances are that you’ve been staring at your screen waiting for the machine to be provisioned for really long periods of time, specially when you need to destroy the VM and start over.
A while ago I came across this gist which solves part of the issue by caching downloaded packages on the host machine and sharing them among similar VM instances. After copying and pasting it on different projects, I decided to extract it to a Vagrant plugin and expand its usage by supporting multiple Linux distros and package managers allowing others to benefit from it as well.
I started spiking the plugin a while ago and after using it on a couple projects today I went ahead and open sourced it. The code is not the best you’ll find around and right now it supports caching for APT, Yum, Pacman and RubyGems packages and I’m planning to add others as needed.
On a side note, this is probably the first Vagrant plugin to make use of guest capabilities that I’m aware of ;)
How does it work?
From the project’s README:
Under the hood, the plugin will hook into calls to
vagrant reloadand will set things up for each configured cache bucket. Before halting the machine, it will revert the changes required to set things up by hooking into calls to
Vagrant::Builtin::GracefulHaltso that you can repackage the machine for others to use without requiring users to install the plugin as well.
Cache buckets will be available from
/tmp/vagrant-cachieron your guest and the appropriate folders will get symlinked to the right path after the machine is up but right before it gets provisioned. We could potentially do it on one go and share bucket’s folders directly to the right path if we were only using VirtualBox since it shares folders after booting the machine, but the LXC provider does that as part of the boot process (shared folders are actually
lxc-startparameters) and as of now we are not able to get some information that this plugin requires about the guest machine before it is actually up and running.
Please keep in mind that this plugin won’t do magic, if you are compiling things during provisioning or manually downloading packages that does not fit into a “cache bucket” you won’t see that much of improvement.
UPDATE: Please refer to the project’s docs for the most up-to-date information about it. Things have changed a bit lately and are likely to change a bit more ;)
Show me the numbers!
The times shown below are just for provisioning after the machine has already been brought up on a 35mb connection:
|First provision||Second provision||Diff.||APT cache|
As I said, the plugin does not do any magic and it will just save you from downloading packages. For instance, my rails-base-box compiles Ruby 2.0 from source using ruby-build and there’s nothing much we can do about it (apart from using precompiled rubies of course).
If you do the maths, on average those numbers represents
~41% drop on provisioning time.
In my opinion this alone represents a huge win, specially if you are running
a CI server as it means a faster feedback loop. It also means that if a mirror is slower
that usual for some weird reason or if you are on a 3G connection, it’ll save you a few mbs
worth of downloading packages. Not to say that is “against etiquette towards package hosters“
to download those files over and over throughout the day.
So please, be nice and stop wasting yours and others bandwidth :)
what people have been saying about the plugin on twitter and looks like some have
experienced an even bigger drop on provisioning time:
vagrant-cachier took my vagrant spin up from 30 to 5 minutes and reduced my caffeine intake by 3 cups bit.ly/114CIr3— Russell Cardullo (@russellcardullo) June 7, 2013
Tested vagrant-cachier. Saved 60% of vagrant up time installing 10 rpms with chef. Pretty awesome. Check it out! github.com/fgrehm/vagrant…— Miguel. (@miguelcnf) June 9, 2013