Skip to main content
  1. All Posts/

wirbelsturm

Tools Shell

THIS PROJECT IS NO LONGER MAINTAINED

Wirbelsturm

Wirbelsturm is a Vagrant and Puppet based tool to perform
1-click local and remote deployments, with a focus on big data related infrastructure.
Wirbelsturm’s goal is to make tasks such as “I want to deploy a multi-node Storm cluster” simple, easy, and
fun.
It has been called the “Cluster Vagrant” and “Big Data Vagrant” by some of its users, albeit in our opinion that makes
Wirbelsturm appear to be more than it really is, and it doesn’t give enough credit to the tools on which it
is based.
Its direct value proposition is two-fold:

  1. Provide a working integration of Vagrant and Puppet.
    Vagrant is used to create and manage machines, Puppet is used for provisioning those machines (e.g. to install and
    configure software packages). Because Wirbelsturm uses Vagrant you can basically deploy to any target platform
    that Vagrant supports — local VMs, AWS, OpenStack, etc. — although Wirbelsturm does not support all of those out
    of the box yet. While Wirbelsturm’s Puppet setup is slightly opinionated with its preference for
    Hiera and with its notion of environments and roles, these conventions
    should help to jumpstart new users and, of course, you can change this behavior if needed.
  2. Add a thin wrapper layer around Vagrant to simplify deploying multiple machines of the same kind.
    This is very helpful when deploying software such as Storm,
    Kafka and Hadoop clusters, where most of the machines look
    the same. In native Vagrant you would be required to (say) manually maintain 30 configuration sections in
    Vagrantfile for deploying 30 Storm slave nodes, even though only their hostnames and IP addresses would change from
    one to the next.

There is also an indirect, third value proposition:

  • Because we happen to maintain Wirbelsturm-compatible Puppet modules such as
    puppet-kafka, puppet-graphite,
    and puppet-storm, you can benefit from Wirbelsturm’s ease of use to
    conveniently deploy those software packages. As you may have noticed most of these Puppet modules are related to
    large-scale data processing infrastructure and to DevOps tools for operating and monitoring such infrastructures, all
    of which are based on free and open source software. See Supported Puppet modules for
    details.

We hope you find Wirbelsturm as useful as we do. And most importantly, have fun!

Table of Contents

Quick start (local Storm cluster)

Assuming you are using a reasonably powerful computer and have already installed Vagrant
(1.7.2+) and VirtualBox you can launch a multi-node
Apache Storm cluster on your local machine with the following commands. This
Storm cluster is the default configuration example that ships with Wirbelsturm. Note that the bootstrap command
needs to be run only once, after a fresh checkout.

$ git clone https://github.com/miguno/wirbelsturm.git
$ cd wirbelsturm
$ ./bootstrap     # <<< May take a while depending on how fast your Internet connection is.
$ vagrant up      # <<< ...and this step also depends on how powerful your computer is.

Done — you now have a fully functioning Storm cluster up and running on your computer! The deployment should have
taken you less time and effort than brewing yourself an espresso. 🙂

Tip: If you run into networking related issues (e.g. “unknown host” errors), try to deploy the cluster via our
./deploy script instead of running vagrant up. The only additional prerequisite for ./deploy is the
installation of the GNU parallel tool — see section Install Prerequisites for details.

Let’s take a look at which virtual machines back this cluster behind the scenes:

$ vagrant status
Current machine states:

zookeeper1                running (virtualbox)
nimbus1                   running (virtualbox)
supervisor1               running (virtualbox)
supervisor2               running (virtualbox)

Storm also ships with a web UI that shows you the cluster’s state, e.g. how many nodes it has, whether any processing
jobs (topologies) are being executed, etc. Wait 20-30 seconds after the deployment is done and then open the Storm UI
at http://localhost:28080/.
What’s more, Wirbelsturm also allows you to use Ansible to interact with the deployed
machines via our ansible wrapper script:

$ ./ansible all -m ping
zookeeper1 | success >> {
    "changed": false,
    "ping": "pong"
}

supervisor1 | success >> {
    "changed": false,
    "ping": "pong"
}

nimbus1 | success >> {
    "changed": false,
    "ping": "pong"
}

supervisor2 | success >> {
    "changed": false,
    "ping": "pong"
}

Want to run more Storm slaves? As long as your computer has enough horsepower you only need to change a single number
in wirbelsturm.yaml:

# wirbelsturm.yaml
nodes:
  ...
  storm_slave:
      count: 2     # <<< changing 2 to 4 is all it takes
  ...

Then run vagrant up again and shortly after supervisor3 and supervisor4 will be up and running.
Want to run a Kafka broker? Uncomment the kafka_broker section in your
wirbelsturm.yaml (only remove the leading # characters, do not remove any whitespace) then run vagrant up kafka1.
Once you have finished playing around, you can stop the cluster again by executing vagrant destroy.
Note that running a small, local Storm cluster is just the default example. You can do much more with Wirbelsturm than
this.

Features

  • Launching machines: Wirbelsturm uses Vagrant to launch the machines that make up your infrastructure
    as VMs running locally in VirtualBox (default) or remotely in Amazon AWS/EC2 (OpenStack support is in the works).
  • Provisioning machines: Machines are provisioned via Puppet.

    • Wirbelsturm uses a master-less Puppet setup, i.e. provisioning is ultimately performed through puppet apply.
    • Puppet modules are managed via librarian-puppet.
  • (Some) batteries included: We maintain a number of standard Puppet modules that work well with Wirbelsturm, some
    of which are included in the default configuration of Wirbelsturm. However you can use any Puppet module with
    Wirbelsturm, of course. See Supported Puppet modules for more information.
  • Ansible support: The Ansible aficionados amongst us can use Ansible to interact with
    machines once deployed through Wirbelsturm and Puppet.
  • Host operating system support: Wirbelsturm has been tested with Mac OS X 10.8+ and RHEL/CentOS 6 as host machines.
    Debian/Ubuntu should work, too.
  • Guest operating system support: The target OS version for deployed machines is RHEL/CentOS 6 (64-bit). Amazon
    Linux is supported, too.

    • For local deployments (via VirtualBox) and AWS deployments Wirbelsturm uses a
      CentOS 6 box created by PuppetLabs.
    • Switching to RHEL 6 only requires specifying a different Vagrant box
      in bootstrap (for VirtualBox) or a different AMI image in wirbelsturm.yaml (for Amazon
      AWS).
  • When using tools other than Vagrant to launch machines: Wirbelsturm-compatible Puppet modules are standard Puppet
    modules, so of course they can be used standalone, too. This way you can deploy against bare metal machines even if
    you are not able to or do not want to run Wirbelsturm and/or Vagrant directly.
    See Wirbelsturm-less deployment documentation for details.

Is Wirbelsturm for me?

Here are some…