wirbelsturm
THIS PROJECT IS NO LONGER MAINTAINED
Wirbelsturm
Wirbelsturm is a Vagrant and Puppet based tool to perform
1-click local and remote deployments, with a focus on big data related infrastructure.
Wirbelsturm’s goal is to make tasks such as “I want to deploy a multi-node Storm cluster” simple, easy, and
fun.
It has been called the “Cluster Vagrant” and “Big Data Vagrant” by some of its users, albeit in our opinion that makes
Wirbelsturm appear to be more than it really is, and it doesn’t give enough credit to the tools on which it
is based.
Its direct value proposition is two-fold:
-
Provide a working integration of Vagrant and Puppet.
Vagrant is used to create and manage machines, Puppet is used for provisioning those machines (e.g. to install and
configure software packages). Because Wirbelsturm uses Vagrant you can basically deploy to any target platform
that Vagrant supports — local VMs, AWS, OpenStack, etc. — although Wirbelsturm does not support all of those out
of the box yet. While Wirbelsturm’s Puppet setup is slightly opinionated with its preference for
Hiera and with its notion of environments and roles, these conventions
should help to jumpstart new users and, of course, you can change this behavior if needed. -
Add a thin wrapper layer around Vagrant to simplify deploying multiple machines of the same kind.
This is very helpful when deploying software such as Storm,
Kafka and Hadoop clusters, where most of the machines look
the same. In native Vagrant you would be required to (say) manually maintain 30 configuration sections in
Vagrantfile
for deploying 30 Storm slave nodes, even though only their hostnames and IP addresses would change from
one to the next.
There is also an indirect, third value proposition:
-
Because we happen to maintain Wirbelsturm-compatible Puppet modules such as
puppet-kafka, puppet-graphite,
and puppet-storm, you can benefit from Wirbelsturm’s ease of use to
conveniently deploy those software packages. As you may have noticed most of these Puppet modules are related to
large-scale data processing infrastructure and to DevOps tools for operating and monitoring such infrastructures, all
of which are based on free and open source software. See Supported Puppet modules for
details.
We hope you find Wirbelsturm as useful as we do. And most importantly, have fun!
Table of Contents
- Quick start
- Features
- Is Wirbelsturm for me?
- Default configuration
- Getting started
- Usage
- Configuration
- Supported deployment platforms
- Supported Puppet modules
- Known issues and limitations
- FAQ
- How it works
- Wishlist
- Appendix
- Change log
- Contributing to Wirbelsturm
- License
- Credits
Quick start (local Storm cluster)
Assuming you are using a reasonably powerful computer and have already installed Vagrant
(1.7.2+) and VirtualBox you can launch a multi-node
Apache Storm cluster on your local machine with the following commands. This
Storm cluster is the default configuration example that ships with Wirbelsturm. Note that the bootstrap
command
needs to be run only once, after a fresh checkout.
$ git clone https://github.com/miguno/wirbelsturm.git $ cd wirbelsturm $ ./bootstrap # <<< May take a while depending on how fast your Internet connection is. $ vagrant up # <<< ...and this step also depends on how powerful your computer is.
Done — you now have a fully functioning Storm cluster up and running on your computer! The deployment should have
taken you less time and effort than brewing yourself an espresso. 🙂
Tip: If you run into networking related issues (e.g. “unknown host” errors), try to deploy the cluster via our
./deploy
script instead of runningvagrant up
. The only additional prerequisite for./deploy
is the
installation of the GNUparallel
tool — see section Install Prerequisites for details.
Let’s take a look at which virtual machines back this cluster behind the scenes:
$ vagrant status
Current machine states:
zookeeper1 running (virtualbox)
nimbus1 running (virtualbox)
supervisor1 running (virtualbox)
supervisor2 running (virtualbox)
Storm also ships with a web UI that shows you the cluster’s state, e.g. how many nodes it has, whether any processing
jobs (topologies) are being executed, etc. Wait 20-30 seconds after the deployment is done and then open the Storm UI
at http://localhost:28080/.
What’s more, Wirbelsturm also allows you to use Ansible to interact with the deployed
machines via our ansible wrapper script:
$ ./ansible all -m ping
zookeeper1 | success >> {
"changed": false,
"ping": "pong"
}
supervisor1 | success >> {
"changed": false,
"ping": "pong"
}
nimbus1 | success >> {
"changed": false,
"ping": "pong"
}
supervisor2 | success >> {
"changed": false,
"ping": "pong"
}
Want to run more Storm slaves? As long as your computer has enough horsepower you only need to change a single number
in wirbelsturm.yaml
:
# wirbelsturm.yaml nodes: ... storm_slave: count: 2 # <<< changing 2 to 4 is all it takes ...
Then run vagrant up
again and shortly after supervisor3
and supervisor4
will be up and running.
Want to run a Kafka broker? Uncomment the kafka_broker
section in your
wirbelsturm.yaml
(only remove the leading #
characters, do not remove any whitespace) then run vagrant up kafka1
.
Once you have finished playing around, you can stop the cluster again by executing vagrant destroy
.
Note that running a small, local Storm cluster is just the default example. You can do much more with Wirbelsturm than
this.
Features
-
Launching machines: Wirbelsturm uses Vagrant to launch the machines that make up your infrastructure
as VMs running locally in VirtualBox (default) or remotely in Amazon AWS/EC2 (OpenStack support is in the works). -
Provisioning machines: Machines are provisioned via Puppet.
-
Wirbelsturm uses a master-less Puppet setup, i.e. provisioning is ultimately performed through
puppet apply
. - Puppet modules are managed via librarian-puppet.
-
Wirbelsturm uses a master-less Puppet setup, i.e. provisioning is ultimately performed through
-
(Some) batteries included: We maintain a number of standard Puppet modules that work well with Wirbelsturm, some
of which are included in the default configuration of Wirbelsturm. However you can use any Puppet module with
Wirbelsturm, of course. See Supported Puppet modules for more information. -
Ansible support: The Ansible aficionados amongst us can use Ansible to interact with
machines once deployed through Wirbelsturm and Puppet. -
Host operating system support: Wirbelsturm has been tested with Mac OS X 10.8+ and RHEL/CentOS 6 as host machines.
Debian/Ubuntu should work, too. -
Guest operating system support: The target OS version for deployed machines is RHEL/CentOS 6 (64-bit). Amazon
Linux is supported, too.-
For local deployments (via VirtualBox) and AWS deployments Wirbelsturm uses a
CentOS 6 box created by PuppetLabs. -
Switching to RHEL 6 only requires specifying a different Vagrant box
in bootstrap (for VirtualBox) or a different AMI image inwirbelsturm.yaml
(for Amazon
AWS).
-
For local deployments (via VirtualBox) and AWS deployments Wirbelsturm uses a
-
When using tools other than Vagrant to launch machines: Wirbelsturm-compatible Puppet modules are standard Puppet
modules, so of course they can be used standalone, too. This way you can deploy against bare metal machines even if
you are not able to or do not want to run Wirbelsturm and/or Vagrant directly.
See Wirbelsturm-less deployment documentation for details.
Is Wirbelsturm for me?
Here are some…