Notes from 4th Monitoring SIG (Ganglia, January 2007)

January 12, 2007 - 9:41 am

Jennifer Davis at BayLISA has posted these to the BayLISA site at:

http://www.baylisa.info/?q=node/110

Is this a good place for them to be? Do people think they can comment on and interact with them, or does the resource have to be more dynamic?

In the meantime, here they are here too:
==============================
Thomas Stocking’s SIG notes:

Matt Massey’s talk about Ganglia was informative for those who were unfamiliar with the mechanics of Ganglia. I had heard this level before, but was interested in the mini-history, guiding principles, and mostly in the statement that Ganglia is a niche product - it is not designed to be all things monitoring.

Questions for Matt were more about details of the mechanics; how does Ganglia determine what to monitor? What are the operating system dependencies? What are the config options? When does it make sense to use Ganglia over Nagios? How does Ganglia communicate the instantiation of new nodes? Things like that.
The answers were:
Ganglia takes a compile-time spec of what to monitor, but can be configured to change that spec using config files.
The list of what it monitors is hard to change (it’s not modular).
Ganglia makes sense to use when you have a lot of similar hosts - Peter Loh made a case for using Ganglia in a less controlled, more dynamic environment, where hosts are deployed and undeployed more rapidly than it is practical for an admin to add monitoring to Nagios (or other central-configed systems).
Ganglia uses multicast clouds and optionally unicast connections to announce the presence of new nodes.

Peter Loh presented a cogent, intelligent, detailed summary of the Ganglia2Nagios integration he has been working on, and a live demo of the “Seurat” view. These were really well received, with several questions about the details of the databases used, the mechanics of importing hosts to Nagios, the trade-offs of importing a single service in Nagios for all ganglia services, and the Seurat view.
People were blown away by the Seurat view. They were suggesting going to an even denser display (one pixel!) and one person joked that a clever attacker could spell out a message in Seurat view screens if they selectively attacked hosts to create a pattern!

Glenn Herteg sent out an email with all sorts of info he learned from Matt Massie:

==============================
Subject: using Ganglia effectively

Folks,

I attended tonight’s Monitoring SIG meeting, and it had some
unexpected direct benefits to my project. I spoke with Matt
Massie before the meeting, and he offered some insight on how
best to configure Ganglia for scalability. I thought I would
write up these ideas in a clear form and share them with everyone.

(*) Configure Ganglia so it writes RRD archives to RAMdisk, and
run a separate rsync of that file tree to disk every few
minutes. Essentially all of the heavy lifting is then just
in-memory operations. Matt claims this scales very well to
thousands of nodes.

There is some possibility of having the RRD archive end up
in a corrupted state if the disk-sync activity happens while
RRDtool is in the middle of updating the archive, or if you
get a machine crash in the middle of the rsync. Apparently
one way to protect against the asynchronous-update problem
is to use an rsync capability for checksumming source and
target files after the copy, and make sure they are identical.
If not, rsync will copy again.

I suspect that even this is not perfect protection against
mid-transaction copying, but it will probably do for all
practical purposes. Matt hasn’t heard of any problems with it.

(*) When you need to write lots of RRD files, don’t fork every
time you need to run an update. Rather, write a daemon
process you can connect to via sockets or other lightweight
connections. This daemon can even be multithreaded if need
be, with proper implementation. Have the daemon open one
or more pipes to separate RRDtool processes run with the “-”
option so they read commands from standard input. And then
when you need to update a file, connect to the daemon, have
it write an RRD command to one of the pipes, and you’re done.
All the work is done in a relatively lightweight manner -
no forks involved. Sure, there is still a socket connection
setup and teardown if you don’t want that transport to be
permanent, and RRDtool itself will open and close the files
it touches, but relatively speaking, those are much lighter
operations than forking, especially if you combine this
approach with the RAMdisk setup as above.

If you use a multi-thread daemon in such a configuration,
you need to ensure that Create operations are prioritized
over Update operations, so you don’t miss an Update because
the Create that should have preceded it hasn’t been run yet
due to race conditions.

The O’Reilly book on Pthreads has some good thread-pool code
that works well and can be used as a model for the worker
threads in such a daemon.

(*) If you have multiple data points to write in tandem, it’s
much more efficient to write them as multiple data sources
to a single RRD archive than to write them as single data
sources in multiple RRD archives. We’re talking order of
magnitude performance differences.

Such archives might be a little more difficult to work with
if the set of metrics to be recorded occasionally changes,
but it’s probably worth dealing with those complications
to take advantage of the real performance benefits.

(*) Ganglia doesn’t have a standard mechanism for the gmetad to
produce non-standard graphs (i.e., graphs other than those
normally presented through the Ganglia Web Interface).
What people do, though, is modify the graph[s].php file
that comes with Ganglia and that produces the existing
graphs. It’s not too difficult to see how the current set
of graphs is produced and to extend the set as needed.

Glenn

No Comments

(Required)
(Required, will not be published)