Tuesday, June 03, 2008

Nagios and its limitations?

Lately, I've been experimenting with Nagios monitoring system, a GPL open source project, to monitor Hadoop clusters.
So far so good, but I'll have to see how many nodes and services Nagios can monitor in a relatively short span (e.g. in a 3-second period)

Note : A "service" is what's being monitored. It could be "check disk space" "check users logged on" and etc.

Since Nagios spawn a child process to monitor a service, theoretically, Nagios is bound to be able to monitor up to N services where N is # of processes the OS can create and handle at once. The service being monitored is more likely to be in a remote location, and in this case, N could be the # of sockets the Nagios server's OS can handle.

My current goal is to see if it can monitor 600 services (100 nodes x 6 service each node) in a 1 minute period.

Let's see how it goes.

1 comment:

Joe said...

Did you get anywhere with Nagios monitoring Hadoop?