Configuration - Hosts
The hosts
files are what the whole configuration has been working
toward. Here we tell which hosts we're interested in and what we want to
monitor. Here's a sample host file called clark.dgim.crc.ca
:
desc DNS and Web
ip 142.92.39.18
aliases ns1.crc.ca
via 142.92.32.10
keys server sun
group Servers
contact Thomas Erskine <thomas.erskine@crc.ca>
tools ping traceroute telnet http clark-special:special
rrd ping
rrd cpu
noalert cpu user
community xyzzy
rrd load
nograph load users
rrd if-le0
alert if-le0 ierr < 1000 5000 10000
alert if-le0 in WARN
rrd df-/var
rrd df-/tmp
rrd port-http critical
rrd port-ssh
rrd port-whois
noavailability port-whois status
noavailability port-whois response
rrd port-domain critical
The name of the file (clark.dgim.crc.ca]
) is the host that you're
interested in. The name should be a fully-qualified-domain-name, but anything
which perl's getaddrbyname can resolve should work.
The ip
line saves the IP number from having to be looked up and
could be used to deal with hosts which aren't in the DNS. If you want the
IP number to be looked up each time, you can leave this line out.
The desc
line gives this host a description page-writer
will put on pages about this host.
The alias
line tells remstats about other names for this host. This
is mainly for the ping-collector
to allow it to tell for sure when
it has got a response from this host.
The via
line is used by the topology-monitor to specify networking
gear (like hubs and switches) which are in the path to the host, but won't
show up in a traceroute.
The keys
line is used to attach arbitrary attributes to hosts, for
selection purposes. This is to make it easier to write scripts which
act on specified collections of hosts. For an example, see the
update-switch-ports script.
The group
line is required and tells which group this host
belongs to. Remember, you defined all the groups back in the
groups file?
The contact
line tells who to contact for this host. If a line in
the alerts config-file refers to a
recipient called CONTACT
, the value of the host's contact line
will be substituted.
The tools
line tells which tools (defined in the
tools config-file)
you want to appear for this host. E.G. if a host doesn't have a
web-server, there's no point in providing a link to connect to it.
To accomodate host-specific tools, a toolname can be given as
real-tool-name:display-name
. This means that the tool will be defined
in the tools
config-file as real-tool-name
, but will be displayed
as display-name
.
The rrd lines tell which rrds to collect for this host.
If the rrd was defined as a wildcard, it will have the instance
specified here. In the example there are three wildcard lines, referring
to if-le0
, df-/var
and df-/mail
. The first is looking at the data
for network interface hme0 and the others are getting data on the /var and
/mail file-systems, respectively.
[Alert definitions are fully specified in alert-monitor.]
The first alert
line is setting the alert threshold for if-le0
to 50. If this host file was from the same configuration as the previous
rrd
sample, the alert here would override the one in the rrd
file.
There is also a noalert
line, which cancels an alert set in the
rrd without setting a replacement alert. The alert line for a host
must specify the rrd as well, but is otherwise the same as an alert on an
rrd.
The second alert
line is specifying the status (WARN
) for missing data
for the in
variable.
There can also be descriptions for rrds. If you append to an
rrd
line something like desc='xyzzy'
, then you'll see
that description on pages dealing with it. I added this for labelling
network interfaces, but you can use if for anything you want.
The community
specifies the SNMP community string to use for this
host to fectch SNMP data. If the host config-file doesn't specify any RRDs
collected by the snmp-collector, you don't need to specify a community.
If this host uses any rrds collected by the snmp-collector, it can also
specify a port to use like:
snmpport 3401
If the RRD itself specifies a port, then the RRD-specified port will be
used instead, for that RRD.
If you don't want a particular graph for this host, you can include a
nograph
line. It looks like:
nograph rrdname graphname
There can also be a statusfile
line, looking like:
statusfile NNN
with NNN
replaced by the name of a status file from that hosts's
data directory. This permits the main index pages to show the status
of an un-pingable host as the status of something else, like the
reachability of it's web-server (STATUS-port-http).
Related to that, there can be multiple headerfile
directives, which look
like:
headerfile FFF LLL
with FFF
being the name of a file in that host's data directory, and
LLL
being a label for the header. This can be used to inject arbitrary
information into the header.
The noavailability
lines tell the availability-report program
not to report on certain rrd/variable combinations. In this case,
we don't want to see availability stats on the whois server. Maybe
it's too embarassing?
RRD-specific Information
There may also be RRD-specific information on an rrd
directive, anything
after the rrd-name.
Any of them can include "desc="whatever"
" as part of the extra information.
This description will be included in the headers of graph pages for that RRD.
For RRDs collected by the port-collector, there may optionally be the
string critical
, which will cause a status of ERROR to be elevated to
CRITICAL. This is a relic, which may go away, so don't count on it. Instead,
use the alert
directive to override the RRDs alert specifications.
For RRDs collected by the log-collector, the extra information is not
optional, and consists of the full path to the log-file from which that
RRd's information will be collected.
For RRDs collected by the L<dbi-collector, there may be several different
pieces of extra information, like:
rrd rrdname CONNECT="cc" SELECT="ss" USER="uu" PASSWORD="pp" DATABASE="dd"
This permits using an RRD definition while overriding many parts of that
definition.
[