Nagios 装配、配置和使用操作

2013-02-28

Nagios 安装、配置和使用操作1. 概述本手册主要描述Nagioscores，Nagiosplugin，NRPE，NDOUtils的安装、配置以

Nagios 安装、配置和使用操作
1. 概述本手册主要描述Nagioscores，Nagiosplugin，NRPE，NDOUtils的安装、配置以及Horizon如何使用Nagios实现监控Openstack控制和计算节点硬件资源[1]和服务[2]。

备注：

[1]:CPU，Mem，Disk，Network

[2]:keystone, glance-api, glance-register, nova-api, nova-computer,nova-network, nova-scheduler, nova-volume, nova-objectstores, mysql,dnsmasq, rabbitmq, etc

2. References

Nagios官方docs:http://www.nagios.org/documentation

参考手册：http://library.nagios.com/library/products/nagioscore/manuals

PluginResources：http://exchange.nagios.org/

TarResources:http://sourceforge.net/projects/nagios/files/?source=navbar

3. 环境准备

操作系统：Ubuntu 12.04 LTS 64x server

Nagioscore Version：nagios-3.4.4

NRPEVersion：nrpe-2.14

NDOUtilsVersion：ndoutils-1.5.2

Dependslist:

apache2

libapache2-mod-php5

build-essential

libgd2-xpm-dev

make

gcc

xinetd

-dDEVICE DEVICE must be without /dev (ex: -d sda)
-w/cTPS,READ,WRITE TPS means transfer per seconds (aka IO/s)
READ andWRITE are in sectors per seconds
Example：
【本地环境】
$ sudo/usr/local/nagios/check_diskstat.sh -d vda -w 200,100000,100000 -c300,200000,200000
>summary: 0 io/s, read 8 sectors (0kB/s), write 56 sectors (4kB/s) in6 seconds | tps=0io/s;;; read=682b/s;;; write=4778b/s;;;
【远程环境】
在/usr/local/nagios/etc/nrpe.cfg中增加
command[check_diskstat]=/usr/local/nagios/libexec/check_diskstat.sh -d vda -w 200,100000,100000 -c 300,200000,200000
$/usr/local/nagios/libexec/check_nrpe -H 10.0.1.14 -c check_ diskstat
> 同上
---------------------------------------------------
插件名称：check_disk
http://exchange.nagios.org/directory/Plugins/Operating-Systems/Linux/check_disk--2D-%25-used-space/details
插件描述：基于df命令编写，-d需要设置df打印出来的Mountedon
插件参数：
Thisplugin shows the % of used space of a mounted partition, using the'df' utility
./check_disk:
-c<integer> If the % of used space is above <integer>,returns CRITICAL state
-w<integer> If the % of used space is below CRITICAL and above<integer>, returns WARNING state
-d<device> The partition or mountpoint to be checked. eg./dev/sda1, /home, /
Example：
【本地环境】
$ df -h
Filesystem Size Used Avail Use% Mounted on
/dev/vda1 9.9G 1.7G 7.8G 18% /
udev 998M 12K 998M 1% /dev
tmpfs 401M 224K 401M 1% /run
none 5.0M 0 5.0M 0% /run/lock
none 1002M 0 1002M 0% /run/shm
/dev/vdb 20G 173M 19G 1% /mnt
$/usr/local/nagios/check_disk -d /mnt -c 80 -w 10
> OK- /mnt space used=1% | '/mnt usage'=1%;10;80;
【远程环境】
在/usr/local/nagios/etc/nrpe.cfg中增加
command[check_disk]=/usr/local/nagios/libexec/check_disk-d /mnt -c 80 -w 10
$/usr/local/nagios/libexec/check_nrpe -H 10.0.1.14 -c check_ disk
> 同上
---------------------------------------------------
插件名称：check_lvm
http://exchange.nagios.org/directory/Plugins/Operating-Systems/Linux/check_lvm/details
插件描述：仅运行在存在vg的情况下
插件参数：
NOTE -This script only works on _mounted_ volumes!
Usage:./check_lvm -w -c
Description:
Thisplugin finds all LVM logical volumes, checks their used space, andcompares against the supplied thresholds.
Example：

5.2 控制和计算服务
插件名称：check_proc
插件描述：基于ps,可用于查看相关服务的进程是否存在。
插件参数：
check_procs-w <range> -c <range> [-m metric] [-s state] [-p ppid]
[-uuser] [-r rss] [-z vsz] [-P %cpu] [-a argument-array]
[-Ccommand] [-t timeout] [-v]
Options:
-h,--help
Printdetailed help screen
-V,--version
Printversion information
-w,--warning=RANGE
Generatewarning state if metric is outside this range
-c,--critical=RANGE
Generatecritical state if metric is outside this range
-m,--metric=TYPE
Checkthresholds against metric. Valid types:
PROCS - number of processes (default)
VSZ - virtual memory size
RSS - resident set memory size
CPU - percentage CPU
ELAPSED- time elapsed in seconds
-t,--timeout=INTEGER
Secondsbefore connection times out (default: 10)
-v,--verbose
Extrainformation. Up to 3 verbosity levels
Filters:
-s,--state=STATUSFLAGS
Onlyscan for processes that have, in the output of `ps`, one or
moreof the status flags you specify (for example R, Z, S, RS,
RSZDT,plus others based on the output of your 'ps' command).
-p,--ppid=PPID
Onlyscan for children of the parent process ID indicated.
-z,--vsz=VSZ
Onlyscan for processes with VSZ higher than indicated.
-r,--rss=RSS
Onlyscan for processes with RSS higher than indicated.
-P,--pcpu=PCPU
Onlyscan for processes with PCPU higher than indicated.
-u,--user=USER
Onlyscan for processes with user name or ID indicated.
-a,--argument-array=STRING
Onlyscan for processes with args that contain STRING.
--ereg-argument-array=STRING
Onlyscan for processes with args that contain the regex STRING.
-C,--command=COMMAND
Onlyscan for exact matches of COMMAND (without path).
Example：
$/usr/local/nagios/check_procs -w 3 -c 5 -a nagios
>PROCS OK: 2 processes with args 'nagios'

5.3 其它可选监控插件
[LOG]
http://exchange.nagios.org/directory/Plugins/Operating-Systems/Linux/check_log-2Esh/details
http://exchange.nagios.org/directory/Plugins/Log-Files
[DNS]
http://exchange.nagios.org/directory/Plugins/Operating-Systems/Linux/check_dig/details
[DHCP]
http://exchange.nagios.org/directory/Plugins/Network-Protocols/DHCP-and-BOOTP
[AMQP]
http://exchange.nagios.org/directory/Plugins/Software/check_rabbitmq/details
[MYSQL]
http://exchange.nagios.org/directory/Plugins/Databases/MySQL
[ROUTE]
http://exchange.nagios.org/directory/Plugins/Network-Protocols/%2A-Routing
备注
Nagios本身具有web界面，web界面通过与Nagioscore的进程交互获取信息，而Nagioscore通过plugin获取信息，并将数据保存在mysql数据库中。
由于在目前环境下仅需基于Nagios的plugin获取节点的监控信息，所以并未在Nagioscore，NDOUtils，Nagiosweb interface进行深入描述。具体详细信息科参考Refernces。

热点排行

操作系统

Nagios 装配、配置和使用 操作

备注

Nagios 装配、配置和使用操作