废话不多说,本博文纯属于个人笔记,可能会出现杂乱无章的感觉,只是把遇到的问题一一的记录下来,方便日后查看,也能帮助遇到类型问题的还在纠结的人。
十载的吉林网站建设经验,针对设计、前端、开发、售后、文案、推广等六对一服务,响应快,48小时及时工作处理。成都营销网站建设的优势是能够根据用户设备显示端的尺寸不同,自动调整吉林建站的显示方式,使网站能够适用不同显示终端,在浏览器中调整网站的宽度,无论在任何一种浏览器上浏览网站,都能展现优雅布局与设计,从而大程度地提升浏览体验。创新互联从事“吉林网站设计”,“吉林网站推广”以来,每个客户项目都认真落实执行。
系统版本及信息
cat /etc/redhat-release CentOS release 6.2 (Final) uname -a Linux 2.6.32-220.el6.x86_64 x86_64 x86_64 x86_64 GNU/Linux ifconfig |sed -n 1,2p eth0 Link encap:Ethernet HWaddr 40:F2:E9:29:5F:EA inet addr:192.168.0.2 Bcast:192.168.69.255 Mask:255.255.255.0 关闭 Iptables selinux
软件版本信息
LAMP/LNMP 忽略,任何一个环境都可以,我这里是yum 安装的LNMP环境 nagios-4.0.5.tar.gz nagios-plugins-1.4.16.tar.gz nrpe-2.15.tar.gz pnp4nagios-0.6.19.tar.gz
安装Nagios软件准备工作
确保 yum 能正常使用,建议是配置网络 yum ,安装系统所需库文件 yum groupinstall "Compatibility libraries" "Base" "Development tools" 安装lamp及所需包 yum -y install http* php* MySQL* perl* net-snmp* openssl* glibc rrdtoolrrdtool-devel rrdtool-perl rrdtool-php chkconfig mysqld on chkconfig httpd on chkconfig snmpd on service httpd start service mysqld start service snmpd start 测试ok 继续下一步 ps -ef | grep -v grep | grep http mysql snmp #分别查看,web页面访问测试
安装Nagios
1、创建nagios程序用户、组 [root@nagios ~]# useradd -s /sbin/nologin nagios [root@nagios ~]# mkdir /usr/local/nagios [root@nagios ~]# chown -R nagios.nagios /usr/local/nagios/ 2、编译安装nagios [root@nagios tools]# tar zxf nagios-4.0.5.tar.gz [root@nagios tools]# cd nagios-4.0.5 [root@nagios nagios-4.0.5]# ./configure --prefix=/usr/local/nagios [root@nagios nagios-4.0.5]# make all &&make install && make install-init && make install-commandmode&& make install-config && make install-webconf [root@nagios nagios-4.0.5]# echo $? 0 3、加入开机启动 chkconfig --add nagios chkconfig nagios on chkconfig--list nagios
安装nagios-plugins 插件
[root@nagios tools]# tar zxf nagios-plugins-1.4.16.tar.gz [root@nagios tools]# cd nagios-plugins-1.4.16 [root@nagios tools nagios-plugins-1.4.16]# ./configure --prefix=/usr/local/nagios/ [root@nagios tools nagios-plugins-1.4.16]# make [root@nagios tools nagios-plugins-1.4.16]# make install [root@nagios tools nagios-plugins-1.4.16]# echo $? 0
编辑http.conf配置文件
cd /etc/httpd/conf cp -a httpd.conf httpd.conf.bak vim httpd.conf # 添加在最后面即可 ####### setting for nagios ####### ScriptAlias /nagios/cgi-bin "/usr/local/nagios/sbin"AuthType Basic Options ExecCGI AllowOverride None Order allow,deny Allow from all AuthName "nagios access" AuthUserFile /usr/local/nagios/etc/htpasswd Require valid-user Alias /nagios "/usr/local/nagios/share"AuthType Basic Options ExecCGI AllowOverride None Order allow,deny Allow from all AuthName "nagios access" AuthUserFile /usr/local/nagios/etc/htpasswd Require valid-user 修改 DirectoryIndex index.html index.html.var 为 DirectoryIndex index.php index.html index.html.var 修改 Options Indexes FollowSymLinks 为 Options FollowSymLinks #防止网站列目录 service httpd restart 增加nagios登陆认证文件,一定要用默认的nagiosadmin作为用户,否则需要修改其他文件,修改之前备份,这里就不备份了 [root@nagios etc]# cd /usr/local/nagios/etc [root@nagios etc]# sed -i s@nagiosadmin@nagiosadmin\,admin@g cgi.cfg [root@nagios etc]# sed -i s@\#default_user_name=guest@default_user_name=admin@g cgi.cfg [root@nagios nagios]# htpasswd -c /usr/local/nagios/etc/htpasswd admin New password: ****** Re-type new password:******
安装 Nrpe 插件
[root@nagios tools]# tar zxf nrpe-2.15.tar.gz [root@nagios tools]# cd nrpe-2.15 [root@nagios nrpe-2.15]# ./configure;make all;make install-plugin;make install-daemon;make install-daemon-config 启动Nrpe [root@nagios nrpe-2.15]# /usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -d [root@nagios nrpe-2.15]# netstat -antl |grep 5666 tcp 0 0 0.0.0.0:5666 0.0.0.0:* LISTEN [root@nagios libexec]#/usr/local/nagios/libexec/check_nrpe -H 127.0.0.1 NRPE v2.15 关闭Nrpe [root@nagios libexec]# ps -ef | grep -v grep | grep nrpe [root@nagios libexec]# kill -9 进程号
检测nagios
[root@nagios etc]# /usr/local/nagios/bin/nagios-v /usr/local/nagios/etc/nagios.cfg Total Warnings: 0 Total Errors: 0 表示OK
启动nagios
[root@nagios etc]# service nagios start stop restart 开启 停止 重启 http://IP/nagios
安装 pnp4nagios 插件
[root@nagios tools]# tar zxf pnp4nagios-0.6.19.tar.gz [root@nagios tools]# cd pnp4nagios-0.6.19 [root@nagios tools pnp4nagios-0.6.19]#./configure make all make install make install-config make install-init make install-webconf 创建默认配置文件 cd /usr/local/pnp4nagios/etc cp misccommands.cfg-sample misccommands.cfg cp nagios.cfg-sample nagios.cfg cp rra.cfg-sample rra.cfg cd pages cp web_traffic.cfg-sample web_traffic.cfg cd ../check_commands/ cp check_all_local_disks.cfg-samplecheck_all_local_disks.cfg cp check_nrpe.cfg-sample check_nrpe.cfg cp check_nwstat.cfg-sample check_nwstat.cfg cp /usr/local/pnp4nagios/libexec/* /usr/local/nagios/libexec/ vim /usr/local/nagios/etc/nagios.cfg 检查 enable_environment_macros=1 process_performance_data=1 host_perfdata_command=process-host-perfdata service_perfdata_command=process-service-perfdata 提示:如果nagios版本是4.X,上面配置会导致后面,生成不了流量图,报如下错误 PNP4Nagios Version 0.6.19 Please check the documentation for information about the following error. perfdata directory "/usr/local/pnp4nagios/var/perfdata/localhost" for host "localhost" does not exist.Read FAQ online file [line]: application/models/data.php [148]: back

出现这个错误的原因是参照
解决方案是使用 Bulk Mode方式
vim /usr/local/nagios/etc/nagios.cfg
检查
enable_environment_macros=1
process_performance_data=1
添加到最后即可
# service performance data
service_perfdata_file=/usr/local/pnp4nagios/var/service-perfdata
service_perfdata_file_template=DATATYPE::SERVICEPERFDATA\tTIMET::$TIMET$\tHOSTNAME::$HOSTNAME$\tSERVICEDESC::$SERVICEDESC$\tSERVICEPERFDATA::$SERVICEPERFDATA$\tSERVICECHECKCOMMAND::$SERVICECHECKCOMMAND$\tHOSTSTATE::$HOSTSTATE$\tHOSTSTATETYPE::$HOSTSTATETYPE$\tSERVICESTATE::$SERVICESTATE$\tSERVICESTATETYPE::$SERVICESTATETYPE$
service_perfdata_file_mode=a
service_perfdata_file_processing_interval=15
service_perfdata_file_processing_command=process-service-perfdata-file
# host performance data starting with Nagios 3.0
host_perfdata_file=/usr/local/pnp4nagios/var/host-perfdata
host_perfdata_file_template=DATATYPE::HOSTPERFDATA\tTIMET::$TIMET$\tHOSTNAME::$HOSTNAME$\tHOSTPERFDATA::$HOSTPERFDATA$\tHOSTCHECKCOMMAND::$HOSTCHECKCOMMAND$\tHOSTSTATE::$HOSTSTATE$\tHOSTSTATETYPE::$HOSTSTATETYPE$
host_perfdata_file_mode=a
host_perfdata_file_processing_interval=15
host_perfdata_file_processing_command=process-host-perfdata-file
保存
vim /usr/local/nagios/etc/objects/commands.cfg
define command{
command_name check_nrpe
command_line $USER1$/check_nrpe-H $HOSTADDRESS$ -c $ARG1$
} #这一段放在上面即可
如下:同步模式设定方法添加到末尾就可以,记住在这个配置文件里面, 默认有这个配置,需要找到注释掉,然后将下面的配置添加,如果不注释掉,在你检查nagios的配置文件的时候会报错
define command{
command_name process-service-perfdata-file
command_line /usr/local/pnp4nagios/libexec/process_perfdata.pl --bulk=/usr/local/pnp4nagios/var/service-perfdata
}
define command{
command_name process-host-perfdata-file
command_line /usr/local/pnp4nagios/libexec/process_perfdata.pl --bulk=/usr/local/pnp4nagios/var/host-perfdata
}
定义pnp的主机和服务两个模版添加在最后面
vim /usr/local/nagios/etc/objects/templates.cfg
define host {
name host-pnp
action_url /pnp4nagios/index.php/graph?host=$HOSTNAME$&srv=_HOST_
register 0
}
define service {
name service-pnp
action_url /pnp4nagios/index.php/graph?host=$HOSTNAME$&srv=$SERVICEDESC$
register 0
}
也可以添加在,其他参数下面省略了,下面这个方法可以减少很多配置主机启用pnp时的时间
vim /usr/local/nagios/etc/objects/templates.cfg
define host {
action_url /pnp4nagios/index.php/graph?host=$HOSTNAME$&srv=_HOST_
}
define service {
action_url /pnp4nagios/index.php/graph?host=$HOSTNAME$&srv=$SERVICEDESC$
}先做一下pnp4nagios环境测试添加在httpd.conf最后面
vim /etc/httpd/conf/httpd.conf Alias /pnp4nagios "/usr/local/pnp4nagios/share"AllowOverride None Order allow,deny Allow from all AuthName "Nagios Access" AuthType Basic AuthUserFile /usr/local/nagios/etc/htpasswd Require valid-user RewirteEngine On Options FollowSymLinks RewirteBase /pnp4nagios RewirteRule ^(application|modules|system) -[F,L] RewirteCond %{REQUEST_FILENAME} !-f RewirteCond %{REQUEST_FILENAME} !-d RewirteRule .* index.php/$0 [PT,L] service httpd restart
访问 http://IP/pnp4nagios

cd /usr/local/pnp4nagios/share/
mv install.php install.php.bak
编辑nagios.cfg文件
vim /usr/local/nagios/etc/nagios.cfg cfg_file=/usr/local/nagios/etc/objects/commands.cfg cfg_file=/usr/local/nagios/etc/objects/contacts.cfg cfg_file=/usr/local/nagios/etc/objects/timeperiods.cfg cfg_file=/usr/local/nagios/etc/objects/templates.cfg cfg_file=/usr/local/nagios/etc/objects/localhost.cfg cfg_file=/usr/local/nagios/etc/objects/hosts.cfg cfg_file=/usr/local/nagios/etc/objects/hostgroup.cfg cfg_file=/usr/local/nagios/etc/objects/services.cfg 或者 cfg_file=/usr/local/nagios/etc/objects/commands.cfg cfg_file=/usr/local/nagios/etc/objects/contacts.cfg cfg_file=/usr/local/nagios/etc/objects/timeperiods.cfg cfg_file=/usr/local/nagios/etc/objects/templates.cfg cfg_file=/usr/local/nagios/etc/objects/localhost.cfg cfg_dir=/usr/local/nagios/etc/objects/apps 提示:此操作只是启用了linux主机监控,没有启用windows和switch,如果需要把注释去掉即可,第一种和第二种都可以 区别是:第一种共同使用一个配置文件,第二种独立使用配置文件,这里我都会演示,下面以第一种和第二种进行区分
添加主机配置,第一种方法
默认nagios/etc/objects/ 下面没有 service.cfg host.cfg hostgroup.cfg 这几个配置文件,需要手动添加
vim hosts.cfg
define host{
use linux-server,host-pnp #这个是根据templates.cfg信息定义,如果上面定义的模板host-pnp添加在define host和define sevice里面,这儿host-pnp可以不用加,因为linux-server已经包含了
host_name cacti #必须是 被监控的主机名
alias cacti-web #别名随便定义
address 192.168.0.3 #主机ip地址
contact_groups admins #邮件组,下面会演示
}
define host{
use linux-server,host-pnp
host_name nginx
alias nginx-web
address 192.168.0.4
contact_groups admins
}
有多少机器就这样添加多少台
vim hostgroup.cfg
define hostgroup{
hostgroup_name servers #组名
alias servers_group #别名
members cacti,nginx #主机名 多个 逗号 隔开
}
vim service.cfg #所有主机在同一配置文件,很乱
#### set cacti host
define service{
use local-service,services-pnp
host_name cacti
service_description http
check_command check_http
contact_groups admins
flap_detection_enabled 0
}
define service{
use local-service,services-pnp
host_name cacti
service_description SSH_port
check_command check_tcp!22
contact_groups admins
flap_detection_enabled 0
}
define service{
use local-service,services-pnp
host_name cacti
service_description check_/
check_command check_nrpe!check_/ #使用nrpe检测,客户端需要定义
contact_groups admins
flap_detection_enabled 0
}
#### set nginx host
define service{
use local-service,service-pnp
host_name nginx
service_description Check_free_mem
check_command check_nrpe!check_free_mem
contact_groups admins
flap_detection_enabled 0
}
define service{
use local-service,services-pnp
host_name nginx
service_description check_/
check_command check_nrpe!check_/ #使用nrpe检测,客户端需要定义
contact_groups admins
flap_detection_enabled 0
}
有多少就需要添加多少,第一种方法 end添加主机配置,第二种方法
cd nagios/etc/objects/
mkdir app
cd app
vim 192.168.0.2.cfg #在一个独立的文件定义所有监控对象,这个没有定义组,意义不大
###定义host
define host{
use linux-server,host-pnp #这个是根据templates.cfg信息定义,如果上面定义的模板host-pnp添加在define host和define sevice里面,这儿host-pnp可以不用加,因为linux-server已经包含了
host_name nginx #必须是 被监控的主机名
alias nginx-web #别名随便定义
address 192.168.0.4 #主机ip地址
contact_groups admins #邮件组,下面会演示
}
###定义service
define service{
use local-service,service-pnp
host_name nginx
service_description Check_free_mem
check_command check_nrpe!check_free_mem
contact_groups admins
flap_detection_enabled 0
}
define service{
use local-service,services-pnp
host_name nginx
service_description check_/
check_command check_nrpe!check_/ #使用nrpe检测,客户端需要定义
contact_groups admins
flap_detection_enabled 0
}vim 192.168.0.3.cfg
###定义host
define host{
use linux-server,host-pnp #这个是根据templates.cfg信息定义,如果上面定义的模板host-pnp添加在define host和define sevice里面,这儿host-pnp可以不用加,因为linux-server已经包含了
host_name cacti #必须是 被监控的主机名
alias cacti-web #别名随便定义
address 192.168.0.3 #主机ip地址
contact_groups admins #邮件组,下面会演示
}
###定义service
define service{
use local-service,service-pnp
host_name cacti
service_description Check_free_mem
check_command check_nrpe!check_free_mem
contact_groups admins
flap_detection_enabled 0
}
define service{
use local-service,service-pnp
host_name cacti
service_description Check_free_mem
check_command check_nrpe!check_free_mem
contact_groups admins
flap_detection_enabled 0
}
这种办法比第一种方便许多,添加主机2种方法 ENDnagios邮件报警设置
[root@nagios objects]# vim contacts.cfg #参数详解,请百度
define contact{
contact_name nagiosadmin
use generic-contact
alias Nagios Admin service_notification_period 24x7
host_notification_period 24x7
service_notification_options w,u,c,r
host_notification_options d,u,r
service_notification_commands notify-service-by-email
host_notification_commands notify-host-by-email
email xxxx@163.com
}
define contactgroup{
contactgroup_name admins #这个就是上面那个admins
alias Nagios Administrators
members nagiosadmin
}检查配置文件是否有错
/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg Total Warnings: 0 Total Errors: 0 Things look okay - No serious problems were detected during the pre-flight check service nagios restart 服务端配置 end
客户端安装配置
需要安装net-snmp,如果有其他错误根据提示进行解决 yum -y install net-snmp* 1、创建nagios程序用户、组 [root@nagios ~]# useradd -s /sbin/nologin nagios [root@nagios ~]# mkdir /usr/local/nagios [root@nagios ~]# chown -R nagios.nagios /usr/local/nagios/ 2、安装nagios-plugins 插件 [root@nagios tools]# tar zxf nagios-plugins-1.4.16.tar.gz [root@nagios tools]# cd nagios-plugins-1.4.16 [root@nagios tools nagios-plugins-1.4.16]# ./configure --prefix=/usr/local/nagios/ [root@nagios tools nagios-plugins-1.4.16]# make [root@nagios tools nagios-plugins-1.4.16]# make install [root@nagios tools nagios-plugins-1.4.16]# echo $? 0 3、安装 Nrpe 插件 [root@nagios tools]# tar zxf nrpe-2.15.tar.gz [root@nagios tools]# cd nrpe-2.15 [root@nagios nrpe-2.15]# ./configure;make all;make install-plugin;make install-daemon;make install-daemon-config 编辑nrpe.cfg sed -I 's/allowed_hosts=127.0.0.1/allowed_hosts=127.0.0.1,192.168.0.2/g' /usr/local/nagios/etc/nrpe.cfg vim /usr/local/nagios/etc/nrpe.cfg command[check_swap]=/usr/local/nagios/libexec/check_swap -w 20% -c 10% command[check_data]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /data command[check_/]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p / command[check_zombie_procs]=/usr/local/nagios/libexec/check_procs -w 5 -c 10 -s Z command[check_total_procs]=/usr/local/nagios/libexec/check_procs -w 150 -c 200 保存 echo "/usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg-d" >> /etc/rc.local 启动Nrpe [root@nagios nrpe-2.15]# /usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -d [root@nagios nrpe-2.15]# netstat -antl |grep 5666 tcp 0 0 0.0.0.0:5666 0.0.0.0:* LISTEN 这个在服务端操作,确保ok,如果不能请检查客户端防火墙和网络是否允许通信 [root@nagios libexec]#/usr/local/nagios/libexec/check_nrpe -H 192.168.0.3 NRPE v2.15 关闭Nrpe [root@nagios libexec]# ps -ef | grep -v grep | grep nrpe [root@nagios libexec]# kill -9 进程号
pnp不出图时候,查看日志
vim /usr/local/pnp4nagios/etc/process_perfdata.cfg
修改
LOG_LEVEL = 0
为
LOG_LEVEL = 2
more /usr/local/pnp4nagios/var/perfdata.log
提示:nagios 监控进程时候,即便pnp配置ok,也不会出图,例如下面的
| OK | 10-20-2014 16:44:45 | 83d 1h 9m 16s | 1/3 | PROCS OK: 503 processes |
| OK | 10-20-2014 16:46:00 | 83d 1h 7m 58s | 1/3 | PROCS OK: 0 processes with STATE = Z |
PNP4Nagios Version 0.6.19 Please check the documentation for information about the following error.XML file "/usr/local/pnp4nagios/var/perfdata/app-11/Total_Processes.xml" not found. Read FAQ online file [line]:application/models/data.php [312]: back |

至于原因可以参考,非常详细
http://storysky.blog.51cto.com/628458/583787/
Nagios如果系统监控插件满足不了需求,可以自行开发插件
例如下面是一个内存监控插件,插件是百度找的还是不错的,我这里借用一下
vim /usr/local/nagios/libexec/check_mem
#!/bin/bash
STAT_OK=0
STAT_WARNING=1
STAT_CRITICAL=2
STAT_UNKNOWN=3
total_mem=`free -m |awk 'NR==2{print $2}'`
used_mem=`free -m |awk 'NR==3{print $3}'` #取的是系统真正用掉的内存
free_mem=`free -m |awk 'NR==3{print $4}'` #取的是free+cache的内存
use_per=`echo "scale=2;$used_mem/$total_mem"|bc|sed 's/^.//g'`
help() {
echo "USAGE:`basename $0` [-w] [-c] [-h]"
exit -1
}
while getopts ":w:c:h" opt
do
case $opt in
w) warning=$OPTARG
;;
c) critical=$OPTARG
;;
h) help
;;
?) unkown=$OPTARG
echo "error,plase check for help,USAGE:./`basename $0` -h"
exit $STAT_UNKNOWN
;;
esac
done
if [[ $use_per -lt $warning ]];
then
echo "OK - total:$total_mem MB,used:$used_mem MB,free:$free_mem MB | total_mem=$total_mem used_mem=$used_mem free_mem=$free_mem"
exit $STAT_OK
elif [[ $use_per -ge $warning ]] && [[ $use_per -lt $critical ]];
then
echo "WARNING - total:$total_mem MB,used:$used_mem MB,free:$free_mem MB | total_mem=$total_mem used_mem=$used_mem free_mem=$free_mem"
exit $STAT_WARNING
else
echo "CRITICAL - total:$total_mem MB,used:$used_mem MB,free:$free_mem MB | total_mem=$total_mem used_mem=$used_mem free_mem=$free_mem"
exit $STAT_CRITICAL
fi
fi
保存
chown nagios.nagios check_mem
chmod +x check_mem
./check_mem -w 80 -c 90
OK - total:15926 MB,used:1839 MB,free:14086 MB | total_mem=15926 used_mem=1839 free_mem=14086
vim /usr/local/nagios/etc/nrpe.cfg
添加
command[check_free_mem]=/usr/local/nagios/libexec/check_mem -w 80 -c 90
重启nrpe
在编辑/usr/local/nagios/etc/objects/app/的文件
添加
define service{
use local-service,service-pnp
host_name cacti
service_description Check_free_mem
check_command check_nrpe!check_free_mem
contact_groups admins
flap_detection_enabled 0
}
检查nagios 重启nagios Windows和交换机监控配置不难,只要思路清晰,肯定能弄出来,nagios配置其实不难,就是有点麻烦而已,只要把配置文件的关系弄明白,一切都很简单
到此全部结束
文章题目:Nagios监控搭建和配置(笔记)
网址分享:http://www.jxjierui.cn/article/pdhioo.html


咨询
建站咨询

