Cacti监控mysql数据库服务器实现过程及其问题汇总

转自:http://blog.itpub.net/26230597/viewspace-1172655/

前言:cacti服务器端安装请参考: http://blog.itpub.net/26230597/viewspace-1170579/

1 先在cacti服务器端安装mysql模板

wget https://mysql-cacti-templates.googlecode.com/files/better-cacti-templates-1.1.8.tar.gz

tar –xvf better-cacti-templates-1.1.8.tar.gz

cd better-cacti-templates-1.1.8

将better-cacti-templates-1.1.8\scripts下的 ss_get_mysql_stats.php 这个脚本 这个脚本需要放在cacti的服务端。

比如cacti部署在/var/www/html目录下,那么就cp到/var/www/html/cacti/scripts/下

cp

/root/better-cacti-templates-1.1.8/scripts/ss_get_mysql_stats.php/var/www/html/cacti/scripts/

 

修改ss_get_mysql_stats.php文件 第30行

$mysql_user = ‘cacti_user’;
$mysql_pass = ‘cacti’;
$cache_dir = “/xok.la/cacti/cache/”;

赋予apache账号操作权限

chown -R apache.apache/var/www/html/cacti/scripts

chmod -R 755 /var/www/html/cacti/scripts

使用http访问cacti主机导入:

/root/better-cacti-templates-1.1.8/templates/cacti_host_template_x_mysql_server_ht_0.8.6i-sver1.1.8.xml

 

2,建立mysql账号

在被监控的mysql服务器建立数据库的cacti账号,需要PROCESS, SUPER, REPLICATION CLIENT权限,SQL如下:

GRANT PROCESS, SUPER, REPLICATION CLIENT ON*.* TO ‘cacti’@’%’ IDENTIFIED BY ”;

 

3,在cacti上面添加主机:

3.1点击Create devices

 

3.2进去之后再点击add按钮,添加主机

 

3.3 录入描述符和主机名或者IP地址,点击右下角的Create按钮即可。

 

3.4 界面报错如下:

 

看到在cacti中添加监控主机时,提示错误“SNMP error”,一般有2种处理办法:

(1),确定cacti所有的主机能ping通被监控主机;如果不能ping通,请确认网络配置和被监控主机的ip设置是否正确。

[root@squid-2 templates]# ping 10.xxx.3.xx

PING 10.254.3.72 (10.254.3.72) 56(84) bytesof data.

64 bytes from 10.xx.3.xx: icmp_seq=1 ttl=64time=0.427 ms

64 bytes from 10.xx.3.xx: icmp_seq=2 ttl=64time=0.389 ms

64 bytes from 10.xx.3.xx: icmp_seq=3 ttl=64time=0.402 ms

64 bytes from 10.xx.3.xx: icmp_seq=4 ttl=64time=0.415 ms

可以ping通,证明不是网络故障。

 

(2),确认被监控主机是否启用snmpd服务:

[root@xxx ~]# ps -eaf|grep snmpd

root     4540 27133  0 17:15 pts/0    00:00:00 grep snmpd

[root@xxx ~]#

[root@xxx ~]# service snmpd start

snmpd: 未被识别的服务

[root@xxx ~]#

被监控主机需要安装snmpd服务,使用yum -y install snmpd 安装snmpd服务。

 

[root@xxx ~]# service snmpdrestart

snmpd: 未被识别的服务

[root@xxx ~]#

[root@db-m2-slave-1 ~]# yum -y install snmp

Loaded plugins: fastestmirror, security

Loading mirror speeds from cached hostfile

*base: mirror.neu.edu.cn

*extras: mirror.neu.edu.cn

*updates: mirror.neu.edu.cn

Setting up Install Process

No package snmp available.

Error: Nothing to do

Yum安装不了,试试yum install -y net-snmp,安装成功:

[root@xxx ~]# yum install -ynet-snmp

Loaded plugins: fastestmirror, security

Loading mirror speeds from cached hostfile

*base: mirror.neu.edu.cn

*extras: mirror.neu.edu.cn

*updates: mirror.neu.edu.cn

Setting up Install Process

Resolving Dependencies

–> Running transaction check

—> Package net-snmp.x86_641:5.5-49.el6_5.1 will be installed

–> Processing Dependency: net-snmp-libs= 1:5.5-49.el6_5.1 for package: 1:net-snmp-5.5-49.el6_5.1.x86_64

–> Processing Dependency:libsensors.so.4()(64bit) for package: 1:net-snmp-5.5-49.el6_5.1.x86_64

–> Processing Dependency:libnetsnmptrapd.so.20()(64bit) for package: 1:net-snmp-5.5-49.el6_5.1.x86_64

–> Processing Dependency:libnetsnmpmibs.so.20()(64bit) for package: 1:net-snmp-5.5-49.el6_5.1.x86_64

[root@xxx ~]# service snmpdrestart

停止 snmpd:                                              [失败]

正在启动 snmpd:                                           [确定]

[root@xxx ~]#

也可以用 service snmpd reload命令来重新加载。

 

(3),这个时候去看主机状态,正在恢复中:

然后也可以到cacti服务器上,运行snmpwalk来check下:

snmpwalk-c public -v 2c 10.xxx.1.xx    # (这个ip10.xxx.1.xx为被监控主机的ip地址)

如果能够接收到被监控机器的数据信息,则表示被监控主机的snmp配置已经完成,没有错误。

 

4,继续添加被监控主机的画图,增加graphs:

在Console界面右侧,点击Create devices连接,如下图所示:

 

然后点击host主机名连接,如下:

然后点击右上角的Create Graphs for this Host 连接


5,添加主机组

在graphs下面添加tree,点击console,选择左边栏的Graphs Trees,点击右边的Add按钮,

输入trees名字,选择排序类型为Natural Ordering,点击Create按钮创建。

之后选择创建好的graphs trees,点击add按钮往trees里面添加database主机,加完如下图:

 

 

之后点击最上面的graphs,就会出现已经建立好的主机组,如下所示:

参考:http://blog.csdn.net/hw_libo/article/details/6881480

前言:cacti监控mysql服务器的大概50张graphs都弄出来了,也出图了,其中遇到一些问题,印象比较深刻的记录如下:

(一):添加io监控

 

点击Create Graphs for this Host 进去创建IO的图,结果报错

This data query returned 0 rows, perhaps there was a problem executing this data query. You can run this data query in debug mode to get more information.

进入*Turn On Graph Debug Mode模式,报错如下:

RRDTool Command:

/usr/bin/rrdtool graph – \

–imgformat=PNG \

–start=-86400 \

–end=-300 \

–title=’db-m2-slave-1 – Traffic’ \

–rigid \

–base=1000 \

–height=120 \

–width=500 \

–alt-autoscale-max \

–lower-limit=’0′ \

–vertical-label=’bits per second’ \

–slope-mode \

–font TITLE:10: \

–font AXIS:7: \

–font LEGEND:8: \

–font UNIT:7: \

CDEF:cdefa=’a,8,*’ \

AREA:cdefa#00CF00FF:’Inbound’  \

GPRINT:cdefa:LAST:’ Current\:%8.2lf %s’  \

GPRINT:cdefa:AVERAGE:’Average\:%8.2lf %s’  \

GPRINT:cdefa:MAX:’Maximum\:%8.2lf %s\n’  \

LINE1:cdefa#002A97FF:’Outbound’  \

GPRINT:cdefa:LAST:’Current\:%8.2lf %s’  \

GPRINT:cdefa:AVERAGE:’Average\:%8.2lf %s’  \

GPRINT:cdefa:MAX:’Maximum\:%8.2lf %s\n’

RRDTool Says:

ERROR: invalid rpn expression in: a,8,*,如下图所示

 

 

编辑linux主机下的/etc/snmp/snmpd.conf文件
找到:com2sec notConfigUser  default       public
修改成:com2sec notConfigUser  all       public
找到:access  notConfigGroup “”      any       noauth    exact  systemview none none
修改成:access  notConfigGroup “”      any       noauth    exact  all none none
找到:#view all    included  .1     80把该行的#去掉,
找到:#view mib2   included  .iso.org.dod.internet.mgmt.mib-2 fc 把改行的#去掉,
重起snmpd:/etc/init.d/snmpd restart

(二):MySQL添加主机出不来图

[root@squid-2 test]# service httpd restart

停止 httpd:                                               [确定]

正在启动 httpd:httpd: Could not reliably determine the server’s fully qualified domain name, using 127.0.0.1 for ServerName

[确定]

1)  进入配置文件目录

cd  /etc/httpd/conf/

2)编辑httpd.conf文件,搜索”#ServerName”,添加ServerName localhost:80
[root@server conf]# ls
extra  httpd.conf  magic  mime.types  original
[root@server conf]# vi httpd.conf
#ServerName www.example.com:80
ServerName localhost:80
3)再重新启动apache 即可。

 

(三):启动报错

[root@squid-2 error]# tail -f /var/log/httpd/error_log

[Sat May 31 22:49:02 2014] [notice] caught SIGTERM, shutting down

[Sat May 31 22:49:02 2014] [notice] SELinux policy enabled; httpd running as context unconfined_u:system_r:httpd_t:s0

[Sat May 31 22:49:02 2014] [notice] suEXEC mechanism enabled (wrapper: /usr/sbin/suexec)

[Sat May 31 22:49:02 2014] [notice] Digest: generating secret for digest authentication …

[Sat May 31 22:49:02 2014] [notice] Digest: done

[Sat May 31 22:49:02 2014] [notice] Apache/2.2.15 (Unix) DAV/2 PHP/5.3.3 configured — resuming normal operations

解决方法:直接关闭SELinux以及防火墙 .

(四):MySQL监控项出图报错

[Sat May 31 23:20:10 2014] [error] [client 192.168.171.71] PHP Fatal error:  Allowed memory size of 134217728 bytes exhausted (tried to allocate 523800 bytes) in /var/www/html/cacti/lib/adodb/adodb.inc.php on line 833

需要导入cacti.sql文件

mysql -u root -p cacti < /var/www/html/cacti/cacti.sql

 

(五):SNMP – Interface Statistics报错

创建SNMP – Interface Statistics报错,如下:

Created graph: db-m2-slave-2 – Traffic – |query_ifName|

ERROR: no Data Source associated. Check Template

[root@squid-2 html]# snmpwalk -c public -v 2c 10.254.3.73 ifHCInOctets

IF-MIB::ifHCInOctets = No more variables left in this MIB View (It is past the end of the MIB tree)

[root@squid-2 html]#

[root@squid-2 html]# snmpwalk -c public -v 2c 10.254.3.73 if

IF-MIB::ifTable = No Such Object available on this agent at this OID

 

于是再次修改snmpd.conf,并重启snmpd
access   notConfigGroup “”       any       noauth     exact   systemview none none–>
access   notConfigGroup “”       any       noauth     exact   all     none none

[root@db-m2-slave-2 ~]# service snmpd restart

停止 snmpd:                                               [确定]

正在启动 snmpd:                                           [确定]

[root@db-m2-slave-2 ~]#

[root@squid-2 html]# snmpwalk -c public -v 2c 10.254.3.73 if

IF-MIB::ifTable = No more variables left in this MIB View (It is past the end of the MIB tree)

【】解决

在snmpd.conf配置文件里面,查找以下字段:[/color]

##           incl/excl subtree                          mask

#view all    included  .1                               80

将该行前面的”#”去掉.

之后重启snmpd服务解决。

 

报错(六):

移除出错的图

进去Consoleà Graph Management àHost(选择出错的主机地址)—>Search(搜索报错的关键字Used Space),就会找到报错不出错的Graph Title栏目,勾选右侧的全选框,点击Go按钮删除掉这些无效的图即可,如下图所示:

然后在新出来的提示界面,点击Continue按钮,删除。

 

(七):Memory Free值为nan

分析:memery free 无数据,原因: rrdtool  的内存上限为10G。

[root@squid-2 local]# find / -name *mem*.rrd

/var/www/html/cacti/rra/db-m2-slave-1_mem_buffers_189.rrd

/var/www/html/cacti/rra/db-master-2_mem_free_156.rrd

/var/www/html/cacti/rra/db-m2-slave-1_lock_system_memory_20.rrd

/var/www/html/cacti/rra/db-m2-slave-2_total_mem_alloc_74.rrd

/var/www/html/cacti/rra/db-m2-slave-1_total_mem_alloc_23.rrd

/var/www/html/cacti/rra/db-m2-slave-2_lock_system_memory_71.rrd

/var/www/html/cacti/rra/localhost_mem_swap_4.rrd

/var/www/html/cacti/rra/db-master-2_total_mem_alloc_117.rrd

/var/www/html/cacti/rra/db-master-2_mem_cache_155.rrd

/var/www/html/cacti/rra/db-master-2_mem_buffers_154.rrd

/var/www/html/cacti/rra/db-m2-slave-1_mem_free_191.rrd

/var/www/html/cacti/rra/localhost_mem_buffers_3.rrd

/var/www/html/cacti/rra/db-m2-slave-2_mem_free_164.rrd

/var/www/html/cacti/rra/db-m2-slave-2_mem_buffers_162.rrd

/var/www/html/cacti/rra/db-m2-slave-1_mem_buffers_54.rrd

/var/www/html/cacti/rra/db-m2-slave-1_mem_swap_55.rrd

/var/www/html/cacti/rra/db-master-2_lock_system_memory_114.rrd

/var/www/html/cacti/rra/db-m2-slave-2_mem_cache_163.rrd

/var/www/html/cacti/rra/db-m2-slave-1_mem_cache_190.rrd

/var/www/html/cacti/rra/db-master-2_mem_free_146.rrd

[root@squid-2 local]#

[root@squid-2 local]# rrdtool info /var/www/html/cacti/rra/db-m2-slave-1_mem_free_191.rrd |grep mem_free

filename = “/var/www/html/cacti/rra/db-m2-slave-1_mem_free_191.rrd”

ds[mem_free].type = “GAUGE”

ds[mem_free].minimal_heartbeat = 120

ds[mem_free].min = 0.0000000000e+00

ds[mem_free].max = 1.0000000000e+07

ds[mem_free].last_ds = “34166500”

ds[mem_free].value = NaN

ds[mem_free].unknown_sec = 2

[root@squid-2 local]#

注:ds[mem_free].max = 1.0000000000e+07  数据的最大值设置为10G

查看rrdtool如何进行修改,执行—help查看:

[root@squid-2 local]# rrdtool –help

RRDtool 1.3.8  Copyright 1997-2009 by Tobias Oetiker <tobi@oetiker.ch>

Compiled Aug 21 2010 10:57:18

Usage: rrdtool [options] command command_options

Valid commands: create, update, updatev, graph, graphv,  dump, restore,

last, lastupdate, first, info, fetch, tune,

resize, xport

RRDtool is distributed under the Terms of the GNU General

Public License Version 2. (www.gnu.org/copyleft/gpl.html)

For more information read the RRD manpages

[root@squid-2 local]#

采用tune命令参数进行修改:

[root@squid-2 rra]# rrdtool tune *_mem_free_*.rrd mem_free:100000000

DS[mem_free] typ: GAUGE     hbt: 120   min: 0.0000      max: 10000000.0000

[root@squid-2 rra]#

有提示信息,表名tune失败,原来少了个-a参数,重新修改如下:

[root@squid-2 rra]# rrdtool tune *_mem_cache_*.rrd -a mem_cache:3000000000

[root@squid-2 rra]# rrdtool tune *_mem_free_*.rrd -a mem_free:3000000000

[root@squid-2 rra]# rrdtool tune *_mem_buffers_*.rrd -a mem_buffers:3000000000

[root@squid-2 rra]#

这里发现rrdtool执行之后,只有一个host主机的的nan变成数字,其他主机的都没有变, 之所以如此是因为rrdtool tune * -a …命令只有一个.rrd文件起作用,其余的需要自己手动再一次次执行rrdtool tune命令。

为了简化操作,特意写了一个ssh脚本如下:

  1. vim /root/rrdtool_increate_mem.sh
  2. cd /var/www/html/cacti/rra
  3. ls *_mem_free_*.rrd -1 >a_mem_free.txt
  4. for i in `cat a_mem_free.txt`
  5. do
  6.          rrdtool tune $i -a mem_free:300000000;
  7. done;
  8. ls *_mem_cache_*.rrd -1 >a_mem_cache.txt
  9. for i in `cat a_mem_cache.txt`
  10. do
  11.          rrdtool tune $i -a mem_cache:300000000;
  12. done;
  13. ls *_mem_buffers_*.rrd -1 >a_mem_buffers.txt
  14. for i in `cat a_mem_buffers.txt`
  15. do
  16.          rrdtool tune $i -a mem_buffers:300000000;
  17. done;

直接sh /root/rrdtool_increate_mem.sh即可。
【补充】

调试cactigraph,步骤如下:

(1)     Console ,再进入Graph Manager ,再进入,选择对于的Host,搜索Memory,选中你要的图,点击链接,比如我这里是Memory Usage,如下图所示:

 

(2)     再 点击Memory Usage链接进去,点击右上角的Debug模式:

 

(3)     就会看到如下的debug界面,可以慢慢来观察RRDTool Command命令,为何是-nan值。

 

 

(八):双网卡 Traffic 网卡流量问题

如下图,em1和em2全部指向一个ip地址,只是em1不生效,em2生效了,但是ip地址在em1上,没有显示在em2一栏。

所以,在graph图上,就没有数据,全为-nan-值,如下所示:

在cacti服务器上面执行check:

[root@squid-2 rra]# snmpwalk -v 2c -c public 10.254.3.72 IF-MIB::ifDescr

IF-MIB::ifDescr.1 = STRING: lo

IF-MIB::ifDescr.2 = STRING: em1

IF-MIB::ifDescr.3 = STRING: em2

IF-MIB::ifDescr.4 = STRING: em3

IF-MIB::ifDescr.5 = STRING: em4

[root@squid-2 rra]#

确实有4个网卡信息记录,这些都没事,正常,经过仔细排查发现主要原因在下拉选框里面要选择Interface – Traffic (bits/sec),不要选择Interface – Traffic (bytes/sec),如下图所示:

 

当选择了Interface – Traffic (bits/sec)之后graph就会出数据,有效果图了。

 

(九):InnoDB Active/Locked Transactions

RRDTool Says:

ERROR: opening ‘/var/www/html/cacti/rra/db-m1-slave-1_locked_transactions_215.rrd’: No such file or directory

 

原因是mysql服务器上的让cacti访问的mysql数据库账号没有创建,创建mysql账号好,问题解决。

 

(十):Tomcat – Connection Rate

 

RRDTool Says:

ERROR: invalid y-grid format

依次进入Console –>Graph templates->Tomcat – Connection Rate->Unit Grid Value (–unit/–y-grid)
默认的值为1  改成为0即可。

发表评论

电子邮件地址不会被公开。