mysql日志主要包括:错误日志、查询日志、慢查询日志、二进制日志。本文将依次讲解这些日志如何配置、使用。

1. 错误日志


错误日志主要用于定位mysql启动和运行中的一些问题,错误日志是默认开启的,可以通过show global variables查看错误日志文件的位置:

mysql> show global variables like '%log_error%';

+---------------+--------------------------+
| Variable_name | Value                    |
+---------------+--------------------------+
| log_error     | /var/log/mysql/error.log |
+---------------+--------------------------+

1 row in set (0.00 sec)

也可以通过mysql配置文件my.cnf查看错误日志文件的位置:

# Error log - should be very few entries.
log_error = /var/log/mysql/error.log

常见的错误日志也包括存储引擎相关的信息,常见的格式如下:

140716 20:24:11 [Note] Plugin 'FEDERATED' is disabled.
140716 20:24:11 InnoDB: The InnoDB memory heap is disabled
140716 20:24:11 InnoDB: Mutexes and rw_locks use GCC atomic builtins
140716 20:24:11 InnoDB: Compressed tables use zlib 1.2.3.4
140716 20:24:11 InnoDB: Initializing buffer pool, size = 128.0M
140716 20:24:11 InnoDB: Completed initialization of buffer pool
140716 20:24:11 InnoDB: highest supported file format is Barracuda.
140716 20:24:11  InnoDB: Waiting for the background threads to start
140716 20:24:12 InnoDB: 5.5.37 started; log sequence number 20673572
140716 20:24:13 [Note] Server hostname (bind-address): '127.0.0.1'; port: 3306
140716 20:24:13 [Note]   - '127.0.0.1' resolves to '127.0.0.1';
140716 20:24:13 [Note] Server socket created on IP: '127.0.0.1'.
140716 20:24:13 [Note] Event Scheduler: Loaded 0 events
140716 20:24:13 [Note] /usr/sbin/mysqld: ready for connections.

2. 查询日志


查询日志用于记录所用的增删查改信息,由于在并发量大时会产生大量信息,所以默认是关闭的。可以通过show global variables查看查询日志的位置以及是否开启。


mysql> show global variables like '%general_log%';
+------------------+----------------------------+
| Variable_name    | Value                      |
+------------------+----------------------------+
| general_log      | OFF                        |
| general_log_file | /var/lib/mysql/mysql.log |
+------------------+----------------------------+

本例中查询日志是关闭的。我们可以在my.cnf中开启查询日志,开启后会影响mysql服务的性能,所以一般只用在开发环境中。

# Be aware that this log type is a performance killer.
# As of 5.1 you can enable the log at runtime!
general_log_file        = /var/log/mysql/mysql.log
general_log             = 1

查询日志包括了mysql执行的每一条sql信息以及会话信息,日志格式如下:

140716 20:33:46     9 Connect   user@localhost on domain
            9 Query select url,url_token from articles where url_md5 = 'c473c205d1ee72cecf2546d332abbbcd'
            9 Query select title,url from hot_articles limit 60 ,10
            9 Quit

3. 慢查询日志


慢查询日志应该是对调试程序最有用的日志了,可以通过慢查询日志找到哪些sql语句是性能瓶颈。一般情况下,正常的web应用时不会出现大量的慢查询日志的,因此强烈建议开启。可以通过show global variables查看慢查询日志的位置以及是否开启。

mysql> SHOW  GLOBAL VARIABLES LIKE '%slow%';               
+---------------------+-------------------------------+
| Variable_name       | Value                         |
+---------------------+-------------------------------+
| log_slow_queries    | ON                            |
| slow_launch_time    | 2                             |
| slow_query_log      | ON                            |
| slow_query_log_file | /var/log/mysql/mysql-slow.log |
+---------------------+-------------------------------+

慢查询默认记录超过10秒的查询语句,可以精确到毫秒:

mysql> SHOW  GLOBAL VARIABLES LIKE '%long_query_tim%';
+-----------------+-----------+
| Variable_name   | Value     |
+-----------------+-----------+
| long_query_time | 10.000000 |
+-----------------+-----------+

可以通过在my.cnf中设置慢查询相关选项,比如超时时间、记录无索引查询等:

# Here you can see queries with especially long duration
log_slow_queries    = /var/log/mysql/mysql-slow.log
long_query_time = 10
log-queries-not-using-indexes

常见的慢查询日志格式如下,其中包含用户、查询耗时、sql语句、结果集数量等信息:

# Time: 140716 20:47:59
# User@Host: user[user] @ localhost []
# Query_time: 12.00012  Lock_time: 0.000136 Rows_sent: 1  Rows_examined: 1
use test;
SET timestamp=1405514879;
select url,url_token from articles where url_md5 = '817563bd7ef4b2a476f1f55d0b558cd1';

4. 二进制日志


二进制日志也叫作变更日志,主要用于记录修改数据或有可能引起数据改变的mysql语句,可以在my.cnf中配置二进制日志相关参数,如文件路径、过期时间、文件大小等。也可以从二进制日志中导出sql,用于恢复数据库。

log_bin         = /var/log/mysql/mysql-bin.log
expire_logs_days    = 10
max_binlog_size         = 100M

使用show binary log查看mysql产生的bin log信息,包括文件名、文件大小(单位为字节)等。

mysql> show binary logs;
+------------------+-----------+
| Log_name         | File_size |
+------------------+-----------+
| mysql-bin.000001 |      6252 |
| mysql-bin.000002 |       295 |
| mysql-bin.000003 |       126 |
| mysql-bin.000004 |       107 |
+------------------+-----------+
4 rows in set (0.00 sec)

4.1 查看二进制日志

可以通过show master statusshow binlog events查询定位binlog信息:

#使用show master status查看下一条binlog的偏移起始位置:
mysql> show master status;
+------------------+----------+--------------+------------------+
| File             | Position | Binlog_Do_DB | Binlog_Ignore_DB |
+------------------+----------+--------------+------------------+
| mysql-bin.000004 |      870 |              |                  |
+------------------+----------+--------------+------------------+

#使用show binlog events查询某条binglog记录
#语法:show binlog events [int 'log_file'] [from position] [limit [offset,] row_count]
#一下是查看一个完整的事务执行binlog记录
mysql> show binlog events in 'mysql-bin.000004' from 870 limit 4\G;
*************************** 1. row ***************************
   Log_name: mysql-bin.000004
        Pos: 870
 Event_type: Query
  Server_id: 1
End_log_pos: 941
       Info: BEGIN
*************************** 2. row ***************************
   Log_name: mysql-bin.000004
        Pos: 941
 Event_type: Intvar
  Server_id: 1
End_log_pos: 969
       Info: INSERT_ID=277
*************************** 3. row ***************************
   Log_name: mysql-bin.000004
        Pos: 969
 Event_type: Query
  Server_id: 1
End_log_pos: 1247
       Info: use `user`; insert into test (title,url,url_md5,url_token,view_time) values('test','www.baidu.com','ae98f26d3b883f80b3eadb8709467607','1438524738','1405517003')
*************************** 4. row ***************************
   Log_name: mysql-bin.000004
        Pos: 1247
 Event_type: Xid
  Server_id: 1
End_log_pos: 1274
       Info: COMMIT /* xid=137 */
4 rows in set (0.00 sec)

4.2 使用二进制日志恢复数据

使用mysql的命令行工具,可以方便的将binlog中的sql语句导出,导出后的sql语句可以方便的用于恢复数据库。具体格式可以参考mysqlbinlog --help的输出,下面是两个最常用的例子:

#将mysql中的命令点870到命令点1274之间的sql语句导出到文件中
mysqlbinlog --start-position=870 --stop-position=1274 mysql-bin.000004  > /tmp/mysql_restore.sql 
#将mysql中的binlog中某段时间内的sql导出,可用于恢复一段时间的数据
mysqlbinlog --start-datetime="2012-07-16 00:00:00" --stop-datetime="2012-07-17 00:00:00" mysql-bin.000004 > /tmp/mysql_restore.sql 

weka是机器学习和数据挖掘库java库,提供有GUI界面版本和类库版本。weka是GNU通用公共许可证下发布的开源软件,这意味着我们在自己的项目自由的使用,并可以在开放源代码的情况下改进weka。在使用weka之前,我们首先要将自己的数据集转换为weka支持的格式。weka的数据格式请参考arff格式说明。学习weka库最好先有一下机器学习方面的基础知识,然后学习官方example。接下来简单说明如何使用weka库的kmeans算法。


训练数据集集

@relation race

@attribute height NUMERIC
@attribute weight NUMERIC
@attribute eyeColor NUMERIC
@attribute hairColor NUMERIC

@data
170,70,1,1
180,70,2,2
175,70,1,1
200,80,2,3
210,80,2,1
200,70,2,2
150,50,1,1
155,50,1,1
170,70,1,1
173,70,1,1
170,70,1,2

测试数据集集

@relation race

@attribute height NUMERIC
@attribute weight NUMERIC
@attribute eyeColor NUMERIC
@attribute hairColor NUMERIC

@data
170,71,1,1
200,80,2,2

1.代码

public class KmeansTest {
      public static void main(String[] args) throws Exception {
            File trainFile = new File("race_train.arff");
            File testFile  = new File("race_test.arff");
            ArffLoader atf = new ArffLoader();  
            // load data
            atf.setFile(trainFile);  
            Instances train = atf.getDataSet();
            atf.setFile(testFile);  
            Instances test = atf.getDataSet(); 
            // build clusterer
            SimpleKMeans clusterer = new SimpleKMeans();
            clusterer.setNumClusters(2);
            clusterer.buildClusterer(train);   
            System.out.println(clusterer.toString());
            // output predictions       
            for (int i = 0; i < test.numInstances(); i++) {
              int cluster = clusterer.clusterInstance(test.instance(i));
              double[] dist = clusterer.distributionForInstance(test.instance(i));
              System.out.print((i+1));
              System.out.print(" - ");
              System.out.print(cluster);
              System.out.print(" - ");
              System.out.print(Utils.arrayToString(dist));
              System.out.println();
            } 
          }
}

2.结果如下

//kMeans聚类结果信息如下,比较重要的是每个聚类的中心以及误差
kMeans
======

Number of iterations: 2
Within cluster sum of squared errors: 1.7463888888888888
Missing values globally replaced with mean/mode

Cluster centroids:
                         Cluster#
Attribute    Full Data          0          1
                  (11)        (4)        (7)
============================================
height        177.5455      197.5   166.1429
weight         68.1818         75    64.2857
eyeColor        1.3636          2          1
hairColor       1.4545          2     1.1429

//分类结果如下
1 - 1 - 0.0,1.0
2 - 0 - 1.0,0.0

更多资料请阅读weka官网