Links

Content Skeleton

This Page

Previous topic

Repair Table

Next topic

Lessons from MySQL corruption incident

MySQL repair table live

History Timeline has Moved

hotcopy crash

2013-05-20 11:15:01,333 __main__ INFO     proceed with MySQLHotCopy /usr/bin/mysqlhotcopy  tmp_ligs_offline_db /var/dbbackup/mysqlhotcopy/dybdb1.ihep.ac.cn/tmp_ligs_offline_db/20130520_1115
340     DBD::mysql::db do failed: Table './tmp_ligs_offline_db/DqChannelStatus' is marked as crashed and should be repaired at /usr/bin/mysqlhotcopy line 467.

all queries fail for crashed table

mysql> select count(*) from DqChannelStatus   ;
ERROR 145 (HY000): Table './tmp_ligs_offline_db_0/DqChannelStatus' is marked as crashed and should be repaired

relevant INSERTs

mysql> select * from  tmp_ligs_offline_db_0.DqChannelStatusVld where SEQNO in (323575,340817,341125) ;
+--------+---------------------+---------------------+----------+---------+---------+------+-------------+---------------------+---------------------+
| SEQNO  | TIMESTART           | TIMEEND             | SITEMASK | SIMMASK | SUBSITE | TASK | AGGREGATENO | VERSIONDATE         | INSERTDATE          |
+--------+---------------------+---------------------+----------+---------+---------+------+-------------+---------------------+---------------------+
| 323575 | 2013-04-01 09:59:43 | 2013-04-01 10:12:13 |        2 |       1 |       2 |    0 |          -1 | 2013-04-01 09:59:43 | 2013-04-30 10:14:06 |   ## corrupted SEQNO
| 340817 | 2013-05-16 08:11:38 | 2013-05-16 08:24:05 |        2 |       1 |       1 |    0 |          -1 | 2013-05-16 08:11:38 | 2013-05-16 11:14:59 |   ## max SEQNO in payload table DqChannelStatus
| 341125 | 2013-05-11 10:26:58 | 2013-05-11 10:43:11 |        4 |       1 |       1 |    0 |          -1 | 2013-05-11 10:26:58 | 2013-05-19 22:26:55 |   ## max SEQNO in validity table DqChannelStatus
+--------+---------------------+---------------------+----------+---------+---------+------+-------------+---------------------+---------------------+
3 rows in set (0.00 sec)

Extraction of dybdb1.ihep.ac.cn tarball onto belle7 into tmp_ligs_offline_db_0

The tarball obtained by coldcopy on dybdb1 extracted onto belle7 without incident. The command creates the DB tmp_ligs_offline_db_0

  • repeatable nature of the extraction means I can proceed with recovery efforts, without any need for caution
[root@belle7 ~]# mysqlhotcopy.py -t 20130522_1541 --node dybdb1.ihep.ac.cn --rename tmp_ligs_offline_db_0 --containerdir /var/lib/mysql --ALLOWEXTRACT  tmp_ligs_offline_db examine extract
2013-05-23 12:06:33,546 env.mysqlhotcopy.mysqlhotcopy INFO     /home/blyth/env/bin/mysqlhotcopy.py -t 20130522_1541 --node dybdb1.ihep.ac.cn --rename tmp_ligs_offline_db_0 --containerdir /var/lib/mysql --ALLOWEXTRACT tmp_ligs_offline_db examine extract
2013-05-23 12:06:33,546 env.mysqlhotcopy.mysqlhotcopy INFO     backupdir /var/dbbackup/mysqlhotcopy/dybdb1.ihep.ac.cn/tmp_ligs_offline_db
2013-05-23 12:06:33,561 env.mysqlhotcopy.mysqlhotcopy INFO     failed to instanciate connection to database tmp_ligs_offline_db with exception Error 1049: Unknown database 'tmp_ligs_offline_db'
2013-05-23 12:06:33,561 env.mysqlhotcopy.mysqlhotcopy INFO     ================================== examine
2013-05-23 12:06:33,562 env.mysqlhotcopy.tar INFO     examining /var/dbbackup/mysqlhotcopy/dybdb1.ihep.ac.cn/tmp_ligs_offline_db/20130522_1541.tar.gz
2013-05-23 12:07:08,913 env.mysqlhotcopy.tar INFO     archive contains 7 items with commonprefix "" flattop True
2013-05-23 12:07:08,913 env.mysqlhotcopy.mysqlhotcopy INFO     seconds {'_examine': 35.351444005966187, 'examine': 35.35143518447876}
2013-05-23 12:07:08,913 env.mysqlhotcopy.mysqlhotcopy INFO     ================================== extract
2013-05-23 12:07:08,914 env.mysqlhotcopy.mysqlhotcopy WARNING  no valid db connection using static opts.mb_required 2000
2013-05-23 12:07:08,914 env.mysqlhotcopy.mysqlhotcopy INFO     sufficient free space,      required 2000 MB less than    free 494499.882812 MB
DO YOU REALLY WANT TO extract Tar /var/dbbackup/mysqlhotcopy/dybdb1.ihep.ac.cn/tmp_ligs_offline_db/20130522_1541.tar.gz tmp_ligs_offline_db gz  into containerdir /var/lib/mysql    ? ENTER "YES" TO PROCEED : YES
2013-05-23 12:07:48,589 env.mysqlhotcopy.mysqlhotcopy INFO     proceeding
2013-05-23 12:07:48,589 env.mysqlhotcopy.mysqlhotcopy INFO     extract Tar /var/dbbackup/mysqlhotcopy/dybdb1.ihep.ac.cn/tmp_ligs_offline_db/20130522_1541.tar.gz tmp_ligs_offline_db gz  into containerdir /var/lib/mysql
2013-05-23 12:07:48,589 env.mysqlhotcopy.tar INFO     _flat_extract opening tarfile /var/dbbackup/mysqlhotcopy/dybdb1.ihep.ac.cn/tmp_ligs_offline_db/20130522_1541.tar.gz
2013-05-23 12:08:23,906 env.mysqlhotcopy.tar INFO     _flat_extract into target /var/lib/mysql/tmp_ligs_offline_db_0 for 7 members with toplevelname tmp_ligs_offline_db_0
2013-05-23 12:09:06,346 env.mysqlhotcopy.tar INFO     total 2429412
-rw-rw---- 1 mysql mysql       8746 Feb  4 16:07 DqChannelStatus.frm
-rw-rw---- 1 mysql mysql 1439608104 May 16 19:15 DqChannelStatus.MYD
-rw-rw---- 1 mysql mysql 1024402432 May 16 19:42 DqChannelStatus.MYI
-rw-rw---- 1 mysql mysql       8908 May 13 13:16 DqChannelStatusVld.frm
-rw-rw---- 1 mysql mysql   17397375 May 20 06:26 DqChannelStatusVld.MYD
-rw-rw---- 1 mysql mysql    3826688 May 20 06:26 DqChannelStatusVld.MYI

2013-05-23 12:09:06,347 env.mysqlhotcopy.mysqlhotcopy INFO     seconds {'_examine': 35.351444005966187, 'examine': 35.35143518447876, 'extract': 77.757769107818604, '_extract': 117.43390297889709}
[root@belle7 ~]#

Repairing crashed DqChannelStatus table in tmp_ligs_offline_db_0

  1. crashed nature was propagated, as expected
mysql> use tmp_ligs_offline_db_0
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
mysql> show tables ;
+---------------------------------+
| Tables_in_tmp_ligs_offline_db_0 |
+---------------------------------+
| DqChannelStatus                 |
| DqChannelStatusVld              |
+---------------------------------+
2 rows in set (0.00 sec)

mysql> select count(*) from DqChannelStatusVld   ;
+----------+
| count(*) |
+----------+
|   341125 |
+----------+
1 row in set (0.00 sec)

mysql> select count(*) from DqChannelStatus   ;
ERROR 145 (HY000): Table './tmp_ligs_offline_db_0/DqChannelStatus' is marked as crashed and should be repaired
mysql>
mysql>
mysql> check table  DqChannelStatus ;
+---------------------------------------+-------+----------+-----------------------------------------------------------+
| Table                                 | Op    | Msg_type | Msg_text                                                  |
+---------------------------------------+-------+----------+-----------------------------------------------------------+
| tmp_ligs_offline_db_0.DqChannelStatus | check | warning  | Table is marked as crashed                                |
| tmp_ligs_offline_db_0.DqChannelStatus | check | warning  | 3 clients are using or haven't closed the table properly  |
| tmp_ligs_offline_db_0.DqChannelStatus | check | error    | Record-count is not ok; is 65436731   Should be: 65436732 |
| tmp_ligs_offline_db_0.DqChannelStatus | check | warning  | Found 22 deleted space.   Should be 0                     |
| tmp_ligs_offline_db_0.DqChannelStatus | check | warning  | Found 1 deleted blocks       Should be: 0                 |
| tmp_ligs_offline_db_0.DqChannelStatus | check | error    | Corrupt                                                   |
+---------------------------------------+-------+----------+-----------------------------------------------------------+
6 rows in set (25.21 sec)

Using local prevents replication, if were in a replication chain:

mysql> repair local table  DqChannelStatus ;
+---------------------------------------+--------+----------+--------------------------------------------------+
| Table                                 | Op     | Msg_type | Msg_text                                         |
+---------------------------------------+--------+----------+--------------------------------------------------+
| tmp_ligs_offline_db_0.DqChannelStatus | repair | warning  | Number of rows changed from 65436732 to 65436731 |
| tmp_ligs_offline_db_0.DqChannelStatus | repair | status   | OK                                               |
+---------------------------------------+--------+----------+--------------------------------------------------+
2 rows in set (3 min 34.62 sec)

Wouldnt skipping things from replication cause divergence ? Good thing this table is excluded from replication.

DqChannelStatus health checks

mysql> select count(*) from  DqChannelStatus ;
+----------+
| count(*) |
+----------+
| 65436731 |
+----------+
1 row in set (0.06 sec)
mysql> select max(SEQNO) from DqChannelStatus ;
+------------+
| max(SEQNO) |
+------------+
|     340817 |
+------------+
1 row in set (0.00 sec)


mysql> select min(SEQNO),max(SEQNO),min(ROW_COUNTER),max(ROW_COUNTER) from DqChannelStatus ;
+------------+------------+------------------+------------------+
| min(SEQNO) | max(SEQNO) | min(ROW_COUNTER) | max(ROW_COUNTER) |
+------------+------------+------------------+------------------+
|          1 |     340817 |                0 |              192 |
+------------+------------+------------------+------------------+
1 row in set (26.50 sec)
mysql> select ROW_COUNTER, count(*) as N from DqChannelStatus group by ROW_COUNTER ;
+-------------+--------+
| ROW_COUNTER | N      |
+-------------+--------+
|           0 |      1 |
|           1 | 340817 |
|           2 | 340817 |
|           3 | 340817 |
|           4 | 340817 |
...
|          52 | 340817 |
|          53 | 340817 |
|          54 | 340817 |
|          55 | 340817 |
|          56 | 340817 |
|          57 | 340817 |
|          58 | 340817 |      #  transition
|          59 | 340816 |      #  from single SEQNO partial payload
|          60 | 340816 |
|          61 | 340816 |
|          62 | 340816 |
|          63 | 340816 |
|          64 | 340816 |
|          65 | 340816 |
...
|         188 | 340816 |
|         189 | 340816 |
|         190 | 340816 |
|         191 | 340816 |
|         192 | 340816 |
+-------------+--------+
193 rows in set (44.89 sec)


mysql> /* excluding the bad SEQNO get back to regular structure */

mysql>  select ROW_COUNTER, count(*) as N from DqChannelStatus where SEQNO != 323575 group by ROW_COUNTER ;
+-------------+--------+
| ROW_COUNTER | N      |
+-------------+--------+
|           1 | 340816 |
|           2 | 340816 |
|           3 | 340816 |
...
|         190 | 340816 |
|         191 | 340816 |
|         192 | 340816 |
+-------------+--------+
192 rows in set (47.06 sec)
mysql> select * from DqChannelStatus where ROW_COUNTER=0 ;
+--------+-------------+-------+--------+-----------+--------+
| SEQNO  | ROW_COUNTER | RUNNO | FILENO | CHANNELID | STATUS |
+--------+-------------+-------+--------+-----------+--------+
| 323575 |           0 |     0 |      0 |         0 |      0 |
+--------+-------------+-------+--------+-----------+--------+
1 row in set (20.37 sec)
mysql> select SEQNO, count(*) as N from DqChannelStatus group by SEQNO having N != 192 ;
+--------+----+
| SEQNO  | N  |
+--------+----+
| 323575 | 59 |
+--------+----+
1 row in set (25.72 sec)


mysql> select * from  DqChannelStatus where SEQNO = 323575 ;
+--------+-------------+-------+--------+-----------+--------+
| SEQNO  | ROW_COUNTER | RUNNO | FILENO | CHANNELID | STATUS |
+--------+-------------+-------+--------+-----------+--------+
| 323575 |           0 |     0 |      0 |         0 |      0 |
| 323575 |           1 | 38347 |     43 |  33687041 |      1 |
| 323575 |           2 | 38347 |     43 |  33687042 |      1 |
| 323575 |           3 | 38347 |     43 |  33687043 |      1 |
| 323575 |           4 | 38347 |     43 |  33687044 |      1 |
| 323575 |           5 | 38347 |     43 |  33687045 |      1 |
| 323575 |           6 | 38347 |     43 |  33687046 |      1 |
...
| 323575 |          52 | 38347 |     43 |  33687812 |      1 |
| 323575 |          53 | 38347 |     43 |  33687813 |      1 |
| 323575 |          54 | 38347 |     43 |  33687814 |      1 |
| 323575 |          55 | 38347 |     43 |  33687815 |      1 |
| 323575 |          56 | 38347 |     43 |  33687816 |      1 |
| 323575 |          57 | 38347 |     43 |  33687817 |      1 |
| 323575 |          58 | 38347 |     43 |  33687818 |      1 |
+--------+-------------+-------+--------+-----------+--------+
59 rows in set (0.00 sec)

Make mysqldump with bad SEQNO excluded

  • hmm, no locks are applied but the table is not active
[blyth@belle7 DybPython]$ dbdumpload.py tmp_ligs_offline_db_0 dump ~/tmp_ligs_offline_db_0.DqChannelStatus.sql --where 'SEQNO != 323575' --tables 'DqChannelStatus DqChannelStatusVld'         ## check the dump  command
[blyth@belle7 DybPython]$ dbdumpload.py tmp_ligs_offline_db_0 dump ~/tmp_ligs_offline_db_0.DqChannelStatus.sql --where 'SEQNO != 323575' --tables 'DqChannelStatus DqChannelStatusVld' | sh    ## do it

Huh mysqldump 2GB of SQL is very quick:

[blyth@belle7 DybPython]$ dbdumpload.py tmp_ligs_offline_db_0 dump ~/tmp_ligs_offline_db_0.DqChannelStatus.sql --where 'SEQNO != 323575' --tables 'DqChannelStatus DqChannelStatusVld' | sh

real    1m36.505s
user    1m14.353s
sys     0m6.705s
[blyth@belle7 DybPython]$

Inspecting the dump file

[blyth@belle7 DybPython]$ du -h  ~/tmp_ligs_offline_db_0.DqChannelStatus.sql
2.1G    /home/blyth/tmp_ligs_offline_db_0.DqChannelStatus.sql
[blyth@belle7 DybPython]$ grep CREATE  ~/tmp_ligs_offline_db_0.DqChannelStatus.sql
CREATE TABLE `DqChannelStatus` (
CREATE TABLE `DqChannelStatusVld` (
[blyth@belle7 DybPython]$ grep DROP  ~/tmp_ligs_offline_db_0.DqChannelStatus.sql
[blyth@belle7 DybPython]$
[blyth@belle7 DybPython]$ head -c 2000 ~/tmp_ligs_offline_db_0.DqChannelStatus.sql    ## looked OK,
[blyth@belle7 DybPython]$ tail -c 2000 ~/tmp_ligs_offline_db_0.DqChannelStatus.sql    ## no truncation

Digest, compress, publish, test url and digest

[blyth@belle7 ~]$ md5sum tmp_ligs_offline_db_0.DqChannelStatus.sql
46b747d88ad74caa4b1d21be600265a4  tmp_ligs_offline_db_0.DqChannelStatus.sql
[blyth@belle7 ~]$ gzip -c tmp_ligs_offline_db_0.DqChannelStatus.sql > tmp_ligs_offline_db_0.DqChannelStatus.sql.gz
[blyth@belle7 ~]$ du -hs tmp_ligs_offline_db_0.DqChannelStatus.sql*
2.1G    tmp_ligs_offline_db_0.DqChannelStatus.sql
335M    tmp_ligs_offline_db_0.DqChannelStatus.sql.gz
[blyth@belle7 ~]$ sudo mv tmp_ligs_offline_db_0.DqChannelStatus.sql.gz $(nginx-htdocs)/data/
[blyth@belle7 ~]$ cd /tmp
[blyth@belle7 tmp]$ curl -O http://belle7.nuu.edu.tw/data/tmp_ligs_offline_db_0.DqChannelStatus.sql.gz
[blyth@belle7 tmp]$ du -h tmp_ligs_offline_db_0.DqChannelStatus.sql.gz
335M    tmp_ligs_offline_db_0.DqChannelStatus.sql.gz
[blyth@belle7 tmp]$ gunzip tmp_ligs_offline_db_0.DqChannelStatus.sql.gz
[blyth@belle7 tmp]$ md5sum tmp_ligs_offline_db_0.DqChannelStatus.sql
46b747d88ad74caa4b1d21be600265a4  tmp_ligs_offline_db_0.DqChannelStatus.sql

Features of the dump tmp_ligs_offline_db_0.DqChannelStatus.sql

  1. bad SEQNO 323575 is excluded
  2. 308 SEQNO > 340817 are validity only, namely 340818:341125

Recreate tables from the dump into tmp_ligs_offline_db_1

[blyth@belle7 ~]$ echo create database tmp_ligs_offline_db_1 | mysql
[blyth@belle7 ~]$ cat ~/tmp_ligs_offline_db_0.DqChannelStatus.sql |  mysql  tmp_ligs_offline_db_1     ## taking much longer to load than to dump, lunchtime
  • looks like Vld continues to be written after the payload crashed ??
mysql> show tables ;
+---------------------------------+
| Tables_in_tmp_ligs_offline_db_1 |
+---------------------------------+
| DqChannelStatus                 |
| DqChannelStatusVld              |
+---------------------------------+
2 rows in set (0.00 sec)

mysql> select count(*) from DqChannelStatus  ;
+----------+
| count(*) |
+----------+
| 65436672 |
+----------+
1 row in set (0.00 sec)

mysql> select count(*) from DqChannelStatusVld  ;
+----------+
| count(*) |
+----------+
|   341124 |
+----------+
1 row in set (0.00 sec)

mysql> select min(SEQNO),max(SEQNO),max(SEQNO)-min(SEQNO)+1, count(*) as N  from DqChannelStatusVld ;
+------------+------------+-------------------------+--------+
| min(SEQNO) | max(SEQNO) | max(SEQNO)-min(SEQNO)+1 | N      |
+------------+------------+-------------------------+--------+
|          1 |     341125 |                  341125 | 341124 |
+------------+------------+-------------------------+--------+
1 row in set (0.00 sec)

mysql> select min(SEQNO),max(SEQNO),max(SEQNO)-min(SEQNO)+1, count(*) as N  from DqChannelStatus ;
+------------+------------+-------------------------+----------+
| min(SEQNO) | max(SEQNO) | max(SEQNO)-min(SEQNO)+1 | N        |
+------------+------------+-------------------------+----------+
|          1 |     340817 |                  340817 | 65436672 |
+------------+------------+-------------------------+----------+
1 row in set (0.01 sec)

mysql> select 341125 -  340817 ;   /* huh 308 more validity SEQNO than payload SEQNO : DBI is not crashed payload table savvy   */
+------------------+
| 341125 -  340817 |
+------------------+
|              308 |
+------------------+
1 row in set (0.03 sec)

Compare the repaired with the recreated from dump

tmp_ligs_offline_db_0
DB in which DqChannelStatus was repaired
tmp_ligs_offline_db_1
freshly created DB populated via the mysqldump obtained from _0 with the bad SEQNO excluded
  1. the SEQNO indicate that the Validity table continued to be updated even after the payload table had crashed
mysql> select min(SEQNO),max(SEQNO),max(SEQNO)-min(SEQNO)+1, count(*) as N  from tmp_ligs_offline_db_0.DqChannelStatusVld ;
+------------+------------+-------------------------+--------+
| min(SEQNO) | max(SEQNO) | max(SEQNO)-min(SEQNO)+1 | N      |
+------------+------------+-------------------------+--------+
|          1 |     341125 |                  341125 | 341125 |
+------------+------------+-------------------------+--------+
1 row in set (0.04 sec)

mysql> select min(SEQNO),max(SEQNO),max(SEQNO)-min(SEQNO)+1, count(*) as N  from tmp_ligs_offline_db_1.DqChannelStatusVld ;
+------------+------------+-------------------------+--------+
| min(SEQNO) | max(SEQNO) | max(SEQNO)-min(SEQNO)+1 | N      |
+------------+------------+-------------------------+--------+
|          1 |     341125 |                  341125 | 341124 |    /* expected difference of 1 due to the skipped bad SEQNO */
+------------+------------+-------------------------+--------+
1 row in set (0.00 sec)

mysql> select min(SEQNO),max(SEQNO),max(SEQNO)-min(SEQNO)+1, count(*) as N  from tmp_ligs_offline_db_0.DqChannelStatus ;
+------------+------------+-------------------------+----------+
| min(SEQNO) | max(SEQNO) | max(SEQNO)-min(SEQNO)+1 | N        |
+------------+------------+-------------------------+----------+
|          1 |     340817 |                  340817 | 65436731 |
+------------+------------+-------------------------+----------+
1 row in set (0.05 sec)

mysql> select min(SEQNO),max(SEQNO),max(SEQNO)-min(SEQNO)+1, count(*) as N  from tmp_ligs_offline_db_1.DqChannelStatus ;
+------------+------------+-------------------------+----------+
| min(SEQNO) | max(SEQNO) | max(SEQNO)-min(SEQNO)+1 | N        |
+------------+------------+-------------------------+----------+
|          1 |     340817 |                  340817 | 65436672 |
+------------+------------+-------------------------+----------+
1 row in set (0.00 sec)

mysql> select 65436731 -  65436672,  341125 -  340817 ;    /* the expected 59 more payloads, 308 more vld */
+----------------------+------------------+
| 65436731 -  65436672 | 341125 -  340817 |
+----------------------+------------------+
|                   59 |              308 |
+----------------------+------------------+
1 row in set (0.00 sec)

Validity/Payload divergence

  • 2-3 days of validity only writes
mysql> select * from tmp_ligs_offline_db_0.DqChannelStatusVld where SEQNO in (340817,341125) ;
+--------+---------------------+---------------------+----------+---------+---------+------+-------------+---------------------+---------------------+
| SEQNO  | TIMESTART           | TIMEEND             | SITEMASK | SIMMASK | SUBSITE | TASK | AGGREGATENO | VERSIONDATE         | INSERTDATE          |
+--------+---------------------+---------------------+----------+---------+---------+------+-------------+---------------------+---------------------+
| 340817 | 2013-05-16 08:11:38 | 2013-05-16 08:24:05 |        2 |       1 |       1 |    0 |          -1 | 2013-05-16 08:11:38 | 2013-05-16 11:14:59 |
| 341125 | 2013-05-11 10:26:58 | 2013-05-11 10:43:11 |        4 |       1 |       1 |    0 |          -1 | 2013-05-11 10:26:58 | 2013-05-19 22:26:55 |
+--------+---------------------+---------------------+----------+---------+---------+------+-------------+---------------------+---------------------+
2 rows in set (0.03 sec)

mysql> select * from tmp_ligs_offline_db_1.DqChannelStatusVld where SEQNO in (340817,341125) ;
+--------+---------------------+---------------------+----------+---------+---------+------+-------------+---------------------+---------------------+
| SEQNO  | TIMESTART           | TIMEEND             | SITEMASK | SIMMASK | SUBSITE | TASK | AGGREGATENO | VERSIONDATE         | INSERTDATE          |
+--------+---------------------+---------------------+----------+---------+---------+------+-------------+---------------------+---------------------+
| 340817 | 2013-05-16 08:11:38 | 2013-05-16 08:24:05 |        2 |       1 |       1 |    0 |          -1 | 2013-05-16 08:11:38 | 2013-05-16 11:14:59 |
| 341125 | 2013-05-11 10:26:58 | 2013-05-11 10:43:11 |        4 |       1 |       1 |    0 |          -1 | 2013-05-11 10:26:58 | 2013-05-19 22:26:55 |
+--------+---------------------+---------------------+----------+---------+---------+------+-------------+---------------------+---------------------+
2 rows in set (0.00 sec)

Validity only writes, 308 SEQNO

Somehow DBI continued to write into the validity table despite the payload from be crashed and unwritable between 2013-05-16 and 2013-05-19

mysql> select * from  tmp_ligs_offline_db_0.DqChannelStatusVld where INSERTDATE > '2013-05-16 10:30:00' ;
+--------+---------------------+---------------------+----------+---------+---------+------+-------------+---------------------+---------------------+
| SEQNO  | TIMESTART           | TIMEEND             | SITEMASK | SIMMASK | SUBSITE | TASK | AGGREGATENO | VERSIONDATE         | INSERTDATE          |
+--------+---------------------+---------------------+----------+---------+---------+------+-------------+---------------------+---------------------+
| 340808 | 2013-05-16 08:09:49 | 2013-05-16 08:19:41 |        1 |       1 |       2 |    0 |          -1 | 2013-05-16 08:09:49 | 2013-05-16 10:30:35 |
| 340809 | 2013-05-16 08:09:49 | 2013-05-16 08:19:41 |        1 |       1 |       1 |    0 |          -1 | 2013-05-16 08:09:49 | 2013-05-16 10:30:37 |
| 340810 | 2013-05-16 07:59:53 | 2013-05-16 08:09:49 |        1 |       1 |       2 |    0 |          -1 | 2013-05-16 07:59:53 | 2013-05-16 10:41:41 |
| 340811 | 2013-05-16 07:59:53 | 2013-05-16 08:09:49 |        1 |       1 |       1 |    0 |          -1 | 2013-05-16 07:59:53 | 2013-05-16 10:41:43 |
| 340812 | 2013-05-16 07:53:39 | 2013-05-16 08:09:57 |        4 |       1 |       4 |    0 |          -1 | 2013-05-16 07:53:39 | 2013-05-16 10:48:29 |
| 340813 | 2013-05-16 07:53:39 | 2013-05-16 08:09:57 |        4 |       1 |       2 |    0 |          -1 | 2013-05-16 07:53:39 | 2013-05-16 10:48:31 |
| 340814 | 2013-05-16 07:53:39 | 2013-05-16 08:09:57 |        4 |       1 |       3 |    0 |          -1 | 2013-05-16 07:53:39 | 2013-05-16 10:48:32 |
| 340815 | 2013-05-16 07:53:39 | 2013-05-16 08:09:57 |        4 |       1 |       1 |    0 |          -1 | 2013-05-16 07:53:39 | 2013-05-16 10:48:35 |
| 340816 | 2013-05-16 08:11:38 | 2013-05-16 08:24:05 |        2 |       1 |       2 |    0 |          -1 | 2013-05-16 08:11:38 | 2013-05-16 11:14:58 |
| 340817 | 2013-05-16 08:11:38 | 2013-05-16 08:24:05 |        2 |       1 |       1 |    0 |          -1 | 2013-05-16 08:11:38 | 2013-05-16 11:14:59 |
| 340818 | 2013-05-03 03:38:35 | 2013-05-03 03:38:51 |        2 |       1 |       2 |    0 |          -1 | 2013-05-03 03:38:35 | 2013-05-19 08:22:20 |   <<< validity only SEQNO begin
| 340819 | 2013-05-03 03:38:35 | 2013-05-03 03:38:51 |        2 |       1 |       1 |    0 |          -1 | 2013-05-03 03:38:35 | 2013-05-19 08:22:21 |
| 340820 | 2013-05-08 23:49:10 | 2013-05-08 23:49:28 |        4 |       1 |       4 |    0 |          -1 | 2013-05-08 23:49:10 | 2013-05-19 08:24:37 |
| 340821 | 2013-05-08 23:49:10 | 2013-05-08 23:49:28 |        4 |       1 |       2 |    0 |          -1 | 2013-05-08 23:49:10 | 2013-05-19 08:24:39 |
| 340822 | 2013-05-08 23:49:10 | 2013-05-08 23:49:28 |        4 |       1 |       3 |    0 |          -1 | 2013-05-08 23:49:10 | 2013-05-19 08:24:40 |
| 340823 | 2013-05-08 23:49:10 | 2013-05-08 23:49:28 |        4 |       1 |       1 |    0 |          -1 | 2013-05-08 23:49:10 | 2013-05-19 08:24:41 |
| 340824 | 2013-05-03 02:11:12 | 2013-05-03 02:18:29 |        1 |       1 |       2 |    0 |          -1 | 2013-05-03 02:11:12 | 2013-05-19 09:13:33 |
| 340825 | 2013-05-03 02:11:12 | 2013-05-03 02:18:29 |        1 |       1 |       1 |    0 |          -1 | 2013-05-03 02:11:12 | 2013-05-19 09:13:35 |
| 340826 | 2013-05-09 17:37:11 | 2013-05-09 17:53:25 |        4 |       1 |       4 |    0 |          -1 | 2013-05-09 17:37:11 | 2013-05-19 09:15:57 |
| 340827 | 2013-05-09 17:37:11 | 2013-05-09 17:53:25 |        4 |       1 |       2 |    0 |          -1 | 2013-05-09 17:37:11 | 2013-05-19 09:15:59 |
mysql> select max(SEQNO) from DqChannelStatus ;
+------------+
| max(SEQNO) |
+------------+
|     340817 |
+------------+
1 row in set (0.00 sec)

mysql> select * from DqChannelStatusVld where SEQNO > 340817  ;
+--------+---------------------+---------------------+----------+---------+---------+------+-------------+---------------------+---------------------+
| SEQNO  | TIMESTART           | TIMEEND             | SITEMASK | SIMMASK | SUBSITE | TASK | AGGREGATENO | VERSIONDATE         | INSERTDATE          |
+--------+---------------------+---------------------+----------+---------+---------+------+-------------+---------------------+---------------------+
| 340818 | 2013-05-03 03:38:35 | 2013-05-03 03:38:51 |        2 |       1 |       2 |    0 |          -1 | 2013-05-03 03:38:35 | 2013-05-19 08:22:20 |
| 340819 | 2013-05-03 03:38:35 | 2013-05-03 03:38:51 |        2 |       1 |       1 |    0 |          -1 | 2013-05-03 03:38:35 | 2013-05-19 08:22:21 |
| 340820 | 2013-05-08 23:49:10 | 2013-05-08 23:49:28 |        4 |       1 |       4 |    0 |          -1 | 2013-05-08 23:49:10 | 2013-05-19 08:24:37 |
| 340821 | 2013-05-08 23:49:10 | 2013-05-08 23:49:28 |        4 |       1 |       2 |    0 |          -1 | 2013-05-08 23:49:10 | 2013-05-19 08:24:39 |
| 340822 | 2013-05-08 23:49:10 | 2013-05-08 23:49:28 |        4 |       1 |       3 |    0 |          -1 | 2013-05-08 23:49:10 | 2013-05-19 08:24:40 |
| 340823 | 2013-05-08 23:49:10 | 2013-05-08 23:49:28 |        4 |       1 |       1 |    0 |          -1 | 2013-05-08 23:49:10 | 2013-05-19 08:24:41 |
| 340824 | 2013-05-03 02:11:12 | 2013-05-03 02:18:29 |        1 |       1 |       2 |    0 |          -1 | 2013-05-03 02:11:12 | 2013-05-19 09:13:33 |
...
| 341122 | 2013-05-11 10:26:58 | 2013-05-11 10:43:11 |        4 |       1 |       4 |    0 |          -1 | 2013-05-11 10:26:58 | 2013-05-19 22:26:30 |
| 341123 | 2013-05-11 10:26:58 | 2013-05-11 10:43:11 |        4 |       1 |       2 |    0 |          -1 | 2013-05-11 10:26:58 | 2013-05-19 22:26:38 |
| 341124 | 2013-05-11 10:26:58 | 2013-05-11 10:43:11 |        4 |       1 |       3 |    0 |          -1 | 2013-05-11 10:26:58 | 2013-05-19 22:26:47 |
| 341125 | 2013-05-11 10:26:58 | 2013-05-11 10:43:11 |        4 |       1 |       1 |    0 |          -1 | 2013-05-11 10:26:58 | 2013-05-19 22:26:55 |
+--------+---------------------+---------------------+----------+---------+---------+------+-------------+---------------------+---------------------+
308 rows in set (0.02 sec)

Full Server backup

  1. huh ChannelQuality continues to be updated
mysql> show tables ;
+-------------------------------+
| Tables_in_tmp_ligs_offline_db |
+-------------------------------+
| ChannelQuality                |
| ChannelQualityVld             |
| DaqRawDataFileInfo            |
| DaqRawDataFileInfoVld         |
| DqChannel                     |
| DqChannelStatus               |
| DqChannelStatusVld            |
| DqChannelVld                  |
| LOCALSEQNO                    |
+-------------------------------+
9 rows in set (0.07 sec)

mysql> select * from DqChannelStatusVld order by SEQNO desc limit 1 ;
+--------+---------------------+---------------------+----------+---------+---------+------+-------------+---------------------+---------------------+
| SEQNO  | TIMESTART           | TIMEEND             | SITEMASK | SIMMASK | SUBSITE | TASK | AGGREGATENO | VERSIONDATE         | INSERTDATE          |
+--------+---------------------+---------------------+----------+---------+---------+------+-------------+---------------------+---------------------+
| 341125 | 2013-05-11 10:26:58 | 2013-05-11 10:43:11 |        4 |       1 |       1 |    0 |          -1 | 2013-05-11 10:26:58 | 2013-05-19 22:26:55 |
+--------+---------------------+---------------------+----------+---------+---------+------+-------------+---------------------+---------------------+
1 row in set (0.06 sec)

mysql> select * from DqChannelVld order by SEQNO desc limit 1 ;
+--------+---------------------+---------------------+----------+---------+---------+------+-------------+---------------------+---------------------+
| SEQNO  | TIMESTART           | TIMEEND             | SITEMASK | SIMMASK | SUBSITE | TASK | AGGREGATENO | VERSIONDATE         | INSERTDATE          |
+--------+---------------------+---------------------+----------+---------+---------+------+-------------+---------------------+---------------------+
| 341089 | 2013-05-11 10:26:58 | 2013-05-11 10:43:11 |        4 |       1 |       1 |    0 |          -1 | 2013-05-11 10:26:58 | 2013-05-19 22:26:54 |
+--------+---------------------+---------------------+----------+---------+---------+------+-------------+---------------------+---------------------+
1 row in set (0.06 sec)

mysql> select * from ChannelQualityVld order by SEQNO desc limit 1 ;
+-------+---------------------+---------------------+----------+---------+---------+------+-------------+---------------------+---------------------+
| SEQNO | TIMESTART           | TIMEEND             | SITEMASK | SIMMASK | SUBSITE | TASK | AGGREGATENO | VERSIONDATE         | INSERTDATE          |
+-------+---------------------+---------------------+----------+---------+---------+------+-------------+---------------------+---------------------+
|  9093 | 2013-04-20 09:41:26 | 2038-01-19 03:14:07 |        4 |       1 |       4 |    0 |          -1 | 2012-12-07 07:13:46 | 2013-04-22 15:32:27 |
+-------+---------------------+---------------------+----------+---------+---------+------+-------------+---------------------+---------------------+
1 row in set (0.07 sec)

mysql>

Before and during the table crash:

mysql> select table_name,table_type, engine, round((data_length+index_length-data_free)/1024/1024,2) as MB  from information_schema.tables where table_schema = 'tmp_ligs_offline_db' ;
+-----------------------+------------+-----------+---------+
| table_name            | table_type | engine    | MB      |
+-----------------------+------------+-----------+---------+
| ChannelQuality        | BASE TABLE | MyISAM    |   47.31 |
| ChannelQualityVld     | BASE TABLE | MyISAM    |    0.53 |
| DaqRawDataFileInfo    | BASE TABLE | FEDERATED |   67.04 |
| DaqRawDataFileInfoVld | BASE TABLE | FEDERATED |   13.23 |
| DqChannel             | BASE TABLE | MyISAM    | 3570.58 |
| DqChannelStatus       | BASE TABLE | MyISAM    | 2338.56 |
| DqChannelStatusVld    | BASE TABLE | MyISAM    |   20.12 |
| DqChannelVld          | BASE TABLE | MyISAM    |   19.91 |
| LOCALSEQNO            | BASE TABLE | MyISAM    |    0.00 |
+-----------------------+------------+-----------+---------+
9 rows in set (0.09 sec)

mysql> select table_name,table_type, engine, round((data_length+index_length-data_free)/1024/1024,2) as MB  from information_schema.tables where table_schema = 'tmp_ligs_offline_db' ;
+-----------------------+------------+-----------+---------+
| table_name            | table_type | engine    | MB      |
+-----------------------+------------+-----------+---------+
| ChannelQuality        | BASE TABLE | MyISAM    |   47.31 |
| ChannelQualityVld     | BASE TABLE | MyISAM    |    0.53 |
| DaqRawDataFileInfo    | BASE TABLE | FEDERATED |   67.73 |
| DaqRawDataFileInfoVld | BASE TABLE | FEDERATED |   13.37 |
| DqChannel             | BASE TABLE | MyISAM    | 3591.27 |
| DqChannelStatus       | BASE TABLE | NULL      |    NULL |
| DqChannelStatusVld    | BASE TABLE | MyISAM    |   20.24 |
| DqChannelVld          | BASE TABLE | MyISAM    |   20.03 |
| LOCALSEQNO            | BASE TABLE | MyISAM    |    0.00 |
+-----------------------+------------+-----------+---------+
9 rows in set (0.08 sec)

Extraction of DqChannel tarball into tmp_ligs_offline_db_0

This is adding the IHEP tmp_ligs_offline_db hotcopy containing DqChannel tables into tmp_ligs_offline_db_0 together with the repaired DqChannelStatus:

[root@belle7 tmp_ligs_offline_db]# mysqlhotcopy.py -t 20130523_1623 --node dybdb1.ihep.ac.cn --rename tmp_ligs_offline_db_0 tmp_ligs_offline_db --ALLOWEXTRACT --ALLOWCLOBBER examine extract
2013-05-24 19:51:36,983 env.mysqlhotcopy.mysqlhotcopy INFO     /home/blyth/env/bin/mysqlhotcopy.py -t 20130523_1623 --node dybdb1.ihep.ac.cn --rename tmp_ligs_offline_db_0 tmp_ligs_offline_db --ALLOWEXTRACT --ALLOWCLOBBER examine extract
2013-05-24 19:51:36,984 env.mysqlhotcopy.mysqlhotcopy INFO     backupdir /var/dbbackup/mysqlhotcopy/dybdb1.ihep.ac.cn/tmp_ligs_offline_db
2013-05-24 19:51:37,004 env.mysqlhotcopy.mysqlhotcopy INFO     db size in MB 0.0
2013-05-24 19:51:37,004 env.mysqlhotcopy.mysqlhotcopy INFO     ================================== examine
2013-05-24 19:51:37,004 env.mysqlhotcopy.tar INFO     examining /var/dbbackup/mysqlhotcopy/dybdb1.ihep.ac.cn/tmp_ligs_offline_db/20130523_1623.tar.gz
2013-05-24 19:51:37,004 env.mysqlhotcopy.tar WARNING  load pickled members file /var/dbbackup/mysqlhotcopy/dybdb1.ihep.ac.cn/tmp_ligs_offline_db/20130523_1623.tar.gz.pc
2013-05-24 19:51:37,007 env.mysqlhotcopy.tar INFO     archive contains 7 items with commonprefix "" flattop True
2013-05-24 19:51:37,007 env.mysqlhotcopy.mysqlhotcopy INFO     seconds {'_examine': 0.0028290748596191406, 'examine': 0.0028209686279296875}
2013-05-24 19:51:37,007 env.mysqlhotcopy.mysqlhotcopy INFO     ================================== extract
2013-05-24 19:51:37,008 env.mysqlhotcopy.mysqlhotcopy INFO     sufficient free space,      required 0.0 MB less than    free 477552.570312 MB
DO YOU REALLY WANT TO extract Tar /var/dbbackup/mysqlhotcopy/dybdb1.ihep.ac.cn/tmp_ligs_offline_db/20130523_1623.tar.gz tmp_ligs_offline_db gz  into containerdir /var/lib/mysql/    ? ENTER "YES" TO PROCEED : YES
2013-05-24 19:51:39,842 env.mysqlhotcopy.mysqlhotcopy INFO     proceeding
2013-05-24 19:51:39,843 env.mysqlhotcopy.mysqlhotcopy INFO     extract Tar /var/dbbackup/mysqlhotcopy/dybdb1.ihep.ac.cn/tmp_ligs_offline_db/20130523_1623.tar.gz tmp_ligs_offline_db gz  into containerdir /var/lib/mysql/
2013-05-24 19:51:39,843 env.mysqlhotcopy.tar INFO     _flat_extract opening tarfile /var/dbbackup/mysqlhotcopy/dybdb1.ihep.ac.cn/tmp_ligs_offline_db/20130523_1623.tar.gz
2013-05-24 19:52:51,413 env.mysqlhotcopy.tar WARNING  ./                                                                                                             :  SKIP TOPDIR
2013-05-24 19:52:51,413 env.mysqlhotcopy.tar INFO     extraction into target /var/lib/mysql/tmp_ligs_offline_db_0 does not clobber any existing paths
2013-05-24 19:52:51,413 env.mysqlhotcopy.tar INFO     _flat_extract into target /var/lib/mysql/tmp_ligs_offline_db_0 for 7 members with toplevelname tmp_ligs_offline_db_0
2013-05-24 19:54:04,216 env.mysqlhotcopy.tar INFO     total 6044204
-rw-rw---- 1 mysql mysql       8892 Feb  4 16:07 DqChannel.frm
-rw-rw---- 1 mysql mysql 2750541696 May 20 06:26 DqChannel.MYD
-rw-rw---- 1 mysql mysql 1015181312 May 20 06:26 DqChannel.MYI
-rw-rw---- 1 mysql mysql       8746 May 23 12:28 DqChannelStatus.frm
-rw-rw---- 1 mysql mysql 1439608082 May 23 12:28 DqChannelStatus.MYD
-rw-rw---- 1 mysql mysql  935564288 May 23 12:28 DqChannelStatus.MYI
-rw-rw---- 1 mysql mysql       8908 May 13 13:16 DqChannelStatusVld.frm
-rw-rw---- 1 mysql mysql   17397375 May 20 06:26 DqChannelStatusVld.MYD
-rw-rw---- 1 mysql mysql    3826688 May 20 06:26 DqChannelStatusVld.MYI
-rw-rw---- 1 mysql mysql       8908 Feb  4 16:07 DqChannelVld.frm
-rw-rw---- 1 mysql mysql   17395539 May 20 06:26 DqChannelVld.MYD
-rw-rw---- 1 mysql mysql    3606528 May 20 06:26 DqChannelVld.MYI

2013-05-24 19:54:04,217 env.mysqlhotcopy.mysqlhotcopy INFO     seconds {'_examine': 0.0028290748596191406, 'examine': 0.0028209686279296875, 'extract': 144.37399792671204, '_extract': 147.20948314666748}
[root@belle7 tmp_ligs_offline_db]#

basic check

mysql> use tmp_ligs_offline_db_0

mysql> show tables ;
+---------------------------------+
| Tables_in_tmp_ligs_offline_db_0 |
+---------------------------------+
| DqChannel                       |
| DqChannelStatus                 |
| DqChannelStatusVld              |
| DqChannelVld                    |
+---------------------------------+
4 rows in set (0.00 sec)

mysql> select count(*) from DqChannel ;
+----------+
| count(*) |
+----------+
| 65489088 |
+----------+
1 row in set (0.00 sec)

mysql> select count(*) from DqChannelStatus ;
+----------+
| count(*) |
+----------+
| 65436731 |
+----------+
1 row in set (0.00 sec)

mysql> select count(*) from DqChannelStatusVld ;
+----------+
| count(*) |
+----------+
|   341125 |
+----------+
1 row in set (0.00 sec)


mysql> select count(*) from DqChannelVld ;
+----------+
| count(*) |
+----------+
|   341089 |
+----------+
1 row in set (0.00 sec)

establishing correspondence

What is the criteria for establishing correspondence between DqChannel and DqChannelStatus ?

Quick and dirty

mysql> select max(cs.seqno) from DqChannelStatusVld cs, DqChannelVld c where cs.seqno=c.seqno and cs.insertdate=c.insertdate;
+---------------+
| max(cs.seqno) |
+---------------+
|        323573 |
+---------------+
1 row in set (1.64 sec)

This query indicates when the synchronized writing starts to go a long way astray but it is not a reliable technique due to flawed assumptions.

  • same second inserts to two tables
  • SEQNO correspondence between two tables

Better but slower way

Based on run range comparisons of “group by SEQNO” queries for each and comparing the RUNNO/FILENO

mysql> select SEQNO, count(*) as N, RUNNO, FILENO from DqChannelStatus group by SEQNO limit 10 ;
+-------+-----+-------+--------+
| SEQNO | N   | RUNNO | FILENO |
+-------+-----+-------+--------+
|     1 | 192 | 21223 |      1 |
|     2 | 192 | 21223 |      1 |
|     3 | 192 | 21223 |      1 |
|     4 | 192 | 37322 |    442 |
|     5 | 192 | 37322 |    442 |
|     6 | 192 | 37322 |    441 |
|     7 | 192 | 37322 |    441 |
|     8 | 192 | 37325 |    351 |
|     9 | 192 | 37325 |    351 |
|    10 | 192 | 37325 |    352 |
+-------+-----+-------+--------+
10 rows in set (0.01 sec)

mysql> select SEQNO, count(*) as N, RUNNO, FILENO from DqChannel group by SEQNO limit 10 ;
+-------+-----+-------+--------+
| SEQNO | N   | RUNNO | FILENO |
+-------+-----+-------+--------+
|     1 | 192 | 21223 |      1 |
|     2 | 192 | 21223 |      1 |
|     3 | 192 | 21223 |      1 |
|     4 | 192 | 37322 |    442 |
|     5 | 192 | 37322 |    442 |
|     6 | 192 | 37322 |    441 |
|     7 | 192 | 37322 |    441 |
|     8 | 192 | 37325 |    351 |
|     9 | 192 | 37325 |    351 |
|    10 | 192 | 37325 |    352 |
+-------+-----+-------+--------+
10 rows in set (0.01 sec)

I checked correspondence between DqChannel and the repaired DqChannelStatus in tmp_ligs_offline_db_0 at NUU.

http://dayabay.ihep.ac.cn/tracs/dybsvn/browser/dybgaudi/trunk/Database/Scraper/python/Scraper/dq/cq_zip_check.py

Many ordering swaps are apparent.

Presumably the explanation of this is that multiple instances of the filling script are closing ingredients and summary writers concurrently. This breaks the sequentiality of closing of the two writers from any one instance of your script preventing them having the same SEQNO in the two tables (at least not reliably).

If sequential KUP job running is not possible then in order to make syncronized SEQNO writing to two tables you will need to try wrapping the closing in lock/unlock. Something like:

db("lock tables DqChannel WRITE, DqChannelVld WRITE, DqChannelStatus WRITE, DqChannelStatusVld WRITE")
wseqno = wrt.Close()
wseqno_status = wrt_status.Close()
db("unlock tables")
assert wseqno ==  wseqno_status

In this way the first instance of the script to take the lock will be able to sequentially perform its writes before releasing its lock. Other scripts will hang around until the first is done and so on.

This should allow synchronized writing in future, but does not fix the existing lack of synchronized nature in the tables so far. I will prepare a dump with the “SEQNO <= 323573” cut to allow you to check out my observations.

Did this with env:source:dybgaudi/trunk/Database/Scraper/python/Scraper/dq/cq_zip_check.py

Concurrent DBI Writing

Some small DBI mods allow to disable the DBI locking and this together with another trick to use a single session gives controlled concurrent writing.

Most of the time this works providing controlled concurrent writing with external locking. But there is enough concurrent flakiness (maybe 1 out of 5 runs of the above test) that result in failed writes that it cannot be recommended at the moment.

The case for synced DBI writing to multiple tables is not strong enough to merit much more work on this.

Four Table Dump/load, 8/70 min

mysqldump are fast to dump (8 min), but very slow to load (70 min)

  • possibly load options can be tweaked to go faster
  • or alternate dump technique used
[blyth@belle7 DybPython]$ dbdumpload.py tmp_ligs_offline_db_0 dump ~/tmp_ligs_offline_db_0.DqChannel_and_DqChannelStatus.sql --where 'SEQNO <= 323573' --tables 'DqChannelStatus DqChannelStatusVld DqChannel DqChannelVld'
[blyth@belle7 DybPython]$ dbdumpload.py tmp_ligs_offline_db_0 dump ~/tmp_ligs_offline_db_0.DqChannel_and_DqChannelStatus.sql --where 'SEQNO <= 323573' --tables 'DqChannelStatus DqChannelStatusVld DqChannel DqChannelVld'  | sh

real    8m37.035s
user    3m3.306s
sys     0m23.131s
[blyth@belle7 DybPython]$ du -h  ~/tmp_ligs_offline_db_0.DqChannel_and_DqChannelStatus.sql
5.7G    /home/blyth/tmp_ligs_offline_db_0.DqChannel_and_DqChannelStatus.sql

[blyth@belle7 DybPython]$ tail -c 1000  ~/tmp_ligs_offline_db_0.DqChannel_and_DqChannelStatus.sql
[blyth@belle7 DybPython]$ head -c 1000  ~/tmp_ligs_offline_db_0.DqChannel_and_DqChannelStatus.sql
[blyth@belle7 DybPython]$ grep CREATE ~/tmp_ligs_offline_db_0.DqChannel_and_DqChannelStatus.sql
CREATE TABLE `DqChannelStatus` (
CREATE TABLE `DqChannelStatusVld` (
CREATE TABLE `DqChannel` (
CREATE TABLE `DqChannelVld` (
[blyth@belle7 DybPython]$ grep DROP ~/tmp_ligs_offline_db_0.DqChannel_and_DqChannelStatus.sql
[blyth@belle7 DybPython]$ md5sum ~/tmp_ligs_offline_db_0.DqChannel_and_DqChannelStatus.sql
ea8a5a4d076febbfd940a90171707a72  /home/blyth/tmp_ligs_offline_db_0.DqChannel_and_DqChannelStatus.sql

Alternative dump/load technique, 8/32 min

blyth@belle7 DybPython]$ time ./dbsrv.py  tmp_ligs_offline_db_0 dumplocal ~/tmp_ligs_offline_db_0 --where 'SEQNO <= 323573' -l debug
DEBUG:__main__:MyCnf read ['/home/blyth/.my.cnf']
DEBUG:__main__:translate mysql config {'host': 'belle7.nuu.edu.tw', 'password': '***', 'user': 'root', 'database': 'tmp_ligs_offline_db_0'} into mysql-python config {'passwd': '***', 'host': 'belle7.nuu.edu.tw', 'db': 'tmp_ligs_offline_db_0', 'user': 'root'}
DEBUG:__main__:connecting to {'passwd': '***', 'host': 'belle7.nuu.edu.tw', 'db': 'tmp_ligs_offline_db_0', 'user': 'root'}
DEBUG:__main__:select distinct(table_name) from information_schema.tables where table_schema='tmp_ligs_offline_db_0'
DEBUG:__main__:show create table DqChannel
DEBUG:__main__:select * from DqChannel where SEQNO <= 323573 into outfile '/home/blyth/tmp_ligs_offline_db_0/DqChannel.csv' fields terminated by ',' optionally enclosed by '"'
DEBUG:__main__:show create table DqChannelStatus
DEBUG:__main__:select * from DqChannelStatus where SEQNO <= 323573 into outfile '/home/blyth/tmp_ligs_offline_db_0/DqChannelStatus.csv' fields terminated by ',' optionally enclosed by '"'
DEBUG:__main__:show create table DqChannelStatusVld
DEBUG:__main__:select * from DqChannelStatusVld where SEQNO <= 323573 into outfile '/home/blyth/tmp_ligs_offline_db_0/DqChannelStatusVld.csv' fields terminated by ',' optionally enclosed by '"'
DEBUG:__main__:show create table DqChannelVld
DEBUG:__main__:select * from DqChannelVld where SEQNO <= 323573 into outfile '/home/blyth/tmp_ligs_offline_db_0/DqChannelVld.csv' fields terminated by ',' optionally enclosed by '"'

real    8m11.323s
user    0m0.269s
sys     0m0.087s
[blyth@belle7 DybPython]$
[blyth@belle7 DybPython]$ time ./dbsrv.py tmp_ligs_offline_db_4 loadlocal ~/tmp_ligs_offline_db_0  -l debug --DB_DROP_CREATE -C
...
  PRIMARY KEY  (`SEQNO`)
) ENGINE=MyISAM AUTO_INCREMENT=341090 DEFAULT CHARSET=latin1
DEBUG:__main__:LOAD DATA LOCAL INFILE '/home/blyth/tmp_ligs_offline_db_0/DqChannelVld.csv' IGNORE INTO TABLE DqChannelVld FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' IGNORE 0 LINES

real    32m38.231s
user    0m1.639s
sys     0m6.183s
[blyth@belle7 DybPython]$


[blyth@belle7 DybPython]$ ./dbsrv.py tmp_ligs_offline_db_4 summary
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~  ~~~~~~~~~~  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
TABLE_NAME                      TABLE_ROWS  CREATE_TIME                     CHECK_TIME
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~  ~~~~~~~~~~  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
DqChannel                       62126016    2013-05-30 13:54:33             2013-05-30 14:11:53
DqChannelStatus                 62126016    2013-05-30 14:11:54             2013-05-30 14:26:55
DqChannelStatusVld              323573      2013-05-30 14:26:56             None
DqChannelVld                    323573      2013-05-30 14:26:58             None
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~  ~~~~~~~~~~  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Fake LOCALSEQNO

[blyth@belle7 ~]$ path=~/LOCALSEQNO.sql
[blyth@belle7 ~]$ dbdumpload.py -t LOCALSEQNO --no-data tmp_offline_db dump $path | sh
[blyth@belle7 ~]$ maxseqno=323573
[blyth@belle7 ~]$ echo "INSERT INTO LOCALSEQNO VALUES ('*',0),('DqChannel',$maxseqno),('DqChannelStatus',$maxseqno);" >> $path
[blyth@belle7 ~]$ echo drop database if exists test_localseqno | mysql
[blyth@belle7 ~]$ echo create database test_localseqno | mysql
[blyth@belle7 ~]$ cat $path | mysql test_localseqno
[blyth@belle7 ~]$ echo select \* from LOCALSEQNO | mysql test_localseqno -t
+-----------------+---------------+
| TABLENAME       | LASTUSEDSEQNO |
+-----------------+---------------+
| *               |             0 |
| DqChannel       |        323573 |
| DqChannelStatus |        323573 |
+-----------------+---------------+

Append LOCALSEQNO onto the dump

[blyth@belle7 ~]$ cat $path >> ~/tmp_ligs_offline_db_0.DqChannel_and_DqChannelStatus.sql
[blyth@belle7 ~]$ du -hs ~/tmp_ligs_offline_db_0.DqChannel_and_DqChannelStatus.sql
5.7G    /home/blyth/tmp_ligs_offline_db_0.DqChannel_and_DqChannelStatus.sql
[blyth@belle7 ~]$ md5sum ~/tmp_ligs_offline_db_0.DqChannel_and_DqChannelStatus.sql
8aed64440efb14d3676b8fda1bc85e5e  /home/blyth/tmp_ligs_offline_db_0.DqChannel_and_DqChannelStatus.sql

Create DB _2 from the four table dump with faked LOCALSEQNO

[blyth@belle7 ~]$ db=tmp_ligs_offline_db_2
[blyth@belle7 ~]$ echo drop database if exists $db | mysql
[blyth@belle7 ~]$ echo create database $db | mysql
[blyth@belle7 ~]$ time cat ~/tmp_ligs_offline_db_0.DqChannel_and_DqChannelStatus.sql | mysql $db
real    72m18.139s
user    3m0.786s
sys     0m24.214s
  • OUCH: 72 min to load the dump, this is liable to kill the server for other users

Warning

disk space usage from the cat could easily be more than 3 times the size of the dump due to the new DB and mysql logging

try alternative load to check time

Almost same time as piped cat:

[blyth@belle7 ~]$ db=tmp_ligs_offline_db_3 && echo drop database if exists $db | mysql && echo create database $db | mysql
[blyth@belle7 ~]$ time mysql $db < ~/tmp_ligs_offline_db_0.DqChannel_and_DqChannelStatus.sql
real    72m24.332s
user    2m44.720s
sys     0m13.221s
[blyth@belle7 ~]$

loading alternatives

  1. slow mysqldump
  2. csv style forced_rloadcat with –local on server thus used the fast LOAD DATA LOCAL INFILE
  3. mysqlhotcopy.py archive and extract
    • its really fast
    • BUT: concern about mysql version differnce between table creation server and table repair sever

dump alternatives

mysql> show tables ;
+---------------------------------+
| Tables_in_tmp_ligs_offline_db_0 |
+---------------------------------+
| DqChannel                       |
| DqChannelStatus                 |
| DqChannelStatusVld              |
| DqChannelVld                    |
+---------------------------------+
4 rows in set (0.00 sec)

mysql> select * from DqChannel where SEQNO < 100 into outfile '/tmp/DqChannel.csv' fields terminated by ',' optionally enclosed by '"' ;
Query OK, 19008 rows affected (0.38 sec)

Deciding how to proceed

I have a recovery dump file for tmp_ligs_offline_db, however load times are too long to be used on the primary server.

  • 70 min : from mysqldump
  • 35 min : from CSV based data with “LOAD DATA LOCAL INFILE”

An alternative would be to extract a “mysqlhotcopy” tarball created elsewhere onto dybdb1.ihep.ac.cn. That would probably take less than 10 min and it does not impose such a high load on the server.

I could make the hotcopy on belle7 (server version 5.0.77) and archive it into a tarball to be extracted on dybdb1.ihep.ac.cn (server version 5.0.45). But that might cause problems in future as creating tables on a version of MySQL different from the version on which you might in future need to make repairs limits repair techniques that can be used.

(The recent incident required repairing elsewhere as we had no available backup in hand and you never want to attempt a repair without having an available and verified backup.)

  • decide to install MySQL 5.0.45 RPM on DB virgin belle1

MySQL Server versions and repair table limitation

Server versions, our primary servers use 5.0.45

node server version
dybdb1.ihep.ac.cn 5.0.45
dybdb2.ihep.ac.cn 5.0.45
dayabay.ihep.ac.cn 5.1.36
belle7.nuu.edu.tw 5.0.77
belle1.nuu.edu.tw 5.0.45 see mysqlrpm-
cms01.phys.ntu.edu.tw 4.1.22

version shifting repair issue

Prior to MySQL 5.0.62, do not use USE_FRM if your table was created by a different version of the MySQL server. Doing so risks the loss of all rows in the table. It is particularly dangerous to use USE_FRM after the server returns this message:

Table upgrade required. Please do
"REPAIR TABLE `tbl_name`" to fix it!

Does different version of the MySQL server refer to major or minor versions ?

what this means

It is better for tables to be created on the same server version as they are used and potentially repaired. Thus install 5.0.45 from RPM on belle1 in order to be able to create a same version hotcopy for extraction into dybdb1. See mysqlrpm- for the install sage.

Compare MySQL servers on belle1 and dybdb1

dybddb1

Remote connection to dybdb1 from belle7:

mysql> status ;
--------------
/data1/env/local/dyb/external/mysql/5.0.67/i686-slc5-gcc41-dbg/bin/mysql  Ver 14.12 Distrib 5.0.67, for redhat-linux-gnu (i686) using  EditLine wrapper

Connection id:          610209
Current database:       tmp_ligs_offline_db
Current user:           ligs@belle7.nuu.edu.tw
SSL:                    Not in use
Current pager:          stdout
Using outfile:          ''
Using delimiter:        ;
Server version:         5.0.45-community-log MySQL Community Edition (GPL)
Protocol version:       10
Connection:             dybdb1.ihep.ac.cn via TCP/IP
Server characterset:    latin1
Db     characterset:    latin1
Client characterset:    latin1
Conn.  characterset:    latin1
TCP port:               3306
Uptime:                 12 days 6 hours 51 min 8 sec

Threads: 8  Questions: 171104994  Slow queries: 79  Opens: 335  Flush tables: 1  Open tables: 302  Queries per second avg: 161.197
--------------
mysql>  select table_schema, table_name, table_collation from information_schema.tables where table_schema = 'tmp_ligs_offline_db' ;
+---------------------+-----------------------+-------------------+
| table_schema        | table_name            | table_collation   |
+---------------------+-----------------------+-------------------+
| tmp_ligs_offline_db | ChannelQuality        | latin1_swedish_ci |
| tmp_ligs_offline_db | ChannelQualityVld     | latin1_swedish_ci |
| tmp_ligs_offline_db | DaqRawDataFileInfo    | latin1_swedish_ci |
| tmp_ligs_offline_db | DaqRawDataFileInfoVld | latin1_swedish_ci |
| tmp_ligs_offline_db | DqChannel             | latin1_swedish_ci |
| tmp_ligs_offline_db | DqChannelStatus       | NULL              |
| tmp_ligs_offline_db | DqChannelStatusVld    | latin1_swedish_ci |
| tmp_ligs_offline_db | DqChannelVld          | latin1_swedish_ci |
| tmp_ligs_offline_db | LOCALSEQNO            | latin1_swedish_ci |
+---------------------+-----------------------+-------------------+
9 rows in set (0.07 sec)

belle1

Local connection to belle1:

mysql> status
--------------
mysql  Ver 14.12 Distrib 5.0.45, for pc-linux-gnu (i686) using readline 5.0

Connection id:          28
Current database:       information_schema
Current user:           root@localhost
SSL:                    Not in use
Current pager:          stdout
Using outfile:          ''
Using delimiter:        ;
Server version:         5.0.45-community MySQL Community Edition (GPL)
Protocol version:       10
Connection:             127.0.0.1 via TCP/IP
Server characterset:    latin1
Db     characterset:    utf8
Client characterset:    latin1
Conn.  characterset:    latin1
TCP port:               3306
Uptime:                 50 min 57 sec

Threads: 2  Questions: 114  Slow queries: 0  Opens: 23  Flush tables: 1  Open tables: 17  Queries per second avg: 0.037

Only difference is Db characterset

mysql> select @@character_set_database ;
+--------------------------+
| @@character_set_database |
+--------------------------+
| utf8                     |
+--------------------------+
1 row in set (0.00 sec)

The character set and collation for the default database can be determined from the values of the character_set_database and collation_database system variables. The server sets these variables whenever the default database changes. If there is no default database, the variables have the same value as the corresponding server-level system variables, character_set_server and collation_server.

mysql> select table_name, table_collation from tables where table_schema = 'channelquality_db' ;
+--------------------+-------------------+
| table_name         | table_collation   |
+--------------------+-------------------+
| DqChannel          | latin1_swedish_ci |
| DqChannelStatus    | latin1_swedish_ci |
| DqChannelStatusVld | latin1_swedish_ci |
| DqChannelVld       | latin1_swedish_ci |
| LOCALSEQNO         | latin1_swedish_ci |
+--------------------+-------------------+
5 rows in set (0.00 sec)

Load the dump into belle1

[blyth@belle1 ~]$ md5sum tmp_ligs_offline_db_0.DqChannel_and_DqChannelStatus.sql
8aed64440efb14d3676b8fda1bc85e5e  tmp_ligs_offline_db_0.DqChannel_and_DqChannelStatus.sql
8aed64440efb14d3676b8fda1bc85e5e
[blyth@belle1 ~]$ echo 8aed64440efb14d3676b8fda1bc85e5e    # matches digest from belle7
[blyth@belle1 ~]$
[blyth@belle1 ~]$ echo create database channelquality_db | mysql
[blyth@belle1 ~]$ time mysql channelquality_db < ~/tmp_ligs_offline_db_0.DqChannel_and_DqChannelStatus.sql
real    77m19.981s
user    2m45.547s
sys     0m12.736s
[blyth@belle1 ~]$

Checking the load as it progresses:

mysql> select TABLE_NAME, TABLE_TYPE, ENGINE, TABLE_ROWS, CREATE_TIME, UPDATE_TIME from information_schema.tables where table_schema = 'channelquality_db' ;
+--------------------+------------+--------+------------+---------------------+---------------------+
| TABLE_NAME         | TABLE_TYPE | ENGINE | TABLE_ROWS | CREATE_TIME         | UPDATE_TIME         |
+--------------------+------------+--------+------------+---------------------+---------------------+
| DqChannel          | BASE TABLE | MyISAM |   59651813 | 2013-05-30 18:52:51 | 2013-05-30 19:33:07 |
| DqChannelStatus    | BASE TABLE | MyISAM |   62126016 | 2013-05-30 18:17:42 | 2013-05-30 18:52:44 |
| DqChannelStatusVld | BASE TABLE | MyISAM |     323573 | 2013-05-30 18:52:44 | 2013-05-30 18:52:51 |
+--------------------+------------+--------+------------+---------------------+---------------------+
3 rows in set (0.00 sec)

At completion:

mysql> select TABLE_NAME, TABLE_TYPE, ENGINE, TABLE_ROWS, CREATE_TIME, UPDATE_TIME from information_schema.tables where table_schema = 'channelquality_db' ;
+--------------------+------------+--------+------------+---------------------+---------------------+
| TABLE_NAME         | TABLE_TYPE | ENGINE | TABLE_ROWS | CREATE_TIME         | UPDATE_TIME         |
+--------------------+------------+--------+------------+---------------------+---------------------+
| DqChannel          | BASE TABLE | MyISAM |   62126016 | 2013-05-30 18:52:51 | 2013-05-30 19:34:55 |
| DqChannelStatus    | BASE TABLE | MyISAM |   62126016 | 2013-05-30 18:17:42 | 2013-05-30 18:52:44 |
| DqChannelStatusVld | BASE TABLE | MyISAM |     323573 | 2013-05-30 18:52:44 | 2013-05-30 18:52:51 |
| DqChannelVld       | BASE TABLE | MyISAM |     323573 | 2013-05-30 19:34:55 | 2013-05-30 19:35:02 |
| LOCALSEQNO         | BASE TABLE | MyISAM |          3 | 2013-05-30 19:35:02 | 2013-05-30 19:35:02 |
+--------------------+------------+--------+------------+---------------------+---------------------+
5 rows in set (0.00 sec)

belle1 hotcopy

After dealing with a mysqlhotcopy perl issue, mysqlrpm-

[root@belle1 ~]#  mysqlhotcopy.py -l debug channelquality_db hotcopy archive
2013-05-30 20:29:40,578 env.mysqlhotcopy.mysqlhotcopy INFO     /home/blyth/env/bin/mysqlhotcopy.py -l debug channelquality_db hotcopy archive
2013-05-30 20:29:40,582 env.mysqlhotcopy.mysqlhotcopy INFO     backupdir /var/dbbackup/mysqlhotcopy/belle1.nuu.edu.tw/channelquality_db
2013-05-30 20:29:40,582 env.mysqlhotcopy.db DEBUG    MyCnf read ['/root/.my.cnf']
2013-05-30 20:29:40,582 env.mysqlhotcopy.db DEBUG    translate mysql config {'host': 'localhost', 'user': 'root', 'database': 'information_schema', 'password': '***', 'socket': '/var/lib/mysql/mysql.sock'} into mysql-python config {'unix_socket': '/var/lib/mysql/mysql.sock', 'host': 'localhost', 'user': 'root', 'passwd': '***', 'db': 'information_schema'}
2013-05-30 20:29:40,582 env.mysqlhotcopy.db DEBUG    connecting to {'unix_socket': '/var/lib/mysql/mysql.sock', 'host': 'localhost', 'user': 'root', 'passwd': '***', 'db': 'information_schema'}
2013-05-30 20:29:40,583 env.mysqlhotcopy.mysqlhotcopy INFO     failed to instanciate connection to database channelquality_db with exception 'NoneType' object has no attribute 'Error'
2013-05-30 20:29:40,583 env.mysqlhotcopy.mysqlhotcopy INFO     ================================== hotcopy
2013-05-30 20:29:40,583 env.mysqlhotcopy.mysqlhotcopy WARNING  no valid db connection using static opts.mb_required 2000
2013-05-30 20:29:40,583 env.mysqlhotcopy.mysqlhotcopy INFO     sufficient free space,      required 2000 MB less than    free 72771.5898438 MB
2013-05-30 20:29:40,583 env.mysqlhotcopy.mysqlhotcopy INFO     hotcopy of database channelquality_db into outd /var/dbbackup/mysqlhotcopy/belle1.nuu.edu.tw/channelquality_db/20130530_2029
2013-05-30 20:29:40,586 env.mysqlhotcopy.mysqlhotcopy INFO     proceed with MySQLHotCopy /usr/bin/mysqlhotcopy  channelquality_db /var/dbbackup/mysqlhotcopy/belle1.nuu.edu.tw/channelquality_db/20130530_2029
2013-05-30 20:29:40,586 env.mysqlhotcopy.cmd DEBUG    MySQLHotCopy /usr/bin/mysqlhotcopy  channelquality_db /var/dbbackup/mysqlhotcopy/belle1.nuu.edu.tw/channelquality_db/20130530_2029
2013-05-30 20:34:38,323 env.mysqlhotcopy.mysqlhotcopy INFO     seconds {'_hotcopy': 297.73979902267456}
2013-05-30 20:34:38,323 env.mysqlhotcopy.mysqlhotcopy INFO     ================================== archive
2013-05-30 20:34:38,324 env.mysqlhotcopy.mysqlhotcopy WARNING  no valid db connection using static opts.mb_required 2000
2013-05-30 20:34:38,324 env.mysqlhotcopy.mysqlhotcopy INFO     sufficient free space,      required 2000 MB less than    free 63394.0234375 MB
2013-05-30 20:34:38,324 env.mysqlhotcopy.mysqlhotcopy INFO     tagd /var/dbbackup/mysqlhotcopy/belle1.nuu.edu.tw/channelquality_db/20130530_2029  into Tar /var/dbbackup/mysqlhotcopy/belle1.nuu.edu.tw/channelquality_db/20130530_2029.tar.gz channelquality_db gz
2013-05-30 20:34:38,324 env.mysqlhotcopy.tar INFO     creating /var/dbbackup/mysqlhotcopy/belle1.nuu.edu.tw/channelquality_db/20130530_2029.tar.gz from /var/dbbackup/mysqlhotcopy/belle1.nuu.edu.tw/channelquality_db/20130530_2029/channelquality_db

The hotcopy step only took 5min for 9 GB of hotcopied directory:

[root@belle1 ~]# du -hs /var/dbbackup/mysqlhotcopy/belle1.nuu.edu.tw/channelquality_db/20130530_2029/channelquality_db/
9.2G    /var/dbbackup/mysqlhotcopy/belle1.nuu.edu.tw/channelquality_db/20130530_2029/channelquality_db/
[root@belle1 ~]#
[root@belle1 ~]# du -hs /var/dbbackup/mysqlhotcopy/belle1.nuu.edu.tw/channelquality_db/20130530_2029/channelquality_db/*
4.0K    /var/dbbackup/mysqlhotcopy/belle1.nuu.edu.tw/channelquality_db/20130530_2029/channelquality_db/db.opt
12K     /var/dbbackup/mysqlhotcopy/belle1.nuu.edu.tw/channelquality_db/20130530_2029/channelquality_db/DqChannel.frm
2.5G    /var/dbbackup/mysqlhotcopy/belle1.nuu.edu.tw/channelquality_db/20130530_2029/channelquality_db/DqChannel.MYD
2.8G    /var/dbbackup/mysqlhotcopy/belle1.nuu.edu.tw/channelquality_db/20130530_2029/channelquality_db/DqChannel.MYI
12K     /var/dbbackup/mysqlhotcopy/belle1.nuu.edu.tw/channelquality_db/20130530_2029/channelquality_db/DqChannelStatus.frm
1.3G    /var/dbbackup/mysqlhotcopy/belle1.nuu.edu.tw/channelquality_db/20130530_2029/channelquality_db/DqChannelStatus.MYD
2.8G    /var/dbbackup/mysqlhotcopy/belle1.nuu.edu.tw/channelquality_db/20130530_2029/channelquality_db/DqChannelStatus.MYI
12K     /var/dbbackup/mysqlhotcopy/belle1.nuu.edu.tw/channelquality_db/20130530_2029/channelquality_db/DqChannelStatusVld.frm
16M     /var/dbbackup/mysqlhotcopy/belle1.nuu.edu.tw/channelquality_db/20130530_2029/channelquality_db/DqChannelStatusVld.MYD
3.5M    /var/dbbackup/mysqlhotcopy/belle1.nuu.edu.tw/channelquality_db/20130530_2029/channelquality_db/DqChannelStatusVld.MYI
12K     /var/dbbackup/mysqlhotcopy/belle1.nuu.edu.tw/channelquality_db/20130530_2029/channelquality_db/DqChannelVld.frm
16M     /var/dbbackup/mysqlhotcopy/belle1.nuu.edu.tw/channelquality_db/20130530_2029/channelquality_db/DqChannelVld.MYD
3.3M    /var/dbbackup/mysqlhotcopy/belle1.nuu.edu.tw/channelquality_db/20130530_2029/channelquality_db/DqChannelVld.MYI
12K     /var/dbbackup/mysqlhotcopy/belle1.nuu.edu.tw/channelquality_db/20130530_2029/channelquality_db/LOCALSEQNO.frm
4.0K    /var/dbbackup/mysqlhotcopy/belle1.nuu.edu.tw/channelquality_db/20130530_2029/channelquality_db/LOCALSEQNO.MYD
4.0K    /var/dbbackup/mysqlhotcopy/belle1.nuu.edu.tw/channelquality_db/20130530_2029/channelquality_db/LOCALSEQNO.MYI
[root@belle1 ~]#

Compressing this into archive is too slow:

[root@belle1 ~]# du -h /var/dbbackup/mysqlhotcopy/belle1.nuu.edu.tw/channelquality_db/20130530_2029.tar.gz
479M    /var/dbbackup/mysqlhotcopy/belle1.nuu.edu.tw/channelquality_db/20130530_2029.tar.gz
[root@belle1 ~]#

Did it from cron:

[root@belle1 ~]# crontab -l
SHELL = /bin/bash
PATH=/home/blyth/env/bin:/usr/bin:/bin
04 21 * * * ( mysqlhotcopy.py -l debug -t 20130530_2029 channelquality_db archive > /root/mysqlhotcopy.log 2>&1 )
[root@belle1 ~]#

Archiving a 9.2G directory down to 2.3G tarball took 4 hrs, unimportant error from forgetting no-confirm option for sourcedir deletion:

[root@belle1 ~]# cat mysqlhotcopy.log
2013-05-30 21:04:01,229 env.mysqlhotcopy.mysqlhotcopy INFO     /home/blyth/env/bin/mysqlhotcopy.py -l debug -t 20130530_2029 channelquality_db archive
2013-05-30 21:04:01,232 env.mysqlhotcopy.mysqlhotcopy INFO     backupdir /var/dbbackup/mysqlhotcopy/belle1.nuu.edu.tw/channelquality_db
2013-05-30 21:04:01,233 env.mysqlhotcopy.db DEBUG    MyCnf read ['/root/.my.cnf']
2013-05-30 21:04:01,233 env.mysqlhotcopy.db DEBUG    translate mysql config {'host': 'localhost', 'user': 'root', 'database': 'information_schema', 'password': '***', 'socket': '/var/lib/mysql/mysql.sock'} into mysql-python config {'unix_socket': '/var/lib/mysql/mysql.sock', 'host': 'localhost', 'user': 'root', 'passwd': '***', 'db': 'information_schema'}
2013-05-30 21:04:01,233 env.mysqlhotcopy.db DEBUG    connecting to {'unix_socket': '/var/lib/mysql/mysql.sock', 'host': 'localhost', 'user': 'root', 'passwd': '***', 'db': 'information_schema'}
2013-05-30 21:04:01,233 env.mysqlhotcopy.mysqlhotcopy INFO     failed to instanciate connection to database channelquality_db with exception 'NoneType' object has no attribute 'Error'
2013-05-30 21:04:01,234 env.mysqlhotcopy.mysqlhotcopy INFO     ================================== archive
2013-05-30 21:04:01,234 env.mysqlhotcopy.mysqlhotcopy WARNING  no valid db connection using static opts.mb_required 2000
2013-05-30 21:04:01,234 env.mysqlhotcopy.mysqlhotcopy INFO     sufficient free space,      required 2000 MB less than    free 63394.015625 MB
2013-05-30 21:04:01,234 env.mysqlhotcopy.mysqlhotcopy INFO     tagd /var/dbbackup/mysqlhotcopy/belle1.nuu.edu.tw/channelquality_db/20130530_2029  into Tar /var/dbbackup/mysqlhotcopy/belle1.nuu.edu.tw/channelquality_db/20130530_2029.tar.gz channelquality_db gz
2013-05-30 21:04:01,234 env.mysqlhotcopy.tar INFO     creating /var/dbbackup/mysqlhotcopy/belle1.nuu.edu.tw/channelquality_db/20130530_2029.tar.gz from /var/dbbackup/mysqlhotcopy/belle1.nuu.edu.tw/channelquality_db/20130530_2029/channelquality_db
2013-05-31 00:59:05,021 env.mysqlhotcopy.tar INFO     deleting sourcedir /var/dbbackup/mysqlhotcopy/belle1.nuu.edu.tw/channelquality_db/20130530_2029 with leaf 20130530_2029 as the leaf is a dated folder
enter "YES" to confirm deletion of sourcedir /var/dbbackup/mysqlhotcopy/belle1.nuu.edu.tw/channelquality_db/20130530_2029 :Traceback (most recent call last):
  File "/home/blyth/env/bin/mysqlhotcopy.py", line 4, in ?
    main()
  File "/usr/lib/python2.4/site-packages/env/mysqlhotcopy/mysqlhotcopy.py", line 721, in main
    hb(verb)
  File "/usr/lib/python2.4/site-packages/env/mysqlhotcopy/mysqlhotcopy.py", line 470, in __call__
    self._archive()
  File "/usr/lib/python2.4/site-packages/env/mysqlhotcopy/common.py", line 13, in wrapper
    res = func(*arg,**kw)
  File "/usr/lib/python2.4/site-packages/env/mysqlhotcopy/mysqlhotcopy.py", line 570, in _archive
    self.tar.archive(self.tagd, self.opts.deleteafter, self.opts.flattop)
  File "/usr/lib/python2.4/site-packages/env/mysqlhotcopy/common.py", line 13, in wrapper
    res = func(*arg,**kw)
  File "/usr/lib/python2.4/site-packages/env/mysqlhotcopy/tar.py", line 155, in archive
    confirm = raw_input("enter \"YES\" to confirm deletion of sourcedir %s :" % sourcedir )
EOFError: EOF when reading a line
[root@belle1 ~]#
[root@belle1 ~]#
[root@belle1 ~]# du -hs /var/dbbackup/mysqlhotcopy/belle1.nuu.edu.tw/channelquality_db/20130530_2029/
9.2G    /var/dbbackup/mysqlhotcopy/belle1.nuu.edu.tw/channelquality_db/20130530_2029/

[root@belle1 ~]# du -h /var/dbbackup/mysqlhotcopy/belle1.nuu.edu.tw/channelquality_db/20130530_2029.tar.gz
2.3G    /var/dbbackup/mysqlhotcopy/belle1.nuu.edu.tw/channelquality_db/20130530_2029.tar.gz

[root@belle1 ~]# tar ztvf  /var/dbbackup/mysqlhotcopy/belle1.nuu.edu.tw/channelquality_db/20130530_2029.tar.gz
drwxr-x--- mysql/mysql       0 2013-05-30 20:34:38 channelquality_db/
-rw-rw---- mysql/mysql    8618 2013-05-30 19:35:02 channelquality_db/LOCALSEQNO.frm
-rw-rw---- mysql/mysql 3646464 2013-05-30 20:29:40 channelquality_db/DqChannelStatusVld.MYI
-rw-rw---- mysql/mysql    8746 2013-05-30 18:17:42 channelquality_db/DqChannelStatus.frm
-rw-rw---- mysql/mysql 2609292672 2013-05-30 19:34:55 channelquality_db/DqChannel.MYD
-rw-rw---- mysql/mysql 2901941248 2013-05-30 20:29:40 channelquality_db/DqChannel.MYI
-rw-rw---- mysql/mysql         65 2013-05-30 18:17:10 channelquality_db/db.opt
-rw-rw---- mysql/mysql       2048 2013-05-30 20:29:40 channelquality_db/LOCALSEQNO.MYI
-rw-rw---- mysql/mysql 2905288704 2013-05-30 20:29:40 channelquality_db/DqChannelStatus.MYI
-rw-rw---- mysql/mysql 1366772352 2013-05-30 18:52:44 channelquality_db/DqChannelStatus.MYD
-rw-rw---- mysql/mysql   16502223 2013-05-30 18:52:51 channelquality_db/DqChannelStatusVld.MYD
-rw-rw---- mysql/mysql       8908 2013-05-30 19:34:55 channelquality_db/DqChannelVld.frm
-rw-rw---- mysql/mysql       8892 2013-05-30 18:52:51 channelquality_db/DqChannel.frm
-rw-rw---- mysql/mysql   16502223 2013-05-30 19:35:02 channelquality_db/DqChannelVld.MYD
-rw-rw---- mysql/mysql        207 2013-05-30 19:35:02 channelquality_db/LOCALSEQNO.MYD
-rw-rw---- mysql/mysql       8908 2013-05-30 18:52:44 channelquality_db/DqChannelStatusVld.frm
-rw-rw---- mysql/mysql    3427328 2013-05-30 20:29:40 channelquality_db/DqChannelVld.MYI
[root@belle1 ~]#
[root@belle1 ~]#
[root@belle1 ~]#

OOPS didnt use –flattop. Takes too long to rerun for this though. Actually that simplifies manual extraction, but makes database renaming problematic. As the name is already as desired “channelquality_db” thats no problem.

Test extraction of belle1 hotcopy onto belle7

Prepare directory for the tarball on belle7 and scp it over from belle1, taking 3.5 min:

[root@belle7 ~]# mkdir -p  /var/dbbackup/mysqlhotcopy/belle1.nuu.edu.tw/channelquality_db/
[root@belle7 ~]# time scp N1:/var/dbbackup/mysqlhotcopy/belle1.nuu.edu.tw/channelquality_db/20130530_2029.tar.gz /var/dbbackup/mysqlhotcopy/belle1.nuu.edu.tw/channelquality_db/
real    3m28.167s
user    1m19.160s
sys     0m24.959s

Verify the digests match:

[root@belle7 ~]# ssh N1 md5sum /var/dbbackup/mysqlhotcopy/belle1.nuu.edu.tw/channelquality_db/20130530_2029.tar.gz
2631bcc9b9c747e238338a4b50c04ad5  /var/dbbackup/mysqlhotcopy/belle1.nuu.edu.tw/channelquality_db/20130530_2029.tar.gz
[root@belle7 ~]# md5sum /var/dbbackup/mysqlhotcopy/belle1.nuu.edu.tw/channelquality_db/20130530_2029.tar.gz
2631bcc9b9c747e238338a4b50c04ad5  /var/dbbackup/mysqlhotcopy/belle1.nuu.edu.tw/channelquality_db/20130530_2029.tar.gz

Check to see what the mysql datadir is:

[root@belle7 ~]# vim .my.cnf   # check the "client" section is appropriate
[root@belle7 ~]# echo select \@\@datadir | mysql
@@datadir
/var/lib/mysql/

Extract into belle7 datadir, took less than 5 min to extract out to 9.2 G:

[root@belle7 ~]# cd /var/lib/mysql
[root@belle7 mysql]# time tar zxvf /var/dbbackup/mysqlhotcopy/belle1.nuu.edu.tw/channelquality_db/20130530_2029.tar.gz
channelquality_db/
channelquality_db/LOCALSEQNO.frm
channelquality_db/DqChannelStatusVld.MYI
channelquality_db/DqChannelStatus.frm
channelquality_db/DqChannel.MYD
channelquality_db/DqChannel.MYI
channelquality_db/db.opt
channelquality_db/LOCALSEQNO.MYI
channelquality_db/DqChannelStatus.MYI
channelquality_db/DqChannelStatus.MYD
channelquality_db/DqChannelStatusVld.MYD
channelquality_db/DqChannelVld.frm
channelquality_db/DqChannel.frm
channelquality_db/DqChannelVld.MYD
channelquality_db/LOCALSEQNO.MYD
channelquality_db/DqChannelStatusVld.frm
channelquality_db/DqChannelVld.MYI

real    4m30.838s
user    1m34.536s
sys     0m40.571s
[root@belle7 mysql]#
[root@belle7 mysql]# du -hs channelquality_db
9.2G    channelquality_db

basic checks OK

mysql> use channelquality_db
Database changed
mysql> show tables ;
+-----------------------------+
| Tables_in_channelquality_db |
+-----------------------------+
| DqChannel                   |
| DqChannelStatus             |
| DqChannelStatusVld          |
| DqChannelVld                |
| LOCALSEQNO                  |
+-----------------------------+
5 rows in set (0.00 sec)

mysql> select max(SEQNO) from  DqChannel ;
+------------+
| max(SEQNO) |
+------------+
|     323573 |
+------------+
1 row in set (0.01 sec)

place tarball on cms01 where Qiumei has access

Prepare directory on S:

[root@cms01 ~]# mkdir -p /data/var/dbbackup/mysqlhotcopy/belle1.nuu.edu.tw/channelquality_db/
[root@cms01 ~]# chown -R dayabayscp.dayabayscp  /data/var/dbbackup/mysqlhotcopy/belle1.nuu.edu.tw/channelquality_db/

Go around the NUU-NTU blockade via my laptop:

simon:~ blyth$ scp N:/var/dbbackup/mysqlhotcopy/belle1.nuu.edu.tw/channelquality_db/20130530_2029.tar.gz .
simon:~ blyth$ scp 20130530_2029.tar.gz S:/data/var/dbbackup/mysqlhotcopy/belle1.nuu.edu.tw/channelquality_db/20130530_2029.tar.gz
simon:~ blyth$ ssh S md5sum /data/var/dbbackup/mysqlhotcopy/belle1.nuu.edu.tw/channelquality_db/20130530_2029.tar.gz          ## nope the dayabayscp user has restricted shell
simon:~ blyth$ ssh C md5sum /data/var/dbbackup/mysqlhotcopy/belle1.nuu.edu.tw/channelquality_db/20130530_2029.tar.gz
2631bcc9b9c747e238338a4b50c04ad5  /data/var/dbbackup/mysqlhotcopy/belle1.nuu.edu.tw/channelquality_db/20130530_2029.tar.gz

dybdb1 extraction instructions for Qiumei

replication config to exclude channelquality_db

Before proceeding with the extraction you should exclude channelquality_db from the replication in addition to the already excluded tmp_ligs_offline_db.

Check free space on the server

Before extraction check the free space on the server. The uncompressed DB directory is 9.2G and tarball 2.3G so you should have at least 25 G free in order to proceed.

Get the tarball and check digest

Grab the tarball from S, and check its digest matches those above:

dybdb1> mkdir -p /var/dbbackup/mysqlhotcopy/belle1.nuu.edu.tw/channelquality_db/
dybdb1> scp S:/data/var/dbbackup/mysqlhotcopy/belle1.nuu.edu.tw/channelquality_db/20130530_2029.tar.gz /var/dbbackup/mysqlhotcopy/belle1.nuu.edu.tw/channelquality_db/20130530_2029.tar.gz
         ## its 2.3 GB, so will probably take 30-60 min

dybdb1> md5sum  /var/dbbackup/mysqlhotcopy/belle1.nuu.edu.tw/channelquality_db/20130530_2029.tar.gz    ## should be 2631bcc9b9c747e238338a4b50c04ad5

Remove empty channelquality_db

Drop the empty “channelquality_db” database:

mysql > status                               # check are talking to dybdb1
mysql > drop database channelquality_db ;
mysql > select @@datadir ;                  # check where mysql keeps its data, I expect /data/mysql

Perform extraction into mysql datadir

From that datadir, check the paths within the tarball, and then extract it should create directory “channelquality_db”:

dybdb1 > cd /data/mysql
dybdb1 > tar ztvf  /var/dbbackup/mysqlhotcopy/belle1.nuu.edu.tw/channelquality_db/20130530_2029.tar.gz   # check the paths of what will be extracted
          ## a minute or so

dybdb1 > tar zxvf  /var/dbbackup/mysqlhotcopy/belle1.nuu.edu.tw/channelquality_db/20130530_2029.tar.gz   # do the extraction, creating channelquality_db
          ## took less than 5 min on belle7

Basic checks of channelquality_db on belle7

The RUNNO/FILENO duplication is from different site/subsite presumably:

mysql> show tables ;
+-----------------------------+
| Tables_in_channelquality_db |
+-----------------------------+
| DqChannel                   |
| DqChannelStatus             |
| DqChannelStatusVld          |
| DqChannelVld                |
| LOCALSEQNO                  |
+-----------------------------+
5 rows in set (0.00 sec)

mysql>
mysql>
mysql> select SEQNO,count(*) N,RUNNO,FILENO from DqChannelStatus group by SEQNO order by SEQNO desc limit 10 ;
+--------+-----+-------+--------+
| SEQNO  | N   | RUNNO | FILENO |
+--------+-----+-------+--------+
| 323573 | 192 | 38860 |    284 |
| 323572 | 192 | 38886 |    343 |
| 323571 | 192 | 38886 |    343 |
| 323570 | 192 | 38860 |    285 |
| 323569 | 192 | 38860 |    285 |
| 323568 | 192 | 38886 |    340 |
| 323567 | 192 | 38886 |    340 |
| 323566 | 192 | 38886 |    336 |
| 323565 | 192 | 38886 |    336 |
| 323564 | 192 | 38886 |    339 |
+--------+-----+-------+--------+
10 rows in set (0.02 sec)

mysql> select SEQNO,count(*) N,RUNNO,FILENO from DqChannel group by SEQNO order by SEQNO desc limit 10 ;
+--------+-----+-------+--------+
| SEQNO  | N   | RUNNO | FILENO |
+--------+-----+-------+--------+
| 323573 | 192 | 38860 |    284 |
| 323572 | 192 | 38886 |    343 |
| 323571 | 192 | 38886 |    343 |
| 323570 | 192 | 38860 |    285 |
| 323569 | 192 | 38860 |    285 |
| 323568 | 192 | 38886 |    340 |
| 323567 | 192 | 38886 |    340 |
| 323566 | 192 | 38886 |    336 |
| 323565 | 192 | 38886 |    336 |
| 323564 | 192 | 38886 |    339 |
+--------+-----+-------+--------+
10 rows in set (0.05 sec)

Large Table Comparisons using grouped digests

mysql> select SEQNO,ROW_COUNTER,md5(concat_ws(",",RUNNO,FILENO,CHANNELID,OCCUPANCY,DADCMEAN,DADCRMS,HVMEAN,HVRMS)) from DqChannel limit 10 ;
+-------+-------------+------------------------------------------------------------------------------------+
| SEQNO | ROW_COUNTER | md5(concat_ws(",",RUNNO,FILENO,CHANNELID,OCCUPANCY,DADCMEAN,DADCRMS,HVMEAN,HVRMS)) |
+-------+-------------+------------------------------------------------------------------------------------+
|     1 |           1 | 6f05eeae022b72a59ce109702579e963                                                   |
|     1 |           2 | 237af55b149a4113cb76ba211e9c780c                                                   |
|     1 |           3 | 2c0f04d886247c1b2332d5de4343f121                                                   |
|     1 |           4 | f25c8f6f0f3863b39549c5b70f4d0d7b                                                   |
|     1 |           5 | 38da30183a502faa5edffa0b66a9d7fd                                                   |
|     1 |           6 | a33ee7df209680d0bc5c24587d0b7b69                                                   |
|     1 |           7 | 606c4aa7971b46ab8535d0f1c436100c                                                   |
|     1 |           8 | 614c10c35498c3181d560a193e210a55                                                   |
|     1 |           9 | 70b9b4b77d1cea95410a4a72628b6114                                                   |
|     1 |          10 | 3aac210ae7274d3e34e6f0b55af66f58                                                   |
+-------+-------------+------------------------------------------------------------------------------------+
10 rows in set (0.00 sec)
mysql> select 33*192 ;
+--------+
| 33*192 |
+--------+
|   6336 |
+--------+
1 row in set (0.00 sec)

mysql> show variables like '%group%' ;
+----------------------------+-------+
| Variable_name              | Value |
+----------------------------+-------+
| group_concat_max_len       | 1024  |
| innodb_log_files_in_group  | 2     |
| innodb_log_group_home_dir  | ./    |
| innodb_mirrored_log_groups | 1     |
+----------------------------+-------+
4 rows in set (0.00 sec)

mysql> set @@group_concat_max_len = ( 1024 << 3 ) ;
Query OK, 0 rows affected (0.00 sec)

mysql> select @@group_concat_max_len ;
+------------------------+
| @@group_concat_max_len |
+------------------------+
|                   8192 |
+------------------------+
1 row in set (0.00 sec)

A per-SEQNO digest that can be compared between tables in separate DBs that should be the same:

mysql> select SEQNO,md5(group_concat(md5(concat_ws(",",ROW_COUNTER,RUNNO,FILENO,CHANNELID,OCCUPANCY,DADCMEAN,DADCRMS,HVMEAN,HVRMS)) separator ",")) as digest from DqChannel group by SEQNO limit 10 ;
+-------+----------------------------------+
| SEQNO | digest                           |
+-------+----------------------------------+
|     1 | 234d05bac921e752a830b725b8e7025d |
|     2 | 7c47447ac58bf99cfb1e1619d1ae497b |
|     3 | 74dd38ac411bb10f61288f871d9c9bf1 |
|     4 | 0ed802219ade8d9aa3b3033d75b2f62f |
|     5 | 32a76390c03c6a4bd3d0d1a958725238 |
|     6 | 06a6d73226d390e1556450edd8fd54ec |
|     7 | ebcdbfb042bf60e934de2d5ee0ec84db |
|     8 | d96107391a788a147e861e9c154d9258 |
|     9 | 57dac2ede0fe9e48be87896480fd6d84 |
|    10 | 29e1fec1affc0ebf814af1777c455078 |
+-------+----------------------------------+
10 rows in set (0.03 sec)

Could also make a full table digest.

Insitu repair and trim

DBI trim:

delete P,V from CableMap P join CableMapVld V on P.SEQNO = V.SEQNO  where P.SEQNO > 398 ;

Aside on MySQL comments

Comments in the mysql log can be handy for seeing what commands lead to what SQL in the log:

[blyth@belle7 ~]$ echo "/* hello without -c does not get into the mysql log */" | mysql
[blyth@belle7 ~]$ echo "/* hello with -c gets into the mysql log */" | mysql -c

How to do that from mysql-python:

In [2]: from DybPython import DB
In [3]: db = DB()
In [4]: db("/* hello from DybPython.DB */")
Out[4]: ()

OR what about DBI:

In [5]: from DybDbi import gDbi
In [7]: gDbi.comment("/* hello from gDbi.comment */")    ## HMM NOT WORKING ANY MORE ?

Review belle7 Databases

database summaries

[blyth@belle7 DybPython]$ ./dbsrv.py channelquality_db summary

channelquality_db
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~  ~~~~~~~~~~  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
TABLE_NAME                      TABLE_ROWS  CREATE_TIME                     CHECK_TIME
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~  ~~~~~~~~~~  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
DqChannel                       62126016    2013-05-30 18:52:51             2013-05-30 18:52:51
DqChannelStatus                 62126016    2013-05-30 18:17:42             2013-05-30 18:17:42
DqChannelStatusVld              323573      2013-05-30 18:52:44             None
DqChannelVld                    323573      2013-05-30 19:34:55             None
LOCALSEQNO                      3           2013-05-30 19:35:02             None
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~  ~~~~~~~~~~  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

[blyth@belle7 DybPython]$ ./dbsrv.py tmp_ligs_offline_db_\\d summary      ## regexp argument to dbsrv.py

tmp_ligs_offline_db_0
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~  ~~~~~~~~~~  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
TABLE_NAME                      TABLE_ROWS  CREATE_TIME                     CHECK_TIME
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~  ~~~~~~~~~~  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
DqChannel                       65489088    2013-05-27 13:10:54             2013-05-27 13:26:26
DqChannelStatus                 65436731    2013-05-27 13:36:35             2013-05-27 13:51:36
DqChannelStatusVld              341125      2013-02-04 16:07:56             2013-05-13 13:16:02
DqChannelVld                    341089      2013-02-04 16:07:51             None
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~  ~~~~~~~~~~  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

tmp_ligs_offline_db_1
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~  ~~~~~~~~~~  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
TABLE_NAME                      TABLE_ROWS  CREATE_TIME                     CHECK_TIME
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~  ~~~~~~~~~~  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
DqChannelStatus                 65436672    2013-05-23 14:03:34             None
DqChannelStatusVld              341124      2013-05-23 14:13:29             None
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~  ~~~~~~~~~~  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

tmp_ligs_offline_db_2
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~  ~~~~~~~~~~  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
TABLE_NAME                      TABLE_ROWS  CREATE_TIME                     CHECK_TIME
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~  ~~~~~~~~~~  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
DqChannel                       62126016    2013-05-29 14:59:36             2013-05-29 14:59:37
DqChannelStatus                 62126016    2013-05-29 14:26:08             2013-05-29 14:26:09
DqChannelStatusVld              323573      2013-05-29 14:59:30             None
DqChannelVld                    323573      2013-05-29 15:38:21             None
LOCALSEQNO                      3           2013-05-29 15:38:27             None
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~  ~~~~~~~~~~  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

tmp_ligs_offline_db_3
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~  ~~~~~~~~~~  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
TABLE_NAME                      TABLE_ROWS  CREATE_TIME                     CHECK_TIME
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~  ~~~~~~~~~~  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
DqChannel                       62126016    2013-05-29 18:15:48             2013-05-29 18:15:49
DqChannelStatus                 62126016    2013-05-29 17:42:09             2013-05-29 17:42:10
DqChannelStatusVld              323573      2013-05-29 18:15:42             None
DqChannelVld                    323573      2013-05-29 18:54:28             None
LOCALSEQNO                      3           2013-05-29 18:54:34             None
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~  ~~~~~~~~~~  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

tmp_ligs_offline_db_4
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~  ~~~~~~~~~~  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
TABLE_NAME                      TABLE_ROWS  CREATE_TIME                     CHECK_TIME
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~  ~~~~~~~~~~  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
DqChannel                       62126016    2013-05-30 13:54:33             2013-05-30 14:11:53
DqChannelStatus                 62126016    2013-05-30 14:11:54             2013-05-30 14:26:55
DqChannelStatusVld              323573      2013-05-30 14:26:56             None
DqChannelVld                    323573      2013-05-30 14:26:58             None
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~  ~~~~~~~~~~  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

tmp_ligs_offline_db_5
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~  ~~~~~~~~~~  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
TABLE_NAME                      TABLE_ROWS  CREATE_TIME                     CHECK_TIME
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~  ~~~~~~~~~~  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
DqChannel                       9600        2013-06-03 19:44:16             2013-06-03 19:44:16
DqChannelStatus                 9600        2013-06-03 19:44:17             2013-06-03 19:44:17
DqChannelStatusVld              50          2013-06-03 19:44:17             None
DqChannelVld                    50          2013-06-03 19:44:17             None
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~  ~~~~~~~~~~  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[blyth@belle7 DybPython]$

tmp_ligs_offline_db_0

  1. DB in which DqChannelStatus was repaired, and DqChannel added in, both from hotcopy tarballs
  2. retains AUTO_INCREMENT=341126

tmp_ligs_offline_db_1

  1. freshly created DB populated via the mysqldump obtained from _0 with the bad SEQNO excluded
  2. AUTO_INCREMENT not preserved

tmp_ligs_offline_db_2

  1. four tables with faked LOCALSEQNO, obtained from a mysqldump of _0 with SEQNO <= 323573
  2. AUTO_INCREMENT not preserved

tmp_ligs_offline_db_3

  1. looks to be the same as _2

tmp_ligs_offline_db_4

  1. created while testing dumplocal/loadlocal, omitted LOCALSEQNO
  2. AUTO_INCREMENT=341126 dbsrv.py dumplocal/loadlocal from _0
[blyth@belle7 DybPython]$ time ./dbsrv.py tmp_ligs_offline_db_0 dumplocal ~/tmp_ligs_offline_db_0 --where 'SEQNO <= 323573' -l debug
[blyth@belle7 DybPython]$ time ./dbsrv.py tmp_ligs_offline_db_4 loadlocal ~/tmp_ligs_offline_db_0  -l debug --DB_DROP_CREATE -C

tmp_ligs_offline_db_5

  1. AUTO_INCREMENT=341126 small partitioned load check with dbsrv.py dumplocal/loadlocal from _0

channelquality_db

  1. DB created from extraction of belle1 hotcopy tarball /data/var/dbbackup/mysqlhotcopy/belle1.nuu.edu.tw/channelquality_db/20130530_2029.tar.gz
  2. no AUTO_INCREMENT as lost that via the mysqldump that created

channelquality_db_0

  1. DB created from partitioned dump of channelquality_db

Partitioned dumplocal/loadlocal:

#17 19 * * * ( $DYBPYTHON_DIR/dbsrv.py channelquality_db dumplocal /tmp/cq/channelquality_db --partition --partitioncfg 10000,0,33 ) >  $CRONLOG_DIR/dbsrv_dump_.log 2>&1
#03 20 * * * ( $DYBPYTHON_DIR/dbsrv.py channelquality_db_0 loadlocal  /tmp/cq/channelquality_db --partition --partitioncfg 10000,0,33 --DB_DROP_CREATE -C ) > $CRONLOG_DIR/dbsrv_load_.log 2>&1

testing DB diffing

Dumping 10k SEQNO takes about 30s:

[blyth@belle7 ~]$ dbsrv tmp_ligs_offline_db_0 dumplocal /tmp/cq/tmp_ligs_offline_db_0 --partition --partitioncfg 10000,0,2
2013-06-05 17:50:34,911 __main__ INFO     /* 10000-partition 1  /2  */ SEQNO >= 1 and SEQNO <= 10000
2013-06-05 17:50:48,550 __main__ INFO     dumplocal partition 0 SEQNO,0 1:10000 --partitioncfg 10000,0,2
2013-06-05 17:50:48,550 __main__ INFO     partition_dumplocal___ /* 10000-partition 1  /2  */ SEQNO >= 1 and SEQNO <= 10000  writing /tmp/cq/tmp_ligs_offline_db_0/10000/0/DqChannel.csv
2013-06-05 17:51:05,411 __main__ INFO     partition_dumplocal___ /* 10000-partition 1  /2  */ SEQNO >= 1 and SEQNO <= 10000  writing /tmp/cq/tmp_ligs_offline_db_0/10000/0/DqChannel.csv took 16.86 seconds
2013-06-05 17:51:05,411 __main__ INFO     partition_dumplocal___ /* 10000-partition 1  /2  */ SEQNO >= 1 and SEQNO <= 10000  writing /tmp/cq/tmp_ligs_offline_db_0/10000/0/DqChannelStatus.csv
2013-06-05 17:51:12,441 __main__ INFO     partition_dumplocal___ /* 10000-partition 1  /2  */ SEQNO >= 1 and SEQNO <= 10000  writing /tmp/cq/tmp_ligs_offline_db_0/10000/0/DqChannelStatus.csv took  7.03 seconds
2013-06-05 17:51:12,442 __main__ INFO     partition_dumplocal___ /* 10000-partition 1  /2  */ SEQNO >= 1 and SEQNO <= 10000  writing /tmp/cq/tmp_ligs_offline_db_0/10000/0/DqChannelStatusVld.csv
2013-06-05 17:51:12,514 __main__ INFO     partition_dumplocal___ /* 10000-partition 1  /2  */ SEQNO >= 1 and SEQNO <= 10000  writing /tmp/cq/tmp_ligs_offline_db_0/10000/0/DqChannelStatusVld.csv took  0.07 seconds
2013-06-05 17:51:12,515 __main__ INFO     partition_dumplocal___ /* 10000-partition 1  /2  */ SEQNO >= 1 and SEQNO <= 10000  writing /tmp/cq/tmp_ligs_offline_db_0/10000/0/DqChannelVld.csv
2013-06-05 17:51:12,794 __main__ INFO     partition_dumplocal___ /* 10000-partition 1  /2  */ SEQNO >= 1 and SEQNO <= 10000  writing /tmp/cq/tmp_ligs_offline_db_0/10000/0/DqChannelVld.csv took  0.28 seconds
2013-06-05 17:51:13,092 __main__ INFO     /* 10000-partition 2  /2  */ SEQNO >= 10001 and SEQNO <= 20000
2013-06-05 17:51:29,136 __main__ INFO     dumplocal partition 1 SEQNO,0 10001:20000 --partitioncfg 10000,0,2
2013-06-05 17:51:29,195 __main__ INFO     partition_dumplocal___ /* 10000-partition 2  /2  */ SEQNO >= 10001 and SEQNO <= 20000  writing /tmp/cq/tmp_ligs_offline_db_0/10000/1/DqChannel.csv
2013-06-05 17:51:46,344 __main__ INFO     partition_dumplocal___ /* 10000-partition 2  /2  */ SEQNO >= 10001 and SEQNO <= 20000  writing /tmp/cq/tmp_ligs_offline_db_0/10000/1/DqChannel.csv took 17.15 seconds
2013-06-05 17:51:46,345 __main__ INFO     partition_dumplocal___ /* 10000-partition 2  /2  */ SEQNO >= 10001 and SEQNO <= 20000  writing /tmp/cq/tmp_ligs_offline_db_0/10000/1/DqChannelStatus.csv
2013-06-05 17:51:53,409 __main__ INFO     partition_dumplocal___ /* 10000-partition 2  /2  */ SEQNO >= 10001 and SEQNO <= 20000  writing /tmp/cq/tmp_ligs_offline_db_0/10000/1/DqChannelStatus.csv took  7.06 seconds
2013-06-05 17:51:53,410 __main__ INFO     partition_dumplocal___ /* 10000-partition 2  /2  */ SEQNO >= 10001 and SEQNO <= 20000  writing /tmp/cq/tmp_ligs_offline_db_0/10000/1/DqChannelStatusVld.csv
2013-06-05 17:51:53,468 __main__ INFO     partition_dumplocal___ /* 10000-partition 2  /2  */ SEQNO >= 10001 and SEQNO <= 20000  writing /tmp/cq/tmp_ligs_offline_db_0/10000/1/DqChannelStatusVld.csv took  0.06 seconds
2013-06-05 17:51:53,468 __main__ INFO     partition_dumplocal___ /* 10000-partition 2  /2  */ SEQNO >= 10001 and SEQNO <= 20000  writing /tmp/cq/tmp_ligs_offline_db_0/10000/1/DqChannelVld.csv
2013-06-05 17:51:53,674 __main__ INFO     partition_dumplocal___ /* 10000-partition 2  /2  */ SEQNO >= 10001 and SEQNO <= 20000  writing /tmp/cq/tmp_ligs_offline_db_0/10000/1/DqChannelVld.csv took  0.21 seconds
[blyth@belle7 ~]$

Hmm difference with validity table SEQNO autoincrement ?:

[blyth@belle7 ~]$ diff -r --brief /tmp/cq/channelquality_db/10000/0/ /tmp/cq/tmp_ligs_offline_db_0/10000/0/
Files /tmp/cq/channelquality_db/10000/0/DqChannelStatusVld.schema and /tmp/cq/tmp_ligs_offline_db_0/10000/0/DqChannelStatusVld.schema differ
Files /tmp/cq/channelquality_db/10000/0/DqChannelVld.schema and /tmp/cq/tmp_ligs_offline_db_0/10000/0/DqChannelVld.schema differ
[blyth@belle7 ~]$
[blyth@belle7 ~]$ diff /tmp/cq/channelquality_db/10000/0/DqChannelStatusVld.schema   /tmp/cq/tmp_ligs_offline_db_0/10000/0/DqChannelStatusVld.schema -y
CREATE TABLE `DqChannelStatusVld` (                             CREATE TABLE `DqChannelStatusVld` (
  `SEQNO` int(11) NOT NULL,                                   |   `SEQNO` int(11) NOT NULL auto_increment,
  ...
  `TIMEEND` datetime NOT NULL,                                    `TIMEEND` datetime NOT NULL,
  PRIMARY KEY  (`SEQNO`)                                          PRIMARY KEY  (`SEQNO`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1                        | ) ENGINE=MyISAM AUTO_INCREMENT=341126 DEFAULT CHARSET=latin1[blyth@belle7 ~]$

Cause of the diff:

[blyth@belle7 ~]$ echo show create table DqChannelStatusVld | tmp_ligs_offline_db_0 | grep ENGINE
) ENGINE=MyISAM AUTO_INCREMENT=341126 DEFAULT CHARSET=latin1 |

[blyth@belle7 ~]$ echo show create table DqChannelStatusVld | channelquality_db  | grep ENGINE
) ENGINE=MyISAM DEFAULT CHARSET=latin1 |

[blyth@belle1 ~]$ echo show create table DqChannelStatusVld | channelquality_db | grep ENGINE
) ENGINE=MyISAM DEFAULT CHARSET=latin1 |

Checking for autoincrement:

echo show create table DqChannelStatusVld | tmp_ligs_offline_db_0   # AUTO_INCREMENT=341126   from hotcopy extractions
echo show create table DqChannelStatusVld | tmp_ligs_offline_db_1   # nope
echo show create table DqChannelStatusVld | tmp_ligs_offline_db_2   # nope
echo show create table DqChannelStatusVld | tmp_ligs_offline_db_3   # nope
echo show create table DqChannelStatusVld | tmp_ligs_offline_db_4   # AUTO_INCREMENT=341126    dbsrv.py dumplocal from _0 then loadlocal
echo show create table DqChannelStatusVld | tmp_ligs_offline_db_5   # AUTO_INCREMENT=341126    small partitioned check using dbsrv.py dumplocal locallocal
echo show create table DqChannelStatusVld | channelquality_db       # nope  from dbdumpload.py
echo show create table DqChannelStatusVld | channelquality_db_0     # nope

dbdumpload.py is omitting auto increment settings ?

[blyth@belle7 ~]$ dbdumpload.py --no-data --tables "DqChannelVld DqChannelStatusVld" tmp_ligs_offline_db_0 dump /dev/stdout  | sh

mysql bug wrt dumping AUTO_INCREMENT

Somehow AUTO_INCREMENT setting got lost OR was included when it should not be

create table and autoincrement

An integer or floating-point column can have the additional attribute AUTO_INCREMENT. When you insert a value of NULL (recommended) or 0 into an indexed AUTO_INCREMENT column, the column is set to the next sequence value. Typically this is value+1, where value is the largest value for the column currently in the table. AUTO_INCREMENT sequences begin with 1.

The initial AUTO_INCREMENT value for the table. In MySQL 5.0, this works for MyISAM and MEMORY tables. It is also supported for InnoDB as of MySQL 5.0.3. To set the first auto-increment value for engines that do not support the AUTO_INCREMENT table option, insert a dummy row with a value one less than the desired value after creating the table, and then delete the dummy row.

For engines that support the AUTO_INCREMENT table option in CREATE TABLE statements, you can also use ALTER TABLE tbl_name AUTO_INCREMENT = N to reset the AUTO_INCREMENT value. The value cannot be set lower than the maximum value currently in the column.

altering auto_increment

  • DBI will probably ignore the AUTO_INCREMENT and come up with its own next SEQNO
  • but can alter table to set it anyhow
mysql> show create table CableMapVld ;
+-------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table       | Create Table                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
+-------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| CableMapVld | CREATE TABLE `CableMapVld` (
  `SEQNO` int(11) NOT NULL auto_increment,
  `TIMESTART` datetime NOT NULL,
  `TIMEEND` datetime NOT NULL,
  `SITEMASK` tinyint(4) default NULL,
  `SIMMASK` tinyint(4) default NULL,
  `SUBSITE` int(11) default NULL,
  `TASK` int(11) default NULL,
  `AGGREGATENO` int(11) default NULL,
  `VERSIONDATE` datetime NOT NULL,
  `INSERTDATE` datetime NOT NULL,
  PRIMARY KEY  (`SEQNO`)
) ENGINE=MyISAM AUTO_INCREMENT=476 DEFAULT CHARSET=latin1 |
+-------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

mysql> alter table CableMapVld AUTO_INCREMENT = 500 ;
Query OK, 398 rows affected (0.02 sec)
Records: 398  Duplicates: 0  Warnings: 0

mysql> show create table CableMapVld ;
...
) ENGINE=MyISAM AUTO_INCREMENT=500 DEFAULT CHARSET=latin1 |

mysql> alter table CableMapVld AUTO_INCREMENT = 476 ;
Query OK, 398 rows affected (0.00 sec)
Records: 398  Duplicates: 0  Warnings: 0

mysql> show create table CableMapVld ;
...
) ENGINE=MyISAM AUTO_INCREMENT=476 DEFAULT CHARSET=latin1 |
mysql> select concat(table_schema,'.',table_name) as dbtable,  auto_increment, create_time, update_time from information_schema.tables where table_schema like 'tmp_ligs_offline_db%' or table_schema like 'channelquality_db%'  ;
+------------------------------------------+----------------+---------------------+---------------------+
| dbtable                                  | auto_increment | create_time         | update_time         |
+------------------------------------------+----------------+---------------------+---------------------+
| channelquality_db.DqChannel              |           NULL | 2013-05-30 18:52:51 | 2013-05-30 19:34:55 |
| channelquality_db.DqChannelStatus        |           NULL | 2013-05-30 18:17:42 | 2013-05-30 18:52:44 |
| channelquality_db.DqChannelStatusVld     |           NULL | 2013-05-30 18:52:44 | 2013-05-30 18:52:51 |
| channelquality_db.DqChannelVld           |           NULL | 2013-05-30 19:34:55 | 2013-05-30 19:35:02 |
| channelquality_db.LOCALSEQNO             |           NULL | 2013-05-30 19:35:02 | 2013-05-30 19:35:02 |
| channelquality_db_0.DqChannel            |           NULL | 2013-06-04 20:03:01 | 2013-06-04 21:18:00 |
| channelquality_db_0.DqChannelStatus      |           NULL | 2013-06-04 20:03:22 | 2013-06-04 21:19:18 |
| channelquality_db_0.DqChannelStatusVld   |           NULL | 2013-06-04 20:03:41 | 2013-06-04 21:19:18 |
| channelquality_db_0.DqChannelVld         |           NULL | 2013-06-04 20:03:41 | 2013-06-04 21:19:18 |
| tmp_ligs_offline_db_0.DqChannel          |           NULL | 2013-05-27 13:10:54 | 2013-05-27 13:18:17 |
| tmp_ligs_offline_db_0.DqChannelStatus    |           NULL | 2013-05-27 13:36:35 | 2013-05-27 13:43:50 |
| tmp_ligs_offline_db_0.DqChannelStatusVld |         341126 | 2013-02-04 16:07:56 | 2013-05-20 06:26:55 |
| tmp_ligs_offline_db_0.DqChannelVld       |         341090 | 2013-02-04 16:07:51 | 2013-05-20 06:26:54 |
| tmp_ligs_offline_db_1.DqChannelStatus    |           NULL | 2013-05-23 14:03:34 | 2013-05-23 14:13:28 |
| tmp_ligs_offline_db_1.DqChannelStatusVld |           NULL | 2013-05-23 14:13:29 | 2013-05-23 14:13:36 |
| tmp_ligs_offline_db_2.DqChannel          |           NULL | 2013-05-29 14:59:36 | 2013-05-29 15:38:21 |
| tmp_ligs_offline_db_2.DqChannelStatus    |           NULL | 2013-05-29 14:26:08 | 2013-05-29 14:59:30 |
| tmp_ligs_offline_db_2.DqChannelStatusVld |           NULL | 2013-05-29 14:59:30 | 2013-05-29 14:59:35 |
| tmp_ligs_offline_db_2.DqChannelVld       |           NULL | 2013-05-29 15:38:21 | 2013-05-29 15:38:26 |
| tmp_ligs_offline_db_2.LOCALSEQNO         |           NULL | 2013-05-29 15:38:27 | 2013-05-29 15:38:27 |
| tmp_ligs_offline_db_3.DqChannel          |           NULL | 2013-05-29 18:15:48 | 2013-05-29 18:54:27 |
| tmp_ligs_offline_db_3.DqChannelStatus    |           NULL | 2013-05-29 17:42:09 | 2013-05-29 18:15:41 |
| tmp_ligs_offline_db_3.DqChannelStatusVld |           NULL | 2013-05-29 18:15:42 | 2013-05-29 18:15:46 |
| tmp_ligs_offline_db_3.DqChannelVld       |           NULL | 2013-05-29 18:54:28 | 2013-05-29 18:54:33 |
| tmp_ligs_offline_db_3.LOCALSEQNO         |           NULL | 2013-05-29 18:54:34 | 2013-05-29 18:54:34 |
| tmp_ligs_offline_db_4.DqChannel          |           NULL | 2013-05-30 13:54:33 | 2013-05-30 14:03:58 |
| tmp_ligs_offline_db_4.DqChannelStatus    |           NULL | 2013-05-30 14:11:54 | 2013-05-30 14:19:39 |
| tmp_ligs_offline_db_4.DqChannelStatusVld |         341126 | 2013-05-30 14:26:56 | 2013-05-30 14:26:58 |
| tmp_ligs_offline_db_4.DqChannelVld       |         341090 | 2013-05-30 14:26:58 | 2013-05-30 14:27:01 |
| tmp_ligs_offline_db_5.DqChannel          |           NULL | 2013-06-03 19:44:16 | 2013-06-03 19:44:17 |
| tmp_ligs_offline_db_5.DqChannelStatus    |           NULL | 2013-06-03 19:44:17 | 2013-06-03 19:44:17 |
| tmp_ligs_offline_db_5.DqChannelStatusVld |         341126 | 2013-06-03 19:44:17 | 2013-06-03 19:44:17 |
| tmp_ligs_offline_db_5.DqChannelVld       |         341090 | 2013-06-03 19:44:17 | 2013-06-03 19:44:17 |
+------------------------------------------+----------------+---------------------+---------------------+
33 rows in set (0.01 sec)


mysql> select max(SEQNO) from tmp_ligs_offline_db_0.DqChannelStatusVld ;
+------------+
| max(SEQNO) |
+------------+
|     341125 |
+------------+
1 row in set (0.03 sec)

mysql> select max(SEQNO) from tmp_ligs_offline_db_0.DqChannelVld  ;
+------------+
| max(SEQNO) |
+------------+
|     341089 |
+------------+
1 row in set (0.04 sec)

mysql> select concat(table_schema,'.',table_name) as dbtable,  auto_increment, create_time, update_time from information_schema.tables where auto_increment > 10 and auto_increment < 1000 ;
+--------------------------------------+----------------+---------------------+---------------------+
| dbtable                              | auto_increment | create_time         | update_time         |
+--------------------------------------+----------------+---------------------+---------------------+
| fake_dcs.DBNS_AD_HV_Imon             |            295 | 2012-08-31 09:09:25 | 2012-08-31 09:09:25 |
| fake_dcs.DBNS_AD_HV_Pw               |            296 | 2012-08-31 09:09:25 | 2012-08-31 09:09:25 |
| fake_dcs.DBNS_AD_HV_Vmon             |            295 | 2012-08-31 09:09:25 | 2012-08-31 09:09:25 |
| fake_dcs.ins_location                |             23 | 2012-08-31 09:09:26 | 2012-08-31 09:09:26 |
| fix_offline_db.CableMapVld           |            476 | 2013-06-05 19:54:21 | 2013-06-05 19:54:21 |
| mydb.auth_permission                 |             34 | 2009-05-27 15:26:38 | 2009-08-14 18:01:09 |
| mydb.auth_user_user_permissions      |             27 | 2009-05-27 15:25:54 | 2009-06-08 16:13:34 |
| mydb.django_content_type             |             14 | 2009-05-27 15:25:54 | 2009-08-14 18:01:09 |
| tmp_cascade_1.DbiDemoData1Vld        |            204 | 2011-08-19 19:03:49 | 2011-08-19 19:03:49 |
| tmp_cascade_1.DbiDemoData2Vld        |            635 | 2011-08-19 19:03:49 | 2011-08-19 19:03:49 |
| tmp_dbitest_1.DbiDemoData1Vld        |            204 | 2013-06-04 13:05:14 | 2013-06-04 13:05:14 |
| tmp_dbitest_1.DbiDemoData2Vld        |            635 | 2013-06-04 13:05:14 | 2013-06-04 13:05:14 |
| tmp_dybdbitest_1.DbiDemoData1Vld     |            204 | 2013-06-04 13:05:33 | 2013-06-04 13:05:33 |
| tmp_dybdbitest_1.DbiDemoData2Vld     |            635 | 2013-06-04 13:05:33 | 2013-06-04 13:05:33 |
| tmp_dybdbitest_2.DbiDemoData1Vld     |            204 | 2013-06-04 13:05:33 | 2013-06-04 13:05:33 |
| tmp_dybdbitest_2.DbiDemoData2Vld     |            635 | 2013-06-04 13:05:33 | 2013-06-04 13:05:33 |
| tmp_offline_db.DemoVld               |             50 | 2013-05-28 18:26:02 | 2013-05-28 18:39:55 |
| tmp_tmp_offline_db_1.DbiDemoData1Vld |            204 | 2011-03-29 18:59:10 | 2011-03-29 18:59:10 |
| tmp_tmp_offline_db_1.DbiDemoData2Vld |            635 | 2011-03-29 18:59:10 | 2011-03-29 18:59:10 |
+--------------------------------------+----------------+---------------------+---------------------+
19 rows in set (0.10 sec)

mysql>
mysql>
mysql> select max(SEQNO) from tmp_tmp_offline_db_1.DbiDemoData1Vld ;
+------------+
| max(SEQNO) |
+------------+
|        203 |
+------------+
1 row in set (0.00 sec)

mysql> select max(SEQNO) from tmp_offline_db.DemoVld  ;
+------------+
| max(SEQNO) |
+------------+
|         49 |
+------------+
1 row in set (0.00 sec)

altering auto increment

  • seems cannot combine the query and alter, even trying to use variables failed
mysql> select max(SEQNO)+1 from CableMapVld ;
+--------------+
| max(SEQNO)+1 |
+--------------+
|          399 |
+--------------+
1 row in set (0.00 sec)

mysql> alter table CableMapVld auto_increment = 399 ;
Query OK, 398 rows affected (0.01 sec)
Records: 398  Duplicates: 0  Warnings: 0

Test use of partitioned dumplocal for large DB diffing

Diff all tables partition of 10k SEQNO at a time, taking approx 2-5s. Thus 2-3 min to diff 10G of DB.

[blyth@belle7 cq]$  diff -r --brief channelquality_db/10000/_ tmp_ligs_offline_db_0/10000/_                     ## the auto increment issue
Files channelquality_db/10000/_/DqChannelStatusVld.schema and tmp_ligs_offline_db_0/10000/_/DqChannelStatusVld.schema differ
Files channelquality_db/10000/_/DqChannelVld.schema and tmp_ligs_offline_db_0/10000/_/DqChannelVld.schema differ

[blyth@belle7 cq]$ diff -r --brief channelquality_db/10000/1 tmp_ligs_offline_db_0/10000/0                      ## oops must line up the patitions
Files channelquality_db/10000/1/DqChannel.csv and tmp_ligs_offline_db_0/10000/0/DqChannel.csv differ
Files channelquality_db/10000/1/DqChannelStatus.csv and tmp_ligs_offline_db_0/10000/0/DqChannelStatus.csv differ
Files channelquality_db/10000/1/DqChannelStatusVld.csv and tmp_ligs_offline_db_0/10000/0/DqChannelStatusVld.csv differ
Files channelquality_db/10000/1/DqChannelVld.csv and tmp_ligs_offline_db_0/10000/0/DqChannelVld.csv differ

[blyth@belle7 cq]$ diff -r --brief channelquality_db/10000/0 tmp_ligs_offline_db_0/10000/0
[blyth@belle7 cq]$ diff -r --brief channelquality_db/10000/1 tmp_ligs_offline_db_0/10000/1
[blyth@belle7 cq]$ diff -r --brief channelquality_db/10000/2 tmp_ligs_offline_db_0/10000/2
[blyth@belle7 cq]$ diff -r --brief channelquality_db/10000/3 tmp_ligs_offline_db_0/10000/3
[blyth@belle7 cq]$ diff -r --brief channelquality_db/10000/4 tmp_ligs_offline_db_0/10000/4
[blyth@belle7 cq]$ time diff -r --brief channelquality_db/10000/5 tmp_ligs_offline_db_0/10000/5
real    0m5.352s
user    0m0.736s
sys     0m0.265s

[blyth@belle7 cq]$ time diff -r --brief channelquality_db/10000/6 tmp_ligs_offline_db_0/10000/6
real    0m2.658s
user    0m0.648s
sys     0m0.235s

[blyth@belle7 cq]$ time diff -r --brief channelquality_db/10000/7 tmp_ligs_offline_db_0/10000/7
real    0m6.171s
user    0m0.686s
sys     0m0.277s

Validating Recovered DB

Procedure if had verified backups available

  1. a verified and daily updated backup would be available on a remote node
  2. on finding corruption in a table, repairs could be performed in situ and the result could be compared against the verified backup.
    • if losses/problems were seen as a result of the repair would need to revert to the backup and refill the entries that were lost

Our procedure without verified backups

Warning

Poor situation arising due to a non backed-up tmp_ DB morphing into a critical one

  1. corruption found
  2. hotcopy of the tables made and transferred to a remote node, where a repair was done, appearing to be successful
    • but cannot be sure of that without a backup to compare against ?
  3. forced to assume the

Strategy

  1. get confidence in the techniques developed by validating the three non-crashed tables up to SEQNO <= 323573 by comparison of partitioned dumps:

    DqChannel
    DqChannelVld
    DqChannelStatusVld
    

    success of this indicates are doing no harm, for these three

Insitu repair

Doing a repair on the server and comparing with the repair done on belle7 would give a crosscheck, but not certainty.

Although the repair was simple to do on belle7 (and took only ~4 min) it can potentially loose all data in the table. Thus you should never normally do it without having a verified backup.

Databases compared

home sect sect note
dybdb1_ligs tmp_ligs_offline_db_dybdb1 original on dybdb1
dybdb2_ligs channelquality_db_dybdb2 recovered on dybdb2
loopback channelquality_db_belle7 recovered onto belle7 from hotcopy created on belle1

Partitioning

Partition tables in SEQNO chunks. The chunking allows DB diffs to be made using filesystem “diff -r –brief” applied to the chunk directories. This results in drastically faster diffing than for example creating digests from queries. Also this will allow efficient incremental backups of very large tables.

Using

Partitioned dumping (in 10000 SEQNO chunks) created CSV files for each table:

[blyth@belle7 10000]$ pwd
/tmp/cq/channelquality_db_dybdb2/10000
[blyth@belle7 10000]$ l 0/
total 115512
-rw-r--r-- 1 blyth blyth   1038894 Jun  6 18:54 DqChannelStatusVld.csv
-rw-r--r-- 1 blyth blyth   1038894 Jun  6 18:54 DqChannelVld.csv
-rw-r--r-- 1 blyth blyth 116074010 Jun  6 18:54 DqChannel.csv
[blyth@belle7 10000]$

This allowed comparison using commands like:

diff -r --brief /tmp/cq/channelquality_db_belle7/10000/0 /tmp/cq/channelquality_db_dybdb2/10000/0

Only an insignificant formatting difference was encountered in one file

More details at