文章关键字 ‘MySQL’

optimizer_search_depth

2010年04月20日,星期二

optimizer_search_depth
The maximum depth of search performed by the query optimizer. Values larger than the number of relations in a query result in better query plans, but take longer to generate an execution plan for a query. Values smaller than the number of relations in a query return an execution plan quicker, but the resulting plan may be far from being optimal.
其中的the number of relations in a query到底指什么?Mark一下

MySQL的Clustered Table?

2010年04月9日,星期五

http://www.yupoo.com/photos/nosql/73299304/

Mark一下。

Midpoint Insertion Strategy的翻译和理解

2010年04月7日,星期三

7.4.5.3. Midpoint Insertion Strategy

By default, the key cache management system uses a simple LRU strategy for choosing key cache blocks to be evicted, but it also supports a more sophisticated method called the midpoint insertion strategy.

Key Cache管理系统默认使用LRU策略来选择丢弃的block,它同样支持一种更复杂的方法:midpoint insertion strategy

When using the midpoint insertion strategy, the LRU chain is divided into two parts: a hot sublist and a warm sublist. The division point between two parts is not fixed, but the key cache management system takes care that the warm part is not “too short,” always containing at least key_cache_division_limit percent of the key cache blocks. key_cache_division_limit is a component of structured key cache variables, so its value is a parameter that can be set per cache.

当使用midpoint insertion strategy策略时,LRU的链被分成两部分:hot子链和warm子链。分割点不是固定的,但是Key Cache管理系统控制着warm子链,不会让它过于短,至少是key_cache_division_limit的百分比于Key Cache的长度。key_cache_division_limitKey Cache的系统变量,所以它可以为每个Cache设置。

When an index block is read from a table into the key cache, it is placed at the end of the warm sublist. After a certain number of hits (accesses of the block), it is promoted to the hot sublist. At present, the number of hits required to promote a block (3) is the same for all index blocks.

当一个索引块被从表中读取到Key Cache中时,它被放在warm子链的尾部。在一定数量的访问(点击)后,(其实在这之前该block已经在warm子链的开始部分了)它被提升到hot子链中。在目前来说,对该block的访问次数是所有的index blocks的总和(数字的等同)

A block promoted into the hot sublist is placed at the end of the list. The block then circulates within this sublist. If the block stays at the beginning of the sublist for a long enough time, it is demoted to the warm sublist. This time is determined by the value of the key_cache_age_threshold component of the key cache.

 当一个block被提升到hot子链时,它是放在hot子链的尾部。然后该block开始在hot子链中循环。如果该blockhot子链的开始部分待了很长时间,它会被降到warm子链。这次决定的是key_cache_age_threshold系统变量。

 

The threshold value prescribes that, for a key cache containing N blocks, the block at the beginning of the hot sublist not accessed within the last N × key_cache_age_threshold / 100 hits is to be moved to the beginning of the warm sublist. It then becomes the first candidate for eviction, because blocks for replacement always are taken from the beginning of the warm sublist.

假定Key Cache中有Nblocks,临界值规定了:在hot子链开始部分的block在最近的(N × key_cache_age_threshold / 100)次访问中没有涉及的话被放回warm子链的开始部分。key_cache_age_threshold决定了频率的高低(最小值100)然后该block成为最有可能被丢弃的,因为新的block替换就发生在warm子链的开始部分。

The midpoint insertion strategy allows you to keep more-valued blocks always in the cache. If you prefer to use the plain LRU strategy, leave the key_cache_division_limit value set to its default of 100.

midpoint insertion strategy能让你保持重要的block始终在Cache(设置key_cache_age_threshold即可)。如果你还是想使用LRU策略,那么将key_cache_division_limit设置成默认的100

The midpoint insertion strategy helps to improve performance when execution of a query that requires an index scan effectively pushes out of the cache all the index blocks corresponding to valuable high-level B-tree nodes. To avoid this, you must use a midpoint insertion strategy with the key_cache_division_limit set to much less than 100. Then valuable frequently hit nodes are preserved in the hot sublist during an index scan operation as well.

midpoint insertion strategy帮助你提升执行全索引扫描时几乎所有的Cache(warm子链的部分)都要替换成新的B树节点的性能。为了避免对性能产生较大的影响,你必须将key_cache_division_limit设置成小于100(有效地控制其影响范围)。这样重要的经常访问的节点在全索引扫描时就被保护在hot子链中。

总结:相当于在一个LRU链上虚拟了两个链(hotwarm),全索引扫描时不会波及全部。

MySQL Refman 7.4.5 The MyISAM Key Cache 译文

2010年04月7日,星期三

原文:

7.4.5. The MyISAM Key Cache

To minimize disk I/O, the MyISAM storage engine exploits a strategy that is used by many database management systems. It employs a cache mechanism to keep the most frequently accessed table blocks in memory:

  • For index blocks, a special structure called the key cache (or key buffer) is maintained. The structure contains a number of block buffers where the most-used index blocks are placed.

  • For data blocks, MySQL uses no special cache. Instead it relies on the native operating system file system cache.

This section first describes the basic operation of the MyISAM key cache. Then it discusses features that improve key cache performance and that enable you to better control cache operation:

  • Multiple sessions can access the cache concurrently.

  • You can set up multiple key caches and assign table indexes to specific caches.

To control the size of the key cache, use the key_buffer_size system variable. If this variable is set equal to zero, no key cache is used. The key cache also is not used if the key_buffer_size value is too small to allocate the minimal number of block buffers (8).

MySQL Enterprise.  For expert advice on identifying the optimum size for key_buffer_size, subscribe to the MySQL Enterprise Monitor. See http://www.mysql.com/products/enterprise/advisors.html.

When the key cache is not operational, index files are accessed using only the native file system buffering provided by the operating system. (In other words, table index blocks are accessed using the same strategy as that employed for table data blocks.)

An index block is a contiguous unit of access to the MyISAM index files. Usually the size of an index block is equal to the size of nodes of the index B-tree. (Indexes are represented on disk using a B-tree data structure. Nodes at the bottom of the tree are leaf nodes. Nodes above the leaf nodes are nonleaf nodes.)

All block buffers in a key cache structure are the same size. This size can be equal to, greater than, or less than the size of a table index block. Usually one these two values is a multiple of the other.

When data from any table index block must be accessed, the server first checks whether it is available in some block buffer of the key cache. If it is, the server accesses data in the key cache rather than on disk. That is, it reads from the cache or writes into it rather than reading from or writing to disk. Otherwise, the server chooses a cache block buffer containing a different table index block (or blocks) and replaces the data there by a copy of required table index block. As soon as the new index block is in the cache, the index data can be accessed.

If it happens that a block selected for replacement has been modified, the block is considered “dirty.” In this case, prior to being replaced, its contents are flushed to the table index from which it came.

Usually the server follows an LRU (Least Recently Used) strategy: When choosing a block for replacement, it selects the least recently used index block. To make this choice easier, the key cache module maintains all used blocks in a special list (LRU chain) ordered by time of use. When a block is accessed, it is the most recently used and is placed at the end of the list. When blocks need to be replaced, blocks at the beginning of the list are the least recently used and become the first candidates for eviction.

译文:

为了减小diskI/OMyISAM的存储引擎和一般的数据库系统一样,使用了缓存机制保存经常访问的block(index block)

  • 对于索引块(index block)Key Cache中保存了常用的索引块。
  • 对于数据块(data block)MySQL没有使用Cache,只是使用了操作系统中文件系统本身的Cache

基本的功能机制和性能优化选项。

  • session并发访问Cache
  • 可以set up多重Key Cache,并指定一个table index给指定的Cache

参数key_buffer_size设置为0或者过小时不使用Key Cache,也就是和数据块一样只能使用操作系统中文件系统自带的Cache

一个索引块一般是MyISAM的索引文件的连续访问的组合。一般索引块的大小是MyISAM的索引结构中节点的大小(MyISAM的索引以BTREE结构存放)

所有Key Cache中的block buffer是一样大的,可以大于、等于、小于索引块的大小。

当需要访问索引数据时,服务器检查是否有现成的block bufferKey Cache中。若有,则从Key Cache中读取而不是disk上读取。也就是说,对索引的读和写均在Key Cache(写的部分到时候要flushdisk)。若没有,服务器将Cache中包含的不同表的索引块的block buffer替换成请求的索引块的拷贝,然后再提供读写。

当一个block被替换之前已经被修改过了,这个block就被标记为。这种情况下,在再次替换前,先将这个block flush到所属表的index文件上(disk)

服务器一般使用LRU策略:当选择被替换的block时,选择最近最少使用的索引块。为了更好地实现该算法,Key Cache模块将所有的block按时间顺序放在一个list(LRU),当一个block被访问时,它被放在最近最常使用的位置(LRU链的结尾,MRU)。当一个block需要被替换时,LRU的开头部分(LRU)就是优先考虑的部分,没有再到MRU部分找。

 

由concurrent_insert参数想到的,MyISAM和InnoDB的插入性能和InnoDB的行锁定

2010年03月29日,星期一

    今天看到concurrent_insert这个参数,解释一下:

 

    MyISAM存储引擎有一个系统变量concurrent_insert,专门用以控制其并发插入的行为,其值分别可以为0、1或2。

    当 concurrent_insert设置为0时,不允许并发插入。

    当concurrent_insert设置为1时,如果MyISAM表中没有空洞(即表的中间没有被删除的行),MyISAM允许在一个进程读表的同时,另一个进程从表尾插入记录。这也是MySQL的默认设置。如果有空洞的话虽然不能很好的并发,但是MySQL还是可以使用INSERT DELAYED来提升插入性能(仅适用于MyISAM,MEMORY和ARCHIVE引擎)。

    当 concurrent_insert设置为2时,无论MyISAM表中有没有空洞,都允许在表尾并发插入记录,这时MySQL允许INSERTSELECT语句在中间没有空数据块的MyISAM表中并行运行。

    注:holes– 空洞这个概念是myisam引擎特有的,myisam在存储时数据文件是顺序的,当有删除操作时,会发生数据文件的空洞.
当有新数据插入时,先查看空洞空间能否能够插入这条数据(concurrent_insert为2时不用判断),否则插入到末端,
此设置只对myisam引擎有效,innodb存储方式有自己独有的方法,不会产生数据空洞。

    那么InnoDB呢,它能不能做到INSERT和SELECT并发呢?这里有个情况说明一下,当这个InnoDB表有索引时当然可以并发(行级锁,只要不连累到,因为因为InnoDB行锁的实现是在索引项上加锁实现的,不同于Oracle的在数据块对行进行加锁,而语句写得不好连累到不相关的行都会被锁定。),但是没有呢?

    先建立一个测试表CurrentTable(注意没有索引):

    然后插入999条数据。

    接着显式开始事务1:执行插入操作

另一个终端上对该表进行select是可以的,这个可以理解,因为只是S锁。

但是另一个终端上进行select … for update的话就出现hanging了。

hanging

从上面的例子也能很清楚的理解(InnoDB行锁的实现是在索引项上加锁实现的)这句话。

而同样的表,我创建了基于ID的主键聚集索引以后,同样的场景,select … for update是可以顺利执行的。

而当该表使用的是MyISAM引擎时,不但select可以执行,select … for update也可以(在数据文件层面实现了新插入行和更改现有行的隔离并发执行)。

 

    从上面的一些只言片语中应该能感觉到是不是插入性能上MyISAM比InnoDB强呢,虽然这里是讨论的insert和select并发及锁定的机制问题。

    确实,MyISAM的插入性能优于InnoDB,具体参见这篇realzyy的博文

COMMIT AND CHAIN and ROLLBACK AND CHAIN

2010年03月29日,星期一

COMMIT AND CHAIN and ROLLBACK AND CHAIN COMMIT RELEASE and ROLLBACK RELEASE 应用场景或者是什么实际的例子,只知道CHAIN是以相同隔离等级开始新事物后者是释放并断开与服务器连接,谁有,不理解它们的使用背景,Mark一下。

Packed (Prefix-Compressed) Indexes对性能影响的一个例子

2010年03月29日,星期一

    在HighPerformanceMySQL的Redundant and Duplicate Indexes章节中,对于Q1:SELECT count(*) FROM userinfo WHERE state_id=5;

    在这两种索引的情况下:KEY state_id_1 (state_id);

                                                KEY state_id_2 (state_id, city, address);

    为什么在MyISAM引擎时,Q1的QPS差距那么大,而InnoDB时,Q1的QPS差距那么小,原文上作者说是MyISAM的前缀索引的缘故,但是对于Q1而言,都是走索引即可,统计state_id=5的总数,前缀索引怎么使效率差那么多的?这个一直没有理解。下面附上原文,并标记了相关的四个QPS值。

SQL服务器模式(sql_mode)中的HIGH_NOT_PRECEDENCE参数理解

2010年03月26日,星期五

    在看refman时看到sql_mode中的这个参数HIGH_NOT_PRECEDENCE时,刚开始有点理解不了官方举的例子,就是关于not的优先级的例子。

    原文部分:

 

·         HIGH_NOT_PRECEDENCE

NOT操作符的优先顺序是表达式例如NOT a BETWEEN b AND c被解释为NOT (a BETWEEN b AND c)。在一些旧版本MySQL中, 表达式被解释为(NOT a) BETWEEN b AND c。启用HIGH_NOT_PRECEDENCESQL模式,可以获得以前旧版本的更高优先级的结果。

mysql> SET sql_mode = '';
mysql> SELECT NOT 1 BETWEEN -5 AND 5;
        -> 0
mysql> SET sql_mode = 'broken_not';
mysql> SELECT NOT 1 BETWEEN -5 AND 5;
        -> 1
为什么例子中的sql_mode是broken_not呢?
我自己测试了一下(把-5和5换成了-2和2):
第二个SQL语句相当于select not (select 1 between -2 and 2);
即select not 1;所以是0.
而将sql_mode改成HIGH_NOT_PRECEDENCE后,not的优先级变成更高了(和旧版本一样),变成select (not 1) between -2 and 2;
即select 0 between -2 and 2;这个是成立的,所以结果为1。(后来突然想到的,这个当时没想到,鄙视一下自己)。

between and 和 in()一个对比

2010年02月26日,星期五

现有一个表(数据量不到3000):

CREATE TABLE `Test` (
  `id` int(11) NOT NULL,
  `city` varchar(50) NOT NULL,
  `addr` varchar(50) NOT NULL,
  KEY `city` (`city`,`addr`),
  KEY `id` (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8
看两种语句的执行计划是一样的:

mysql> explain select SQL_NO_CACHE id from Test where id in (1,2,5,7,9,10,11,12,13,14,15,16,17,18,19,20,21,22);
+—-+————-+——-+——-+—————+——+———+——+——+————————–+
| id | select_type | table | type  | possible_keys | key  | key_len | ref  | rows | Extra                    |
+—-+————-+——-+——-+—————+——+———+——+——+————————–+
|  1 | SIMPLE      | Test  | range | id            | id   | 4       | NULL |   18 | Using where; Using index |
+—-+————-+——-+——-+—————+——+———+——+——+————————–+
1 row in set (0.02 sec)

mysql> explain select SQL_NO_CACHE id from Test where id between 0 and 22;
+—-+————-+——-+——-+—————+——+———+——+——+————————–+
| id | select_type | table | type  | possible_keys | key  | key_len | ref  | rows | Extra                    |
+—-+————-+——-+——-+—————+——+———+——+——+————————–+
|  1 | SIMPLE      | Test  | range | id            | id   | 4       | NULL |   18 | Using where; Using index |
+—-+————-+——-+——-+—————+——+———+——+——+————————–+
1 row in set (0.00 sec)

但是实际的profiles:
mysql> show profiles;
+———-+————+————————————————————————————————-+
| Query_ID | Duration   | Query                                                                                           |
+———-+————+————————————————————————————————-+
|        1 | 0.01375400 | select SQL_NO_CACHE id from Test where id between 0 and 22                                      |
|        2 | 0.00043300 | select SQL_NO_CACHE id from Test where id in (1,2,5,7,9,10,11,12,13,14,15,16,17,18,19,20,21,22) |
+———-+————+————————————————————————————————-+
差距是后者时间只是前者的3.15%。同样是走索引的。
详细的:
mysql> show profile for query 1;
+——————————–+———-+
| Status                         | Duration |
+——————————–+———-+
| starting                       | 0.000047 |
| checking query cache for query | 0.000070 |
| Opening tables                 | 0.000013 |
| System lock                    | 0.000007 |
| Table lock                     | 0.000026 |
| init                           | 0.000020 |
| optimizing                     | 0.000011 |
| statistics                     | 0.013088 |
| preparing                      | 0.000337 |
| executing                      | 0.000007 |
| Sending data                   | 0.000090 |
| end                            | 0.000005 |
| end                            | 0.000003 |
| query end                      | 0.000005 |
| freeing items                  | 0.000014 |
| closing tables                 | 0.000005 |
| logging slow query             | 0.000003 |
| cleaning up                    | 0.000003 |
+——————————–+———-+
18 rows in set (0.00 sec)

mysql> show profile for query 2;
+——————————–+———-+
| Status                         | Duration |
+——————————–+———-+
| starting                       | 0.000033 |
| checking query cache for query | 0.000093 |
| Opening tables                 | 0.000011 |
| System lock                    | 0.000005 |
| Table lock                     | 0.000008 |
| init                           | 0.000023 |
| optimizing                     | 0.000010 |
| statistics                     | 0.000132 |
| preparing                      | 0.000011 |
| executing                      | 0.000004 |
| Sending data                   | 0.000074 |
| end                            | 0.000003 |
| end                            | 0.000003 |
| query end                      | 0.000003 |
| freeing items                  | 0.000009 |
| closing tables                 | 0.000005 |
| logging slow query             | 0.000002 |
| cleaning up                    | 0.000004 |
+——————————–+———-+
18 rows in set (0.00 sec)
差距主要在statistics这一步上,在同一索引上我猜测in()的这种显示的指定范围比between and效率高。

手工update mysql.user和mysql.db修改用户权限的另类总结

2010年01月27日,星期三
  1. 千万记得update执行如果加where条件过滤时也是需要Select_priv权限的。
  2. 修改’root’@’localhost’的权限会影响到’root’@’127.0.0.1’的。
  3. 手工暴力修改这些表时,必须FLUSH PRIVILEGES,否则退出再进入也白搭。
  4. 当我将user表中user=roothost=localhostSelect_priv修改为‘N’时,所有的库都不能select了但是问题出在test库,这里root用户还是能select的。后来发现db表中有条记录的user是为空,dbtest,只要我将db表中的user为空的记录的Select_priv修改为‘N’时,root才不能在test进行select.看来是即时指明登陆名是root,但是在user表和db表中没有找到匹配值后,它被鉴定为匿名用户,就是匹配dbuser为空的记录,所以当时root能在test中查询。也就是说用户权限等于当前用户权限+空用户权限,realzyy兄也建议一般初始化完了就要把空用户删掉,这有可能会是隐患的,谢谢realzyy的提醒!