4. 构造一个Bad case
因为关联条件中MySQL使用索引统计信息做成本预估,所以数据分布不均匀的时候,就容易做出错误的判断。简单的我们构造下面的案例:
表和索引结构不变,按照下面的方式构造数据:
| for i in `seq 1 10000` ; do mysql -uroot test -e 'insert into department values (600000*rand(),repeat(char(65+rand()*58),rand()*20))'; done for i in `seq 1 10000` ; do mysql -uroot test -e 'insert into employee values (repeat(char(65+rand()*58),rand()*20),600000*rand())'; done for i in `seq 1 1` ; do mysql -uroot test -e 'insert into employee values ("zhou",27760)'; done for i in `seq 1 10` ; do mysql -uroot test -e 'insert into department values (27760,"TBX")'; done for i in `seq 1 1000` ; do mysql -uroot test -e 'insert into department values (27760,repeat(char(65+rand()*58),rand()*20))'; done explain select * from employee as A,department as B where A.LastName = 'zhou' and B.DepartmentID = A.DepartmentID and B.DepartmentName = 'TBX'; +----+-------------+-------+------+-----------------+---------+---------+---------------------+------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+------+-----------------+---------+---------+---------------------+------+-------------+ | 1 | SIMPLE | A | ref | IND_L_D,IND_DID | IND_L_D | 43 | const | 1 | Using where | | 1 | SIMPLE | B | ref | IND_D,IND_DN | IND_D | 5 | test.A.DepartmentID | 1 | Using where | +----+-------------+-------+------+-----------------+---------+---------+---------------------+------+-------------+ |
可以看到这里,MySQL执行计划对表department使用了索引IND_D,那么A表命中一条记录为(zhou,27760);根据B.DepartmentID=27760将返回1010条记录,然后根据条件DepartmentName = 'TBX'进行过滤。
这里可以看到如果B表选择索引IND_DN,效果要更好,因为DepartmentName = 'TBX'仅仅返回10条记录,再根据条件A.DepartmentID=B.DepartmentID过滤之。










