I was looking for some information on some high-performance MySQL questions lurking around in the back of my mind and found this very useful slideshow:
On slide 40, there is a fairly complex and nearly unreadable (without going full screen) MySQL query that finds "useless" MySQL indexes by analyzing their cardinality. Since it is not able to be copy-and-pasted, I figure I'll save someone the trouble. It has been slightly modified for average data sets and to fix a case-sensitive bug:
FROM information_schema.STATISTICS AS s INNER JOIN information_schema.TABLES AS t ON (s.TABLE_SCHEMA = t.TABLE_SCHEMA AND s.TABLE_NAME = t.TABLE_NAME)
WHERE t.TABLE_SCHEMA <> 'mysql' AND t.TABLE_ROWS > 100 AND s.CARDINALITY IS NOT NULL AND (s.CARDINALITY / IFNULL(t.TABLE_ROWS, 0.01)) < 1.00
ORDER BY `SEL %`, TABLE_SCHEMA, TABLE_NAME
LIMIT 25;
The rest of the slideshow is pretty good too.
The second to last slide on using 'auto_increment' is a key performance improvement that I've always suspected exists but I have yet to run into a corporate drone, I mean, DBA who agrees with me. I've always held that integer lookups are many, many times faster than doing multiple string-based lookups when joining multiple tables together while the DBAs I've run into won't budge from their silly little "3rd normal form" to inject a "redundant" auto-increment field (aka Surrogate Primary Key). The slideshow is straight from an expert on MySQL performance who worked at/on MySQL AB - so take that DBAs!
Then again, I don't know many DBAs.
On slide 40, there is a fairly complex and nearly unreadable (without going full screen) MySQL query that finds "useless" MySQL indexes by analyzing their cardinality. Since it is not able to be copy-and-pasted, I figure I'll save someone the trouble. It has been slightly modified for average data sets and to fix a case-sensitive bug:
SELECT t.TABLE_SCHEMA, t.TABLE_NAME, s.INDEX_NAME, s.COLUMN_NAME, s.SEQ_IN_INDEX, (SELECT MAX(SEQ_IN_INDEX) FROM information_schema.STATISTICS AS s2 WHERE s.TABLE_SCHEMA = s2.TABLE_SCHEMA AND s.TABLE_NAME = s2.TABLE_NAME AND s.INDEX_NAME = s2.INDEX_NAME) AS `COLS_IN_INDEX`, s.CARDINALITY, t.TABLE_ROWS AS `ROWS`, ROUND(((s.CARDINALITY / IFNULL(t.TABLE_ROWS, 0.01)) * 100), 2) AS `SEL %`
FROM information_schema.STATISTICS AS s INNER JOIN information_schema.TABLES AS t ON (s.TABLE_SCHEMA = t.TABLE_SCHEMA AND s.TABLE_NAME = t.TABLE_NAME)
WHERE t.TABLE_SCHEMA <> 'mysql' AND t.TABLE_ROWS > 100 AND s.CARDINALITY IS NOT NULL AND (s.CARDINALITY / IFNULL(t.TABLE_ROWS, 0.01)) < 1.00
ORDER BY `SEL %`, TABLE_SCHEMA, TABLE_NAME
LIMIT 25;
The rest of the slideshow is pretty good too.
The second to last slide on using 'auto_increment' is a key performance improvement that I've always suspected exists but I have yet to run into a corporate drone, I mean, DBA who agrees with me. I've always held that integer lookups are many, many times faster than doing multiple string-based lookups when joining multiple tables together while the DBAs I've run into won't budge from their silly little "3rd normal form" to inject a "redundant" auto-increment field (aka Surrogate Primary Key). The slideshow is straight from an expert on MySQL performance who worked at/on MySQL AB - so take that DBAs!
Then again, I don't know many DBAs.
Comments
Post a Comment