Some of our clients have asked us recently for an overview of our MySQL health check service. I wanted to take some time and summarize the MySQL health check service and some of the steps we take.
Bitlancer health checks are a monthly review of one or more MySQL servers in a customer’s environment. We have a large checklist that covers many critical installation and configuration issues, along with other less critical issues that can potentially cause problems down the road. We look for low-hanging fruit and perform a detailed investigation of potential optimizations, needed performance tuning, and long-term planning strategies. We generate detailed reports of immediately actionable items and impending future problems. For recurring MySQL health checks, reports are delivered monthly with comparisons from previous months to outline growth and possible upcoming bottlenecks. Our health checks can prevent costly, unexpected server maintenances and the need for disaster recovery by finding problems ahead of time. With verification of proper backups, we can help prevent downtime, data loss, and respective recovery costs. If you’re looking to supplement your in-house MySQL talent, or you lack in-house MySQL expertise, our MySQL health check would likely beneficial to you.
In time, we’ll post a full and detailed description of our health check process. In fact, I’m looking forward to breaking each of these items down into further blog posts to help explain why we check them, and what to look for. For now, here are a sampling of the items we evaluate during a MySQL health check:
Identify and Document Operating System Information
- What version of MySQL?
- Is MySQL custom compiled or a standard installation (RPM, FreeBSD Port, etc.)?
- What storage engines are in use? (InnoDB, MyISAM, etc.)
- What is the maximum size of files on disk?
- What is the underlying hardware configuration?
- What other software is running on the box besides MySQL?
- What is the underlying disk configuration? Is it redundant? Is it optimized for size or performance?
- Is the server running on dedicated or shared hardware? If shared, is it in a public or private cloud? Are there potential resource issues from competing tenants?
- How much memory is available to MySQL?
- How are the log files configured? What size/shape are the data files?
Identify and Document Current Resource Utilization
- How much memory is in use by MySQL?
- What are the I/O and data access patterns?
- What’s average CPU utilization like? Are there bursts or is it consistent?
- What kind of init script is in use? Is it overriding any default options?
- What does the my.cnf look like? What default options are being overwritten?
- What is the maximum memory utilization of MySQL?
- Is there obvious necessary tuning that was missed during a MySQL upgrade (for instance, 5.1 -> 5.5)?
- If MyISAM, what is the
key_buffer_sizein comparison to all indexes on disk?
- If InnoDB, what is the
innodb_buffer_pool_sizein comparison to all data and indexes on disk?
- Is the query cache enabled? If so, is it being used efficiently?
- What is
max_connectionsset to? Is
max_user_connectionsset, and if not, should it be?
- We perform many other tuning checks, specifically with InnoDB. We’ll cover those in a later blog post.
Database Maintenance Procedures and Log Files
- Review of monitoring and log rotation
- Review monitoring of storage space and growth rates
- Are there any procedures in place to auto-optimize?
- How are log files distributed across underlying disk(s)? What about the data and indexes?
- What log files are enabled? Is the slow query log enabled? If so, how busy is it? What about the mysql general log?
- How big are the MySQL grant tables?
- What password hash is being used for passwords?
- Is a root password set?
- Are there shared user accounts? What about for applications?
- Are triggers or stored procedures in use? Is anything defined by an individual user? If so, what if that user leaves the company and that account is removed?
- Is replication enabled?
- Do all replicating nodes have different server IDs?
- Are report hosts in use?
skip_slave_errorsin use? If so, why?
- Are any databases or tables skipped in replication?
- Is there any replication configured in
- Is master to master setup? If so, is it configured properly?
These are just some of the items we check during our MySQL health check service.