linkedin Skip to Main Content
Categories

How to Optimize Query Performance in MySQL Databases

Development

The built-in MySQL query optimizer does an excellent job of optimizing the execution of your queries.

However, poorly written queries can prevent the optimizer from performing well. Even if you do other optimization techniques such as good schema design or indexing, if your queries are wrong, they will still affect your database’s performance. This guide aims to help you improve your MySQL database speed by optimizing your queries.

In this guide, you will:

  • Learn how to identify slow queries in MySQL using several query profiling techniques
  • Learn how you can optimize your queries for faster response times
  • Understand how MySQL optimizes and executes your queries
  • Learn how to control and modify the default query execution plan of your query
  • Learn how the EXPLAIN and EXPLAIN ANALYZE keywords provide information about how MySQL databases execute queries

A quick introduction to query processing

MySQL processes your query in a series of steps. Understanding how these steps work can unlock powerful insights into how you can optimize your queries. These steps are complex internally but can be summarized below:

MySQL client talking to MySQL server. First SQL goes to Parser, then preprocessor, then query optimizer, then query execution plan, then query execution engine. Then it makes an API call to the storage engine (eg InnoDB) to retrieve the data. The storage engine then returns data to the query execution engine and then back to the mysql client.
MySQL query lifecycle by Elvis Duru
  1. The MySQL client sends your query to the MySQL server using the MySQL Client/Server Protocol.
  2. The query is parsed, preprocessed, and finally optimized into a query execution plan by the MySQL query optimizer. The optimizer may ask the Storage engine for statistics about the tables referenced in your query before execution.
  3. The Query Execution Engine executes the plan by making calls to the Storage engine through special handler interfaces.
  4. The MySQL server sends the results to the MySQL client.

In the following sections, we will see how we can get information about the default execution plan and explore ways to optimize our queries to influence the plan.

Identify and analyze slow queries in MySQL

Slow queries can be annoying, but the good news is that we can identify and fix them.

The amount of time it takes MySQL to execute your query is known as its response time. It is the most critical metric used in measuring your query’s speed. In other words, the performance of an individual query or transaction is directly proportional to its response time.

There are several reasons why a query may be slow. These reasons can range from your current hardware configuration, to the permissions you may have set, improper index usage, bad schema design, or even the intention of your queries. This guide focuses on the last point – the query.

Whenever you send a query to the MySQL server, you send a series of instructions that you want it to perform.

Some instructions are simple — e.g., a query that searches using a single parameter — and some instructions can be complex — e.g., queries that involve complex joins and subqueries. In whatever form the MySQL server receives your queries, they will consume time.

Luckily, several built-in tools can allow you to see how your query performed. The process of finding out how MySQL spends time processing and executing your query is known as query profiling. The following sections explore some query profiling tools and explain how we can use them in analyzing your query’s performance. If you want to follow along, you can download and install the sakila sample database we use throughout this article.

The INFORMATION_SCHEMA.PROFILING table

The INFORMATION_SCHEMA.PROFILING table stores profiling information about queries you run in a current interactive session. It’s disabled by default, but you can enable query profiling for your current session by setting the profiling session variable as shown below. Note that the profiling information is lost when a session ends:

mysql> SET SESSION profiling = 1; Query OK, 0 rows affected, 1 warning (0.00 sec)

After that, you can select the database you want to use:

mysql> USE sakila; Database changed
Code language: PHP (php)

Next, you run your query (Make sure you do not use the EXPLAIN statement)

mysql> SELECT * FROM customer;

Run the SHOW PROFILES query

mysql> SHOW PROFILES; +----------+------------+------------------------+ | Query_ID | Duration   | Query                  | +----------+------------+------------------------+ |        1 | 0.02364600 | SELECT DATABASE()      | |        2 | 0.04395425 | show databases         | |        3 | 0.00854575 | show tables            | |        4 | 0.00213000 | SELECT * FROM customer | +----------+------------+------------------------+ 4 rows in set, 1 warning (0.00 sec)
Code language: JavaScript (javascript)

The SHOW PROFILES query fetches all the list of queries in the current session. It has three columns:

  1. The Query_ID – a unique numerical identifier for a query
  2. The Duration is the time taken to execute a query
  3. The Query column shows the query that the MySQL server executed

To see more information about a particular query, we can run the following:

mysql> SELECT * FROM INFORMATION_SCHEMA.PROFILING WHERE QUERY_ID=4; +----------+-----+--------------------------------+----------+----------+------------+-->| QUERY_ID | SEQ | STATE | DURATION | CPU_USER | CPU_SYSTEM | C>+----------+-----+--------------------------------+----------+----------+------------+-->| 4 | 2 | starting | 0.000107 | 0.000046 | 0.000019 | >| 4 | 3 | Executing hook on transaction | 0.000005 | 0.000003 | 0.000001 | >| 4 | 4 | starting | 0.000007 | 0.000005 | 0.000002 | >| 4 | 5 | checking permissions | 0.000007 | 0.000004 | 0.000002 | >| 4 | 6 | Opening tables | 0.000337 | 0.000238 | 0.000100 | >| 4 | 7 | init | 0.000007 | 0.000005 | 0.000001 | >| 4 | 8 | System lock | 0.000010 | 0.000006 | 0.000003 | >| 4 | 9 | optimizing | 0.000004 | 0.000003 | 0.000001 | >| 4 | 10 | statistics | 0.000011 | 0.000008 | 0.000004 | >| 4 | 11 | preparing | 0.000016 | 0.000011 | 0.000004 | >| 4 | 12 | executing | 0.000802 | 0.000573 | 0.000239 | >| 4 | 13 | end | 0.000017 | 0.000005 | 0.000002 | >| 4 | 14 | query end | 0.000003 | 0.000002 | 0.000001 | >| 4 | 15 | waiting for handler commit | 0.000008 | 0.000006 | 0.000002 | >| 4 | 16 | closing tables | 0.000007 | 0.000005 | 0.000002 | >| 4 | 17 | freeing items | 0.000029 | 0.000020 | 0.000009 | >| 4 | 18 | cleaning up | 0.000019 | 0.000013 | 0.000005 | >+----------+-----+--------------------------------+----------+----------+------------+-->(END)

The results show some interesting information about how MySQL processed your query and the time each step took. We discussed some of those steps earlier when explaining the MySQL query lifecycle. Closely observe SEQ (Sequence) 9 – 12.

The slow_query_log

Another way we can identify slow queries is by inspecting the slow query logs. This built-in feature enables you to log queries that exceed a time limit you set using the long_query_time system variable. The default time limit is 10 seconds, i.e., MySQL will log any query that runs longer than 10 seconds. This slow query log, like the INFORMATION_SCHEMA.PROFILING, is not enabled by default. To use it, you must first enable it by setting the slow_query_log global variable to 'ON':

mysql> SET GLOBAL slow_query_log = 'ON'; Query OK, 0 rows affected (0.08 sec)
Code language: PHP (php)

To confirm, you could always view a global variable’s value this way:

mysql> SHOW GLOBAL VARIABLES LIKE 'slow_query_log'; +----------------+-------+ | Variable_name  | Value | +----------------+-------+ | slow_query_log | ON    | +----------------+-------+
Code language: PHP (php)

You can change the time limit in seconds by typing the following command:

mysql> SET GLOBAL long_query_time = 60; Query OK, 0 rows affected (0.08 sec)
Code language: PHP (php)

You can also change the default location for the slow query log file, usually found at var/lib/mysql/hostname-slow.log, to any destination of your choice:

mysql> SET GLOBAL slow_query_log_file = '/somepath/filename.log'; Query OK, 0 rows affected (0.08 sec)
Code language: PHP (php)

Once you’ve run any query that exceeds the time limit you configured, it will be logged in the slow query log by the MySQL server. You can always inspect the file to see those slow queries. 

To demonstrate this, I have set my slow query log file to the path “/tmp/slow_queries.log”.  I’m using the default long_query_time, which is 10 seconds. We can force a slow query that runs for 15 seconds using the SLEEP() function as shown below:

mysql> SELECT SLEEP(15); +-----------+ | SLEEP(15) | +-----------+ |         0 | +-----------+ 1 row in set (15.00 sec)
Code language: JavaScript (javascript)

Let’s inspect the slow query log and see if the MySQL server logged our new query:

❯ sudo cat /tmp/slow_queries.log /usr/sbin/mysqld, Version: 8.0.29-0ubuntu0.20.04.3 ((Ubuntu)). started with: Tcp port: 3306  Unix socket: /var/run/mysqld/mysqld.sock Time                 Id Command    Argument # Time: 2022-08-16T15:47:39.323248Z # [email protected]: root[root] @ localhost []  Id:    17 # Query_time: 15.000317  Lock_time: 0.000000 Rows_sent: 1  Rows_examined: 1 SET timestamp=1660664844; SELECT SLEEP(15);
Code language: PHP (php)

For larger log files, MySQL provides the mysqldumpslow tool to parse and summarize slow query log files:

❯ sudo mysqldumpslow /tmp/slow_queries.log Reading mysql slow query log from /tmp/slow_queries.log Count: 1  Time=15.00s (15s)  Lock=0.00s (0s)  Rows=1.0 (1), root[root]@localhost   SELECT SLEEP(N)
Code language: JavaScript (javascript)

If you’re done inspecting, you can turn off the feature by setting the  slow_query_log global variable to 'OFF':

mysql> SET GLOBAL slow_query_log = 'OFF'; Query OK, 0 rows affected (0.08 sec)
Code language: PHP (php)

The EXPLAIN statement

Another tool MySQL provides for query analysis is the EXPLAIN statement. This statement shows us how the MySQL server will execute a plan. It’s quite easy to use. Anytime you want to see how MySQL will execute your query before running it, just prepend the EXPLAIN statement to it:

mysql> EXPLAIN SELECT * FROM customer\G *************************** 1. row ***************************           id: 1   select_type: SIMPLE         table: customer   partitions: NULL         type: ALL possible_keys: NULL           key: NULL       key_len: NULL           ref: NULL         rows: 599     filtered: 100.00         Extra: NULL 1 row in set, 1 warning (0.00 sec)
Code language: PHP (php)

Ending the SQL statement with “\G” instead of “;” causes MySQL to return the query result in vertical format, which makes it easier to read tables with many long columns.

The above example tells us that the server only needs to scan 599 rows to get the result. A common use case for EXPLAIN is analyzing how MySQL will execute joins. Let’s run a query that joins some tables in our sakila database to demonstrate:

mysql> SELECT first_name, last_name, city, country FROM customer INNER JOIN address USING(address_id) INNER JOIN city USING(city_id) INNER JOIN country USING(country_id); +-------------+--------------+----------------------------+---------------+ | first_name  | last_name    | city                       | country                               +-------------+--------------+----------------------------+---------------+ | VERA        | MCCOY        | Kabul                      | Afghanistan                           | MARIO       | CHEATHAM     | Batna                      | Algeria                               | JUDY        | GRAY         | Bchar                      | Algeria                               | JUNE        | CARROLL      | Skikda                     | Algeria                               | ANTHONY     | SCHWAB       | Tafuna                     | American Samoa                        | CLAUDE      | HERZOG       | Benguela                   | Angola                                | MARTIN      | BALES        | Namibe                     | Angola                                | BOBBY       | BOUDREAU     | South Hill                 | Anguilla                              | WILLIE      | MARKHAM      | Almirante Brown            | Argentina                             | JORDAN      | ARCHULETA    | Avellaneda                 | Argentina                             | JASON       | MORRISSEY    | Baha Blanca                | Argentina                             | KIMBERLY    | LEE          | Crdoba                     | Argentina                             | MICHEAL     | FORMAN       | Escobar                    | Argentina                             | DARRYL      | ASHCRAFT     | Ezeiza                     | Argentina                             | JULIA       | FLORES       | La Plata                   | Argentina                             | FLORENCE    | WOODS        | Merlo                      | Argentina                             | PERRY       | SWAFFORD     | Quilmes                    | Argentina                             | LYDIA       | BURKE        | San Miguel de Tucumn       | Argentina                             | ERIC        | ROBERT       | Santa F                    | Argentina                             | LEONARD     | SCHOFIELD    | Tandil                     | Argentina                             | WILLIE      | HOWELL       | Vicente Lpez               | Argentina                             | STEPHANIE   | MITCHELL     | Yerevan                    | Armenia                               | AUDREY      | RAY          | Graz                       | Austria                               | JILL        | HAWKINS      | Linz                       | Austria                               ---

Our query is simple. We join our customer table with the address table using the address_id key; we then join the city and the country using their respective ids. Now closely observe the order of our query. Let’s see the plan MySQL used in running our query by prepending the EXPLAIN keyword in our query:

mysql> EXPLAIN SELECT first_name, last_name, city, country FROM customer INNER JOIN address USING(address_idINNER JOIN city USING(city_idINNER JOIN country USING(country_id)\G *************************** 1. row ***************************           id: 1   select_type: SIMPLE         table: country   partitions: NULL         type: ALL possible_keys: PRIMARY           key: NULL       key_len: NULL           ref: NULL         rows: 109     filtered: 100.00         Extra: NULL *************************** 2. row ***************************           id: 1   select_type: SIMPLE         table: city   partitions: NULL         type: ref possible_keys: PRIMARY,idx_fk_country_id           key: idx_fk_country_id       key_len: 2           ref: sakila.country.country_id         rows: 5     filtered: 100.00         Extra: NULL *************************** 3. row ***************************           id: 1   select_type: SIMPLE         table: address   partitions: NULL         type: ref possible_keys: PRIMARY,idx_fk_city_id           key: idx_fk_city_id       key_len: 2           ref: sakila.city.city_id         rows: 1     filtered: 100.00         Extra: Using index *************************** 4. row ***************************           id: 1   select_type: SIMPLE         table: customer   partitions: NULL         type: ref possible_keys: idx_fk_address_id           key: idx_fk_address_id       key_len: 2           ref: sakila.address.address_id         rows: 1     filtered: 100.00         Extra: NULL (END)
Code language: CSS (css)

So, this is different from the order we expected. According to this result, we can see that MySQL wants to start with the country table and then scan the city table, the address table, and finally the customer table. But why is it choosing this route?

The answer is the join optimizer. Join optimization is one of the most potent ways MySQL helps optimize queries that involve joins. It does this by arranging the joins in several orders and then estimates the various costs for each order before finally selecting the least expensive option that gives the same result as the initial order of a query.

Back to our last example, by using EXPLAIN, we see that MySQL would only need to scan 109 rows in the country table before scanning the small number of rows in the other tables using the indexed columns – 5, 1, and 1 respectively.

We can force MySQL to run our query in the order we intend by using the STRAIGHT_JOIN clause. Let’s see information about using that:

mysql> EXPLAIN SELECT STRAIGHT_JOIN first_name, last_name, city, country FROM customer INNER JOIN address USING(address_id) INNER JOIN city USING(city_id) *************************** 1. row ***************************           id: 1   select_type: SIMPLE         table: customer   partitions: NULL         type: ALL possible_keys: idx_fk_address_id           key: NULL       key_len: NULL           ref: NULL         rows: 599     filtered: 100.00         Extra: NULL *************************** 2. row ***************************           id: 1   select_type: SIMPLE         table: address   partitions: NULL         type: eq_ref possible_keys: PRIMARY,idx_fk_city_id           key: PRIMARY       key_len: 2           ref: sakila.customer.address_id         rows: 1     filtered: 100.00         Extra: NULL *************************** 3. row ***************************           id: 1   select_type: SIMPLE         table: city   partitions: NULL         type: eq_ref possible_keys: PRIMARY,idx_fk_country_id           key: PRIMARY       key_len: 2           ref: sakila.address.city_id         rows: 1     filtered: 100.00         Extra: NULL *************************** 4. row ***************************           id: 1   select_type: SIMPLE         table: country   partitions: NULL         type: eq_ref possible_keys: PRIMARY           key: PRIMARY       key_len: 2           ref: sakila.city.country_id         rows: 1     filtered: 100.00         Extra: NULL (END)
Code language: PHP (php)

Judging by the number of rows that the MySQL server will scan on the customer table (599) in this example, we can tell that the plan suggested by the join optimizer is more performant. Note that there are still a few times that the optimizer could be wrong; in such cases, you can check your query against STRAIGHT_JOIN or check if there are opportunities to improve the written query.

The EXPLAIN statement can help you identify where you should add indexes to tables so that queries can perform faster by using the indexes to find rows in those tables.

The EXPLAIN ANALYZE statement

The profiling techniques we have discussed won’t be complete without mentioning EXPLAIN ANALYZE.

It works similarly to the EXPLAIN statement. The key difference is that the former is more verbose. You get more statistics about the query execution.

Another difference is that EXPLAIN doesn’t execute queries, but EXPLAIN ANALYZE does. That’s because to fetch the missing statistics, it needs to communicate with the Storage engine. We’ll explain how this communication works in a later section. Meanwhile, let’s demonstrate this profiling technique. To profile a query with the EXPLAIN ANALYZE statement, you just need to prepend the query with the text like this:

mysql> EXPLAIN ANALYZE SELECT first_name, last_name, city, country FROM customer INNER JOIN address USING(address_id) INNER JOIN city USING(city_id) INNER JOIN country USING(country_id); +----------------------------------------------------------------------------------------------------------------------> | EXPLAIN                                                                                                              > +----------------------------------------------------------------------------------------------------------------------> | -> Nested loop inner join  (cost=690.10 rows=599) (actual time=0.073..8.259 rows=599 loops=1)     -> Nested loop inner join  (cost=480.45 rows=599) (actual time=0.064..5.635 rows=599 loops=1)         -> Nested loop inner join  (cost=270.80 rows=599) (actual time=0.057..3.083 rows=599 loops=1)             -> Table scan on customer  (cost=61.15 rows=599) (actual time=0.043..0.478 rows=599 loops=1)             -> Single-row index lookup on address using PRIMARY (address_id=customer.address_id)  (cost=0.25 rows=1) (actual time=0.002..0.002 rows=1 loops=599)         -> Single-row index lookup on city using PRIMARY (city_id=address.city_id)  (cost=0.25 rows=1) (actual time=0.002..0.002 rows=1 loops=599)     -> Single-row index lookup on country using PRIMARY (country_id=city.country_id)  (cost=0.25 rows=1) (actual time=0.002..0.002 rows=1 loops=599) | +----------------------------------------------------------------------------------------------------------------------> (END)

As you can see, the results show how the MySQL server executed the plan. It introduces a few more statistics:

  • The actual time to fetch the first row in a table (in milliseconds)
  • The actual time to fetch all rows in a table (in milliseconds)
  • The estimated costs of the query
  • The actual number of rows read
  • The actual number of loops made

The estimated cost of a query’s execution plan is the cumulative sum of the costs of all operations within the plan, such as I/O or CPU operations.

Inspecting the last_query_cost session variable

Another quick tool to find out the estimated cost of a query is by inspecting the last_query_cost session variable:

mysql> SHOW STATUS LIKE 'last_query_cost'; +-----------------+------------+ | Variable_name   | Value      | +-----------------+------------+ | Last_query_cost | 520.262389 | +-----------------+------------+ 1 row in set (0.00 sec)
Code language: JavaScript (javascript)

The results show the cost of the last compiled query’s plan computed by the query optimizer. It is a useful way to compare two or more versions of your query. In the following example, we run two similar queries and compare the cost for each one. The results show that the execution plan of our first query would cost less:

mysql> SELECT first_name, last_name, city, country FROM customer INNER JOIN address USING(address_id) INNER JOIN city USING(city_id) INNER JOIN country USING(country_id); +-------------+--------------+----------------------------+---------------+ | first_name  | last_name    | city                       | country                               +-------------+--------------+----------------------------+---------------+ | VERA        | MCCOY        | Kabul                      | Afghanistan                           | MARIO       | CHEATHAM     | Batna                      | Algeria                               | JUDY        | GRAY         | Bchar                      | Algeria                               | JUNE        | CARROLL      | Skikda                     | Algeria                               | ANTHONY     | SCHWAB       | Tafuna                     | American Samoa                        --skipping-- 599 rows in set (0.06 sec) mysql> SHOW STATUS LIKE 'last_query_cost'; +-----------------+------------+ | Variable_name   | Value      | +-----------------+------------+ | Last_query_cost | 642.952482 | +-----------------+------------+ 1 row in set (0.00 sec) mysql> SELECT STRAIGHT_JOIN first_name, last_name, city, country FROM customer INNER JOIN address USING(address_id) INNER JOIN city USING(city_id) INNER JOIN country USING(country_id); +-------------+--------------+----------------------------+---------------+ | first_name  | last_name    | city                       | country                               +-------------+--------------+----------------------------+---------------+ | MARY        | SMITH        | Sasebo                     | Japan                                 | PATRICIA    | JOHNSON      | San Bernardino             | United States                         | LINDA       | WILLIAMS     | Athenai                    | Greece                                | BARBARA     | JONES        | Myingyan                   | Myanmar                               | ELIZABETH   | BROWN        | Nantou                     | Taiwan                                --skipping-- 599 rows in set (0.01 sec) mysql> SHOW STATUS LIKE 'last_query_cost'; +-----------------+------------+ | Variable_name   | Value      | +-----------------+------------+ | Last_query_cost | 690.099000 | +-----------------+------------+ 1 row in set (0.00 sec)
Code language: JavaScript (javascript)

Optimize slow queries

Now that we’ve seen the several ways we can identify and analyze slow queries, the next thing would be to fix the slow-performing queries. Instinctively, the first thing we would do is to check the query itself. The following techniques demonstrate how we can optimize slow queries.

Only retrieve the data you need

While the server scans all the rows your query asks for, it locks any resources available in the session, and the client cannot interrupt the process without you forcefully quitting the session. Also, the client cannot selectively access data until the server has scanned all the rows.

The query you’re sending may be asking for too much data, which causes the MySQL server to waste time scanning many rows of data. But we can fix it by doing the following:

Fetch the number of rows you need only (pagination)

Imagine you’re building a movie database website like IMDB that only needs to fetch about 10 rows of movie information at a time (per page). Usually, you will not fetch all the data like this:

mysql> SELECT * FROM film; +---------+-----------------------------+---------------------------------> | film_id | title                       | description                                                                  +---------+-----------------------------+---------------------------------> |       1 | ACADEMY DINOSAUR            | A Epic Drama of a Feminist And a> |       2 | ACE GOLDFINGER              | A Astounding Epistle of a Databa> |       3 | ADAPTATION HOLES            | A Astounding Reflection of a Lif> |       4 | AFFAIR PREJUDICE            | A Fanciful Documentary of a Fris> |       5 | AFRICAN EGG                 | A Fast-Paced Documentary of an A> |       6 | AGENT TRUMAN                | A Intrepid Panorama of a Robot a> |       7 | AIRPLANE SIERRA             | A Touching Saga of a Hunter And > |       8 | AIRPORT POLLOCK             | A Epic Tale of a Moose And a Gir> |       9 | ALABAMA DEVIL               | A Thoughtful Panorama of a Datab>--skipping-- |     991 | WORST BANGER                | A Thrilling Drama of a Madman An> |     992 | WRATH MILE                  | A Intrepid Reflection of a Techn> |     993 | WRONG BEHAVIOR              | A Emotional Saga of a Crocodile > |     994 | WYOMING STORM               | A Awe-Inspiring Panorama of a Ro> |     995 | YENTL IDAHO                 | A Amazing Display of a Robot And> |     996 | YOUNG LANGUAGE              | A Unbelieveable Yarn of a Boat A> |     997 | YOUTH KICK                  | A Touching Drama of a Teacher An> |     998 | ZHIVAGO CORE                | A Fateful Yarn of a Composer And> |     999 | ZOOLANDER FICTION           | A Fateful Reflection of a Waitre> |    1000 | ZORRO ARK                   | A Intrepid Panorama of a Mad Sci> +---------+-----------------------------+---------------------------------> (END)

Doing so would make the MySQL server scan the complete 1000 rows of the table for the little data we need.

 We can fix our query using a common database technique known as pagination. We can use the LIMIT and OFFSET clauses to specify the number of rows we need and from which row it should start fetching data. The aim of this guide is not to teach you pagination in-depth (we’ll talk about this in another article), but our optimized version of our query would look like this:

mysql> SELECT * from film LIMIT 10; +---------+------------------+--------------------+--------------+-> | film_id | title            | description        | release_year | > +---------+------------------+--------------------+--------------+-> |       1 | ACADEMY DINOSAUR | A Epic Drama of... |         2006 | > |       2 | ACE GOLDFINGER   | A Astounding Ep... |         2006 | > |       3 | ADAPTATION HOLES | A Astounding Re... |         2006 | > |       4 | AFFAIR PREJUDICE | A Fanciful Docu... |         2006 | > |       5 | AFRICAN EGG      | A Fast-Paced Do... |         2006 | > |       6 | AGENT TRUMAN     | A Intrepid Pano... |         2006 | > |       7 | AIRPLANE SIERRA  | A Touching Saga... |         2006 | > |       8 | AIRPORT POLLOCK  | A Epic Tale of ... |         2006 | > |       9 | ALABAMA DEVIL    | A Thoughtful Pa... |         2006 | > |      10 | ALADDIN CALENDAR | A Action-Packed... |         2006 | > +---------+------------------+--------------------+--------------+-> (END) 10 rows in set (0.01 sec)
Code language: JavaScript (javascript)
mysql> SELECT * from film LIMIT 10 OFFSET 10; +---------+---------------------+--------------------+-------------> | film_id | title               | description        | release_year> +---------+---------------------+--------------------+-------------> |      11 | ALAMO VIDEOTAPE     | A Boring Epistl... |         2006> |      12 | ALASKA PHANTOM      | A Fanciful Saga... |         2006> |      13 | ALI FOREVER         | A Action-Packed... |         2006> |      14 | ALICE FANTASIA      | A Emotional Dra... |         2006> |      15 | ALIEN CENTER        | A Brilliant Dra... |         2006> |      16 | ALLEY EVOLUTION     | A Fast-Paced Dr... |         2006> |      17 | ALONE TRIP          | A Fast-Paced Ch... |         2006> |      18 | ALTER VICTORY       | A Thoughtful Dr... |         2006> |      19 | AMADEUS HOLY        | A Emotional Dis... |         2006> |      20 | AMELIE HELLFIGHTERS | A Boring Drama ... |         2006> +---------+---------------------+--------------------+------------->
Code language: JavaScript (javascript)

Fetch only the columns you need

Our queries in the last example are still not fully optimized. If you observe them, you will notice we fetched our data in all the table columns using SELECT * (select all). A better way to have accessed our data would have been by naming the columns we need in our query. Let’s see the columns our film table has:

mysql> SELECT COLUMN_NAME FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME = 'film'; +----------------------+ | COLUMN_NAME          | +----------------------+ | description          | | film_id              | | language_id          | | last_update          | | length               | | original_language_id | | rating               | | release_year         | | rental_duration      | | rental_rate          | | replacement_cost     | | special_features     | | title                | +----------------------+ 13 rows in set (0.00 sec)
Code language: JavaScript (javascript)

For our movie data application, we might only need the films’ titles, descriptions, ratings, and release years. Here’s how we would ask for such data:

mysql> SELECT title, description, rating, release_year FROM film LIMIT 10 OFFSET 10; +---------------------+--------------------+--------+--------------+ | title               | description        | rating | release_year | +---------------------+--------------------+--------+--------------+ | ALAMO VIDEOTAPE     | A Boring Epistl... | G      |         2006 | | ALASKA PHANTOM      | A Fanciful Saga... | PG     |         2006 | | ALI FOREVER         | A Action-Packed... | PG     |         2006 | | ALICE FANTASIA      | A Emotional Dra... | NC-17  |         2006 | | ALIEN CENTER        | A Brilliant Dra... | NC-17  |         2006 | | ALLEY EVOLUTION     | A Fast-Paced Dr... | NC-17  |         2006 | | ALONE TRIP          | A Fast-Paced Ch... | R      |         2006 | | ALTER VICTORY       | A Thoughtful Dr... | PG-13  |         2006 | | AMADEUS HOLY        | A Emotional Dis... | PG     |         2006 | | AMELIE HELLFIGHTERS | A Boring Drama ... | R      |         2006 | +---------------------+--------------------+--------+--------------+ 10 rows in set (0.00 sec)
Code language: JavaScript (javascript)

Fetch the columns you need when joining tables

Just like the previous example. You can make your joins faster by only asking for the columns you need in the result set. Let’s say you want to fetch only the first and last names of all the customers in our earlier join example. We would write our query in this manner:

SELECT first_name, last_name FROM customer INNER JOIN address USING(address_id) INNER JOIN city USING(city_id) INNER JOIN country USING(country_id);

Doing so will prevent the MySQL server from scanning all the columns of the merged tables.

Use caching strategies to prevent fetching the same data repeatedly

If your application uses the same piece of data frequently, instead of running the query every time the app needs the data, you can cache the data the first time the query runs and simply reuse the data subsequently.

The only exception is if the data changes frequently or when you need your queried information to be up-to-date every time. There used to be an in-built Cache layer in MySQL called “The Query Cache,” but it was deprecated for scalability reasons. Luckily, some tools can help cache data, such as Memcached and Redis

Avoid LIKE expressions with leading wildcards

Wildcards are used in conjunction with LIKE clauses. They are characters used to help search for data that match complex criteria.

The position of wildcard characters in the LIKE clause can cause unexpected performance behavior. Using leading wildcards has been known to be the culprit most times, and this is because MySQL is not able to utilize indexes efficiently when you have a leading wildcard:

SELECT * FROM customer WHERE first_name LIKE '%AB%'; // Leading wildcard causes poor performance
Code language: JavaScript (javascript)

UNION vs. UNION ALL

The MySQL query optimizer doesn’t apply as many optimizations to UNION queries as other queries, but you can specify clauses such as WHERE, LIMIT, ORDER BY, etc., to help the optimizer. But an even better option would be to use the UNION ALL clause if you need to eliminate duplicate rows in your query result.

That’s because omitting ALL adds the DISTINCT option to the temporary table, which results in the MySQL server making expensive scans on all the rows to determine uniqueness. The key takeaway is to avoid using DISTINCT and UNION unless necessary.

Optimize JOIN queries

You should use INNER JOIN instead of OUTER JOIN where possible. You can also optimize your JOIN queries by ensuring that indexes are present on the columns specified in the ON and USING clauses.

GROUP BY and ORDER BY

 GROUP BY and ORDER BY expressions should always refer to columns from a single table so that the server can use indexes for such operations.

Optimize COUNT() queries

The COUNT() function is a special aggregate function used in two ways. First, when you specify a column name or expression in the COUNT() function, it counts the number of times that expression has value (i.e. a non-NULL expression).

The second way to use COUNT() is for counting the number of rows in a query result using the COUNT(*) expression. It ignores the columns in the table and counts rows instead; therefore, it’s best to use this version when you only need to count the number of rows in the query result.

How to help the MySQL optimizer

There are a few cases where the optimizer may not choose the best execution path. The reasons could be that MySQL does not have sufficient information about the current data and must make “educated” guesses about the data.

In such cases, there are some methods available to help MySQL:

  • Using the EXPLAIN statement to view information about how MySQL processed your query. I showed you how to do that earlier.
  • Using FORCE INDEX, USE INDEX, and IGNORE INDEX (Index hints) for the scanned table to tell MySQL that the table scans are expensive compared to using the given index. Index hints give you  total control over how the optimizer chooses indexes for a query execution plan. It’s best to use this as a last resort after trying other optimization techniques discussed. The syntax for using index hints looks like this:
SELECT * FROM table1, table2 FORCE INDEX (index_of_column) WHERE table1.col_name=table2.col_name;
  • Applying Optimizer Hints at the different scope levels: global, query block, table-level, or index-level to control the optimizer’s strategies based on certain criteria. Optimizer hints are usually applied per statement and can be used together with Index hints. They look like comments but are recognized by the query parser. The syntax for adding optimizer hints looks like this:
SELECT /*+ NO_RANGE_OPTIMIZATION(t3 PRIMARY, f2_idx) */ f1   FROM t3 WHERE f1 > 30 AND f1 < 33; SELECT /*+ BKA(t1) NO_BKA(t2) */ * FROM t1 INNER JOIN t2 WHERE ...; SELECT /*+ NO_ICP(t1, t2) */ * FROM t1 INNER JOIN t2 WHERE ...; SELECT /*+ SEMIJOIN(FIRSTMATCH, LOOSESCAN) */ * FROM t1 ...; EXPLAIN SELECT /*+ NO_ICP(t1) */ * FROM t1 WHERE ...;
Code language: JavaScript (javascript)
  • Using the ANALYZE TABLE table_name to update the key distributions of the scanned table.
  • Using Global and table-level STRAIGHT_JOIN.
  • Tuning global or thread-specific system variables. For example, you can start the mysqld with the --max-seeks-for-key=1000 option or use SET max_seeks_for_key=1000 to tell the optimizer to assume that no key scan causes more than 1,000 key seeks.
  • Re-writing your query: Sometimes, your query might need refactoring to improve its response time. We discussed some scenarios we could do in the previous sections.
  • Adding Indexes to your tables: For tables that do not need frequent batch inserts or updates, you might want to use an index. Indexes also speed up JOIN operations and help MySQL retrieve data faster, thereby improving your query response time. For smaller tables, adding indexes may not be optimal. Also, you should not index everything, as it increases disk space usage and makes updating or inserting records slow.
  • Re-designing your schema: To access tables with the best possible speed, you must consider schema optimization. A poorly designed database schema forces MySQL to take a longer and utilize more system resources when processing your queries.

The next step: schema optimization and indexing

Query optimization is a crucial subject in relational databases such as MySQL. Understanding how MySQL executes queries allows you to determine where it spends time on your queries. You also understand how the response time can be affected by the quality of the queries you send to the MySQL server. 

Query optimization is just one piece of the puzzle in improving MySQL performance. Schema optimization and indexing are other options to consider. In fact, schemas and indexing can affect query speed and vice versa. Our next article will explain how schema optimization and indexing can affect your MySQL database performance and query execution speed.

I’m Elvis Duru. I create helpful content for web developers with a focus on React, Node.js, SQL/NoSQL Databases, and more! Let’s connect on Twitter.

Further reading