Skip to main content

Improving query performance - Subqueries (*aka Derived Queries)

I recently asked this question on StackOverflow about how to join to a derived table using an index.  I found out that even if your derived table is generated with tables that use all of the proper indexes, and you try to join that derived table to a base table in your outer query, no indexes are used.  Ever...

So, let's start with a contrived example (as we often do)...


SELECT * FROM t1
  JOIN (SELECT * FROM t2) AS derived_t2 ON t1.f1=derived_t2.f1;


OK... so let me get this straight... even if t1.f1 and t2.f1 are indexed, the join will still be a full table scan?  You betcha!

Let me say that again... whenever you write a subquery and try to join the derived table to your outer query tables, no indexes are ever used.  It's a full table scan.

So, I think to myself... "Wow... so that's really stupid, actually... because I have tables that only have a few thousand rows in them, and the query performance is shit.  Like... over 2 seconds... terrible and unacceptable."  What's the point of subqueries in MySQL if they throw foul-smelling excrement all over your query performance?  Eww... gross.

After some research, I found out that the query optimizer in MySQL 5.6.3+, which is a preview release at the time of this writing, is actually smart enough to create an index on the derived table if it can improve query performance.  YAY!  Unfortunately, everyone is running 5.1 right now; 5.6 is many months away.  *sigh* Bummer, dudes.

But, fortunately, I found a workaround, as discussed by Venu Anuganti (kudos, dude!) in one of his blog posts.  His blog post talks about creating a temporary table containing the rows of the derived table, adding an index to it, and then running your main query, joining to the temporary table, instead of writing the derived query.  This approach is a workaround, but it's almost an insult to the MySQL query optimizer... you have to hold MySQL's hand to tell it how to not be stupid.

For those who don't like this solution... consider this...

Things I tried that don't work:

  • Moving the derived table into a view - no noticeable performance gains here... there are no indexes on a view either. Duh!
  • Using "USE/FORCE INDEX" syntax on the derived table. MySQL doesn't like this - invalid syntax.
  • Curse at MySQL (although this did help... it didn't solve the problem)
  • Cry/weep - what are you, some kind of cry-baby?  Geez!
Things that might work:
  • Joining directly to t2 and using a GROUP BY in the query, eliminating the subquery altogether.  I like this approach, too, but it involves re-writing your query and GROUPing BY every single column in t1.  Lame.  I wonder what that does to query performance?
  • Strangle a MySQL RDBMS Software Engineer
And, finally, I'd like to say that neither the workaround above nor the query optimizer improvements in MySQL 5.6.3+ solve the entire problem.  Sometimes you might not need to generate a key on the derived table because the field in the derived table is already indexed in the first place.  In this case, MySQL should not regenerate the key... which takes O(n * log(n) ) time, by the way... it should just re-use the index that already exists to join the derived table to the outer query.  In our example above, if t2.f1 is already indexed, there is no need for the optimizer to create an index on the derived table because the field used to join to the outer query is already indexed.

Comments

Popular posts from this blog

Developing a lightweight WebSocket library

Late in 2016, I began development on a lightweight, isomorphic WebSocket library for Node.js called ws-wrapper .  Today, this library is stable and has been successfully used in many production apps. Why?  What about socket.io ?  In my opinion, socket.io and its dependencies are way too heavy .  Now that the year is 2018, this couldn't be more true.  Modern browsers have native WebSocket support meaning that all of the transports built into the socket.io project are just dead weight.  On the other hand, ws-wrapper and its dependencies weigh about 3 KB when minified and gzipped.  Similarly, ws-wrapper consists of about 500 lines of code; whereas, socket.io consists of thousands of lines of code.  As Dijkstra once famously said: "Simplicity is prerequisite for reliability." ws-wrapper also provides a few more features out of the box.  The API exposes a two-way, Promise-based request/response interface.  That is, clients can request data from servers just as easily as se

Computer Clocks Cause More Issues

Two nights ago, a leap second was added to system clocks running Linux, causing much-undesired havoc. On July 1st at 12:00 AM UTC, both of my Amazon EC2 instances fired an alarm indicating high CPU usage. I investigated to find that it was MySQL that was eating all of the CPU. I logged in and ran SHOW PROCESSLIST to find that no queries were running (these servers don't get hit much after business hours). I stopped MySQL, CPU utilization dropped back down to 1-3% (as normal). I restarted MySQL, and it started eating a lot of CPU again. Then, I restarted the server (shutdown -r now), and the problem went away. Both servers had the exact same problem (running Ubuntu 12.04 LTS). In my particular case, MySQL began eating CPU, even after being restarted.  It was a livelock. The only relevant item I saw in the syslog was: Jun 30 23:59:59 hostname kernel: [14152976.187987] Clock: inserting leap second 23:59:60 UTC Oh yeah... leap seconds.  Those are super important.

JavaScript Sticky Footer and Scroll Effect

This post talks about two different HTML/JavaScript effects: How to keep a page footer stuck at the bottom of the browser window. How to create a scrolling <div> without using a scroll bar OK. So... you have a website. You want a header stuck at the top of your page and the footer stuck at the bottom of your page. The stuff in the middle, you want to be able to scrollable. But, you don't want those ugly scrollbars to the right of your scrollable text. Maybe, instead, you'll have up arrows and down arrows above and below your <div>. When you mouseover the arrows, the text in the <div> will move up or down and create a scrolling effect. Suppose your page looks like this... <html> <head> <title>Test</title> </head> <body> <div style="position: relative; width: 700px; margin-left: auto; margin-right: auto;"> <div id="header">Header</div> <div id="scrollUp&q