Randomly select a MySQL record

This is a reminder to myself as I have this problem every now and then. I thought I share this with you.

Ok the problem is to random(ish)ly select a record from a MySQL database in an efficient way without any scripting or programming. There are several more or less simple ways described everywhere on the web, one being

SELECT * FROM table_name ORDER BY RAND() DESC LIMIT 1;

It does the job. Very slowly though, it can take MySQL several seconds or even minutes to complete the query. After fiddling around for some time, I came up with this:

set @randomId=FLOOR(RAND()*(SELECT MAX(pk_field) FROM table));
PREPARE randomStmt FROM "SELECT * FROM table WHERE pk_field > ? LIMIT 1";
EXECUTE randomStmt USING @randomId;

…where `pk_field` is a numeric primary key. Ok the query goes like this:

  1. Get the highest value for pk_field from table_name, multiply with a random number between 0…1, cut off decimals and store in variable randomId.
  2. Prepare a statement where we can insert randomId and retrieve the desired record with.
  3. Take the query, insert randomId and execute.

This runs very fast even with a large dataset. I hear your objections already… :)

“This is stupid. Why didn’t you use `COUNT(pk_field)` to find the end of the number range? Even then this approach would still be stupid BTW.” Because it is slow. I currently have a table with >1.8 million records and `COUNT(pk_field)` takes >1 seconds to complete. Not good. `MAX(pk_field)` is good enough in most cases. YMMV though.

“The query is not truly random and prefers records at the end of a large pk_field range gap!” True. That’s what I meant with “YMMV” :) My solution is not truly random and it is biased, especially when there are large gaps in `pk_field`. It works for me though and is random enough when there are no large pk_field gaps.


Posted

in

by

Tags:

Comments

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.