Very Simple Subquery Returning Strange Results

All I want is a single random value from this MySQL query. Both the outer inner and outer queries reference the same table but I’m assigning them different aliases. Yet, this query is still performing like a correlated subquery, often returning multiple results.


select img.* from images img where img.image_id = (select floor(max(i.image_id) * rand()) + 1 from images i)

The inner query performs as expected if extracted and ran by itself, returning only a single random value.

in that case, why not just run two consecutive queries ?

Figuring out the reasoning for this query’s behavior was mostly for educational purposes. But for practical purposes, I’d like to minimize calls from the application to the DB.

the subquery cannot return more than one result (if it did, you’d get an error message)

therefor, if you’re getting more than one result returned by the outer query, this can only be because there are multiple rows with the same value of image_id

That’s exactly what I thought. Test this out for yourself. You might be surprised by the results…


CREATE TABLE `images` (
  `image_id` int(11) NOT NULL AUTO_INCREMENT,
  `url` varchar(55) DEFAULT NULL,
  PRIMARY KEY (`image_id`)
);
INSERT INTO images (`url`) VALUES ('cat.png'),('dog.png'),('bike.png'),('car.png');

weird

looks to me like it’s calculating RAND() separately for each row of the table

aha, yes, i just checked da manual, and that’s what it says

so…

run two separate queries

No need for two separate queries. There IS a way to do it with just one.


select images.* from images
   join (
      select ceil(rand() * (select max(image_id) from images)) as rand_id
   ) as tmp
on images.image_id >= tmp.rand_id 
limit 1