Alter This Query to Grab First Five Rows by First Letter (all 26 letters)

Here is the query I’ve been using to properly grab the correct title without “the” and all that nonsense.


$getfilms = mysql_query("select films.film_id,films.film_title,films.studio,films.director,films.writer,films.cast,films.release,films.film_keyw,films.image,films.mobile
     , case when substring_index(film_title,' ',1)
                  in ('a','an','the')
            then concat(
                    substring(film_title,instr(film_title,' ')+1)
                   ,', '
                   ,substring_index(film_title,' ',1)
                       )
            else film_title
         end as title2
  from films
 where case when substring_index(film_title,' ',1)
                  in ('a','an','the')
            then concat(
                    substring(film_title,instr(film_title,' ')+1)
                   ,', '
                   ,substring_index(film_title,' ',1)
                       )
            else film_title
         end                like '$letter%' AND films.mobile='y'
order
    by title2",$cn);

This query has done me great. But now I need it to grab the first five rows (films) that begin with each letter in the alphabet as a preview display of our database by the letter. My brain is rattled on how to do this efficiently.

All feedback appreciated.

Thanks!
Ryan

My gut feeling is that you’re better off doing it with the server-side language that you’re using (PHP?).

You could grab the list of films from the db, then you would have a films array with a sub array for each letter or number and then for each letter or number there would be arrays with the film details in. You could write one function that snips “an” “a” and “the” from the film title and it would then sort each film to the relevant letter array.

The next function would grab the details of the first 5 films in each letter array and place them in an array which would then be handed over to whatever script you’re using on the film details

films =>
a
=> American Pie
=> Avatar
=> Director
=> Year Released

    b
        => Back To The Future
        => Battleship

hey, nice CASE expression – i recognize it :wink:

in order for the following query to remain simple, you will need to declare a view that incorporates the CASE expression for title2

CREATE VIEW films_v
AS
SELECT films.*
     , CASE WHEN SUBSTRING_INDEX(film_title,' ',1)
                  IN ('a','an','the')
            THEN CONCAT(
                    SUBSTRING(film_title,INSTR(film_title,' ')+1)
                   ,', '
                   ,SUBSTRING_INDEX(film_title,' ',1)
                       )
            ELSE film_title  END AS title2
  FROM films 

imagine the following query with five of those CASE expressions :confused:

the view keeps it simple so that you can see what’s going on


SELECT t.film_id
     , t.film_title
     , t.studio
     , t.director
     , t.writer
     , t.cast
     , t.release
     , t.film_keyw
     , t.image
     , t.mobile
     , t.[color=blue]title2[/color]
  FROM films_v AS t
 WHERE ( SELECT COUNT(*)
           FROM films_v
          WHERE LEFT([color=blue]title2[/color],1) = LEFT(t.[color=blue]title2[/color],1)
            AND [color=blue]title2[/color] < t.[color=blue]title2[/color] ) < 5

caution: this won’t perform well if there are very many rows, but in that case, reading the entire table into php probably won’t perform well either :wink:

r937 -

Thought you might have recognized this. :slight_smile: When it comes to mysql, you are a god amongst insects.

I’ll give this a try and let you know! The good news is that the out put will cache to a remote system and only update about once daily, so as long as it processes quick enough the resource hog shouldn’t be much of an issue.

Cheers!
Ryan

I’d probably maintain a separate Top5 table.
Initially, I’d be populating it with the top five titles per letter.
Upon modifications on parent table, I’d check if the records being inserted/modified qualify to feature in the Top5.

<hr>

I’d also be considering partitioning the original table by 26 RANGES… and go for the window functions.

<hr>

Another thing I’d do, is consider storing the titles like this: “Amazing Spider Man, The”, “Dark Knight Rises, The”, “Clockwork Orange, A”… or have something like two columns: title|prefix, instead just the title column: Amazing Spider Man|The

Wow, yeah, that took some time to process and gave some results (like films with odd characters starting them). I ended up creating an “app_db” table, and ran this:


$letters = array('a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z');

foreach($letters as $letter) {
$getfilms = mysql_query("select films.film_id,films.film_title,films.mobile 
     , case when substring_index(film_title,' ',1)
                  in ('a','an','the')
            then concat(
                    substring(film_title,instr(film_title,' ')+1)
                   ,', '
                   ,substring_index(film_title,' ',1)
                       )
            else film_title
         end as title2   
  from films
 where case when substring_index(film_title,' ',1)
                  in ('a','an','the')
            then concat(
                    substring(film_title,instr(film_title,' ')+1)
                   ,', '
                   ,substring_index(film_title,' ',1)
                       )
            else film_title
         end                like '$letter%' AND films.mobile='y'
order 
    by title2 ASC LIMIT 5",$ch)
or die(mysql_error());
	$titles = array();
	while($films=mysql_fetch_array($getfilms)) {
	$titles[] = $films['film_title'];
	}
	echo $newtitles = implode(", ", $titles);
	echo $letter = trim($letter);
	echo "<br />";
	$updaterow = mysql_query("UPDATE app_db SET films='$newtitles' WHERE letter='$letter' LIMIT 1",$cn); 
}

This responds instantly, but I am hitting a weird bug. It is not updating rows e, h, j, k, l, o, u, v, w. I checked and can see films are showing under each letter (so not a blank insert), but the table remains blank in those rows. Why is that?

Cheers!
Ryan

doing it that way though you are running your mysql query 26 separate times.

Yes. 26 times. I’m not worried about it, as simple solution to a quick function that will run once a week. But why is it only updating specific rows. The echo confirms the data is there for the other rows.

Confused by that.

Thanks
Ryan

Okay but WHY are you running that query 26 times when the solution given to you in post 3 allows you to get the first five films for each letter of the alphabet as you have asked in a single query?

Sorry for the delay. I was playing with the queries more today.

One reason I did the 26-alphabet loop was the script/queries worked instantaneously, while the single query took about 25 seconds to complete.

Yes, I prefer running most things in a single query and r937 version worked beautifully. Except that it was also pulling rows with special characters at the beginning. I could have added another WHERE condition to clear those out, but decided the loop was easier for the table I was updating. The info is going to be accessed a lot (thousands of requests per minute for peak) I found out, so I didn’t want that query running too often.

So instead did the loop to find the information I needed and updated a special table with the info correlated to starting letter. This script only runs twice per day, and then the script being called repeatedly just requests whats in the updated table without any special conditions.

I might rewrite the script to just write/create a static XML file to even save more on resources twice a day.

But, my final decision on query choice was what required the cleaner (and easier to understand) PHP coding after the fact. Doing it this way was easier on my eyes.

Cheers!
Ryan