Hello, I am making a movie database and wanted to ask opinions on how to separate the actors from the title tables.
I was thinking of a title table with duration, comments, etc but then another table for actors and an actorId field in the title table to refer to the actors. However this results in a many to many relationship so I need to build a junction table for that. Pretty standard stuff, but was wondering if anyone had any other ideas on how to better do this.
GROUP BY produces one row for every distinct combination of values of the GROUP BY columns
so if you write GROUP BY foo,bar, you get one row for every unique combination of values of the foo and bar columns
if you write GROUP BY titleId, you get one row for every titleId
thus if your subquery accesses a table that has multiple rows per titleId, and you GROUP BY titleId, then the subquery will produce only one row per titleId, which can then be safely joined to the other tables in the outer query
Quite so, thanks! Forgetting the GROUP BY was causing all the languages to show up in the first movie edition only. Not sure why that is, but glad it works.
So does the whole edition and current query seem ok to you?
If I may try your patience for a bit, could you tell me why the GROUP BY clause causes the subqueries to work properly. There’s always RTFM, but in the case of subqueries I am a bit confused, trying to make sense of all of this
Back again to the table structure because I am unsure how to handle something.
I have the main table, with a junction table for director and another table for actor which has one row for each actor and role and a foreign key for the main table to link them.
I am doing a keyword search, where the keyword can be either title (dvdpedia table), director (director table via the junction table), or actor (actor table). I don’t know what people will be looking for, could be any of these three things.
So this works for matching title and director:
SELECT DISTINCT dvdpedia.id
, dvdpedia.title
, locale
, d.director
FROM dvdpedia
LEFT OUTER JOIN ( SELECT director2title.titleId,
GROUP_CONCAT(director SEPARATOR ', ') AS director
FROM director
INNER JOIN director2title
ON director.id = director2title.pid
GROUP BY director2title.titleId ) AS d
ON d.titleId = dvdpedia.id
WHERE MATCH(title) AGAINST(:title IN BOOLEAN MODE)
OR d.director LIKE :director";
:title and :director are placeholder variables
However, I’m unsure how to match also actors. Since the actor table does not work through a junction table, I can’t do the same trick. I tried doing a
WHERE dvdpedia.id IN (SELECT titleId FROM actor WHERE MATCH(title) AGAINST(:actor IN BOOLEAN MODE))
Which works, but takes an obscene amount of time to run. So I’m hoping there is a much better way to do this.
Hello, I’m coming back to a fulltext issue I’m having:
If I search the actor table for actor and role (2 fields in the table). When the role field is empty, I’m getting no results with a fulltext search.
Sometimes this column WILL be empty. How can I get results for this combination? It seems odd that if the column is empty, there wouldn’t be a match no? I mean, I do get a result using a similar query with LIKE instead of fulltext search. I haven’t seen anything about this in the manual.
let’s use a portion of your database to illustrate
one movie can have multiple actors, and one movie can have multiple writers
try to combine these relationships in a single query, and boom, you get cross join effects
one thing you could do is collapse one of these relationships in a subquery, so that the subquery produces a single row per movie, with GROUP_CONCAT to collapse multiple values to one (concatenated) value
SELECT dvdpedia.id
, dvdpedia.title
, rated.rated
, w.writers
, actors.name
, actors.lastName
FROM dvdpedia
INNER
JOIN actor2title
on actor2title.titleId = dvdpedia.id
INNER
JOIN actors
on actors.id = actor2title.pId
INNER
JOIN rated
on rated.id = dvdpedia.ratedId
INNER
JOIN ( SELECT writer2dvd.titleId
, GROUP_CONCAT(CONCAT(writers.name
, ' '
, writers.lastName) AS writers
FROM writer2dvd
INNER
JOIN writers
ON writers.id = writer2dvd.pId
GROUP
BY writer2dvd.titleId ) AS w
ON w.titleId = dvdpedia.id
thus the outer query has only one one-to-many relationship, along with the one-to-one relationship with the subquery
if you also wanted to collapse the actors into a concatenated value, you would do that in a subquery as well
there will be as many “repeats” for a single actor as there are movies that this actor played in
they aren’t really repeats, of course, because you would not allow the same actor to play the same role in the same movie more than once
however, this brings up the interesting point that you might want to have the same actor play more than one role in the same movie
for example…
Coming To America - Eddie Murphy - Prince Akeem
Coming To America - Eddie Murphy - Clarence
Coming To America - Eddie Murphy - Randy Watson
Coming To America - Eddie Murphy - Saul
in this case you need to change the table’s primary key
CREATE TABLE actor2title
( pId INTEGER NOT NULL
, titleId INTEGER NOT NULL
, role VARCHAR(99) NOT NULL , PRIMARY KEY ( pId , titleId , role )
);
and of course now you need to specify a role for every actor in every movie, since no portion of a primary key may be null
Yet another question: The above is working beautifully, except that if the movie has a missing entry (say no actors associated yet) the query will not list it. Is there a way to list movies with NULL entries?
But doesn’t that defeat the whole point of normalization of not repeating the same value? Also, it would make searches based on actors a lot slower no?