Hello, I am making a movie database and wanted to ask opinions on how to separate the actors from the title tables.
I was thinking of a title table with duration, comments, etc but then another table for actors and an actorId field in the title table to refer to the actors. However this results in a many to many relationship so I need to build a junction table for that. Pretty standard stuff, but was wondering if anyone had any other ideas on how to better do this.
Ahhhhhhh, gotcha. Thanks a lot for the explanation.
How's the weather in Toronto by the way? I lived there for a year, loved the city, hated the winters
GROUP BY produces one row for every distinct combination of values of the GROUP BY columns
so if you write GROUP BY foo,bar, you get one row for every unique combination of values of the foo and bar columns
if you write GROUP BY titleId, you get one row for every titleId
thus if your subquery accesses a table that has multiple rows per titleId, and you GROUP BY titleId, then the subquery will produce only one row per titleId, which can then be safely joined to the other tables in the outer query
Quite so, thanks! Forgetting the GROUP BY was causing all the languages to show up in the first movie edition only. Not sure why that is, but glad it works.
So does the whole edition and current query seem ok to you?
just looking at the syntax, i think so
of course, i'm not really all that familiar with your data
If I may try your patience for a bit, could you tell me why the GROUP BY clause causes the subqueries to work properly. There's always RTFM, but in the case of subqueries I am a bit confused, trying to make sense of all of this
you forgot the GROUP BY in the third subquery
also, you don't have any edition columns in the SELECT clause, so the results of the query will surely be confusing
Back again to the table structure because I am unsure how to handle something.
I have the main table, with a junction table for director and another table for actor which has one row for each actor and role and a foreign key for the main table to link them.
I am doing a keyword search, where the keyword can be either title (dvdpedia table), director (director table via the junction table), or actor (actor table). I don't know what people will be looking for, could be any of these three things.
So this works for matching title and director:
SELECT DISTINCT dvdpedia.id
LEFT OUTER JOIN ( SELECT director2title.titleId,
GROUP_CONCAT(director SEPARATOR ', ') AS director
INNER JOIN director2title
ON director.id = director2title.pid
GROUP BY director2title.titleId ) AS d
ON d.titleId = dvdpedia.id
WHERE MATCH(title) AGAINST(:title IN BOOLEAN MODE)
OR d.director LIKE :director";
:title and :director are placeholder variables
However, I'm unsure how to match also actors. Since the actor table does not work through a junction table, I can't do the same trick. I tried doing a
WHERE dvdpedia.id IN (SELECT titleId FROM actor WHERE MATCH(title) AGAINST(:actor IN BOOLEAN MODE))
Which works, but takes an obscene amount of time to run. So I'm hoping there is a much better way to do this.
As always, thanks in advance.
Hello, I'm coming back to a fulltext issue I'm having:
If I search the actor table for actor and role (2 fields in the table). When the role field is empty, I'm getting no results with a fulltext search.
Sometimes this column WILL be empty. How can I get results for this combination? It seems odd that if the column is empty, there wouldn't be a match no? I mean, I do get a result using a similar query with LIKE instead of fulltext search. I haven't seen anything about this in the manual.
Ok, I'll try it as soon as I get home.
"if you also wanted to collapse the actors into a concatenated value, you would do that in a subquery as well"
This query IS faster, but produces one row for each actor in a movie, see attached screenshot.
let's use a portion of your database to illustrate
one movie can have multiple actors, and one movie can have multiple writers
try to combine these relationships in a single query, and boom, you get cross join effects
one thing you could do is collapse one of these relationships in a subquery, so that the subquery produces a single row per movie, with GROUP_CONCAT to collapse multiple values to one (concatenated) value
on actor2title.titleId = dvdpedia.id
on actors.id = actor2title.pId
on rated.id = dvdpedia.ratedId
JOIN ( SELECT writer2dvd.titleId
, ' '
, writers.lastName) AS writers
ON writers.id = writer2dvd.pId
BY writer2dvd.titleId ) AS w
ON w.titleId = dvdpedia.id
thus the outer query has only one one-to-many relationship, along with the one-to-one relationship with the subquery
if you also wanted to collapse the actors into a concatenated value, you would do that in a subquery as well
It does, but I don't know how to fix it. Any pointers?
it does not fix the problem, it merely papers over it
your query tries to combine multiple one-to-many relationships, with the result that you are getting cross join effects (multiplied results)
using DISTINCT within the GROUP_CONCAT merely collapses the output, you are still executing a horrendously inefficient query underneath
does this explanation make sense to you?
there will be as many "repeats" for a single actor as there are movies that this actor played in
they aren't really repeats, of course, because you would not allow the same actor to play the same role in the same movie more than once
however, this brings up the interesting point that you might want to have the same actor play more than one role in the same movie
Coming To America - Eddie Murphy - Prince Akeem
Coming To America - Eddie Murphy - Clarence
Coming To America - Eddie Murphy - Randy Watson
Coming To America - Eddie Murphy - Saul
in this case you need to change the table's primary key
CREATE TABLE actor2title
( pId INTEGER NOT NULL
, titleId INTEGER NOT NULL
, role VARCHAR(99) NOT NULL
, PRIMARY KEY ( pId , titleId , role )
and of course now you need to specify a role for every actor in every movie, since no portion of a primary key may be null
Yes, use a LEFT OUTER JOIN instead of an INNER JOIN.
Yet another question: The above is working beautifully, except that if the movie has a missing entry (say no actors associated yet) the query will not list it. Is there a way to list movies with NULL entries?
But doesn't that defeat the whole point of normalization of not repeating the same value? Also, it would make searches based on actors a lot slower no?
But the actor will play different roles in different movies, so I would end up with repeated rows for the same actor inside that table???
next page →