Help understanding why a join works

I have a small database and this question is in regards to just five of the tables, I am going to list them with just the basics and primary keys.

Movie
movieUPC (PK)
movieTitle

movieGenre
movieUPC (PK)
genreName (PK)

movieNight
dateViewed (PK)
movieUPC

movieViewer
dateViewed (PK)
viewer (PK)

viewer
viewerID (PK)
viewerFname
favGenre

  1. The movie table stores the attributes of the movie
  2. A movie can have more than one genre
  3. movieNight is when a movie was watched
  4. viewer is the people that can watch a movie on movieNight
  5. movieViewer is to store the many viewers for a single movieNight relation

I was trying to find all movies that match a particular viewers favGenre by joining tables. I ran into a problem joining movie and movieNight though:

SELECT * FROM movie LEFT JOIN movieNight ON movie.movieUPC = movieNight.movieUPC\\G

The problem I had was that if a movie wasn’t watched yet I obviously got NULL values for dateViewed, so I couldn’t link back to the viewer table.

While messing around I, from a typo, accidentally found a way that works, but was left even more confused, because I don’t see why it should work:

SELECT movieTitle, genreName AS 'Anna''s Favorite Genre' FROM viewer 
RIGHT JOIN movieViewer 
ON viewer.viewerID = movieViewer.viewerID 
RIGHT JOIN movieNight 
ON movieViewer.dateViewed = movieNight.dateViewed 
LEFT JOIN movie 
ON movieNight.movieUPC = movieNight.movieUPC -- What is this line doing?
LEFT JOIN movieGenre 
ON movie.movieUPC = movieGenre.movieUPC 
WHERE viewerFname = 'Anna' 
AND genreName = favGenre;

That line is always true, so in reality you are doing a cross join between those 2 tables: each row of the movienight table is joined with each row of the movie table.
So what you should get from that query is all movies, regardless of the viewer’s favorite genre.

So is this the correct way to do it then, but changing it to:

SELECT movieTitle, genreName AS 'Anna''s Favorite Genre' FROM viewer
RIGHT JOIN movieViewer
ON viewer.viewerID = movieViewer.viewerID
RIGHT JOIN movieNight
ON movieViewer.dateViewed = movieNight.dateViewed
CROSS JOIN movie
LEFT JOIN movieGenre
ON movie.movieUPC = movieGenre.movieUPC
WHERE viewerFname = 'Anna'
AND genreName = favGenre;

Since you can’t join on movieUPC if certain movies haven’t been watched. Seems like the more movies in the table and the more movieNight’s that took place would cause a lot of rows retrieved. I realize the WHERE statement filters that. Just would like to know if there is a better way or if that is the only real way since you can’t retrieve all movies since you can’t join on the movieUPC.

What is the difference between a Cross Join and Full Joins?

Also, thanks for explaining what that part of my Query was actually doing,

omfg, no :nono:

mixing RIGHT and LEFT joins is a recipe for disaster

i’m sort of getting the impression you don’t really know what RIGHT and LEFT are for (no offence intended, we were all new once) …

what happens if you change them all to INNER joins?

700lbGorilla, this page here provides a nice explaination (with diagrams) of the different joins. MySQL doesn’t support FULL JOINS

nice one, spacephoenix, many people like the venn diagram approach (i personally don’t)

here’s another explanation – best… post… ever…

this post eventually made it pretty much intact into chapter 3 of my book explaining the different types of join

Yes, I am still grasping when to use what joins, but to answer your question using inner joins returns an empty set. I’ll also look at that link SpacePhoenix, thanks.

I read those articles, think I understand the joins a little more. I changed them all to Right and all to Left besides the cross just to see the outcome and it helped see the results and what I should expect.

I changed them to using just Left joins except for the Cross. I need the cross to be able to link all movies to movieNight in case some of the movies matching the genre have not been watched during movieNight yet.

SELECT movieTitle, genreName AS 'Anna''s Favorite Genre' FROM viewer
LEFT JOIN movieViewer
ON viewer.viewerID = movieViewer.viewerID
LEFT JOIN movieNight
ON movieViewer.dateViewed = movieNight.dateViewed
CROSS JOIN movie
LEFT JOIN movieGenre
ON movie.movieUPC = movieGenre.movieUPC
WHERE viewerFname = 'Anna'
AND genreName = favGenre;

Hopefully this is the more correct way to do it…

After more reading I found another way to do it that seems more logical and straight forward than the above one I posted.

Original:

SELECT movieTitle
FROM viewer
LEFT JOIN movieViewer
USING (viewerID)
LEFT JOIN movieNight
USING (dateViewed)
CROSS JOIN movie
LEFT JOIN movieGenre
ON movie.movieUPC = movieGenre.movieUPC
WHERE viewerFname = 'Anna'
AND favGenre = genreName
GROUP BY movieTitle;

In the original I had to join back through all tables and cross join the two to link all movies (watched/not) that resulted in duplicate results that need to be grouped into movieTitle to give me a desired answer. The one below requires no group and seems to be better to me. Does one approach have benefits over the other? I would think not needing the cross join would be a huge benefit as the database grows. But I have heard nested SELECT’s are more intensive, just wondering in this case if it is actually better?

New:

SELECT movieTitle
FROM movieGenre
LEFT JOIN movie
USING (movieUPC)
WHERE genreName = 
	(SELECT favGenre
	FROM viewer
	WHERE viewerFname = 'Anna');

Thank you r937 and SpacePhoenix for your earlier helping on better understanding joins.

–Edit–
Just realized I could have used DISTINCT instead of the GROUP BY