Tying 3 tables together with junction table

Hello and Happy New Year to everyone.

I posted this in another thread, but it was 6 months old and the system suggested I create a new thread so I’m doing just that.

I have 3 tables that I need to tie in with a junction table.

dvdpedia contains the tv series and movie titles.
actor contains the actor names
role contains the roles that an actor plays in a particular movie or tv series
actor2role is the junction table that will have an entry for each primary key of those 3 tables.

The problem I am facing is when querying the database to get the proper output without repeated values.

The simplest query works but returns repeated columns:

select dvdpedia.title, actor.actor, role.role FROM dvdpedia, actor, actor2role, role
		WHERE actor.id = actor2role.actorId and dvdpedia.id = actor2role.titleId;

What would be the proper query here?

PS
This is the create tables syntax:

CREATE TABLE `actor` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `titleId` int(11) NOT NULL,
  `actor` varchar(255) NOT NULL DEFAULT '',
  PRIMARY KEY (`id`),
) ENGINE=MyISAM DEFAULT CHARSET=utf8

CREATE TABLE `actor2role` (
  `actorId` int(11) NOT NULL DEFAULT '0',
  `titleId` int(11) NOT NULL DEFAULT '0',
  `roleId` int(11) NOT NULL,
  PRIMARY KEY (`actorId`,`titleId`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8

CREATE TABLE `role` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `actorId` int(11) NOT NULL,
  `role` text,
  PRIMARY KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8


Thanks

how many actors in the same dvdpedia can play a single role? i think only one, yes?

e.g. in hamlet, only one actor plays polonius, right?

so i think the dvdpedia is related to the role, but not the actor2role, and not the actor

see what i’m saying?

Since we included tv series, then you have the same actor playing the same role but for different titles (even though it’s a series, each episode is a distinct title).

So, ok, you’re saying that what matters is the role to match to the title and not the actor. This might be true, but people will often search by actor and want to see which movies and shows they’re in, so I still need to relate the actor to the title.

you’re right, it’s a 3-way junction

however, you have some extraneous columns in your design

i would do it like this –

CREATE TABLE actor 
( id    INTEGER NOT NULL PRIMARY KEY AUTO_INCREMENT
, actor VARCHAR(255) NOT NULL 
);
 
CREATE TABLE role 
( id    INTEGER NOT NULL PRIMARY KEY AUTO_INCREMENT
, role  TEXT
);

CREATE TABLE actor2role 
( actorId INTEGER NOT NULL 
, titleId INTEGER NOT NULL 
, roleId  INTEGER NOT NULL
, PRIMARY KEY (actorId,titleId,roleId)
);

Yes, I realized I probably don’t need the actorId in the role table or the titleId in the actor table (I’ve left them for now because it’s a different relationship at the moment, not a 3-way junction). However, my problem is really getting the relationship between the 3 tables in a query :slight_smile:

I’m trying to think of a way that would return the data properly for display and for searching.

How would I do that?

you want to dump everything, all actors for all roles for all titles?

or will it be a search for a specific actor, say, or a specific title?

also, which columns can the query use? i’m kinda lost at the moment about your actual/proposed table designs

  1. I want to be able to get movie title, and actor with role for a search that matches a title or actor.

  2. I also need other matches that come close to that search.

So for example if I search for “Midnight Run” I will get Midnight Run as the first hit, with actor and role info for that title, but also 10 other options like Midnight Sun, Before Midnight, etc.

I had all this working, I don’t want you to think that I want you to do my job for me :smiley: but with the addition of a third table into this junction it has thrown me off because when you do certain operations on the actor table you cannot use FULL TEXT search for example.

To clarify, full text search is enabled on the actor, role and dvdpedia.title columns

  1. “which columns can the query use?” Let’s assume that the columns are the ones you suggested, and of course dvdpedia.id and dvdpedia.title, this would be the most efficient way to go about it I think.

In order to get the ids from the titles I had this code which worked, except…

SELECT dvdpedia.id, dvdpedia.title, dvdpedia.locale, dvdpedia.localeLanguage, releaseDate FROM dvdpedia WHERE dvdpedia.id IN 
		(SELECT actor2role.titleId FROM actor 
				INNER JOIN actor2role ON actor.id = actor2role.actorId
				WHERE MATCH(actor) AGAINST('chiklis' IN BOOLEAN MODE)
				GROUP BY titleId)

I also need the fulltext match as a relevance column (MATCH(actor) AGAINST(‘chiklis’ IN BOOLEAN MODE) AS relevance) and I can’t get it with this query. Can anyone think of a way to tie in these 3 tables so that I can get the match column as relevance?

SELECT dvdpedia.id
     , dvdpedia.title
     , dvdpedia.locale
     , dvdpedia.localeLanguage
     , dvdpedia.releaseDate 
     , MATCH(actor) AGAINST('chiklis' IN BOOLEAN MODE) 
          AS relevance 
  FROM actor
INNER
  JOIN actor2role 
    ON actor2role.actorId = actor.id
INNER
  JOIN dvdpedia 
    ON dvdpedia.id = actor2role.titleId 
 WHERE MATCH(actor) AGAINST('chiklis' IN BOOLEAN MODE)

As always, Rudy you are a genius. I had come up with this:

SELECT dvdpedia.id, dvdpedia.title, dvdpedia.locale, dvdpedia.localeLanguage, dvdpedia.releaseDate, MATCH(actor) AGAINST(:actor IN BOOLEAN MODE) AS relevance
FROM actor, dvdpedia
	INNER JOIN (SELECT actor2role.titleId 
			FROM actor2role
			INNER JOIN actor ON actor2role.actorId = actor.Id 
			WHERE MATCH(actor) AGAINST(:actor2 IN BOOLEAN MODE) 
			GROUP BY actor2role.titleId) AS a
	ON dvdpedia.id = a.titleid																					
WHERE a.titleId = dvdpedia.id AND MATCH(actor) AGAINST(:actor3 IN BOOLEAN MODE) 
GROUP BY dvdpedia.id

Which gave me something similar but harder to read and took longer. The :actor[n] are just variable placeholders for the value.

Please excuse my idiocy, but now I need to get the actors and roles for a particular title. I’d like them to be in a column for each:

i.e. The Shield, Michael Chiklis, Vic Makey

I had this query, but it gives them to me in the same field.

SELECT dvdpedia.id, dvdpedia.title
	, a.actors
	, r.roles
	FROM dvdpedia
	LEFT OUTER JOIN ( SELECT actor2role.titleId
			              , GROUP_CONCAT(actor SEPARATOR ', ') AS actors
			           FROM actor2role 
			         INNER JOIN actor 
	  	                ON actor.id = actor2role.actorId
			         GROUP BY actor2role.titleId ) AS a
			        ON a.titleId = dvdpedia.id
						    						    						    
				LEFT OUTER JOIN (SELECT actor2role.titleId
				, GROUP_CONCAT(role SEPARATOR ', ') as roles
				FROM actor2role
				INNER JOIN role
				ON role.id = actor2role.roleId
				GROUP BY actor2role.titleId) AS r
				ON r.titleId = dvdpedia.id
WHERE dvdpedia.id = 60

If I get rid of the GROUP_CONCAT and remove the GROUP BYs, then it just mixes actors and roles.

All I’m interested in really is the actor with corresponding role for a particular title, I don’t even need to get the title name.

SELECT dvdpedia.id
     , dvdpedia.title
     , GROUP_CONCAT(
          CONCAT(actor.actor,' as ',role.role)
                   ) AS actor_roles
  FROM dvdpedia
INNER
  JOIN actor2role
    ON actor2role.titleId = dvdpedia.id
INNER
  JOIN role
    ON role.id = actor2role.roleId
INNER
  JOIN actor 
    ON actor.id = actor2role.actorId
GROUP
    BY dvdpedia.id

I came up with this:

SELECT actor2role.titleId
	, a.actor
	, r.role
	, r.position
FROM actor2role

INNER JOIN (SELECT actor.id, actor.actor 
			FROM actor
			INNER JOIN actor2role ON actor.id = actor2role.actorId
			GROUP BY actor.id) AS a 
			ON actor2role.actorId = a.id
INNER JOIN (SELECT role.id, role.role, role.position
			FROM role
			INNER JOIN actor2role ON role.id = actor2role.roleId
			GROUP BY role.id) AS r
			ON actor2role.roleId = r.id			
			
WHERE actor2role.titleId = 60

And I think it works, but I’ve been at it so long maybe I’m seeing visions :blush: . Does this look right?

Rudy I hadn’t seen your post when I entered the one above.

I modified it slightly because I need the actor and role in individual columns and yours is 3 milliseconds faster than mine so I’ll use that :slight_smile:

SELECT dvdpedia.id
     , actor.actor
     , role.role
     , role.position
  FROM dvdpedia
INNER JOIN actor2role
    ON actor2role.titleId = dvdpedia.id
INNER JOIN role
    ON role.id = actor2role.roleId
INNER JOIN actor 
    ON actor.id = actor2role.actorId
    
WHERE dvdpedia.id = 60  
ORDER BY position; 

It seems I’m always complicating the queries, I sometimes have trouble with the joins…

Thanks once again

Does anyone know if it’s possible to do a fulltext match against an empty field?

The situation is that I’m matching actor and role fields but often the role field is empty but I still need to match it, that is, an actor with an empty role is valid and I need to get a hit out of a search for that or I end up inserting the same actor with the empty role.