Join with one to many

Hello,

Trying to create a query that’s a bit over my head. Working in WordPress that stores meta data for each post in a separate table like…

Table 1 - wp_posts

  • Sample Columns: post_id, title, content

Table 2 - wp_postmeta

  • Columns: post_id, meta_key, meta_value.

When you have more than one meta data entry, a straight query on the wp_postmeta by post_id would give you…

Row 1: post_id=x, meta_key=custom_field_1, meta_value=custom_value_1
Row 2: post_id=x, meta_key=custom_field_2, meta_value=custom_value_2

What I’m trying to get, via one query, is a row returned for each post like…

Row 1: post_id=x, custom_field_1=custom_value_1, custom_field_2=custom_value_2.

Can someone help point me in the right direction? I’d sure appreciate it!

Philip

what you’re asking for can be done in sql, but it’s clunky, and it depends on the number of meta fields, as you have to hardcode something for each one, so it’s different if there are two columns versus three columns, etc.

why can’t you just consume multiple meta rows?

Thanks for your help.

Maybe I can do multiple rows, but I’m not sure how. This is for an instant search feature on a table with 10k’s of rows, so it gets executed many times per visitor and obviously needs to be highly optimized, ie I can’t be gathering all the wp_postmeta info for each record in a loop.

Do you mean just let it return multiple rows with the same wp_post info and the meta_key and meta_value would be different for each row? I guess I could just loop through all the results and pack a single array for each post with all the wp_postmeta info. Is something like that what you mean?

Also, I don’t mind clunky as long as it’s efficient, and the number of fields from wp_postmeta will always be the same. I’ve come close with…

SELECT wp_posts.ID, wp_posts.post_date, wp_posts.post_title, wp_posts.post_content,
MAX(CASE WHEN meta_key = ‘ad_city’ THEN meta_value END ) AS ad_city,
MAX(CASE WHEN meta_key = ‘ad_state’ THEN meta_value END ) AS ad_state,
MAX(CASE WHEN meta_key = ‘ad_trade_for’ THEN meta_value END ) AS ad_trade_for,
MAX(CASE WHEN meta_key = ‘ad_trade_category’ THEN meta_value END ) AS ad_trade_category,
MAX(CASE WHEN meta_key = ‘ad_miles_range’ THEN meta_value END ) AS ad_miles_range,
MAX(CASE WHEN meta_key = ‘ad_zip_code’ THEN meta_value END ) AS ad_zip_code,
MAX(CASE WHEN meta_key = ‘ad_price’ THEN meta_value END ) AS ad_price
FROM wp_posts
LEFT JOIN wp_postmeta ON ( wp_posts.ID = wp_postmeta.post_id )
WHERE wp_posts.post_status = ‘publish’
GROUP BY wp_posts.ID
ORDER BY wp_posts.post_date DESC
LIMIT 30

…which works fine, but throws an unknown column error when I try adding “AND ad_miles_range = ‘50’”. Might I be on the right path with this approach? Would be much easier than trying to create arrays and all that.

Thanks again, I really appreciate it!

this is it, exactly :slight_smile:

what you were doing here –

… MAX(CASE WHEN meta_key = ‘ad_city’ THEN meta_value END ) AS ad_city,

this is what i meant when i said “hardcoding”

you have to do it for each different field, and your query will have to be touched if you ever add another different type of field

(“touching” a query might not be a big deal for a one-man shop, but in a large organization, the production department takes a very, very dim view of changes to queries, requiring you to dance through all kinds of hoops before the change goes live…)

…which works fine, but throws an unknown column error when I try adding “AND ad_miles_range = ‘50’”
you’d have to use a HAVING condition, not a WHERE condition

You so rock. The hardcoded version is much preferred, it’s just me building this so having to change a query isn’t nearly as big of a deal as messing with packing arrays and the extra processing that will take. Getting there on the clunky though.

Just a couple little problems and I think it will be there. Code…

SELECT wp_posts.ID, wp_posts.post_date, wp_posts.post_title, wp_posts.post_content,
MAX(CASE WHEN meta_key = ‘ad_city’ THEN meta_value END ) AS ad_city,
MAX(CASE WHEN meta_key = ‘ad_state’ THEN meta_value END ) AS ad_state,
MAX(CASE WHEN meta_key = ‘ad_trade_for’ THEN meta_value END ) AS ad_trade_for,
MAX(CASE WHEN meta_key = ‘ad_trade_category’ THEN meta_value END ) AS ad_trade_category,
MAX(CASE WHEN meta_key = ‘ad_miles_range’ THEN meta_value END ) AS ad_miles_range,
MAX(CASE WHEN meta_key = ‘ad_zip_code’ THEN meta_value END ) AS ad_zip_code,
MAX(CASE WHEN meta_key = ‘ad_price’ THEN meta_value END ) AS ad_price
FROM wp_posts
LEFT JOIN wp_postmeta ON ( wp_posts.ID = wp_postmeta.post_id )
GROUP BY wp_posts.ID
HAVING MAX(CASE WHEN meta_key = ‘ad_miles_range’ THEN meta_value END) = ‘50’
ORDER BY wp_posts.post_date DESC
LIMIT 30

How do I incorporate another HAVING clause, ie ‘ad_state’ = ‘nj’?

Also, I do need some ‘where’ type stuff, ie ‘where wp_posts.post_status=“publish”’. How do I get this in there too?

You are a life saver! Thanks so much.

incorporate additional HAVING conditions the same way you’d incorporate additional WHERE conditions – with ANDs and ORs

WHERE conditions for filter purposes should be placed right before the GROUP BY clause

Soo close, this works…

SELECT wp_posts.ID, wp_posts.post_date, wp_posts.post_title, wp_posts.post_content,
MAX(CASE WHEN meta_key = ‘ad_city’ THEN meta_value END ) AS ad_city,
MAX(CASE WHEN meta_key = ‘ad_state’ THEN meta_value END ) AS ad_state,
MAX(CASE WHEN meta_key = ‘ad_trade_for’ THEN meta_value END ) AS ad_trade_for,
MAX(CASE WHEN meta_key = ‘ad_trade_category’ THEN meta_value END ) AS ad_trade_category,
MAX(CASE WHEN meta_key = ‘ad_miles_range’ THEN meta_value END ) AS ad_miles_range,
MAX(CASE WHEN meta_key = ‘ad_zip_code’ THEN meta_value END ) AS ad_zip_code,
MAX(CASE WHEN meta_key = ‘ad_price’ THEN meta_value END ) AS ad_price
FROM wp_posts
LEFT JOIN wp_postmeta ON ( wp_posts.ID = wp_postmeta.post_id )
WHERE wp_posts.post_status = ‘publish’
GROUP BY wp_posts.ID
HAVING (MAX(CASE WHEN meta_key = ‘ad_miles_range’ THEN meta_value END) = ‘50’
ORDER BY wp_posts.post_date DESC
LIMIT 30

…and returns everything right, including rows with ad_state = “NJ”. When I add and AND under the HAVING, like…

SELECT wp_posts.ID, wp_posts.post_date, wp_posts.post_title, wp_posts.post_content,
MAX(CASE WHEN meta_key = ‘ad_city’ THEN meta_value END ) AS ad_city,
MAX(CASE WHEN meta_key = ‘ad_state’ THEN meta_value END ) AS ad_state,
MAX(CASE WHEN meta_key = ‘ad_trade_for’ THEN meta_value END ) AS ad_trade_for,
MAX(CASE WHEN meta_key = ‘ad_trade_category’ THEN meta_value END ) AS ad_trade_category,
MAX(CASE WHEN meta_key = ‘ad_miles_range’ THEN meta_value END ) AS ad_miles_range,
MAX(CASE WHEN meta_key = ‘ad_zip_code’ THEN meta_value END ) AS ad_zip_code,
MAX(CASE WHEN meta_key = ‘ad_price’ THEN meta_value END ) AS ad_price
FROM wp_posts
LEFT JOIN wp_postmeta ON ( wp_posts.ID = wp_postmeta.post_id )
WHERE wp_posts.post_status = ‘publish’
GROUP BY wp_posts.ID
HAVING MAX(CASE WHEN meta_key = ‘ad_miles_range’ THEN meta_value END) = ‘50’
AND MAX(CASE WHEN meta_key = ‘ad_state’ THEN meta_value END) = ‘NJ’
ORDER BY wp_posts.post_date DESC
LIMIT 30

…it returns no results. Do I need to do it a little different?

maybe you’re getting no results because there aren’t any for those criteria?

if you think that there are, perhaps you could dump the table structure along with a few rows so that i can do some testing…

There are rows that match, ie 50 miles and NJ. I see NJ in there when I just do 50 miles.

Query for a sample data dump attached that should be enough to sort it out.

i tested with your data, and i got back 40967 as expected

Sorry, should have tested that myself. Not the same result in the real data though, so I must have oversimplified the sample.

Is there anyway to get you the full dumps? They are rather large (90k and 10k rows) so not sure they will attach here.

not sure that the full dumps would help any more than looking at specific examples

why don’t you do some digging, find a couple of posts which aren’t working the way you expect, and pull those INSERT statements out of your dump file

Heya,

Thanks so much for your help. Please send me an Amazon link for a book, cd, dvd, or such :slight_smile:

Alrighty, so I figured out how to dump from a query (for those who don’t know, just run a query in phpmyadmin and then click export at the bottom of the results page) and can recreate the problem now on a sample. Attached (tables are called wp_posts_3 and wp_postmeta_3 now).

So, if I run…

SELECT wp_posts_3.ID, wp_posts_3.post_date, wp_posts_3.post_title, wp_posts_3.post_content,
MAX(CASE WHEN meta_key = ‘ad_city’ THEN meta_value END ) AS ad_city,
MAX(CASE WHEN meta_key = ‘ad_state’ THEN meta_value END ) AS ad_state,
MAX(CASE WHEN meta_key = ‘ad_trade_for’ THEN meta_value END ) AS ad_trade_for,
MAX(CASE WHEN meta_key = ‘ad_trade_category’ THEN meta_value END ) AS ad_trade_category,
MAX(CASE WHEN meta_key = ‘ad_miles_range’ THEN meta_value END ) AS ad_miles_range,
MAX(CASE WHEN meta_key = ‘ad_zip_code’ THEN meta_value END ) AS ad_zip_code,
MAX(CASE WHEN meta_key = ‘ad_price’ THEN meta_value END ) AS ad_price
FROM wp_posts_3
LEFT JOIN wp_postmeta_3 ON ( wp_posts_3.ID = wp_postmeta_3.post_id )
WHERE wp_posts_3.post_status = ‘publish’
GROUP BY wp_posts_3.ID
HAVING MAX(CASE WHEN meta_key = ‘ad_miles_range’ THEN meta_value END) = ‘50’
ORDER BY wp_posts_3.post_date DESC
LIMIT 30

…then I get 2 rows, one that has ‘TN’ as the ad_state. If I run now…

SELECT wp_posts_3.ID, wp_posts_3.post_date, wp_posts_3.post_title, wp_posts_3.post_content,
MAX(CASE WHEN meta_key = ‘ad_city’ THEN meta_value END ) AS ad_city,
MAX(CASE WHEN meta_key = ‘ad_state’ THEN meta_value END ) AS ad_state,
MAX(CASE WHEN meta_key = ‘ad_trade_for’ THEN meta_value END ) AS ad_trade_for,
MAX(CASE WHEN meta_key = ‘ad_trade_category’ THEN meta_value END ) AS ad_trade_category,
MAX(CASE WHEN meta_key = ‘ad_miles_range’ THEN meta_value END ) AS ad_miles_range,
MAX(CASE WHEN meta_key = ‘ad_zip_code’ THEN meta_value END ) AS ad_zip_code,
MAX(CASE WHEN meta_key = ‘ad_price’ THEN meta_value END ) AS ad_price
FROM wp_posts_3
LEFT JOIN wp_postmeta_3 ON ( wp_posts_3.ID = wp_postmeta_3.post_id )
WHERE wp_posts_3.post_status = ‘publish’
GROUP BY wp_posts_3.ID
HAVING MAX(CASE WHEN meta_key = ‘ad_miles_range’ THEN meta_value END) = ‘50’
AND MAX(CASE WHEN meta_key = ‘ad_state’ THEN meta_value END) = ‘TN’
ORDER BY wp_posts_3.post_date DESC
LIMIT 30

…just adding the ad_state HAVING AND, then I get an empty result. If I just run…

SELECT wp_posts_3.ID, wp_posts_3.post_date, wp_posts_3.post_title, wp_posts_3.post_content,
MAX(CASE WHEN meta_key = ‘ad_city’ THEN meta_value END ) AS ad_city,
MAX(CASE WHEN meta_key = ‘ad_state’ THEN meta_value END ) AS ad_state,
MAX(CASE WHEN meta_key = ‘ad_trade_for’ THEN meta_value END ) AS ad_trade_for,
MAX(CASE WHEN meta_key = ‘ad_trade_category’ THEN meta_value END ) AS ad_trade_category,
MAX(CASE WHEN meta_key = ‘ad_miles_range’ THEN meta_value END ) AS ad_miles_range,
MAX(CASE WHEN meta_key = ‘ad_zip_code’ THEN meta_value END ) AS ad_zip_code,
MAX(CASE WHEN meta_key = ‘ad_price’ THEN meta_value END ) AS ad_price
FROM wp_posts_3
LEFT JOIN wp_postmeta_3 ON ( wp_posts_3.ID = wp_postmeta_3.post_id )
WHERE wp_posts_3.post_status = ‘publish’
GROUP BY wp_posts_3.ID
HAVING MAX(CASE WHEN meta_key = ‘ad_state’ THEN meta_value END) = ‘TN’
ORDER BY wp_posts_3.post_date DESC
LIMIT 30

…I still get an empty result. If I run it with 2 parameters other than ad_state it works right again, ie…

SELECT wp_posts_3.ID, wp_posts_3.post_date, wp_posts_3.post_title, wp_posts_3.post_content,
MAX(CASE WHEN meta_key = ‘ad_city’ THEN meta_value END ) AS ad_city,
MAX(CASE WHEN meta_key = ‘ad_state’ THEN meta_value END ) AS ad_state,
MAX(CASE WHEN meta_key = ‘ad_trade_for’ THEN meta_value END ) AS ad_trade_for,
MAX(CASE WHEN meta_key = ‘ad_trade_category’ THEN meta_value END ) AS ad_trade_category,
MAX(CASE WHEN meta_key = ‘ad_miles_range’ THEN meta_value END ) AS ad_miles_range,
MAX(CASE WHEN meta_key = ‘ad_zip_code’ THEN meta_value END ) AS ad_zip_code,
MAX(CASE WHEN meta_key = ‘ad_price’ THEN meta_value END ) AS ad_price
FROM wp_posts_3
LEFT JOIN wp_postmeta_3 ON ( wp_posts_3.ID = wp_postmeta_3.post_id )
WHERE wp_posts_3.post_status = ‘publish’
GROUP BY wp_posts_3.ID
HAVING MAX(CASE WHEN meta_key = ‘ad_miles_range’ THEN meta_value END) = ‘50’
AND MAX(CASE WHEN meta_key = ‘ad_city’ THEN meta_value END) = ‘Brighton’
ORDER BY wp_posts_3.post_date DESC
LIMIT 30

Seems to be something specific to the ad_state field. Any ideas?

Hopefully you can find the issue from this, I’ve spent hours and hours since yesterday and still no luck.

Thanks yet again!

Philip

look at postmeta with meta_id = 95786

that state is not TN, it is blankTN

Ahhhhhhhhhhhhhhhhhhh.

Can’t BELIEVE I didn’t find that. Got to clean up my importer (moving from an old system to a new currently) and try 'er again.

Seriously, send an Amazon link ok?

Off to get that resolved!

Alrighty, that was indeed the problem. Purrrrfect. Guess I was staring at it too long…lol.

One more issue to go with this. I also need to select the first image that goes with each post. I’m trying to leave WP as untouched as possible, but I’m not even sure it’s possible with how it stores the images, which is in the same table as the posts, but a different row. It also stores autosaves and revisions to a post as a new row in the same table.

Is this even possible to do all in one query? The query is already beyond me, so incorporating this has been very unfruitful. Any ideas or pointing me in the right direction would be hugely appreciated.

An new dump is attached with 1 post that has an attachment, autosave, and revision) along with it’s associated wp_postmeta rows.

Current queries is like…

SELECT wp_posts_3.ID, wp_posts_3.post_date, wp_posts_3.post_title, wp_posts_3.post_content,
MAX(CASE WHEN meta_key = ‘ad_city’ THEN meta_value END ) AS ad_city,
MAX(CASE WHEN meta_key = ‘ad_state’ THEN meta_value END ) AS ad_state,
MAX(CASE WHEN meta_key = ‘ad_trade_for’ THEN meta_value END ) AS ad_trade_for,
MAX(CASE WHEN meta_key = ‘ad_trade_category’ THEN meta_value END ) AS ad_trade_category,
MAX(CASE WHEN meta_key = ‘ad_miles_range’ THEN meta_value END ) AS ad_miles_range,
MAX(CASE WHEN meta_key = ‘ad_zip_code’ THEN meta_value END ) AS ad_zip_code,
MAX(CASE WHEN meta_key = ‘ad_price’ THEN meta_value END ) AS ad_price
FROM wp_posts_3
LEFT JOIN wp_postmeta_3 ON ( wp_posts_3.ID = wp_postmeta_3.post_id )
WHERE wp_posts_3.post_status = ‘publish’
GROUP BY wp_posts_3.ID
HAVING MAX(CASE WHEN meta_key = ‘ad_miles_range’ THEN meta_value END) = ‘50’
AND MAX(CASE WHEN meta_key = ‘ad_state’ THEN meta_value END) = ‘NJ’
ORDER BY wp_posts_3.post_date DESC
LIMIT 30

Fingers crossed, thanks yet again for all your help…

Philip

could you explain how the rows are related?

Ah man, just realized you replied days ago. Been banging my head with it still.

WP stores posts, pages, revisions, auto-saves, and attachments all in the wp_posts table. Sample data of two lines, one a post and one an

attachment to that post…

Row 1: Post
ID: 21877
post_status: publish
post_parent: 0
post_type: page
post_mime_type:

Row 2: Attachment (image)
ID: 21895
post_status: inherit
post_parent: 21877
post_type: attachment
post_mime_type: image/jpeg

Should be pretty easy I imagine to marry the first image for each post along with the post data in the single query, for the right person at

least. The query is a bit over my head as-is, much less adding a whole new level to it!

Thanks so very much, I can’t tell you how much I appreciate it.

Philip

PS: Rows like that exist in the latest dump.

aha, i see that it is post_parent that identifies the relationship

try this –

SELECT wp_posts_3.ID
     , wp_posts_3.post_date
     , wp_posts_3.post_title
     , wp_posts_3.post_content
     , [COLOR="Blue"]last_image.guid AS last_image_guid[/COLOR]
  FROM wp_posts_3
INNER
  JOIN ( SELECT post_id
              , MAX(CASE WHEN meta_key = 'ad_city' THEN meta_value END ) AS ad_city
              , MAX(CASE WHEN meta_key = 'ad_state' THEN meta_value END ) AS ad_state
              , MAX(CASE WHEN meta_key = 'ad_trade_for' THEN meta_value END ) AS ad_trade_for
              , MAX(CASE WHEN meta_key = 'ad_trade_category' THEN meta_value END ) AS ad_trade_category
              , MAX(CASE WHEN meta_key = 'ad_miles_range' THEN meta_value END ) AS ad_miles_range
              , MAX(CASE WHEN meta_key = 'ad_zip_code' THEN meta_value END ) AS ad_zip_code
              , MAX(CASE WHEN meta_key = 'ad_price' THEN meta_value END ) AS ad_price
           FROM wp_postmeta_3 
         GROUP 
             BY post_id
         HAVING MAX(CASE WHEN meta_key = 'ad_miles_range' THEN meta_value END) = '50'
            AND MAX(CASE WHEN meta_key = 'ad_state' THEN meta_value END) = 'NJ'
       ) AS q1    
    ON q1.post_id = wp_posts_3.ID -- note no parentheses
[COLOR="blue"]LEFT OUTER
  JOIN ( SELECT post_parent
              , MAX(post_date) AS latest
           FROM wp_posts_3
          WHERE post_status = 'inherit'
            AND post_type = 'attachment'
            AND post_mime_type = 'image/jpeg'
         GROUP
             BY post_parent
       ) AS q2    
    ON q2.post_parent = wp_posts_3.ID -- note no parentheses
LEFT OUTER
  JOIN wp_posts_3 AS last_image
    ON last_image.post_parent = wp_posts_3.ID -- note no parentheses
   AND last_image.post_status = 'inherit'
   AND last_image.post_type = 'attachment'
   AND last_image.post_mime_type = 'image/jpeg'
   AND last_image.post_date = q2.latest[/COLOR]
 WHERE wp_posts_3.post_status = 'publish'
ORDER 
    BY wp_posts_3.post_date DESC LIMIT 30

you can see i’ve pushed all that MAX business with the HAVING clause to determine eligibility into a subquery, q1, so that the outer query (which now deals with something else as well) doesn’t have the GROUP BY on it

subquery q2 finds the latest image date for each post (using a LEFT OUTER JOIN in case a post doesn’t have any images), and then an additional join is needed to retrieve the row (and the guid) that corresponds to that latest date