Streamline a Query in MySQL

bostboy · April 14, 2015, 5:35pm

I have a query that works but it seems to me there ought to be a better way to write it. I am trying to get a single record out of a possible set of records that has the MAX effectivedate value. So kind of go look for these records and return to me the one with the most recent effective date. Here is what I have that works.

select * from table1 as t1
inner join table2 as t2 on t2.pricingmasterid = t1.pricingmasterid
where t2.companyid = 102 and t1.pricetype = 'DFS' and t1.expirationdate > now() and t1.effectivedate in
(select max(t1.effectivedate) from table1 as t1
inner join table2 as t2 on t2.pricingmasterid = t1.pricingmasterid
where t2.companyid = 102 and t1.pricetype = 'DFS' and t1.expirationdate > now())

r937 · April 15, 2015, 12:39am

what is the one-to-many relationship here?

pretty hard to figure it out from your table names

bostboy · April 15, 2015, 1:05pm

t2 has companyid and pricingmasterid as the key and t1 has pricingmasterid as its key. I know the companyid and am trying to find the latest effective price list. t1 also has the pricetype and the effective and expiration dates of the price lists. A price list can have an expiration date later than today and would still be used up until its expiration date if it was on an active quote and the quote had not yet been converted to a PO.

So I do 2 searches. The first is to see if there is a list that has an effective date less than today and an expiration date that is null (only one price list of a type can have a null expiration date). So in the case where a new list has been added that is not effective until next week, in the meantime I want to use the most current of the expiring lists. There will always be at least one with an expiration date greater than the new list’s effective date.

If the first search returns no results, I am now looking for the list with an expiration date greater than today and since there can be more than one, I want the one with the most recent effective date. That is what the “in” statement is doing, getting the most recent effective date. I hope that makes sense. Thanks

r937 · April 15, 2015, 1:48pm

[quote=“bostboy, post:3, topic:122216, full:true”]
t2 has companyid and pricingmasterid as the key and t1 has pricingmasterid as its key.[/quote]

so there’s only one row in t1 for each pricingmasterid, and multiple rows in t2 for any given pricingmasterid?

so why are there multiple dates in t1? what do the dates refer to, if there’s only one row per pricingmasterid?

i’m still having trouble understanding the relationships here

bostboy · April 15, 2015, 2:15pm

hahaha this is why I’m not a teacher. There is indeed only one row in t1 for each pricingmaster and a pricingmaster can be assigned to multiple companies in t2. A company can be assigned multiple pricingmasters in t2 and in t1 I indicate the effective date and expiration date of each price list.

The reason for this is it is a quoting system. So a customer can create a quote and and once created the company will honor that pricing until the price list expires even if there is a new list in effect. So if the customer creates a new quote after the new list (null expiration date) is in place, the new quote will use the new list while the old existing quote may still be using the older list.

The other twist is that when a new list is put in place it has no expiration date but they are required to give an expiration date to the list it is replacing for that customer since they can only have one “current” list of each price type. So I may have mulitple lists that have been replaced and that have expiration dates but will only ever have one list with a null expiration date. And that one list with the null expiration date may not be effective until next Monday. So in that case, for a new quote I want to use the latest list (the list with the latest effective date) with an expiration date that is later than today.

List 1 - eff 1/1/2015 exp 4/30/2015
List 2 - eff 4/1/2015 exp 4/30/2015
List 3 - eff 4/20/2015 exp null

So the most current list if I created quote today would be list 2. List 3 is not yet effective.

So my first query is to see if there is a null expiration date with an effective date < today. If not, as in the example above, I want to find the list that expires after today and if there is more than one (as in the example) I want the one with the latest effective date.

In the vast majority of cases the first query will return the list. But you know programming, I have to account for the 1 in 1,000 that looks like the case above. Does that help?

r937 · April 15, 2015, 4:42pm

[quote=“bostboy, post:5, topic:122216”] Does that help?
[/quote]yes

so pricingmasterid is presumably just an auto_increment or something

i think it was your FROM that threw me off… it looks like you’re grabbing a specific t1 row and then seeing if it matches a t2 row for a given company

but instead, i now understand it as grabbing a t2 row for a specific company, and then seeing if it has a matching t1 row for a given pricetype

bostboy · April 15, 2015, 6:13pm

Yes, pricingmasterid is auto-increment. So given all that, am I doing the query correctly or is there a better way to write that?

r937 · April 15, 2015, 6:22pm

my first attempt… this should replace both of your 2 queries

SELECT * FROM table2 AS t2 INNER JOIN table1 AS t1 ON t1.pricingmasterid = t2.pricingmasterid AND t1.pricetype = 'DFS' AND t1.effectivedate <= CURRENT_DATE AND COALESCE(t1.expirationdate,'2937-12-31') > CURRENT_DATE WHERE t2.companyid = 102 ORDER BY t1.effectivedate DESC LIMIT 1

bostboy · April 15, 2015, 6:37pm

Excellent, let me digest that and I will report back. Thanks for the input and sorry the explanation was so vague.

bostboy · April 16, 2015, 1:27pm

It works to pull the non-null result (which is what I am looking for in my second query) but if there is a null I want that one first. So in my previous example if you change entry 3 to have an effective date less than today it should pull the #3, null expiration date. Only in the case where there is no null expiration date will it go after the expiration date greater than today.

So as I understand COALESCE it goes after the first non null value and only if there is no non-null value does it pull the null. I need just the opposite, go after the null first and if there is no null, then get the non-null expiration date.

bostboy · April 16, 2015, 1:35pm

But this works!! The key is the LIMIT 1 and the ORDER BY. Of course, this assumes that the null will have a later effective date than all those with expiration dates but that will always be true. I can’t back date price lists. Sorry, I should have mentioned that. Thanks so much for the help

SELECT * 
  FROM table2 AS t2
INNER 
  JOIN table1 AS t1 
    ON t1.pricingmasterid = t2.pricingmasterid
   AND t1.pricetype = 'DFS'
   AND t1.effectivedate <= CURRENT_DATE 
   AND (t1.expirationdate > CURRENT_DATE or t1.expirationdate IS NULL)
 WHERE t2.companyid = 102  
ORDER
    BY t1.effectivedate DESC LIMIT 1

r937 · April 16, 2015, 1:49pm

i want you to look really hard at the difference between this –

AND (t1.expirationdate > CURRENT_DATE or t1.expirationdate IS NULL)
and this –

AND COALESCE(t1.expirationdate,'2937-12-31') > CURRENT_DATE
and then tell me under what circumstances they will ~not~ evaluate the same true or false result

bostboy · April 16, 2015, 2:04pm

I believe that if the effective date of a record is less than today and it’s expiration date is null, and another record’s expiration date is greater than today, the coalesce will return the record with the expiration date. However, in that case, I want the record with the null expiration date before the expiration date > today.

At least that’s what happened when I ran it. Am I missing something? Maybe I am…

Is the COALESCE returning the expiration date if it is not null and ‘2937-12-31’ if expiration date is null? Let me run that query again and see what I am getting.

bostboy · April 16, 2015, 2:15pm

I stand corrected, it works fine. The first time I ran it I must have had a data issue. Thanks again for pointing that out. And the COALESCE is a better way to accomplish this and I will put it in now. Cheers!!

bostboy · April 16, 2015, 3:02pm

One other question about your query. When I am writing a query I tend to put the INNER JOIN criteria in the ON part with the inner join and then all the other conditions in the where clause. You have all the conditions related to t1 in the INNER JOIN section and the t2 conditions in the where clause.

First, from what I could tell it will work either way, is that right? And second, what is the reasoning behind that? Is it because t1 is a secondary table in this case because all the SELECT arguments are from t2?

If I were to SELECT t1.xx, t2.yy in the query would you write it differently? Thanks

r937 · April 16, 2015, 3:08pm

correct… for inner joins, the optimizer figures it out easily

for outer joins, there is a huge semantic difference

no, the columns being selected have nothing to do with the way i write my FROM clauses

i try to mention the tables in the order that makes sense given which table is being “restricted” the most

your query is a great example – you’re looking for specific types of t1 rows for a single t2 company, rather than specific t2 company rows for a single t1 type

system · July 16, 2015, 10:15pm

This topic was automatically closed 91 days after the last reply. New replies are no longer allowed.