I am new to SQL and started with Simply SQL by Rudy Limeback. I really like the book. I am running a mysql server on a windows machine. My question utilizes the entries table from the content management system example. Actually, the question is much more basic. When we join two tables with a common field (primary key, foreign key), the resulting set (table) contains complete records for both tables that meet the selected criterion. However, when we operate on a single table, the resulting set is not always made up of entire records. For example, in the entries table if I run a query select id, category, title, created from entries where id = 1 or id = 2 or id = 3 or id = 4;
I will get 4 records from the entries which are complete. By complete, I mean that the category, title, time created will all be for the id = 1 or 2 or 3 or 4 in the original table.
mysql> select id, category, title, created from entries where id = 1 or id =2 or id =3 or id = 4;
±—±---------±----------------------------±--------------------+
| id | category | title | created |
±—±---------±----------------------------±--------------------+
| 1 | angst | What If I get Sick and Die? | 2015-04-12 20:57:24 |
| 2 | humor | Uncle Karl and the Gasoline | 2015-04-14 00:08:11 |
| 3 | advice | Be Nice to Me | 2015-04-12 23:09:35 |
| 4 | humor | Hello Statue | 2015-04-14 00:08:39 |
±—±---------±----------------------------±--------------------+
However, if I use the max(created) function to obtain the time at which say, the most recent humor article was created, I do not get the entire record when I enter the following query
mysql> select id, category, title, max(created) from entries;
±—±---------±----------------------------±--------------------+
| id | category | title | max(created) |
±—±---------±----------------------------±--------------------+
| 1 | angst | What If I get Sick and Die? | 2015-04-14 00:54:22 |
±—±---------±----------------------------±--------------------+
1 row in set (0.00 sec)
In this result, the id is 1, the category and the title are from record 1 but the max(created) time is correct but from an entirely different record. In order to get one record, we have to design a query with an inner join of the entries table on itself. Could someone please explain the reasons why sql is designed this way. Clearly, it must useful for more complicated queries and it would be nice to understand the reason.
Thank you very much in advance for your help.
@Ophelie it is not that we don’t know the answer to this question is more like it is good to be answered by the writer himself when you have the chance
[quote=“saleemhasan, post:1, topic:133387, full:true”]Could someone please explain the reasons why sql is designed this way.[/quote]sql is designed this way because logic
the reason you get some columns from the id=1 row, but the max(created) value from a different row, is because the sql that you ran is invalid –
select id, category, title, max(created) from entries
here you have three non-aggregate columns and one aggregate column in the SELECT clause
however, you’re missing the GROUP BY clause, and so the entire query is invalid
yes, i know it runs… mysql will happily try to execute even invalid sql
I will definitely read the article you recommend to find out about the issues with non-aggregate columns being included with the aggregate columns.
I must point out that even if I “include group by category” in the select statement, I get the correct (most recent) timestamp for each category. However, the id and title (probably even the category) are not from the same record as the value of max(created). The values of id and title are from the very first record in that category and Not those that are in the record for the correct max(created).
mysql> select id, category, title, max(created) from entries group by category;
±—±----------±----------------------------±--------------------+
| id | category | title | max(created) |
±—±----------±----------------------------±--------------------+
| 3 | advice | Be Nice to Me | 2015-04-12 23:09:35 |
| 1 | angst | What If I get Sick and Die? | 2015-04-12 20:57:24 |
| 6 | computers | Windows Media Center Rocks | 2015-04-14 00:54:22 |
| 2 | humor | Uncle Karl and the Gasoline | 2015-04-14 00:08:39 |
| 5 | science | The Size of Our Galaxy | 2015-04-12 21:09:21 |
±—±----------±----------------------------±--------------------+
5 rows in set (0.02 sec)
Reminds me of the old complaint about computers – How they always give me what I ask for but never what I want
Thank you very much for recommending that wonderful article. I believe I understood quite a bit of it. However, I do have a question.
The way I understood it, the rule is to add to the group by clause, all the columns listed in the select statement that are not functionally dependent on the existing column in the group by clause. For example, in the entries table, if I have a statement
select category, title from entries group by category;
The rule recommends that I add title to the group by clause in order to make it a correct query.
select category, title from entries group by category, title;
However, the resulting set will not group by category.
Why doesn’t it also ask me to check the list of fields in the select statement as one or more may be incompatible (not functionally dependent on the column in the group by clause) with the column in the group by clause. After all, even though correct, the changed query does not give me the desired result.
[quote=“saleemhasan, post:10, topic:133387, full:true”]Why doesn’t it also ask me to check the list of fields in the select statement …[/quote]i guess because it doesn’t know what you really want
if you want one row per category, but you also want to show a title, it’s really up to you to specify which title you want