What is the difference in the NULL and NOT NULL types in mysql? When I insert a blank record into a NULL row it inserts NULL while inserting into NOT NULL it seems to insert a space: " ".
So…what’s the difference between the two and when would you use each type? My primary key it NOT NULL and that makes sense but is there any other use?
if you insert a blank into a field that allows a NULL, you get a blank, not a NULL, as you suggested
if you insert a blank into a field defined NOT NULL, you get a blank
NULL represents the absence of a value
that’s not the same thing as sticking a default zero into a numeric field or a default zero-length string (or worse, a blank) into a character field
the best way to think about NULL is with the DATE_TERMINATED field in an employee record – what do you stick in there for employees that are still on staff? you can’t insert a real date, especially not a phony date like i’ve seen in some apps (e.g. december 31, 2199) – it is this type of thinking that led to the entire y2k fiasco
no, you stick a NULL in when you do not have a value
i could go on for hours about the goodness of NULL
student grade
tom 75
mary null
john 50
bill 85
fred null
what’s the average grade?
okay, now this –
student grade
tom 75
mary 0
john 50
bill 85
fred 0
what’s the average grade?
okay, now let’s say that you were smart enough that you realized that using a zero as a “default” (i.e. gotta put something in there if you don’t want to use null) is going to give you the wrong average, how do you actually come up with the right calculations for average?
maybe something like
select sum(grade) / count(*)
from yourtable
where grade > 0
how would you do it if you had nulls in there instead?
select avg(grade)
from yourtable
these are simplistic queries and the implication for more complex ones is that you do not want to always have to try to remember when a zero means a real zero and when it means a dummy placeholder absence-of-a-value value
thanks for the great thread. helped me clarify some things.
The only think I’d add is only use NULL when you need to, as you’ll get faster SELECTs on columns defined NOT NULL. All this means is that it’s a good idea to have a clear idea about what will go into the column before you code the DB schema, but that’s true for a million other reasons as well.
Originally posted by r937 … the best way to think about NULL is with the DATE_TERMINATED field in an employee record – what do you stick in there for employees that are still on staff? you can’t insert a real date, especially not a phony date like i’ve seen in some apps (e.g. december 31, 2199) – it is this type of thinking that led to the entire y2k fiasco …
Just wanted to add:
Why have a date_terminated field in an employee table?
I guess (or hope?) that almost everyone in that table will have date_terminated == null (unless your company is very old)
I think it’s better to have a separate table with employee_events (employee_id fk, event_type fk, date…)
The same events table could be used instead of columns for date_employed, date_maternity_leave etc in the employee table
OK, no one will have null in date_employed, but I still think it’s a good idea to have the employee’s “history” in a separate table.
And - I don’t think inserting phony dates was the biggest y2k problem, I think it was using two digits for year representation
jofa, i know what you’re suggesting, but that sort of semantic clarity often introduces needless complexity
i mean, consider your extra table to store employee events
if the president comes in to my cubicle and says “hey rudy, you got an employee database, right? how many employees did we have on january 1?”, i would have to do a left outer join from the employee table to the employee event table, use a filter to look for the “terminated” event, be able to count the correct number of rows for all employees, whether the employee table matched a row in the employee events table or not, and boy oh boy i had better know whether to put the date check for january 1 into the ON conditon of the left outer join or into the WHERE clause, and i guess i also have to know under what circumstances COUNT(*) is going to work correctly or not…
… versus the following:
select count(*)
from employees
where date_terminated is not null
or date_terminated > '2002-01-01'
i fully understand your suggestion, but if i’ve learned anything in over twenty-five years of database design, it’s knowing when to stop
edited? yeah, i had < when i shoulda had >
but at least i caught it within 2 minutes
Why left outer join from the employee table if the president only wants to know how many?
select
sum(case when evt_type = 1 then 1 else 0 end) - sum(case when evt_type = 2 then 1 else 0 end)
from emp_evt
where date <= ‘2002-01-01’ and evt_type between 1 and 2
Originally posted by r937 what if the employee left and was rehired before 2002
now we gotta throw a DISTINCT in there somehow, right?
Hmmm, no.
There will be two rows with the emp_id == X & evt_type == 1,
and one row with emp_id == X & evt_type == 2
2-1 = 1 => one employed person to add to the sum
This discussion reminds me of the story about counting metro passengers in Buenos Aires. The results were very confusing, because the total number of persons entering the metro was higher than the total count at all the exits. Some people entered the metro, and never exited?