Difference between NULL and NOT NULL?

What is the difference in the NULL and NOT NULL types in mysql? When I insert a blank record into a NULL row it inserts NULL while inserting into NOT NULL it seems to insert a space: " ".

So…what’s the difference between the two and when would you use each type? My primary key it NOT NULL and that makes sense but is there any other use?

Thanks,

MAtt

you want the short answer or the long one? :wink:

if you insert a blank into a field that allows a NULL, you get a blank, not a NULL, as you suggested

if you insert a blank into a field defined NOT NULL, you get a blank

NULL represents the absence of a value

that’s not the same thing as sticking a default zero into a numeric field or a default zero-length string (or worse, a blank) into a character field

the best way to think about NULL is with the DATE_TERMINATED field in an employee record – what do you stick in there for employees that are still on staff? you can’t insert a real date, especially not a phony date like i’ve seen in some apps (e.g. december 31, 2199) – it is this type of thinking that led to the entire y2k fiasco

no, you stick a NULL in when you do not have a value

i could go on for hours about the goodness of NULL

want more?

rudy

want more?

sure, if you don’t mind :slight_smile:

one more example

student grade
tom 75
mary null
john 50
bill 85
fred null

what’s the average grade?

okay, now this –

student grade
tom 75
mary 0
john 50
bill 85
fred 0

what’s the average grade?

okay, now let’s say that you were smart enough that you realized that using a zero as a “default” (i.e. gotta put something in there if you don’t want to use null) is going to give you the wrong average, how do you actually come up with the right calculations for average?

maybe something like

select sum(grade) / count(*)
from yourtable
where grade > 0

how would you do it if you had nulls in there instead?

select avg(grade)
from yourtable

these are simplistic queries and the implication for more complex ones is that you do not want to always have to try to remember when a zero means a real zero and when it means a dummy placeholder absence-of-a-value value

rudy

thanks for the info.

so, if i have a default value of NULL then that means i can check for a value with isset()… is that a good thing to do?

whethere a column allows NULL is different from its default value (except that the default cannot be NULL if the column is defined NOT NULL)

create table nullexamples
( mypk integer not null
, mycol1 char(9) null
, mycol2 char(9) null default ‘oops’
, mycol3 char(9) not null
, mycol4 char(9) not null default ‘aargh’
)

as for checking whether a column is null, use IS NULL, which is part of standard sql syntax

select emplname
from employees
where date_terminated is null

rudy

thanks for the great thread. helped me clarify some things.

The only think I’d add is only use NULL when you need to, as you’ll get faster SELECTs on columns defined NOT NULL. All this means is that it’s a good idea to have a clear idea about what will go into the column before you code the DB schema, but that’s true for a million other reasons as well.

Originally posted by r937
… the best way to think about NULL is with the DATE_TERMINATED field in an employee record – what do you stick in there for employees that are still on staff? you can’t insert a real date, especially not a phony date like i’ve seen in some apps (e.g. december 31, 2199) – it is this type of thinking that led to the entire y2k fiasco …

Just wanted to add:
Why have a date_terminated field in an employee table?
I guess (or hope?) that almost everyone in that table will have date_terminated == null (unless your company is very old)
I think it’s better to have a separate table with employee_events (employee_id fk, event_type fk, date…)
The same events table could be used instead of columns for date_employed, date_maternity_leave etc in the employee table
OK, no one will have null in date_employed, but I still think it’s a good idea to have the employee’s “history” in a separate table.

And - I don’t think inserting phony dates was the biggest y2k problem, I think it was using two digits for year representation :wink:

And - I don’t think inserting phony dates was the biggest y2k problem, I think it was using two digits for year representation

True, but they’re all in a similar ‘I don’t want to bother with the important little details which will bite me in the *** later’ mentality… :smiley:

good stuff…

thanks, hooha, that is exactly what i meant

jofa, i know what you’re suggesting, but that sort of semantic clarity often introduces needless complexity

i mean, consider your extra table to store employee events

if the president comes in to my cubicle and says “hey rudy, you got an employee database, right? how many employees did we have on january 1?”, i would have to do a left outer join from the employee table to the employee event table, use a filter to look for the “terminated” event, be able to count the correct number of rows for all employees, whether the employee table matched a row in the employee events table or not, and boy oh boy i had better know whether to put the date check for january 1 into the ON conditon of the left outer join or into the WHERE clause, and i guess i also have to know under what circumstances COUNT(*) is going to work correctly or not…

… versus the following:

select count(*)
  from employees
 where date_terminated is not null
    or date_terminated > '2002-01-01'

i fully understand your suggestion, but if i’ve learned anything in over twenty-five years of database design, it’s knowing when to stop

:wink:

edited? yeah, i had < when i shoulda had >
but at least i caught it within 2 minutes

Why left outer join from the employee table if the president only wants to know how many?

select
sum(case when evt_type = 1 then 1 else 0 end) - sum(case when evt_type = 2 then 1 else 0 end)
from emp_evt
where date <= ‘2002-01-01’ and evt_type between 1 and 2

(event type 1 = employed, 2 = terminated)

nice one

this is fun

what if the employee left and was rehired before 2002

now we gotta throw a DISTINCT in there somehow, right?

you other people must think jofa and me are just fooling around, but we’re not

this is a beautiful example of why database designers should never, ever, be allowed to design in a vacuum

in fact, they should understand not only storage and archive and history requirements, but be prepared for things like counting

“sure,” i can hear all of you say, “counting, that’s trivial”

no it isn’t

(are we still more or less on topic about nulls? i think so)

Originally posted by r937
what if the employee left and was rehired before 2002
now we gotta throw a DISTINCT in there somehow, right?

Hmmm, no.
There will be two rows with the emp_id == X & evt_type == 1,
and one row with emp_id == X & evt_type == 2
2-1 = 1 :slight_smile: => one employed person to add to the sum

This discussion reminds me of the story about counting metro passengers in Buenos Aires. The results were very confusing, because the total number of persons entering the metro was higher than the total count at all the exits. Some people entered the metro, and never exited? :smiley: