Working with large knowledge requires consideration to particulars, testing, retesting, evaluating knowledge returned by the saved procedures, views, desk valued capabilities and many others. with the anticipated outcome. So understanding the core ideas of SQL is a should.
I’m presently working for an insurance coverage firm and the quantity of knowledge they’ve is large. One in all their major performance is having experiences the place they’ll interpret the information as they want.
Whereas testing one thing on one in every of their experiences, I discovered a bug within the saved process that fed the report with knowledge. They have been attempting to JOIN two units of knowledge, the left hand facet contained NULL on a selected column, and the precise facet contained 0 on the identical explicit column.
Whilst you might imagine that representing NULL on the precise hand facet was the repair to go to, this would not have labored. As a substitute, this was the purpose the place I assumed representing NULLs with 0s would work. I am going to exhibit beneath why NULL wasn’t working and the way representing it with 0 fastened this.
Let’s think about the next instance:
LEFT hand facet:
VALUES (1, null), (2, 1), (3, 2)) leftHandSide(Id, fk)
RIGHT hand facet:
VALUES (null, '0'), (1, '1'), (2, '2')) rightHandSide(Id, fk)
I am performing a JOIN between them:
SELECT * FROM (VALUES (1, null), (2, 1), (3, 2)) leftHandSide(Id, fk) INNER JOIN (VALUES (null, '0'), (1, '1'), (2, '2')) rightHandSide(Id, fk) ON rightHandSide.Id = leftHandSide.fk
And I need my outcome set to appear like this, it will not work.
Id fk Id fk 1 NULL NULL NULL 2 1 1 1 3 2 2 2
As a substitute I’ll solely get the final 2 rows
Id fk Id fk 2 1 1 1 3 2 2 2
When you could have NULL values in a column, these won’t ever be matched to different NULL values. It’s because NULL signifies the absence of any worth, and can’t be in contrast as they are going to by no means equal something.
So, what I did was to signify NULL to 0 in order that my outcome set comprises the three rows:
Id fk Id fk 1 0 0 0 2 1 1 1 3 2 2 2
by merely doing
SELECT * FROM (VALUES (1, ISNULL(NULL, 0)), (2, 1), (3, 2)) leftHandSide(Id, fk) INNER JOIN (VALUES (ISNULL(NULL, 0), '0'), (1, '1'), (2, '2')) rightHandSide(Id, fk) ON rightHandSide.Id = leftHandSide.fk
Lesson discovered: each time you must take care of NULL, and you do not perceive why your outcome set is lacking some rows that you simply count on, take into consideration representing NULLs to 0s.