Understanding NULL Semantics in SQL

Last Updated: June 17, 2026

Categories:

Written by:
Tihomir Babic

NULL in SQL is often a source of confusion, leading to bug propagation. Learning how NULL behaves in different scenarios to keep your query outputs trustworthy.

NULL is one of the most misunderstood concepts in SQL. It isn’t a value; it’s the absence of a value.

Having that crucial distinction in mind will save you from wrong filtering, counting, and joining results.

To drive that single fact home, we’ll have a detailed look at NULL: what it is, its behavior in different contexts, and how to stop it from breaking your code logic.

What NULL Actually Means in SQL

NULL means information is unknown or missing. Once again: it isn’t a value, it isn’t zero, it isn’t false, it isn’t an empty string. NULL means we don’t know what the value is.

Understanding the NULL behavior depends on that single piece of information. If the value is unknown, you can’t say it equals or doesn’t equal something.

NULL was introduced in relational database theory to handle real-world scenarios where information is incomplete or has not yet been collected.

You might ask now, how is that different from a blank (empty string)? Isn’t that also “a real-world scenario where information is incomplete”?

No, it isn’t. If, for example, phone = ' ', that’s a value meaning we asked for the phone number and stored it as blank because, for example, the user doesn’t have a phone number. If phone = NULL, you don’t know whether they have a phone or not, and, if they do, what the number is.

In practice, NULLs appear wherever data collection is incomplete or asynchronous: a user who registered but never completed onboarding has a NULL last_login timestamp; an event logged from a mobile client with a dropped connection may arrive with a partial payload and NULL fields; an order created before payment confirmation has a NULL in completed_at. The SQL you write to analyze these systems has to account for that incompleteness deliberately.

Why = NULL Never Works in SQL

Let’s answer that by solving an interview question.

It requires you to simply select the required columns and filter out the “users who haven't provided their phone numbers yet.” This means filtering out the NULLs.

A phone number is a classic optional attribute in a user table – not everyone provides one, so the column is nullable by design.

Users Missing Phone Numbers

Last Updated: April 2025

EasyID 2168

The product team is launching a new WhatsApp notification feature and needs to identify users who haven't provided their phone numbers yet. These users will be shown a prompt to add their contact information.

Find all users who have not provided a phone number. Return the user ID and name.

Go to the Question

It’s easy to slip up here. It’s even expected, after you’ve been writing query after query where you compare numeric, date, or string data in WHERE using the comparison operators (>, <, =, etc.). Then comes filtering by NULL, and many an inexperienced user (even some experienced ones) would automatically write the filtering condition this way.

PostgreSQL

Go to the question on the platformTables: fintech_app_users

If I told you this doesn’t work, what do you think the query would return? A syntax error? No, = NULL is perfectly valid SQL syntax. The query will run smoothly, no problem at all. (Run it yourself if you want the proof.)

However, that’s not the point. The point is that using = NULL is semantically wrong, not syntactically. The database perfectly understands what you’re asking. The thing is, you don’t understand, so the output surprises you.

The output shows zero rows. Why is that? Because SQL uses three-valued logic (3VL), which means conditions can evaluate to TRUE, FALSE, or NULL. Any comparison involving NULL produces NULL. Not FALSE, not an error, just NULL. The WHERE clause only returns rows where the condition evaluates to TRUE. As the result of the comparison in the code above is NULL, there are no rows to show. (This applies to all other comparison operators, including !=.)

user_id	user_name

How to Output the NULL Rows

The only correct syntax for getting the NULL rows is IS NULL.

PostgreSQL

Go to the question on the platformTables: fintech_app_users

Here’s the output.

user_id	user_name
4	Liam Martinez
5	Ava Rodriguez
6	Oliver Lopez
10	Lucas Thomas
11	Mia Taylor
17	Ethan Thompson
18	Evelyn White
20	Abigail Sanchez
22	Emily Ramirez
24	Elizabeth Robinson
27	Alexander Allen
31	Henry Torres
36	Victoria Baker
41	David Roberts
44	Zoey Campbell
47	Jack Edwards
51	Ryan Morris
53	Isaac Rivera
54	Eleanor Cook
55	Gabriel Rogers
56	Hazel Reed
58	Stella Bell
59	Anthony Cooper

Understanding NULL With AND / OR Logic

The existence of NULL and 3VL changes how the boolean (or logical) operators behave. That behavior can feel counterintuitive.

NULL With AND

You’d expect most of those outcomes, except for two.

The first is TRUE AND NULL, which results in NULL. The reason is that we can’t determine the outcome without a known value.

As for FALSE AND NULL, it evaluates as FALSE. It doesn’t matter what the unknown value is; one operand is already FALSE, so the whole expression must be FALSE.

You can look at it this way: NULL is “stronger” than TRUE, but it’s not “stronger” than FALSE.

NULL With OR

The situation with OR is different. TRUE OR NULL returns TRUE because one side is already true, regardless of the unknown value. FALSE OR NULL returns NULL because the result depends on the unknown value.

In other words, in OR, TRUE is “stronger” than NULL, while NULL is “stronger” than FALSE.

NOT With NULL

The NOT operand is a negation. The negation of an unknown is still unknown, so the result is NULL.

How NULL Breaks Filters

If you don’t know how NULL behaves in certain logical and comparison operators, your filters will silently break.

Let’s explore how and when this happens, so you can avoid it when it matters.

Inequality Filters Silently Exclude NULLs

I’ll use this question’s dataset.

Worst Businesses

Last Updated: May 2018

HardID 9739

Identify the business with the most violations each year, based on records that include a violation ID. For each year, output the year, the name of the business with the most violations, and the corresponding number of violations. Note: If multiple businesses are tied for the highest number of violations in a given year, returning any one of them is acceptable.

Go to the Question

However, I’ll change the requirements, so the new question is:

“Find all restaurant inspections that were not classified as low risk. Output the business name, inspection date, and risk category.”

This is a segmentation query – partitioning inspections by risk level to prioritize follow-up actions. In any segmentation workflow, silently dropping a segment (the NULL rows here) means your segments don't add up to the whole, and downstream decisions are made on incomplete data.

Dataset

The risk_category column can have these values: Low Risk, Moderate Risk, High Risk, and some rows with nothing in the column.

Table: sf_restaurant_health_violations

business_id	business_name	business_address	business_city	business_state	business_postal_code	business_latitude	business_longitude	business_location	inspection_id	inspection_date	inspection_score	inspection_type	violation_id	violation_description	risk_category
5800	John Chin Elementary School	350 Broadway St	San Francisco	CA	94133	37.8	-122.4	{'longitude': '-122.403154', 'needs_recoding': False, 'latitude': '37.798358', 'human_address': '{"address":"","city":"","state":"","zip":""}'}	5800_20171017	2017-10-17	98	Routine - Unscheduled	5800_20171017_103149	Wiping cloths not clean or properly stored or inadequate sanitizer	Low Risk
64236	Sutter Pub and Restaurant	700 Sutter St	San Francisco	CA	94102	37.79	-122.41	{'longitude': '-122.41188', 'needs_recoding': False, 'latitude': '37.78881', 'human_address': '{"address":"","city":"","state":"","zip":""}'}	64236_20170725	2017-07-25	88	Routine - Unscheduled	64236_20170725_103133	Foods not protected from contamination	Moderate Risk
1991	SRI THAI CUISINE	4621 LINCOLN Way	San Francisco	CA	94122	37.76	-122.51	{'longitude': '-122.507779', 'needs_recoding': False, 'latitude': '37.764073', 'human_address': '{"address":"","city":"","state":"","zip":""}'}	1991_20171129	2017-11-29	86	Routine - Unscheduled	1991_20171129_103139	Improper food storage	Low Risk
3816	Washington Bakery & Restaurant	733 Washington St	San Francisco	CA	94108	37.8	-122.41	{'longitude': '-122.405845', 'needs_recoding': False, 'latitude': '37.795174', 'human_address': '{"address":"","city":"","state":"","zip":""}'}	3816_20160728	2016-07-28	67	Routine - Unscheduled	3816_20160728_103108	Contaminated or adulterated food	High Risk
39119	Brothers Restaurant	4128 GEARY Blvd	San Francisco	CA	94118	37.78	-122.46	{'longitude': '-122.463762', 'needs_recoding': False, 'latitude': '37.781148', 'human_address': '{"address":"","city":"","state":"","zip":""}'}	39119_20160718	2016-07-18	79	Routine - Unscheduled	39119_20160718_103133	Foods not protected from contamination	Moderate Risk

Loading Dataset

The Empty Cell That Isn’t Empty

If you got this question in the interview, you’d be happy. It’s soooooo easy! Just filter with WHERE risk_category != 'Low Risk', and that’s it.

PostgreSQL

Go to the question on the platformTables: sf_restaurant_health_violations

Let’s run this code and inspect the output.

business_name	inspection_date	risk_category
Sutter Pub and Restaurant	2017-07-25	Moderate Risk
Washington Bakery & Restaurant	2016-07-28	High Risk
Brothers Restaurant	2016-07-18	Moderate Risk
T & L FOOD MARKET	2016-06-09	High Risk
Jiang Ling Cuisine Restaurant	2017-12-12	High Risk
Big Fish Little Fish Poke	2018-02-27	Moderate Risk
Laguna Café	2018-07-11	Moderate Risk
Home Plate	2016-10-17	High Risk
SO	2017-12-14	High Risk
Allstars Cafe Inc	2017-08-29	Moderate Risk
Peet's Coffee & Tea	2017-03-27	Moderate Risk
Veraci Pizza	2017-02-23	Moderate Risk
Sharetea	2016-03-31	Moderate Risk
Let's Be Frank	2016-05-13	Moderate Risk
IL BORGO	2018-03-28	Moderate Risk
Boss Supermarket	2017-10-03	Moderate Risk
SAKANA BUNE RESTAURANT	2018-02-22	High Risk
Maggie Cafe	2015-12-02	Moderate Risk
Golden Wok	2017-07-20	High Risk
China Fun Express	2016-03-03	Moderate Risk
Taqueria Dos Charros	2016-02-29	High Risk
YUMMA'S MED GRILL	2016-06-09	Moderate Risk
West Coast Wine & Cheese	2016-11-08	Moderate Risk
SUBWAY #31419	2016-06-27	Moderate Risk
Carbon Grill	2017-03-28	Moderate Risk
San Francisco Marriott Union Square - Main Kitchen	2016-06-28	Moderate Risk
Chinatown Restaurant	2016-01-07	Moderate Risk
HAMANO SUSHI	2017-08-23	Moderate Risk
Souvla	2017-04-03	Moderate Risk
MARTHA & BROS. COFFEE CO	2017-07-07	Moderate Risk
The Grove - Design District	2018-04-27	Moderate Risk
King of Thai Noodles Cafe	2016-12-12	Moderate Risk
Andersen Bakery	2017-11-14	Moderate Risk
IRVING PIZZA	2016-05-31	Moderate Risk
Bellissimo Pizza	2016-07-14	Moderate Risk
PEKING WOK RESTAURANT	2015-09-28	Moderate Risk
Bursa	2018-02-22	Moderate Risk
City Super	2016-03-16	Moderate Risk
Hot Pot Island	2017-08-18	High Risk
95117 Premium Commissary Room	2018-03-26	Moderate Risk
Soo Fong Restaurant	2017-06-20	High Risk
Seal Rock Inn Restaurant	2018-06-20	Moderate Risk
Yummy Sticks	2016-06-18	Moderate Risk
Red Jade Restaurant	2017-08-16	High Risk
Buckhorn Grill	2017-08-29	Moderate Risk
Man Sung Company	2016-08-01	Moderate Risk
Rotee Express	2016-08-26	High Risk
Project Juice	2017-10-19	Moderate Risk
DONA TERE'S MARKET	2017-11-27	Moderate Risk
Cecilia's Pizza & Restaurant	2018-05-29	Moderate Risk
Cabin	2015-10-01	Moderate Risk
Mi Yucatan	2015-11-17	Moderate Risk
Fresca Gardens, Inc	2018-01-17	Moderate Risk
Modern Thai Inc.	2016-10-24	Moderate Risk
Pectopah LLC	2018-02-09	Moderate Risk
The Salvation Army	2018-07-24	Moderate Risk
Harvest Urban Market	2016-03-16	High Risk
Bubble Cafe	2017-11-08	Moderate Risk
LOS PANCHOS	2015-12-07	Moderate Risk
S. F. Gourmet Hot Dog Cart	2016-10-31	Moderate Risk
Cathead's BBQ	2017-01-05	High Risk
ITALIAN AMERICAN SOCIAL CLUB	2016-09-07	Moderate Risk
Gateway High/Kip Schools	2017-04-19	High Risk
Toy Boat Dessert Cafe	2016-11-14	Moderate Risk
Iza Ramen	2018-05-29	Moderate Risk
King of Thai Noodle House	2018-05-08	Moderate Risk
Urban Putt	2018-05-09	Moderate Risk
The AA Bakery & Cafe	2016-05-24	High Risk
CHA-AM RESTAURANT	2016-09-01	High Risk
Pica Pica	2018-03-20	High Risk
AK SUBS	2017-07-24	Moderate Risk
Heritage	2018-06-05	Moderate Risk
Castagnola's Restaurant	2017-08-07	High Risk
Park Gyros Castro	2017-09-27	High Risk
SF BAGEL CO. (KATZ BAGELS)	2016-11-29	High Risk
Luke's Local Inc.	2017-11-27	Moderate Risk
GOLDEN PRODUCE	2018-06-05	Moderate Risk
PANCHO'S	2018-01-17	Moderate Risk
India Clay Oven Restaurant and Bar	2016-02-05	Moderate Risk
Pabu	2016-12-07	Moderate Risk
Salem Grocery	2017-12-21	High Risk
Sam Rong Cafe	2017-05-23	High Risk
Dim Sum Bistro	2017-11-30	Moderate Risk
WING HING RESTAURANT	2017-11-06	High Risk
Rock Japanese Cuisine	2016-02-02	Moderate Risk
Chez Julien	2016-05-10	High Risk
Rico Pan Bakery	2016-05-02	Moderate Risk
Modern Thai Inc.	2016-10-24	High Risk
CLEMENT BBQ RESTAURANT	2018-05-22	Moderate Risk
Milkbomb Ice Cream	2018-06-15	Moderate Risk
Morning Brew Cafe	2016-03-15	Moderate Risk
MONGKOK DIM SUM & RESTAURANT	2016-05-24	High Risk
Azalina's	2016-06-23	High Risk
Little Vietnam Cafe	2016-07-27	Moderate Risk
7-Eleven, Store 2366-21389F	2016-12-17	High Risk
Quickly	2016-02-18	Moderate Risk
Jackson Fillmore Trattoria	2016-09-19	Moderate Risk
Prospect	2016-09-16	Moderate Risk
TAWAN'S THAI FOOD	2018-08-02	Moderate Risk
Ninki Sushi Bar & Restaurant	2017-08-24	High Risk
Kate O'Brien's	2017-08-10	Moderate Risk
Elephant Sushi	2016-07-27	High Risk
Roma Pizzeria	2016-08-09	Moderate Risk
Lollipot	2018-05-22	Moderate Risk
Da Cafe	2016-09-07	High Risk
Lollipot	2018-05-22	High Risk
Lollipot	2018-05-22	High Risk
Lollipot	2018-05-22	Moderate Risk
Da Cafe	2016-09-07	High Risk
Da Cafe	2016-09-07	High Risk
Da Cafe	2016-09-07	High Risk
Roxanne Cafe	2016-11-28	Moderate Risk
Roxanne Cafe	2015-09-30	Moderate Risk
Roxanne Cafe	2015-09-30	Moderate Risk
Roxanne Cafe	2018-04-03	Moderate Risk
Roxanne Cafe	2016-11-28	High Risk

It seems correct, as there really are no businesses with the Low Risk category. Well, ‘seems’ does a lot of heavy lifting here. The fact that there are no Low Risk businesses doesn’t mean that all the non-low-risk businesses are actually shown in the output. So, no, it’s not correct. The whole code and the output are wrong!

Your naive approach silently excluded several rows. You immediately jumped to the conclusion that empty fields in risk_category are just fields with no data and that != 'Low Risk' will catch everything that isn’t low risk, including blanks. It didn’t because blanks aren't actually blanks; they are NULL.

How to Tell That NULL Is NULL?

OK, if blanks are actually NULL, why are they shown as empty cells? Because NULL is a database concept, not a display value. Different engines display it differently: as NULL, or (null), or as N/A. Our IDE shows it as a blank field.

How can you then know if NULL is really NULL?

Applying Business Logic

One way to tell these are actually NULL is business thinking. You can see that risk_category is always empty when violation_id is empty. That way, you could assume that, since the table shows inspections, an empty field in violation_id means no violations were found, so there’s also no risk category.

However, this is just an assumption about the business logic; no violation_id value could also be categorized as low risk, for example. You should always clarify with the interviewer.

Explicitly Checking for `NULLs`

You could also check for any NULLs in the data before you solve the question.

COUNT(*) and GROUP BYare great little helpers. GROUP BY treats NULL as a distinct group. COUNT(*) then counts every row within a group, regardless of its value.

PostgreSQL

Go to the question on the platformTables: sf_restaurant_health_violations

(If you, for example, used COUNT(risk_category), the count for the NULL group would be 0, as it omits NULLs from the count calculation. But more on that later.)

Run the code to see if there are any NULLs. Surprise, surprise, there are. Namely, 72. Since they appear in the output, you’d know these are NULLs, despite being shown as an empty field.

risk_category	count
	72
Low Risk	109
Moderate Risk	78
High Risk	38

Recommended Approach: Safeguarding From Possible `NULLs`

The first two approaches rely on your being able to see and query actual data in the interview. That won’t happen very often.

The safest approach – both in interviews and on the job – is to simply assume there could be NULLs and embed this assumption in your solution.

PostgreSQL

Tables: sf_restaurant_health_violations

The output is now complete, as it includes NULLs, too.

business_name	inspection_date	risk_category
Sutter Pub and Restaurant	2017-07-25	Moderate Risk
Washington Bakery & Restaurant	2016-07-28	High Risk
Brothers Restaurant	2016-07-18	Moderate Risk
T & L FOOD MARKET	2016-06-09	High Risk
Jiang Ling Cuisine Restaurant	2017-12-12	High Risk
Tenderloin Market & Deli	2017-03-13
Big Fish Little Fish Poke	2018-02-27	Moderate Risk
Laguna Café	2018-07-11	Moderate Risk
SAFEWAY STORE #964	2016-06-24
Home Plate	2016-10-17	High Risk
Cafe Bakery	2016-03-28
MARTIN L. KING MIDDLE SCHOOL	2015-09-23
ROYAL GROUND COFFEE	2017-12-22
Dolores Park Outpost	2017-10-06
SO	2017-12-14	High Risk
Allstars Cafe Inc	2017-08-29	Moderate Risk
Peet's Coffee & Tea	2017-03-27	Moderate Risk
Veraci Pizza	2017-02-23	Moderate Risk
Sharetea	2016-03-31	Moderate Risk
Let's Be Frank	2016-05-13	Moderate Risk
IL BORGO	2018-03-28	Moderate Risk
Boss Supermarket	2017-10-03	Moderate Risk
Dragoneats	2017-07-12
Nabe	2016-10-19
Dragon Beaux	2015-09-18
SAKANA BUNE RESTAURANT	2018-02-22	High Risk
Maggie Cafe	2015-12-02	Moderate Risk
Live Oak School	2016-08-26
Southern Comfort Kitchen	2018-03-05
Castro Street Chevron	2015-12-24
Golden Wok	2017-07-20	High Risk
Pho Express	2018-06-29
China Fun Express	2016-03-03	Moderate Risk
Taqueria Dos Charros	2016-02-29	High Risk
YUMMA'S MED GRILL	2016-06-09	Moderate Risk
Boos Voni	2017-11-16
Duboce Park Cafe	2018-07-20
West Coast Wine & Cheese	2016-11-08	Moderate Risk
SUBWAY #31419	2016-06-27	Moderate Risk
Carbon Grill	2017-03-28	Moderate Risk
The Bindery	2017-10-23
San Francisco Marriott Union Square - Main Kitchen	2016-06-28	Moderate Risk
Chinatown Restaurant	2016-01-07	Moderate Risk
HAMANO SUSHI	2017-08-23	Moderate Risk
PASITA'S BAKERY	2016-08-04
MV Taurus	2016-05-18
Tacos San Buena	2017-01-27
Souvla	2017-04-03	Moderate Risk
Samiramis Imports	2016-11-10
MARTHA & BROS. COFFEE CO	2017-07-07	Moderate Risk
The Grove - Design District	2018-04-27	Moderate Risk
Jay's Cheesesteak	2017-07-21
King of Thai Noodles Cafe	2016-12-12	Moderate Risk
Andersen Bakery	2017-11-14	Moderate Risk
IRVING PIZZA	2016-05-31	Moderate Risk
Contrada	2016-11-10
Bellissimo Pizza	2016-07-14	Moderate Risk
JIM'S RESTAURANT	2018-05-04
PEKING WOK RESTAURANT	2015-09-28	Moderate Risk
Bursa	2018-02-22	Moderate Risk
City Super	2016-03-16	Moderate Risk
Hot Pot Island	2017-08-18	High Risk
95117 Premium Commissary Room	2018-03-26	Moderate Risk
Koja Kitchen CA01	2016-07-06
Soo Fong Restaurant	2017-06-20	High Risk
Split Bread	2018-03-23
Seal Rock Inn Restaurant	2018-06-20	Moderate Risk
Keep It, Inc.	2017-10-26
Yummy Sticks	2016-06-18	Moderate Risk
Red Jade Restaurant	2017-08-16	High Risk
EL POLLO SUPREMO	2018-07-12
Kuma Sushi + Sake	2017-08-03
Buckhorn Grill	2017-08-29	Moderate Risk
La Quinta Restaurant	2016-08-29
SENIORE'S PIZZA	2016-03-24
Hook a Cook	2015-09-15
Man Sung Company	2016-08-01	Moderate Risk
Rotee Express	2016-08-26	High Risk
Project Juice	2017-10-19	Moderate Risk
DONA TERE'S MARKET	2017-11-27	Moderate Risk
Jersey	2017-03-02
Cecilia's Pizza & Restaurant	2018-05-29	Moderate Risk
Crepe and Brioche, Inc.	2016-06-25
Cabin	2015-10-01	Moderate Risk
The Grove - Design District	2017-06-30
Mi Yucatan	2015-11-17	Moderate Risk
Jane the Bakery	2016-12-02
North Point Market	2018-06-15
Wing Lum Cafe	2017-10-25
Blue Bottle Coffee	2016-07-20
Fresca Gardens, Inc	2018-01-17	Moderate Risk
Modern Thai Inc.	2016-10-24	Moderate Risk
Pectopah LLC	2018-02-09	Moderate Risk
The Salvation Army	2018-07-24	Moderate Risk
Harvest Urban Market	2016-03-16	High Risk
Roadside Rosy's	2016-08-01
Tupelo	2016-08-05
Del Popolo LLC	2017-01-24
Bubble Cafe	2017-11-08	Moderate Risk
Cafe Fiore	2016-02-11
LOS PANCHOS	2015-12-07	Moderate Risk
A La Turca	2017-09-05
VIP Coffee & Cake Shop	2018-05-21
S. F. Gourmet Hot Dog Cart	2016-10-31	Moderate Risk
Flores	2016-11-16
Pho Huynh Sang	2017-06-12
Cathead's BBQ	2017-01-05	High Risk
ITALIAN AMERICAN SOCIAL CLUB	2016-09-07	Moderate Risk
Poke Kana	2016-07-22
AT&T Park - Coffee and Ice Cream (5A+5B)	2016-08-14
Gateway High/Kip Schools	2017-04-19	High Risk
Taco Bell Cantina #31685	2016-09-19
House of Bagels	2017-05-03
Toy Boat Dessert Cafe	2016-11-14	Moderate Risk
Iza Ramen	2018-05-29	Moderate Risk
King of Thai Noodle House	2018-05-08	Moderate Risk
Urban Putt	2018-05-09	Moderate Risk
Bebebar Juice & Sandwich	2016-11-09
The AA Bakery & Cafe	2016-05-24	High Risk
Cream	2016-01-06
CHA-AM RESTAURANT	2016-09-01	High Risk
Pica Pica	2018-03-20	High Risk
AK SUBS	2017-07-24	Moderate Risk
Heritage	2018-06-05	Moderate Risk
A Mano	2017-05-05
Castagnola's Restaurant	2017-08-07	High Risk
Park Gyros Castro	2017-09-27	High Risk
SF BAGEL CO. (KATZ BAGELS)	2016-11-29	High Risk
Blue Bottle Coffee	2018-01-17
Thai Cottage Restaurant	2018-07-25
Luke's Local Inc.	2017-11-27	Moderate Risk
Fair Trade Cafe LLC	2017-09-29
GOLDEN PRODUCE	2018-06-05	Moderate Risk
PANCHO'S	2018-01-17	Moderate Risk
Project Juice	2016-02-01
India Clay Oven Restaurant and Bar	2016-02-05	Moderate Risk
Pabu	2016-12-07	Moderate Risk
My Ivy Corp.	2016-09-14
Salem Grocery	2017-12-21	High Risk
Sam Rong Cafe	2017-05-23	High Risk
Dim Sum Bistro	2017-11-30	Moderate Risk
Bayshore Taqueria	2018-05-01
WING HING RESTAURANT	2017-11-06	High Risk
Brendas Meat & Three	2018-04-25
Rock Japanese Cuisine	2016-02-02	Moderate Risk
BLOWFISH SUSHI	2015-09-30
Chez Julien	2016-05-10	High Risk
Rico Pan Bakery	2016-05-02	Moderate Risk
Modern Thai Inc.	2016-10-24	High Risk
CLEMENT BBQ RESTAURANT	2018-05-22	Moderate Risk
SEGAFREDO	2018-02-12
Milkbomb Ice Cream	2018-06-15	Moderate Risk
Morning Brew Cafe	2016-03-15	Moderate Risk
Mixt Greens	2018-05-29
MONGKOK DIM SUM & RESTAURANT	2016-05-24	High Risk
Mizutani Sushi Bar	2016-06-09
Yerba Buena Tea Co (formerly Tea Smiths of SF)	2016-10-09
Azalina's	2016-06-23	High Risk
Tropisueño	2017-10-25
Little Vietnam Cafe	2016-07-27	Moderate Risk
7-Eleven, Store 2366-21389F	2016-12-17	High Risk
Quickly	2016-02-18	Moderate Risk
Jackson Fillmore Trattoria	2016-09-19	Moderate Risk
Peet's Coffee & Tea	2017-10-27
Prospect	2016-09-16	Moderate Risk
TAWAN'S THAI FOOD	2018-08-02	Moderate Risk
Stanford Court Hotel	2017-10-27
Ninki Sushi Bar & Restaurant	2017-08-24	High Risk
Kate O'Brien's	2017-08-10	Moderate Risk
Elephant Sushi	2016-07-27	High Risk
24th and Folsom Eatery	2015-09-30
Hilton Financial District- Restaurant Seven Fifty	2016-06-10
Sushi Hon	2016-06-16
Roma Pizzeria	2016-08-09	Moderate Risk
Hans Coffee Shop	2016-04-13
Lollipot	2018-05-22	Moderate Risk
Da Cafe	2016-09-07	High Risk
Lollipot	2018-05-22	High Risk
Lollipot	2018-05-22	High Risk
Lollipot	2018-05-22	Moderate Risk
Da Cafe	2016-09-07	High Risk
Da Cafe	2016-09-07	High Risk
Da Cafe	2016-09-07	High Risk
Roxanne Cafe	2016-11-28	Moderate Risk
Roxanne Cafe	2015-09-30	Moderate Risk
Roxanne Cafe	2015-09-30	Moderate Risk
Roxanne Cafe	2018-04-03	Moderate Risk
Roxanne Cafe	2016-11-28	High Risk

NOT IN With NULLs Produces Empty Results

Imagine you get the following interview question.

“The security team flags suspicious customer activity in a security_events table. Retrieve all orders that are not linked to a flagged account.”

This is an eligibility check – a pattern common in payments, fraud prevention, and access control, where a flaglist determines which customers are cleared to transact. NULL handling here isn't just a SQL technicality; it determines whether legitimate customers actually receive what they're entitled to.

Dataset

The dataset consists of the orders and security_events tables.

The Filter That Blocks Everyone

NOT IN sounds like a direct SQL translation of “give me everything in A that isn’t in B.”, so you’d likely write this as a solution.

SELECT *
FROM orders
WHERE customer_id NOT IN (SELECT customer_id FROM security_events);

Run it, and you get zero rows. If you look at the security_events table, you see that the customers with the IDs 2 & 4 are flagged. But why doesn’t the query return orders placed by other customers, such as 1, 3, 5, who obviously placed orders, as you can find them in the orders table?

The reason is that the NULL row in the security_events table poisons all other data. Namely, NOT IN (SELECT customer_id FROM security_events) reads as NOT IN (2, 4, NULL), which expands to customer_id != 2 AND customer_id != 4 AND customer_id != NULL.

We already learned that the condition with NULL doesn’t produce FALSE, but NULL. Since the AND operator is used, every row in WHERE becomes NULL.

Using NOT EXISTS Instead of NOT IN (No, They’re Not the Same!)

Though they sound quite similar, NOT IN and NOT EXISTS are not the same thing, which surfaces when the NULL data is involved.

Let me explain the difference. NOT IN looks at the row value and the question it asks is: “Is this value definitely not in the list?” It can’t answer that definitively, since one of the list values is unknown.

On the other hand, NOT EXISTS asks: “Is there a row where this condition is TRUE?” We know that NULL conditions are never TRUE. As a result, the NULL rows are skipped, and the matching rows with the actual value in the customer_id column appear in the output.

In other words, NOT EXISTS handles NULLs because it checks for the existence of a matching row, rather than comparing values directly.

SELECT *
FROM orders o
WHERE NOT EXISTS (
    SELECT 1 FROM security_events se
    WHERE se.customer_id = o.customer_id
);

Here’s the output.

BETWEEN and NULL

Here’s an interesting interview question I’ll use to show you how BETWEEN behaves with NULLs.

Find Olympic Events Based On Height

EasyID 9938

Find events of any Winter Olympics in which there were athletes of height between 180 to 210 centimeters. Output unique events only.

Go to the Question

There’s a simple requirement: find events at any Winter Olympics in which athletes with heights between 180 and 210 centimeters participated, and output the unique events.

Dataset

One of the filtering columns will be height. Just a quick look at the table shows that many rows in that column are empty. Learning from previous experience, we can safely conclude that those are NULLs.

Table: olympics_athletes_events

id	name	sex	age	team	noc	games	year	season	city	sport	event
3520	Guillermo J. Amparan	M		Mexico	MEX	1924 Summer	1924	Summer	Paris	Athletics	Athletics Men's 800 metres
35394	Henry John Finchett	M		Great Britain	GBR	1924 Summer	1924	Summer	Paris	Gymnastics	Gymnastics Men's Rings
21918	Georg Frederik Ahrensborg Clausen	M	28	Denmark	DEN	1924 Summer	1924	Summer	Paris	Cycling	Cycling Men's Road Race Individual
110345	Marinus Cornelis Dick Sigmond	M	26	Netherlands	NED	1924 Summer	1924	Summer	Paris	Football	Football Men's Football
54193	Thodore Tho Jeitz	M	26	Luxembourg	LUX	1924 Summer	1924	Summer	Paris	Gymnastics	Gymnastics Men's Individual All-Around

Loading Dataset

BETWEEN Filters NULLs Out

The question asks us to filter out the NULL rows, as NULL surely is not “between 180 and 210 centimeters”.

In this case, using BETWEEN is the correct approach. It’s a shorthand for height >= 180 AND height <= 210.

PostgreSQL

Go to the question on the platformTables: olympics_athletes_events

Here’s the output.

event
Alpine Skiing Men's Combined
Alpine Skiing Men's Downhill
Alpine Skiing Men's Super G
Speed Skating Women's 1000 metres
Speed Skating Women's 500 metres

Taking Control Over Excluding NULLs

BETWEEN comes in handy when you want to exclude NULLs automatically, like in the above example.

However, if you want to decide for yourself whether you want to exclude them, include them, or surface them separately, you’d want to ditch BETWEEN and check for NULLs before filtering. Check “The COUNT(*) vs COUNT(column) Distinction” section to see how to do it.

How NULL Affects Aggregates

Aggregate functions in SQL ignore NULL values by design. There’s one important exception that you already saw in action: COUNT(*).

The COUNT(*) vs COUNT(column) Distinction

The main difference is how they treat NULLs. COUNT(*) counts rows no matter their value, which includes those with NULLs. COUNT(column) counts non-NULL values in that column.

Let’s solve this interview question to demonstrate that difference.

Contact Information Completeness

Last Updated: February 2025

EasyID 2169

The data quality team is auditing employee records to assess the completeness of contact information. Calculate and return the ratio of employees who have a NULL phone number.

Go to the Question

We need to calculate the ratio of employees who have a NULL phone number.

Dataset

Table: techcorp_workforce

id	first_name	last_name	department	salary	phone_number	joining_date
1	Sarah	Mitchell	HR	95000	555-0101	2021-03-15
2	Michael	Chen	HR	88000	555-0102	2022-06-01
3	Emily	Rodriguez	HR	82500		2021-09-20
4	David	Park	HR	80000	555-0104	2023-01-10
5	Lisa	Thompson	HR	65000		2021-04-05

Loading Dataset

Counting What Isn’t There

The instinctive approach is to explicitly filter out NULLs and count them.

For example, you could do it by using CASE WHEN to flag the NULLs, then sum them.

SELECT CAST(SUM(CASE WHEN phone_number IS NULL THEN 1 ELSE 0 END) AS DECIMAL) 
       / COUNT(*) AS null_phone_ratio
FROM techcorp_workforce;

Another, even more long-winded approach with two subqueries, is to find all the NULL phone numbers, then count them using COUNT(*).

SELECT CAST(
              (SELECT COUNT(*)
               FROM techcorp_workforce
               WHERE phone_number IS NULL) AS DECIMAL) /
  	 (SELECT COUNT(*)
FROM techcorp_workforce) AS null_phone_ratio;

However, if the second approach demonstrates that you know that COUNT(*) counts NULLs, why not use that knowledge to write a much shorter solution?

The Power of the `COUNT(*)` and `COUNT(column)` Alliance

If COUNT(*) counts all the rows, including NULLs, and COUNT(phone_number) counts only the non-NULL values, then their difference is the number of rows/employees with NULL phone numbers.

PostgreSQL

Go to the question on the platformTables: techcorp_workforce

Here’s the output.

null_phone_ratio
0.31

AVG() and the Denominator Problem

AVG() ignores NULLs in the average calculation; it considers only known values. Whether this is the correct behavior depends on the business logic.

You should always ask: should NULLs be excluded from the calculation, or treated as a specific value (e.g., 0)?

Here’s the question we’ll use to show those two scenarios.

Homework Results

Last Updated: November 2021

MediumID 2075

Given the homework results of a group of students, calculate the average grade and the completion rate of each student. A homework is considered not completed if no grade has been assigned. Output first name of a student, their average grade, and completion rate in percentages. Note that it's possible for several students to have the same first name but their results should still be shown separately.

Go to the Question

The task is to calculate the average grade and the competition rate for each student.

The question explicitly says that a homework is considered not completed if no grade has been assigned.

That’s valuable information for the completion ratio calculation, but also for the average grade: it implies that we’re calculating the average of each student’s completed homework.

Dataset

The first table in the dataset is allstate_homework. It has several NULL rows, which means these are not completed/not submitted homework assignments.

Table: allstate_homework

student_id	homework_id	grade
1	1	7
1	2	6
1	3	6
2	1	8
2	2

Loading Dataset

The second table is allstate_students.

Table: allstate_students

student_id	student_firstname	student_lastname
1	Marie	Butters
2	Joan	Reed
3	Marie	Maynard
4	Hugo	Wright

Loading Dataset

Solution

The code consists of a subquery, which is then used in a quite straightforward calculation.

1. Flagging the completed and non-completed homeworks

In the subquery, use CASE WHEN to flag the homeworks. If the grade is NULL, it’s marked as 0. Otherwise, it’s marked as 1, i.e., the homework is completed.

SELECT *, 
       CASE
           WHEN grade IS NULL THEN 0
           ELSE 1
       END is_completed
   FROM allstate_homework

2. Calculating average grades and completion rates by student

The main query calculates the average, with AVG() excluding not completed homeworks (NULLs) from the calculation.

In the following line, we use SUM() to count the number of completed homeworks (marked as 1 in CASE WHEN), then divide that by the total number of homeworks.

PostgreSQL

Go to the question on the platformTables: allstate_homework, allstate_students

Take a good look at the output. We’ll see how it changes as the business logic changes.

student_firstname	avg_grade	completion_rate
Joan	8	33.33
Marie	6.33	100
Hugo		0
Marie	5	66.67

How the Business Logic Makes the Average “Wrong”

If the question changes to “What is each student’s overall grade accounting for missed work?”, the previous (default) average calculation would be wrong.

The revised question suggests that NULLs should be treated as zero, since they should be included in the average calculation. In other words, use COALESCE() to turn NULLs into zeros.

PostgreSQL

Go to the question on the platformTables: allstate_homework, allstate_students

The average grades are now significantly different for Joan (dropped from 8 to 2.67) and Marie (dropped from 5 to 3.33). Also, Hugo’s average grade is 0, not NULL.

student_firstname	avg_grade	completion_rate
Joan	2.67	33.33
Marie	6.33	100
Hugo	0	0
Marie	3.33	66.67

I’d recommend reading more about SQL COALESCE. The MySQL alternative is IFNULL.

How NULL Breaks JOINS

Joining data that contains NULL rows is another source of frustration due to silent data loss. Different types of SQL joins can cause different kinds of problems with NULLs.

NULLs Never Match in JOIN Conditions

To illustrate this problem, we’ll solve a classic task: outputting a list of all employees alongside their department name.

Dataset

I’ve created an example dataset. The first table is employees. Two employees are new and still haven’t been assigned to a department.

The second table is departments.

Silently Losing Data With JOIN

Unassumingly, you would probably write this code, thinking that it will return all employees and their departments, regardless of whether there are NULLs in the data.

SELECT e.name, 
 d.department_name
FROM employees e
JOIN departments d ON e.department_id = d.department_id;

It won’t, because join conditions use equality comparisons under the hood. Since the comparison including NULLs evaluates to NULL, the two employees – Eva and Grace – don’t match any department, so they’re silently dropped from the output.

That behavior is expected, but it doesn’t answer the question; it asks you to find all employees and their departments.

The output would have been correct if the question asked “show me employees and their departments”; employees without departments genuinely don’t belong here.

Use LEFT JOIN to Preserve NULLs

Since we want to show employees, even those with unknown departments, the correct approach is to use LEFT JOIN.

SELECT e.name, d.department_name
FROM employees e
LEFT JOIN departments d ON e.department_id = d.department_id;

Here’s the output.

Joining NULLs With NULLs

The previous example showed how Eva and Grace are silently dropped from a JOIN because their department_id is NULL.

What if you actually wanted to match them?

Imagine the departments table has been updated; it now includes an Unassigned department “name”. This is a placeholder department, with a NULL in department_id.

If you watched closely, you’d know that recycling the above JOIN code as in the previous example would again return the same result: Eva and Grace are still dropped because NULL = NULL is still NULL.

There are two ways to fix this problem.

Approach #1: IS NOT DISTINCT FROM

This is a NULL-safe equality operator in PostgreSQL. Unlike =, it treats NULL as equal to NULL.

SELECT e.name, 
 d.department_name
FROM employees e
JOIN departments d 
ON e.department_id IS NOT DISTINCT FROM d.department_id;

The output now shows all eight employees, with Eva and Grace now matched to the “Unassigned” row.

Approach #2: COALESCE() to Normalize NULLs Before Joining

An alternative approach is to replace NULLs before comparison with a value that can’t exist as a real value in the table. For example, NULLS in department_id are replaced with -1 by using COALESCE().

SELECT e.name, d.department_name
FROM employees e
JOIN departments d 
ON COALESCE(e.department_id, -1) = COALESCE(d.department_id, -1);

Bear in mind that this can be risky. If -1 ever appears as a real department ID – for whatever reason – this will “unfix” your fix. IS NOT DISTINCT FROM is safer and more explicit.

A common mistake with COALESCE(): COALESCE() in SELECT is for display only, COALESCE() in the ON clause is for matching logic. Don’t mix up those two!

When LEFT JOIN Behaves Like JOIN

The question from Instacart and Amazon is perfect to demonstrate this issue.

No Order Customers

Last Updated: June 2020

MediumID 10142

Identify customers who did not place an order between 2019-02-01 and 2019-03-01.

Include:

• Customers who placed orders only outside this date range. • Customers who never placed any orders.

Output the customers' first names.

Go to the Question

The task is to output customers who didn’t place an order between 2019-02-01 and 2019-03-01. In other words, those customers either placed orders only outside this data range or never placed any orders.

In practice, this is a retention query. Identifying customers who haven't ordered in a given window is the first step in any churn analysis or re-engagement campaign. A silent bug here – dropping customers due to a NULL filter – means your retention list is incomplete before you've even started.

Dataset

Table: customers

id	first_name	last_name	city	phone_number
8	John	Joseph	San Francisco	928-386-8164
7	Jill	Michael	Austin	813-297-0692
4	William	Daniel	Denver	813-368-1200
5	Henry	Jackson	Miami	808-601-7513
13	Emma	Isaac	Miami	808-690-5201

Loading Dataset

Table: orders

id	cust_id	order_date	order_details	total_order_cost
1	3	2019-03-04	Coat	100
2	3	2019-03-01	Shoes	80
3	3	2019-03-07	Skirt	30
4	7	2019-02-01	Coat	25
5	7	2019-03-10	Shoes	80

Loading Dataset

Reckless Filtering Turns LEFT JOIN Into JOIN

Your instinct tells you that, to answer the question, you should LEFT JOIN orders onto customers, all customers get preserved, then filter by date.

PostgreSQL

Go to the question on the platformTables: customers, orders

If you run the checker, you’ll see that your instinct and the solution are, unfortunately, incorrect. It seems several rows went missing somehow; the official solution returns ten rows, while this output has only six.

first_name
Jill
William
Henry
Mia
Mia
Farida

What happened here: For customers with no matching order, the LEFT JOIN fills o.order_date with NULL, so the filtering condition becomes NULL BETWEEN '2019-02-01' AND '2019-03-01'. You already know where this is going: the condition evaluates to NULL, and the WHERE filter fails. As a result, every unmatched customer is dropped – exactly the ones we were trying to find.

You accidentally turned LEFT JOIN into JOIN by filtering on a column from the right-side table.

Keep the Date Filter Out of the Where Clause

There are two ways to fix this.

Approach #1: Filter in the ON Clause

That way, filtering becomes a part of the matching logic, rather than a post-join row filter.

PostgreSQL

Go to the question on the platformTables: customers, orders

Approach #2: Filter Data Before Joining

This is an approach that the official solution uses. The orders table is filtered by order dates before it’s joined with customers. After joining, use filtering to keep only NULL customer IDs.

PostgreSQL

Go to the question on the platformTables: customers, orders

Both approaches result in the same, correct output.

first_name
Frank
Eva
Lili
Mona
Emma
John
Jack
Mark
Liam
Justin

NULL and Data Modeling

Your decisions on how to model NULLs at the schema level have significant downstream effects.

Ideal Approach: NOT NULL Constraints

The most direct way to eliminate NULL-related bugs is to not have NULLs in the data at all.

Wherever possible, prevent them from being stored in the first place by using the NOT NULL.

CREATE TABLE users (
    id          INT          NOT NULL,
    email       VARCHAR(255) NOT NULL,
    created_at  TIMESTAMP    NOT NULL DEFAULT CURRENT_TIMESTAMP,
    phone       VARCHAR(20)  -- nullable: phone is optional
);

When NULL Is Unavoidable

We know that we don’t live in an ideal world. Some columns must be allowed to be NULL. For example, a phone number is often non-mandatory data. Another example is that the delivery date is not recorded until the shipment is actually delivered.

If you can’t avoid NULLS, the solution is then to be clear about what NULL means in the column. That way, you’ll at least avoid the ambiguity, because a NULL column can represent three different things.

So, if you can’t avoid nullable columns, use one of these two approaches.

Approach #1: Documenting

If NULL has one clear meaning, write it in a schema comment or data dictionary. For example, “NULL in delivery_date means the order hasn’t been delivered yet”. Everyone querying that column now knows exactly what they’re dealing with, and can decide how to approach it.

Approach #2: Restructuring With `DEFAULT`

There are situations where NULL can mean different things, depending on the row. For example, delivery_date may be NULL because the delivery hasn’t been delivered yet, but also because it was canceled. The fix is to add a separate status column and assign a default value.

ALTER TABLE orders ADD COLUMN delivery_status VARCHAR(20) NOT NULL DEFAULT 'pending';
-- 'pending', 'delivered', 'cancelled'

Now, when a new row/order is added, delivery_date will still be NULL, but we’ll know this means the order is pending because that’s the default value for delivery_status.

If delivery_date is NULL because the order was canceled, the delivery status would change to reflect that.

How to Reason About Missingness

Best Practices for Working With NULL in SQL

Common NULL Mistakes (Quick Reference)

Conclusion

NULL is not a bug in SQL, but a deliberate design choice for representing the unknown. The bugs come from treating it like a value. It isn’t a value.

The common thread that runs through all of the examples we showed: SQL making a choice about NULL that you didn’t notice, the output looks plausible enough, and you move on.

This article focused on internalizing three-valued logic; you will now know where the traps are and be able to catch them before they reach production.

FAQs

1. What is the difference between `NULL` and an empty string in SQL?

NULL is the absence of a value, as the data is unknown or missing. An empty string ('') is a value; a string with zero characters.

As an example, the NULL phone number means you don’t have a customer’s phone number. An empty string as a phone number means a customer doesn’t have a phone number, so you record that collected piece of information as an empty string.

2. How does `NULL` affect SQL `WHERE` clauses?

Comparisons with NULL always evaluate to NULL – not TRUE or FALSE – so the rows with NULLs will be filtered out.

If you want to find NULL rows explicitly, use IS NULL. If you want to include them alongside other results, add OR column IS NULL.

3. Does `COUNT()` include `NULL` values in SQL?

It depends on which COUNT() form you use. COUNT(*) does include NULLs; it counts every row, regardless of what’s in it.

COUNT(column_name) counts only the non-NULL values in that column.

4. Do SQL aggregate functions ignore `NULL` values?

Yes, they do. The only exception is COUNT(*), which counts every row regardless.

All other aggregate functions – COUNT(column), SUM(), AVG(), MIN(), MAX() – skip NULLs. Bear in mind that, as a consequence, AVG() divides by the count of non-NULL values. This can result in misleading calculations, depending on the business logic.

5. How does `NULL` affect SQL joins?

Rows with NULLs in a join key will never match any other rows, including rows that also have NULL.

The reason is that joins use equality comparisons to join the rows. And in equality comparisons, NULL = NULL results in NULL.

6. Why do `LEFT JOINs` sometimes behave like `INNER JOINs`?

This happens when you filter on a column from the right-side table in the WHERE clause. A LEFT JOIN fills right-side columns with NULL for unmatched rows. If you filter on one of those columns, those NULLs fail the filter and get dropped. Which is exactly the JOIN behavior; it drops the unmatched rows.

7. What is SQL three-valued logic?

Standard Boolean logic says that the condition is either TRUE or FALSE. The three-valued logic adds a third outcome: NULL, as in “unknown”.

WHERE treats it the same as FALSE, meaning the row is excluded. In other words, WHERE returns only rows where the condition is TRUE.

This logic underlies the behavior of the logical and comparison operators whenever a NULL value is involved.

8. What does `IS NULL` do in SQL?

IS NULL evaluates the NULL row as TRUE. For the non-NULL rows, it’s FALSE.

This predicate is the only reliable way to test for NULL. Its counterpart is IS NOT NULL.

9. What does `COALESCE()` do in SQL?

It evaluates each argument left to right and returns the first non-NULL value. This feature is useful for replacing NULLs with zeros, thus not excluding them from the output or calculation.

10. Is `NULL` the same across all SQL databases?

The core semantics are defined by the SQL standard, so NULL behaves consistently across databases.

However, there are some other differences. The first is that the way NULL is displayed varies. Some engines show the word NULL, others show (null), N/A, or a blank cell.

Another important difference is that Oracle treats empty strings as NULL. PostgreSQL, MySQL, and SQL Server don’t do that.

PostgreSQL has a NULL-safe equality operator (IS NOT DISTINCT FROM). Despite being defined by the SQL standard, it’s not supported in older versions of MySQL and SQL Server; they use <=> and a workaround with ISNULL() respectively.

11. When should a column allow `NULL` values?

When the absence of a value is a legitimate state in the data model. This happens in three situations:

not applicable -> the value doesn’t exist for this row
not yet known -> the value will exist eventually
not collected -> the value exists in reality but wasn’t captured

Understanding NULL Semantics in SQL

What NULL Actually Means in SQL

Why = NULL Never Works in SQL

Users Missing Phone Numbers

How to Output the NULL Rows

Understanding NULL With AND / OR Logic

NULL With AND

NULL With OR

NOT With NULL

How NULL Breaks Filters

Inequality Filters Silently Exclude NULLs

Worst Businesses

Dataset

The Empty Cell That Isn’t Empty

How to Tell That NULL Is NULL?

Applying Business Logic

Explicitly Checking for NULLs

Recommended Approach: Safeguarding From Possible NULLs

NOT IN With NULLs Produces Empty Results

Dataset

The Filter That Blocks Everyone

Using NOT EXISTS Instead of NOT IN (No, They’re Not the Same!)

BETWEEN and NULL

Find Olympic Events Based On Height

Dataset

BETWEEN Filters NULLs Out

Taking Control Over Excluding NULLs

How NULL Affects Aggregates

The COUNT(*) vs COUNT(column) Distinction

Contact Information Completeness

Dataset

Counting What Isn’t There

The Power of the COUNT(*) and COUNT(column) Alliance

AVG() and the Denominator Problem

Homework Results

Dataset

Solution

How the Business Logic Makes the Average “Wrong”

How NULL Breaks JOINS

NULLs Never Match in JOIN Conditions

Dataset

Silently Losing Data With JOIN

Use LEFT JOIN to Preserve NULLs

Joining NULLs With NULLs

Approach #1: IS NOT DISTINCT FROM

Approach #2: COALESCE() to Normalize NULLs Before Joining

When LEFT JOIN Behaves Like JOIN

No Order Customers

Dataset

Reckless Filtering Turns LEFT JOIN Into JOIN

Keep the Date Filter Out of the Where Clause

Approach #1: Filter in the ON Clause

Approach #2: Filter Data Before Joining

NULL and Data Modeling

Ideal Approach: NOT NULL Constraints

When NULL Is Unavoidable

Approach #1: Documenting

Approach #2: Restructuring With DEFAULT

How to Reason About Missingness

Best Practices for Working With NULL in SQL

Common NULL Mistakes (Quick Reference)

Conclusion

FAQs

1. What is the difference between NULL and an empty string in SQL?

2. How does NULL affect SQL WHERE clauses?

3. Does COUNT() include NULL values in SQL?

4. Do SQL aggregate functions ignore NULL values?

5. How does NULL affect SQL joins?

6. Why do LEFT JOINs sometimes behave like INNER JOINs?

7. What is SQL three-valued logic?

8. What does IS NULL do in SQL?

9. What does COALESCE() do in SQL?

10. Is NULL the same across all SQL databases?

11. When should a column allow NULL values?

Explicitly Checking for `NULLs`

Recommended Approach: Safeguarding From Possible `NULLs`

The Power of the `COUNT(*)` and `COUNT(column)` Alliance

Approach #2: Restructuring With `DEFAULT`

1. What is the difference between `NULL` and an empty string in SQL?

2. How does `NULL` affect SQL `WHERE` clauses?

3. Does `COUNT()` include `NULL` values in SQL?

4. Do SQL aggregate functions ignore `NULL` values?

5. How does `NULL` affect SQL joins?

6. Why do `LEFT JOINs` sometimes behave like `INNER JOINs`?

8. What does `IS NULL` do in SQL?

9. What does `COALESCE()` do in SQL?

10. Is `NULL` the same across all SQL databases?

11. When should a column allow `NULL` values?