Difference Between On Clause And Where Clause When Used With Left Join in SQL Server



Introduction

The purpose of this article is to show the difference between the On clause and Where clause when used with a left join in SQL Server.

Background


I have been working with SQL Server databases for more than 5 years and I was not aware of the difference between the On clause and Where clause when used with left join. When I asked our DBA about this he said that there is a difference but he was not be able to explain the difference. Then I started exploring this topic myself and got some interesting observations of these two clauses when used with left join. So today I decided to share my knowledge so that everyone could benefit from my work.
Using the code

Today I will explain the difference between the On clause and Where clause when used with left join in SQL Server. When the On clause is used in an outer join, the On clause will have no effect on the outer table and all rows from the outer table will be returned and the On clause determines which rows of the subordinate table joins to the outer table. Rows of the outer table that do not meet the condition specified in the On clause in the join are extended with null values for subordinate columns (columns of the subordinate table), whereas the Where clause filters the rows that actually were returned to the final output. It's difficult to understand this from the above definition, so let's try to understand this difference with an example. Suppose we have two tables Departments (deptId, deptName) and Employees (empID, DeptID, EmpName, Salary) and deptiD is the foreign key of the department table. An employee can have only one department whereas a department can have many employees.

We have this sample data in the Department table:

DeptId DeptName
1 HR
2 Payroll
3 Admin
4 Marketing
5 HR & Accounts

And here is the sample data for the Employees table:

EmpId EmpName DeptId EmpSalary
1 John 1 5000.00
2 Albert 1 4500.00
3 Crain 2 6000.00
4 Micheal 2 5000.00
5 David NULL 34.00
6 Kelly NULL 457.00
7 Rudy 1 879.00
8 Smith 2 7878.00
9 Karsen 5 878.00
10 Stringer 5 345.00
11 Cherly NULL NULL

Case 1

Suppose we are asked to show all the employees and their relevant departments, then we would write a query like this:

select * from employees e 
<p>left join departments d on e.deptid = d.deptid

and the result of this query is:

EmpId EmpName DeptId EmpSalary DeptId DeptName
1 John 1 5000.00 1 HR
2 Albert 1 4500.00 1 HR
3 Crain 2 6000.00 2 Payroll
4 Michael 2 5000.00 2 Payroll
5 David NULL 34.00 NULL NULL
6 Kelly NULL 457.00 NULL NULL
7 Rudy 1 879.00 1 HR
8 Smith 2 7878.00 2 Payroll
9 Karsen 5 878.00 5 HR & Accounts
10 Stringer 5 345.00 5 HR & Accounts
11 Cheryl NULL NULL NULL NULL

Here you can see that all rows from the Employees table are listed along with their department, if they have a department; otherwise, deptid and deptname are null.

Case 2


Let's say we are asked to show the list of all employees and the department name of only those employees who have "HR" or "HR & Accounts" department; then we would write a query like this:


select
* from employees e
left
join
departments d on e.deptid = d.deptid
and
( d.deptname = 'HR' or d.deptname = 'HR & Accounts')


and the result of this query is

EmpId EmpName DeptId EmpSalary DeptId DeptName
1 John 1 5000.00 1 HR
2 Albert 1 4500.00 1 HR
3 Crain 2 6000.00 NULL NULL
4 Michael 2 5000.00 NULL NULL
5 David NULL 34.00 NULL NULL
6 Kelly NULL 457.00 NULL NULL
7 Rudy 1 879.00 1 HR
8 Smith 2 7878.00 NULL NULL
9 Karsen 5 878.00 5 HR & Accounts
10 Stringer 5 345.00 5 HR & Accounts
11 Cheryl NULL NULL NULL NULL

You can notice here that only the same number of rows are returned as in query 1, but here the department of only those employees are listed whose department name is "HR" or "HR & Accounts". As you can see, "Crain"," Micheal", and "Smith" have department "Payroll" (see table 1), but it's not listed in the above result set because here we applied the filter in the "On clause"; hence only those rows become part of the join that has department name "HR" or "HR & Employee", and as a result, all other employees who don't have department name "HR" and "HR & Accounts" show their department name as null. This is an example of the On clause in a left join.

Case 3

What if we are asked to show only those employees who have their department name "HR" or "HR & Accounts"? We would write a query like this:

select * from employees e
left join
departments d on e.deptid = d.deptid
where
( d.deptname = 'HR' or d.deptname = 'HR & Accounts')


and the result of this query is:


EmpId EmpName DeptId EmpSalary DeptId DeptName
1 John 1 5000.00 1 HR
2 Albert 1 4500.00 1 HR
3 Rudy 1 879.00 1 HR
4 Karsen 5 878.00 5 HR & Accounts
5 Stringer 5 345.00 5 HR & Accounts

You can see here that only those employees are listed who have their department "HR" or "HR & Accounts", so what we did here is the same query as query 1. You can check the result of query 1 in table 1, and then we just applied a filter in the Where clause to filter the rows and to return only those employees who have their department name as "HR" or "HR & Accounts". This is an example of Where clause in left join.

Now you have seen all three cases and can see how these differ from each other. This was an example of a (1:1) one to one table as one employee can have only one department. Let's execute the above 3 cases with (1:M), a one to many table relationship where one department can have multiple employees (or more than one employee can be in one department).


Case 4

Let's suppose our requirement is to get all departments and the employees listed in these departments, then we would write a query like: 

select * from departments d
left
join
employees e on e.deptId = d.deptId


It will return all rows from the Department table and only those rows from the Employees table that have a department.

The result will look like:

DeptId DeptName EmpId EmpName DeptId EmpSalary
1 HR 1 John 1 5000.00
1 HR 2 Albert 1 4500.00
1 HR 7 Rudy 1 879.00
2 Payroll 3 Crain 2 6000.00
2 Payroll 4 Michael 2 5000.00
2 Payroll 8 Smith 2 7878.00
3 Admin NULL NULL NULL NULL
4 Marketing NULL NULL NULL NULL
5 HR & Accounts 9 Karsen 5 878.00
5 HR & Accounts 10 Stringer 5 345.00

You can see in the result that all departments are listed, even those that don't have any employees, as "Admin" and "Marketing".

Case 5

But what if we are asked to show all departments and only those employees who have department name "HR or "HR & Accounting". Then we would write a query like:

select * from departments d
left
join
employees e on e.deptId = d.deptId
and
( d.deptname = 'HR' or d.deptname = 'HR & Accounts')


It will show the following records:

DeptId DeptName EmpId EmpName DeptId EmpSalary
1 HR 1 John 1 5000.00
1 HR 2 Albert 1 4500.00
1 HR 7 Rudy 1 879.00
2 Payroll NULL NULL NULL NULL
3 admin NULL NULL NULL NULL
4 Marketing NULL NULL NULL NULL
5 HR & Accounts 9 Karsen 5 878.00
5 HR & Accounts 10 Stringer 5 345.00

You can see in the result, this is the On clause in a left join; it just returns all rows from the Department table and only those rows from the Employee table that have department "HR" or "HR & Accounts". We have employees in the Payroll department (Table 4 as a result of query 4) but it's not listed in the above result set as we just put an extra condition in the "On clause" that the department name should be "HR" or "HR & Accounts". That's why only those rows were part of this join which have department name "HR" and "HR & Accounts". Hence all rows from the Department table were returned and only those matching rows were returned from the Employees table who have their department as "HR" or "HR & Accounts".

Case 6

What if we are asked to show only the "HR" and "HR & Accounts" departments along with their relevant employees? We would then write a query like this:

select * from departments d
left
join
employees e on e.deptId = d.deptId
where
( d.deptname = 'HR' or d.deptname  = 'HR & Accounts')


and the result of the above query would be:

DeptId DeptName EmpId EmpName DeptId EmpSalary
1 HR 1 John 1 5000.00
1 HR 2 Albert 1 4500.00
1 HR 7 Rudy 1 879.00
5 HR & Accounts 9 Karsen 5 878.00
5 HR & Accounts 10 Stringer 5 345.00

So what happened here? We just did a left join as we did in query 4, and you can see the result in table 4. And then we apply a filter on the result set to return only those rows that have deptNmae as "HR" or "HR&Accounts". This is a Where clause in left join. Now you can see how it is different from the result of query 5 where we put this condition in the "On clause".

There is no difference between the Where clause and On clause when used with inner join.

I have shown you six different cases to explain the concept. I hope you enjoyed reading the article. I have attached a script that can be used to create the tables and sample data and then you can practice all by yourself. I would be waiting for your feedback.

References

* http://publib.boulder.ibm.com/infocenter/idshelp/v10/index.jsp?topic=/com.ibm.sqls.doc/sqls678.htm

Up Next
    Ebook Download
    View all
    Learn
    View all