In the previous post, I introduced the SELECT statement and demonstrated how to use it to pull data from the database. Now that we are comfortable with the basic usage, let’s build on our knowledge by adding a WHERE clause. If you are unfamiliar with boolean algebra, it is strongly recommended that you read this tutorial before continuing.
A WHERE clause is used to filter data. It helps to control which records are returned when the query is ran. Let’s take a look at a query.
SELECT * FROM Employees WHERE EmployeeID = '1'
This query will return all records where the column matches the constant provided. Since there is only one row that has 1 as the value for the EmployeeID column there is only one record returned.
What if we try something where multiple records would match? Let’s run this query.
SELECT * FROM Employees WHERE Title = 'Sales Representative'
Now we get 6 records back. What if we wanted to narrow down our query even further? For example, what if we wanted to find all of the Sales Reps that lived in a particular city?
Chaining Filters With AND
Multiple filters can be applied to a single SELECT statement by linking them together using the AND keyword as shown:
SELECT EmployeeID, LastName, FirstName, Title, City FROM Employees WHERE Title = 'Sales Representative' AND City = 'London'
As you can see, now were are only pulling London based Sales Reps. What if we wanted to do the opposite and pull all of the Sales Reps that do not work in London?
Reversing Filters with Negation
We can use the ‘not equal’ symbol <> to filter records that match the search string. Take a look at this query:
SELECT EmployeeID, LastName, FirstName, Title, City FROM Employees WHERE Title = 'Sales Representative' AND City <> 'London'
This is the same as our last query except for the City filter. You will notice that instead of using = we use <> to specify that we should not pull records that match the string. If we take a look at the result set we can see this in action.
What if we wanted to pull all of our Sales Reps that live in either Seattle or London? A common beginners mistake would be to try something like this:
SELECT EmployeeID, LastName, FirstName, Title, City FROM Employees WHERE Title = 'Sales Representative' AND City = 'London' AND City = 'Seattle'
But if you were to run this code you would get 0 records back. What gives? The reason that you wouldn’t get the expected results has to do with the way that the logic works for the WHERE clause. If you have no experience with boolean algebra then I strongly suggest that you read my posts covering the subject here. When you are ANDing filters together you can imagine the filter looking at each row and then applying the filters one by one to get a bunch of boolean (true or false) values linked together with the keywords that are separating the filters (in this case, AND). Once the boolean algebra is worked out, the final value of the equation determines weather the row will be included or not. Let’s take a look at an example. I have the record for EmployeeID 1 here side by side with the WHERE clause.
WHERE Title = ‘Sales Representative’ AND City = ‘London’ AND City = ‘Seattle’
Title = ‘Sales Representative’ matches so it gets a ‘true’ value, City = ‘London’ does not match so it gets a ‘false’ value, and City = ‘Seattle’ matches so it gets a ‘true’ value. Let’s look at the boolean equation that this produces when chained with our logical keywords.
true AND false AND true
So this will evaluate to false and the record will not be pulled into the result set. In fact, whenever you have two different values for the same column ANDed together you will end up with a negation (a statement that always evaluates to false). We will need a new way to filter to query this data.
Chaining Filters with OR
In order to create a query that will give us all users that live in either London or Seattle we need to use the OR filter.
SELECT EmployeeID, LastName, FirstName, Title, City FROM Employees WHERE Title = 'Sales Representative' AND (City = 'London' OR City = 'Seattle')
Notice that I have added parentheses to the WHERE clause. When combining ands and ors it is important to do this. Without the parentheses, SQL will try to read the where clause left to right and you will end up with an extra record. Here is how the where clause will evaluate logically without the parentheses:
WHERE Title = ('Sales Representative' AND City = 'London') OR City = 'Seattle'
Now that we have seen the major operators, let’s take a look at some ways that we can apply some more fine grained filters.
Filtering with IN, LIKE, and BETWEEN
For these next few operators, let’s try looking at the Products table.
SELECT * FROM Products
In this table there are fewer columns but much more rows. If you take a look at the data in this table you will see it is of all the products that this company carries as well as information about the products. Let’s pretend that you are the manager of this business and you need to check which of your products are out of stock or almost out of stock that cost between 10 and 40 dollars. On top of this, you are not interested in any products that have been discontinued.
When designing this query, the first thing that we should think about is what columns we need. We need to know what the product name is, we will probably want to know the price of the product for when we are planning our purchase, and we will want to know how many products we have in stock. Lets start with just the SELECT statement:
SELECT ProductName, UnitPrice, UnitsInStock FROM Products
You may notice that I am not including the Discontinued column. Although we do not want to see discontinued products, we also do not necessarily need to see the discontinued field for every record. In this situation it would be best to filter on the Discontinued field but ignore it in the SELECT statement. Let’s add that filter.
SELECT ProductName, UnitPrice, UnitsInStock FROM Products WHERE Discontinued = '0'
Great, now let’s move on to finding the products that cost between 10 and 40 dollars. There are two ways to do this. We can either filter using arithmetic operators or using the BETWEEN keyword. I will show both approaches below.
SELECT ProductName, UnitPrice, UnitsInStock FROM Products WHERE Discontinued = '0' AND UnitPrice >= 10 AND UnitPrice <= 40
SELECT ProductName, UnitPrice, UnitsInStock FROM Products WHERE Discontinued = '0' AND UnitPrice BETWEEN 10 AND 40
It does not matter which way you write this query, as each way will give the same result. I prefer the BETWEEN operator so I will be using it moving forward. Now let’s take care of the final condition. We want to show all products that are either out of stock or nearly out of stock. Let’s consider anything with less than five items in stock to be almost gone.
SELECT ProductName, UnitPrice, UnitsInStock FROM Products WHERE Discontinued = '0' AND UnitPrice BETWEEN 10 AND 40 AND UnitsInStock < 5
This query now meets all of the conditions that we initially laid out and pulls the data that we wanted.
What if we wanted only a few specific products to show up? We could write something like this:
SELECT ProductName, UnitPrice, UnitsInStock FROM Products WHERE ProductName = 'Chai' OR ProductName = 'Tofu' OR ProductName = 'Mascarpone Fabioli'
This would totally work but there is a shorter way to write it using the IN filter:
SELECT ProductName, UnitPrice, UnitsInStock FROM Products WHERE ProductName IN ('Chai','Tofu','Mascarpone Fabioli')
If you run these queries you can see that the results are the same.
In the next post I will cover aggregate functions and the GROUP BY clause.