332 CHAPTER 17: THE SELECT STATEMENT
SELECT F1.fruit, F1.score,
FROM Foobar AS F1
ORDER BY F1.score DESC;
One outcome might have been any of these:
Result #1
('Apples', 5)
('Oranges', 5)
('Apples', 2)
('Banana', 2)
('Apples', 1)
Result #2
('Oranges', 5)
('Apples', 5)
('Apples', 2)
('Banana', 2)
('Apples', 1)
Result #3
('Oranges', 5)
('Apples', 5)
('Banana', 2)
('Apples', 2)
('Apples', 1)
Result #4
('Apples', 5)
('Oranges', 5)
('Banana', 2)
('Apples', 2)
('Apples', 1)
If you use:
SELECT F1.fruit, F1.score,
FROM Foobar AS F1
ORDER BY F1.score DESC, F1.fruit ASC;
Result
('Apples', 5)
('Oranges', 5)
17.1 SELECT and JOINs 333
('Apples', 2)
('Banana', 2)
('Apples', 1)
But this is not what we wanted—the order within fruits has been
destroyed. Likewise:
SELECT F1.fruit, F1.score
FROM Foobar AS F1
ORDER BY F1.fruit ASC, F1.score DESC;
Results
('Apples', 5)
('Apples', 2)
('Apples', 1)
('Banana', 2)
('Oranges', 5)
But this is still not what we wanted—the order within scores has been
destroyed. We need a dummy column to preserve the ordering, thus:
SELECT F1.fruit, F1.score,
(SELECT MAX(score) FROM Foobar AS F2 WHERE F1.fruit =
F2.fruit)
AS score_preserver
FROM Foobar AS F1
ORDER BY score_preserver DESC, F1.fruit ASC, F1.score DESC;
Cursors include an <updatability clause>, which tells you if
the cursor is
FOR READ ONLY or for UPDATE [OF <column name
list>], but this clause in optional. If ORDER BY is specified, or if the
result table is a read-only table, then the
<updatability clause>
defaults to
FOR READ ONLY.
The ORDER BY and CASE Expressions
SQL-99 allows you to use a function in an
ORDER BY clause. While it is
now legal, it is still not a good programming practice. Users should see
the fields that are used for the sort, so they can use them to read and
locate lines of data in reports. The sorting values are usually on the left
side of each line, since we read left to right. The most portable method is
to use a
CASE expression that takes an external parameter of the form:
334 CHAPTER 17: THE SELECT STATEMENT
SELECT first_name, last_name, dept,
CASE :flag
WHEN 'f' THEN first_name
WHEN 'l' THEN last_name
WHEN 'd' THEN dept
ELSE NULL END AS sort_col
FROM Personnel
ORDER BY sort_col;
Obviously, the expression in the THEN clauses must either be of the
same data type or be cast into the same data type. Controlling the
direction of the sort is a little trickier and requires two columns, one of
which is always set to all
NULLs.
SELECT last_name,
CASE :flag
WHEN 'la' THEN last_name ELSE NULL END AS sort_col1,
CASE :flag
WHEN 'ld' THEN last_name ELSE NULL END AS sort_col2
FROM Personnel
ORDER BY sort_col1, sort_col2 DESC;
You can get a bit fancy with this basic idea:
SELECT
CASE :flag_1
WHEN 'a' THEN CAST (a AS CHAR(n))
WHEN 'b' THEN CAST (b AS CHAR(n))
WHEN 'c' THEN CAST (c AS CHAR(n))
ELSE NULL END AS sort_1,
CASE :flag_2
WHEN 'x' THEN CAST (x AS CHAR(n))
WHEN 'y' THEN CAST (y AS CHAR(n))
WHEN 'z' THEN CAST (z AS CHAR(n))
ELSE NULL END AS sort_2,
CASE :flag_n
WHEN 'n1' THEN CAST (n1 AS CHAR(n))
WHEN 'n2' THEN CAST (n2 AS CHAR(n))
WHEN 'n3' THEN CAST (n3 AS CHAR(n))
ELSE NULL END AS sort_2,
17.1 SELECT and JOINs 335
FROM MyTable
WHERE
ORDER BY sort_1, sort_2, ;
If you have more than one sort column and only a limited set of
combinations, use concatenation.
CASE :flag_1
WHEN 'ab'
THEN CAST(a AS CHAR(n)) ||' ' || CAST(b AS CHAR(n))
WHEN 'ba'
THEN CAST(b AS CHAR(n)) ||' ' || CAST(a AS CHAR(n))
ELSE NULL END AS sort_1,
If you need ASC and DESC options, then use a combination of CASE
and
ORDER BY:
CASE :flag_1
WHEN :flag_1 = 'a' AND :flag_1_ad = 'ASC'
THEN CAST (a AS CHAR(n))
WHEN :flag_1 = 'b' AND :flag_1_ad = 'ASC'
THEN CAST (b AS CHAR(n))
WHEN :flag_1 = 'c' AND :flag_1_ad = 'ASC'
THEN CAST (c AS CHAR(n))
ELSE NULL END AS sort_1_a,
CASE :flag_1
WHEN :flag_1 = 'a' AND :flag_1_ad = 'DESC'
THEN CAST (a AS CHAR(n))
WHEN :flag_1 = 'b' AND :flag_1_ad = 'DESC'
THEN CAST (b AS CHAR(n))
WHEN :flag_1 = 'c' AND :flag_1_ad = 'DESC'
THEN CAST (c AS CHAR(n))
ELSE NULL END AS sort_1_d
ORDER BY sort_1_a ASC, sort_1_d DESC
I have shown explicit CAST(<exp> AS CHAR(n)), but if the data
types of the
THEN clause expressions were already the same, there would
be no reason to force the conversions. You change the
ELSE NULL
clause to any constant of the appropriate data type, but it should be
336 CHAPTER 17: THE SELECT STATEMENT
something useful to the reader. A neater way of doing this is to use one
column for each sorting option.
SELECT MyTable.* ,
CASE WHEN :flag = 'a' THEN a ELSE NULL END AS sort1,
CASE WHEN :flag = 'b' THEN b ELSE NULL END AS sort2,
CASE WHEN :flag = 'c' THEN c ELSE NULL END AS sort3
FROM Personnel
WHERE
ORDER BY sort1, sort2, sort3;
This code is easy to read, and you do not have worry about CAST()
operations. The trade-off is a larger result set being sent to the cursor.
17.2 OUTER JOINs
OUTER JOINs used to be done with proprietary vendor syntax. Today,
the use of the Standard
OUTER JOIN is universal. An OUTER JOIN is a
JOIN that preserves all the rows in one or both tables, even when they do
not have matching rows in the second table. Let’s take a real-world
situation: I have a table of orders and a table of suppliers that I wish to
JOIN for a report to tell us how much business we did with each supplier.
With an inner join, the query would be this:
SELECT Suppliers.sup_id, sup_name, order_nbr, order_amt
FROM Suppliers, Orders
WHERE Suppliers.sup_id = Orders.sup_id;
Some suppliers’ totals include credits for returned merchandise, and
our total business with them works out to zero dollars. Other suppliers
never got an order from us at all, so we did zero dollars’ worth of
business with them, too. But the first case will show up in the query
result and be passed on to the report, whereas the second case will
disappear in the
INNER JOIN.
If we had used an
OUTER JOIN, preserving the Suppliers table, we
would have all the suppliers in the results. When a supplier with no
orders was found in the Orders table, the order_nbr and order_amt
columns would be given a
NULL value in the result row.
17.2 OUTER JOINs 337
17.2.1 Syntax for OUTER JOINs
In the old SQL-89 standard, there was no OUTER JOIN syntax, so you
had to construct it by hand with a messy
UNION in products such as
earlier versions of DB2 from IBM:
SELECT sup_id, sup_name, order_amt regular INNER JOIN
FROM Suppliers, Orders
WHERE Suppliers.sup_id = Orders.sup_id
UNION ALL
SELECT sup_id, sup_name, CAST(NULL AS INTEGER) preserved rows
of LEFT JOIN
FROM Suppliers
WHERE NOT EXISTS
(SELECT *
FROM Orders
WHERE Suppliers.sup_id = Orders.sup_id);
You have to use a NULL with the correct data type to make the UNION
work, hence the
CAST() functions. Some products are smart enough
that just
NULL by itself will be given the correct data type, but this
version is portable and safer.
The other alternative is to insert a constant of some sort to give a
more meaningful result. This is easy in the case of a
CHARACTER column,
where a message like
'{{NONE}}' can be quickly understood. It is
much harder in the case of a numeric column, where we could have a
balance with a supplier that is positive, zero, or negative because of
returns and credits. There really is a difference between a vendor we did
not use, and a vendor whose returns canceled out its orders.
The most common vendor extensions were for the
LEFT OUTER
JOIN. These extensions were all different in syntax, or semantics, or
both. Today, they are all gone and replaced by the Standard SQL syntax.
In the second edition of this book, I described the proprietary extensions
in detail; since they are mercifully gone, I am not going to tell you about
them in this edition.
The name
LEFT OUTER JOIN comes from the fact that the preserved
table is on the left side of the operator. Likewise, a
RIGHT OUTER JOIN
would have the preserved table on the right-hand side, and a
FULL
OUTER JOIN would preserve both tables.
Here is how
OUTER JOINs work in Standard SQL. Assume you are
given:
338 CHAPTER 17: THE SELECT STATEMENT
Table1 Table2
a b a c
====== ======
1 w 1 r
2 x 2 s
3 y 3 t
4 z
and the OUTER JOIN expression:
Table1
LEFT OUTER JOIN
Table2
ON Table1.a = Table2.a <== JOIN condition
AND Table2.c = 't'; <== single table condition
We call Table1 the “preserved table” and Table2 the “unpreserved
table” in the query. What I am going to give you is a little different, but
equivalent to the ANSI/ISO standards.
1. Build the
CROSS JOIN of the two tables. Scan each row in the
result set.
2. If the predicate tests
TRUE for that row, keep it. You also
remove all rows derived from it from the
CROSS JOIN.
3. If the predicate tests
FALSE or UNKNOWN for that row, then
keep the columns from the preserved table, convert all the
columns from the unpreserved table to
NULLs, and remove the
duplicates. Let us execute this by hand:
Let @ = passed the first predicate {{typesetter: pick better
tokens, such as bullets}}
Let * = passed the second predicate
Table1 CROSS JOIN Table2
a b a c
=========================
1 w 1 r @
1 w 2 s
1 w 3 t *
2 x 1 r
17.2 OUTER JOINs 339
2 x 2 s @
2 x 3 t *
3 y 1 r
3 y 2 s
3 y 3 t @* <== the TRUE set
4 z 1 r
4 z 2 s
4 z 3 t *
Table1 LEFT OUTER JOIN Table2
a b a c
=========================
3 y 3 t <= only TRUE row
1 w NULL NULL Sets of duplicates
1 w NULL NULL
1 w NULL NULL
2 x NULL NULL
2 x NULL NULL
2 x NULL NULL
3 y NULL NULL <== derived from the TRUE set - Remove
3 y NULL NULL
4 z NULL NULL
4 z NULL NULL
4 z NULL NULL
The final results are:
Table1 LEFT OUTER JOIN Table2
a b a c
=========================
1 w NULL NULL
2 x NULL NULL
3 y 3 t
4 z NULL NULL
The basic rule is that every row in the preserved table is represented
in the results in at least one result row.
340 CHAPTER 17: THE SELECT STATEMENT
There is a myth among ACCESS programmers that the ON clause can
contain only a
JOIN condition, and the WHERE can contain only search
conditions. This is not true, and the differences in the position of the
predicates are important.
Consider the two famous Chris Date tables.
Suppliers SupParts
sup_id sup_id part_nbr qty
========= =================
S1 S1 P1 100
S2 S1 P2 250
S3 S2 P1 100
S2 P2 250
If you write the OUTER JOIN with only the join predicate in the ON
clause, like this:
SELECT Suppliers.sup_id, SupParts.part_nbr, SupParts.qty
FROM Suppliers
LEFT OUTER JOIN
SupParts
ON Supplier.sup_id = SupParts.sup_id
WHERE qty < 200;
You get:
sup_id part_nbr qty
===================
'S1' 'P1' 100
'S2' 'P1' 100
But if you put the search predicate in the ON clause:
SELECT Suppliers.sup_id, SupParts.part_nbr, SupParts.qty
FROM Suppliers
LEFT OUTER JOIN
SupParts
ON Supplier.sup_id = SupParts.sup_id
AND qty < 200;
17.2 OUTER JOINs 341
You get:
sup_id part_nbr qty
===================
'S1' 'P1' 100
'S2' 'P1' 100
'S3' NULL NULL
Another problem is that, in general, the order of execution matters
with a chain of
OUTER JOINs. That is to say, ((T1 OUTER JOIN T2)
OUTER JOIN T3) does not produce the same results as (T1 OUTER
JOIN (T2 OUTER JOIN T3)).
I can use any of the options in the
ON clause of an outer join:
SELECT S1.sup_id, S1.sup_name, O1.order_nbr, O1.order_amt
FROM Suppliers AS S1 LEFT OUTER JOIN Orders AS O1
ON S1.sup_id = O1.sup_id;
or:
SELECT S1.sup_id, S1.sup_name, O1.order_nbr, O1.order_amt
FROM Suppliers AS S1 LEFT OUTER JOIN Orders AS O1
USING (S1.sup_id);
or:
SELECT S1.sup_id, S1.sup_name, O1.order_nbr, O1.order_amt
FROM Suppliers AS S1 NATURAL LEFT OUTER JOIN Orders AS O1;
A SELECT expression that returns a single row with a single value
can be used where a scalar expression can be used. If the result of the
scalar query is empty, it is converted to a
NULL. This will sometimes,
but not always, let you write an
OUTER JOIN as a query within the
SELECT clause; thus, this query will work only if each supplier has one
or zero orders:
SELECT sup_id, sup_name, order_nbr,
(SELECT order_amt
FROM Orders
WHERE Suppliers.sup_id = Orders.sup_id)
AS order_amt
FROM Suppliers;