The GROUP BY Statement

The GROUP BY statement is used in conjunction with the aggregate functions to group the result-set by one or more columns.

SQL GROUP BY Syntax

SELECT column_name, aggregate_function(column_name)
FROM table_name
WHERE column_name operator value
GROUP BY column_name

 

SQL GROUP BY Example

We have the following “Orders” table:

O_Id OrderDate OrderPrice Customer
1 2008/11/12 1000 Hansen
2 2008/10/23 1600 Nilsen
3 2008/09/02 700 Hansen
4 2008/09/03 300 Hansen
5 2008/08/30 2000 Jensen
6 2008/10/04 100 Nilsen

Now we want to find the total sum (total order) of each customer.

We will have to use the GROUP BY statement to group the customers.

We use the following SQL statement:

SELECT Customer,SUM(OrderPrice) FROM Orders
GROUP BY Customer

The result-set will look like this:

Customer SUM(OrderPrice)
Hansen 2000
Nilsen 1700
Jensen 2000

Nice! Isn’t it? ūüôā

Let’s see what happens if we omit the GROUP BY statement:

SELECT Customer,SUM(OrderPrice) FROM Orders

The result-set will look like this:

Customer SUM(OrderPrice)
Hansen 5700
Nilsen 5700
Hansen 5700
Hansen 5700
Jensen 5700
Nilsen 5700

The result-set above is not what we wanted.

Explanation of why the above SELECT statement cannot be used:¬†The SELECT statement above has two columns specified (Customer and SUM(OrderPrice). The “SUM(OrderPrice)” returns a single value (that is the total sum of the “OrderPrice” column), while “Customer” returns 6 values (one value for each row in the “Orders” table). This will therefore not give us the correct result. However, you have seen that the GROUP BY statement solves this problem.

GROUP BY More Than One Column

We can also use the GROUP BY statement on more than one column, like this:

SELECT Customer,OrderDate,SUM(OrderPrice) FROM Orders
GROUP BY Customer,OrderDate

The HAVING Clause

The HAVING clause was added to SQL because the WHERE keyword could not be used with aggregate functions.

SQL HAVING Syntax

SELECT column_name, aggregate_function(column_name)
FROM table_name
WHERE column_name operator value
GROUP BY column_name
HAVING aggregate_function(column_name) operator value

 

SQL HAVING Example

We have the following “Orders” table:

O_Id OrderDate OrderPrice Customer
1 2008/11/12 1000 Hansen
2 2008/10/23 1600 Nilsen
3 2008/09/02 700 Hansen
4 2008/09/03 300 Hansen
5 2008/08/30 2000 Jensen
6 2008/10/04 100 Nilsen

Now we want to find if any of the customers have a total order of less than 2000.

We use the following SQL statement:

SELECT Customer,SUM(OrderPrice) FROM Orders
GROUP BY Customer
HAVING SUM(OrderPrice)<2000

The result-set will look like this:

Customer SUM(OrderPrice)
Nilsen 1700

Now we want to find if the customers “Hansen” or “Jensen” have a total order of more than 1500.

We add an ordinary WHERE clause to the SQL statement:

SELECT Customer,SUM(OrderPrice) FROM Orders
WHERE Customer=’Hansen’ OR Customer=’Jensen’
GROUP BY Customer
HAVING SUM(OrderPrice)>1500

The result-set will look like this:

Customer SUM(OrderPrice)
Hansen 2000
Jensen 2000

A view is a virtual table.

This chapter shows how to create, update, and delete a view.

 

SQL CREATE VIEW Statement

In SQL, a view is a virtual table based on the result-set of an SQL statement.

A view contains rows and columns, just like a real table. The fields in a view are fields from one or more real tables in the database.

You can add SQL functions, WHERE, and JOIN statements to a view and present the data as if the data were coming from one single table.

SQL CREATE VIEW Syntax

CREATE VIEW view_name AS
SELECT column_name(s)
FROM table_name
WHERE condition

Note:¬†A view always shows up-to-date data! The database engine recreates the data, using the view’s SQL statement, every time a user queries a view.

 

SQL CREATE VIEW Examples

If you have the Northwind database you can see that it has several views installed by default.

The view “Current Product List” lists all active products (products that are not discontinued) from the “Products” table. The view is created with the following SQL:

CREATE VIEW
[Current Product List] AS
SELECT ProductID,ProductName
FROM Products
WHERE Discontinued=No

We can query the view above as follows:

SELECT * FROM [Current Product List]

Another view in the Northwind sample database selects every product in the “Products” table with a unit price higher than the average unit price:

CREATE VIEW [Products Above Average Price] AS
SELECT ProductName,UnitPrice
FROM Products
WHERE UnitPrice>(SELECT AVG(UnitPrice) FROM Products)

We can query the view above as follows:

SELECT * FROM [Products Above Average Price]

Another view in the Northwind database calculates the total sale for each category in 1997. Note that this view selects its data from another view called “Product Sales for 1997”:

CREATE VIEW [Category Sales For 1997] AS
SELECT DISTINCT CategoryName,Sum(ProductSales) AS CategorySales
FROM [Product Sales for 1997]
GROUP BY CategoryName

We can query the view above as follows:

SELECT * FROM [Category Sales For 1997]

We can also add a condition to the query. Now we want to see the total sale only for the category “Beverages”:

SELECT * FROM [Category Sales For 1997]
WHERE CategoryName=’Beverages’

 

SQL Updating a View

You can update a view by using the following syntax:

SQL CREATE OR REPLACE VIEW Syntax

CREATE OR REPLACE VIEW view_name AS
SELECT column_name(s)
FROM table_name
WHERE condition

Now we want to add the “Category” column to the “Current Product List” view. We will update the view with the following SQL:

CREATE VIEW [Current Product List] AS
SELECT ProductID,ProductName,Category
FROM Products
WHERE Discontinued=No

 

SQL Dropping a View

You can delete a view with the DROP VIEW command.

SQL DROP VIEW Syntax

DROP VIEW view_name

 

SQL CREATE INDEX Statement

The CREATE INDEX statement is used to create indexes in tables.

Indexes allow the database application to find data fast; without reading the whole table.

Indexes

An index can be created in a table to find data more quickly and efficiently.

The users cannot see the indexes, they are just used to speed up searches/queries.

Note: Updating a table with indexes takes more time than updating a table without (because the indexes also need an update). So you should only create indexes on columns (and tables) that will be frequently searched against.

SQL CREATE INDEX Syntax

Creates an index on a table. Duplicate values are allowed:

CREATE INDEX index_name
ON table_name (column_name)

SQL CREATE UNIQUE INDEX Syntax

Creates a unique index on a table. Duplicate values are not allowed:

CREATE UNIQUE INDEX index_name
ON table_name (column_name)

Note: The syntax for creating indexes varies amongst different databases. Therefore: Check the syntax for creating indexes in your database.

CREATE INDEX Example

The SQL statement below creates an index named “PIndex” on the “LastName” column in the “Persons” table:

CREATE INDEX PIndex
ON Persons (LastName)

If you want to create an index on a combination of columns, you can list the column names within the parentheses, separated by commas:

CREATE INDEX PIndex
ON Persons (LastName, FirstName)

The SQL HAVING keyword provides a search condition for a group or aggregate. The SQL HAVING clause works together with the SQL SELECT clause. The SQL HAVING clause is somewhat similar to the SQL WHERE clause, because it specifies a search condition.
There is one important difference between SQL HAVING and SQL WHERE clauses. The SQL WHERE clause condition is tested against each and every row of data, while the SQL HAVING clause condition is tested against the groups and/or aggregates specified in the SQL GROUP BY clause and/or the SQL SELECT column list.

It is important to understand that if a SQL statement contains both SQL WHERE and SQL HAVING clauses the SQL WHERE clause is applied first, and the SQL HAVING clause is applied later to the groups and/or aggregates.

We will demonstrate how to use SQL HAVING using our Weather table:

City AverageTemperature Date
New York 22 C 10/10/2005
Seattle 21 C 10/10/2005
Washington 20 C 10/10/2005
New York 18 C 10/09/2005
Seattle 20 C 10/09/2005
Washington 17 C 10/09/2005

Our goal is to select all cities, with average temperature greater than 19 C. To accomplish that we are going to use the following SQL HAVING statement:

SELECT City, AVG(AverageTemperature)
FROM Weather
GROUP BY City
HAVING AVG(AverageTemperature) > 19
The result of the above SQL HAVING statement is displayed below:

City AverageTemperature
Seattle 20.5 C
New York 20 C

In the SQL GROUP BY chapter we used almost the same SQL statement but without the HAVING part:

SELECT City, AVG(AverageTemperature)
FROM Weather
GROUP BY City
The result of it was the following:

City AverageTemperature
Washington 18.5 C
Seattle 20.5 C
New York 20 C

By applying the SQL HAVING clause to this result set consisting of three groups, we remove the Washington group simply because its AverageTemperature value is less than 19.

group by and having clauses

Description

Used in select statements to divide a table into groups and to return only groups that match conditions in the havingclause. group by is typically used in conjunction with aggregates to specify how to group the unaggregated columns of a select query. having clauses are applied to these groups.

Syntax

Start of select statement

[group by [all] aggregate_free_expression

[,¬†aggregate_free_expression]…]

[having search_conditions]

End of select statement

Parameters

group by

specifies the groups into which the table is divided, and if aggregate functions are included in the select list, finds a summary value for each group. These summary values appear as columns in the results, one for each group. You can refer to these summary columns in the having clause.

You can use the avg, count, count_big, max, min, and sum aggregate functions in the select list beforegroup by (the expression is usually a column name).

A table can be grouped by any combination of columns‚ÄĒthat is, groups can be nested within each other, as in Example 2.

all

is a Transact-SQL extension that includes all groups in the results, even those excluded by a where clause. For example:

select type, avg(price)

from titles

where advance > 7000

group by all type

 

type

—————–¬†¬†———-

UNDECIDED                NULL

business                 2.99

mod_cook                 2.99

popular_comp            20.00

psychology               NULL

trad_cook               14.99

 

(6 rows affected)

‚ÄúNULL‚ÄĚ in the aggregate column indicates groups that would be excluded by the¬†where¬†clause. A¬†havingclause negates the meaning of¬†all.

aggregate_free_expression

is an expression that includes no aggregates. A Transact-SQL extension allows grouping by an aggregate-free expression as well as by a column name.

You cannot group by column heading or alias. This example is correct:

select Price=avg(price), Pay=avg(advance),

Total=price * $1.15

from titles

group by price * $1.15

having

sets conditions for the group by clause, similar to the way in which where sets conditions for the selectclause.

having search conditions can include aggregate expressions; otherwise, having search conditions are identical to where search conditions. This is an example of a having clause with aggregates:

select pub_id, total = sum(total_sales)

from titles

where total_sales is not null

group by pub_id

having count(*)>5

When Adaptive Server optimizes queries, it evaluates the search conditions in where and having clauses, and determines which conditions are search arguments (SARGs) that can be used to choose the best indexes and query plan. All of the search conditions are used to qualify the rows. For more information on search arguments, see the Performance and Tuning Guide: Optimizer and Abstract Plans.

Examples

Example 1

Calculates the average advance and the sum of the sales for each type of book:

select type, avg(advance), sum(total_sales)

from titles

group by type

Example 2

Groups the results by type, then by pub_id within each type:

select type, pub_id, avg(advance), sum(total_sales)

from titles

group by type, pub_id

Example 3

Calculates results for all groups, but displays only groups whose type begins with ‚Äúp‚ÄĚ:

select type, avg(price)

from titles

group by type

having type like ‘p%’

Example 4

Calculates results for all groups, but displays results for groups matching the multiple conditions in the havingclause:

select pub_id, sum(advance), avg(price)

from titles

group by pub_id

having sum(advance) > $15000

and avg(price) < $10

and pub_id > “0700”

Example 5

Calculates the total sales for each group (publisher) after joining the titles and publishers tables:

select p.pub_id, sum(t.total_sales)

from publishers p, titles t

where p.pub_id = t.pub_id

group by p.pub_id

Example 6

Displays the titles that have an advance of more than $1000 and a price that is more than the average price of all titles:

select title_id, advance, price

from titles

where advance > 1000

having price > avg(price)