SQL provides a variety of aggregate functions. The main idea behind an aggregate function is to take multiple inputs and return a single output.
Most common aggregate functions:
- AVG() - returns average value
- COUNT() - returns number of values
- MAX() - returns maximum value
- MIN() - returns minimum value
- SUM() - returns the sum of all values
Note: Aggregate function calls happen only in the SELECT clause or the HAVING clause.
Table Name: person
| id | first_name | last_name | age | joining_date |
|---|---|---|---|---|
| 101 | John | Smith | 25 | 2017-02-15 |
| 102 | Alice | Johnson | 30 | 2020-03-20 |
| 103 | Bob | Brown | 28 | 2016-04-25 |
| 104 | Alice | Adams | 22 | 2019-05-30 |
| 105 | John | Doe | 27 | 2023-06-04 |
To learn the min and max ages in the person table:
SELECT MIN(age), MAX(age) FROM person;| MIN(age) | MAX(age) |
|---|---|
| 22 | 30 |
To learn the average age in the person table:
SELECT AVG(age) FROM person;| AVG(age) |
|---|
| 26.4 |
Note: AVG() returns a floating point value many decimal places (e.g. 2.342418…). You can use ROUND() to specify precision after the decimal.
SELECT ROUND(AVG(age)) FROM person;| ROUND(AVG(age)) |
|---|
| 26 |
GROUP BY allows us to aggregate columns per some category. We need to choose a categorical column to GROUP BY. Categorical columns are non-continuous.
Note: Keep in mind, they can still be numerical, such as cabin class categories on a ship (e.g. Class 1, Class 2, Class 3).
Table Name: example
| category | value1 | value2 |
|---|---|---|
| A | 1 | 114 |
| B | 0 | 105 |
| A | 1 | 117 |
| C | 0 | 110 |
| C | 1 | 101 |
| B | 1 | 118 |
| A | 0 | 113 |
| B | 1 | 107 |
| C | 0 | 116 |
To select unique categories from the example table:
SELECT category FROM example GROUP BY category;| category |
|---|
| A |
| B |
| C |
To find out the average of the value1s of each category class:
SELECT category , ROUND(AVG(value2),2) FROM example GROUP BY category;| category | AVG(value2) |
|---|---|
| A | 114.67 |
| B | 110 |
| C | 109 |
To find out the average of the value2s of each category class except class A:
SELECT category , AVG(value2) FROM example WHERE category != 'A' GROUP BY category;| category | AVG(value2) |
|---|---|
| B | 110 |
| C | 109 |
Note: The GROUP BY clause must appear right after a FROM or WHERE statement.
Note: In the SELECT statement, columns must either have an aggregate function or be in the GROUP BY call.
Note: WHERE statements should not refer to the aggregation result, later on we will learn to use HAVING to filter on those results.
To sort each class in the category column by their average value2:
SELECT category , ROUND(AVG(value2),2) FROM example GROUP BY category ORDER BY AVG(value2) DESC;| category | AVG(value2) |
|---|---|
| A | 114.67 |
| B | 110 |
| C | 109 |
Note: If you want to sort results based on the aggregate, make sure to reference the entire function.
To group average value2 by category and value1:
SELECT category, value1, ROUND(AVG(value2)) FROM example GROUP BY category, value1;| category | value1 | ROUND(AVG(value2)) |
|---|---|---|
| A | 0 | 113 |
| A | 1 | 116 |
| B | 0 | 105 |
| B | 1 | 113 |
| C | 0 | 113 |
| C | 1 | 101 |
To sort the last table by the AVG value of value2:
SELECT category, value1, ROUND(AVG(value2)) FROM example GROUP BY category, value1 ORDER BY ROUND(AVG(value2)) DESC;| category | value1 | ROUND(AVG(value2)) |
|---|---|---|
| A | 1 | 116 |
| A | 0 | 113 |
| B | 1 | 113 |
| C | 0 | 113 |
| B | 0 | 105 |
| C | 1 | 101 |
The HAVING clause allows us to filter after an aggregation has already taken place.
Note: We’ve already seen we can filter before executing the GROUP BY, but what if we want to filter based on SUM(sales)? We can not use WHERE to filter based off of aggregate results, because those happen after a WHERE is executed.
Note: HAVING allows us to use the aggregate result as a filter along with a GROUP BY.
Table Name: example
| category | value1 | value2 |
|---|---|---|
| A | 1 | 114 |
| B | 0 | 105 |
| A | 1 | 117 |
| C | 0 | 110 |
| C | 1 | 101 |
| B | 1 | 118 |
| A | 0 | 113 |
| B | 1 | 107 |
| C | 0 | 116 |
To learn, retrieve average "value2" excluding 'C', grouped by category, HAVING >10:
SELECT category, AVG(value2) FROM example WHERE category != 'C' GROUP BY category HAVING AVG(value2) > 110;| category | AVG(value2) |
|---|---|
| A | 114.67 |