Skip to content

Latest commit

 

History

History
174 lines (143 loc) · 5.93 KB

File metadata and controls

174 lines (143 loc) · 5.93 KB

GROUP BY Statements

Aggregate Functions

SQL provides a variety of aggregate functions. The main idea behind an aggregate function is to take multiple inputs and return a single output.

Most common aggregate functions:

  • AVG() - returns average value
  • COUNT() - returns number of values
  • MAX() - returns maximum value
  • MIN() - returns minimum value
  • SUM() - returns the sum of all values

Note: Aggregate function calls happen only in the SELECT clause or the HAVING clause.

Application

Table Name: person

id first_name last_name age joining_date
101 John Smith 25 2017-02-15
102 Alice Johnson 30 2020-03-20
103 Bob Brown 28 2016-04-25
104 Alice Adams 22 2019-05-30
105 John Doe 27 2023-06-04

To learn the min and max ages in the person table:

SELECT MIN(age), MAX(age) FROM person;
MIN(age) MAX(age)
22 30

To learn the average age in the person table:

SELECT AVG(age) FROM person;
AVG(age)
26.4

Note: AVG() returns a floating point value many decimal places (e.g. 2.342418…). You can use ROUND() to specify precision after the decimal.

SELECT ROUND(AVG(age)) FROM person;
ROUND(AVG(age))
26

GROUP BY

GROUP BY allows us to aggregate columns per some category. We need to choose a categorical column to GROUP BY. Categorical columns are non-continuous.

Note: Keep in mind, they can still be numerical, such as cabin class categories on a ship (e.g. Class 1, Class 2, Class 3).

Application

Table Name: example

category value1 value2
A 1 114
B 0 105
A 1 117
C 0 110
C 1 101
B 1 118
A 0 113
B 1 107
C 0 116

To select unique categories from the example table:

SELECT category FROM example GROUP BY category;
category
A
B
C

To find out the average of the value1s of each category class:

SELECT category , ROUND(AVG(value2),2) FROM example GROUP BY category;
category AVG(value2)
A 114.67
B 110
C 109

To find out the average of the value2s of each category class except class A:

SELECT category , AVG(value2) FROM example WHERE category != 'A' GROUP BY category;
category AVG(value2)
B 110
C 109

Note: The GROUP BY clause must appear right after a FROM or WHERE statement.

Note: In the SELECT statement, columns must either have an aggregate function or be in the GROUP BY call.

Note: WHERE statements should not refer to the aggregation result, later on we will learn to use HAVING to filter on those results.

To sort each class in the category column by their average value2:

SELECT category , ROUND(AVG(value2),2) FROM example GROUP BY category ORDER BY AVG(value2) DESC;
category AVG(value2)
A 114.67
B 110
C 109

Note: If you want to sort results based on the aggregate, make sure to reference the entire function.

To group average value2 by category and value1:

SELECT category, value1, ROUND(AVG(value2)) FROM example GROUP BY category, value1;
category value1 ROUND(AVG(value2))
A 0 113
A 1 116
B 0 105
B 1 113
C 0 113
C 1 101

To sort the last table by the AVG value of value2:

SELECT category, value1, ROUND(AVG(value2)) FROM example GROUP BY category, value1 ORDER BY ROUND(AVG(value2)) DESC;
category value1 ROUND(AVG(value2))
A 1 116
A 0 113
B 1 113
C 0 113
B 0 105
C 1 101

HAVING

The HAVING clause allows us to filter after an aggregation has already taken place.

Note: We’ve already seen we can filter before executing the GROUP BY, but what if we want to filter based on SUM(sales)? We can not use WHERE to filter based off of aggregate results, because those happen after a WHERE is executed.

Note: HAVING allows us to use the aggregate result as a filter along with a GROUP BY.

Application

Table Name: example

category value1 value2
A 1 114
B 0 105
A 1 117
C 0 110
C 1 101
B 1 118
A 0 113
B 1 107
C 0 116

To learn, retrieve average "value2" excluding 'C', grouped by category, HAVING >10:

SELECT category, AVG(value2) FROM example WHERE category != 'C' GROUP BY category HAVING AVG(value2) > 110;
category AVG(value2)
A 114.67