Skip to content

Click to use (opens in a new tab)

What is GROUP BY

Introduction to GROUP BY

The GROUP BY clause in SQL is used to group rows that have the same values in specified columns into summary rows. This allows you to perform aggregate functions (such as COUNT, SUM, AVG, MAX, MIN) on those groups. The GROUP BY clause is commonly used in conjunction with aggregate functions to summarize data and gain insights from it.

Key Characteristics

  • Aggregation: Groups rows based on one or more columns.
  • Summary Data: Produces summarized data by applying aggregate functions.
  • Distinct Values: Ensures that each group contains distinct values for the grouped columns.

Syntax

SELECT column1, column2, aggregate_function(column3)
FROM table_name
WHERE condition
GROUP BY column1, column2;

Explanation

  • column1, column2: Columns to group by.
  • aggregate_function(column3): Function applied to another column to produce a single value per group.
  • table_name: The table from which to retrieve data.
  • WHERE condition: Optional condition to filter rows before grouping.

Example: Using GROUP BY

Scenario

Consider a table Orders with the following columns:

order_idcustomer_idproduct_idquantityorder_date
110150122024-01-01
210250212024-01-01
310150132024-01-02
410350352024-01-03

Example 1: Grouping by Customer

To find the total quantity of orders placed by each customer:

SELECT 
    customer_id, 
    SUM(quantity) AS total_quantity
FROM 
    Orders
GROUP BY 
    customer_id;

Result

customer_idtotal_quantity
1015
1021
1035

Explanation

  • Groups the rows by customer_id.
  • Uses SUM(quantity) to calculate the total quantity ordered by each customer.

Example 2: Grouping by Product and Date

To find the total quantity sold for each product on each date:

SELECT 
    product_id, 
    order_date, 
    SUM(quantity) AS total_quantity
FROM 
    Orders
GROUP BY 
    product_id, 
    order_date;

Result

product_idorder_datetotal_quantity
5012024-01-012
5012024-01-023
5022024-01-011
5032024-01-035

Explanation

  • Groups the rows by product_id and order_date.
  • Uses SUM(quantity) to calculate the total quantity sold for each product on each date.

Example 3: Combining with WHERE Clause

To find the total quantity of orders placed by each customer only for orders after January 1, 2024:

SELECT 
    customer_id, 
    SUM(quantity) AS total_quantity
FROM 
    Orders
WHERE 
    order_date > '2024-01-01'
GROUP BY 
    customer_id;

Result

customer_idtotal_quantity
1013
1035

Explanation

  • Filters the rows to include only those with order_date > '2024-01-01'.
  • Groups the remaining rows by customer_id.
  • Uses SUM(quantity) to calculate the total quantity ordered by each customer after the specified date.

Benefits of Using GROUP BY

  • Data Summarization: Provides a way to summarize large datasets into meaningful summaries.
  • Insight Generation: Helps in generating insights from data by aggregating and analyzing specific segments.
  • Performance Optimization: Can improve query performance by reducing the amount of data processed and returned.

Conclusion

The GROUP BY clause is a powerful tool in SQL for summarizing and analyzing data. By grouping rows based on common attributes and applying aggregate functions, it enables users to extract valuable insights and make informed decisions from their data.


Chat2DB - AI Text2SQL Tool for Easy Database Management

Click to use (opens in a new tab)

What can Chat2DB do?