[Postgres] - DISTINCT ON in Postgres

Truong Bui

Published on: Last updated:

postgres tips

In PostgreSQL, the DISTINCT ON expression is a very useful feature that allows us to select distinct rows based on a certain column or set of columns.
Let's consider a simple example. Suppose we have a table orders with the following data:
File name: orders.sql
CREATE TABLE orders (
		id SERIAL PRIMARY KEY,
		customer_id INT,
		order_date DATE,
		amount DECIMAL
	);

	INSERT INTO orders (customer_id, order_date, amount)
	VALUES (1, '2023-01-01', 100),
	       (1, '2023-01-02', 200),
				 (2, '2023-01-01', 150),
				 (2, '2023-01-03', 300);
Now, if we want to get the latest order of each customer, we can use the DISTINCT ON expression as follows:
SELECT DISTINCT ON (customer_id) customer_id, order_date, amount
	FROM orders
	ORDER BY customer_id, order_date DESC;
This query will return the latest order of each customer. The DISTINCT ON expression is used in conjunction with the ORDER BY clause to determine which row among the duplicates should be returned.
customer_id
order_date
amount
1
2023-01-02
200
2
2023-01-03
300
If we use DISTINCT only
SELECT customer_id, order_date, amount
	FROM orders
	ORDER BY customer_id, order_date DESC;
This query will return all distinct rows in the table, not just the latest order of each customer.
customer_id
order_date
amount
1
2023-01-02
200
1
2023-01-01
100
2
2023-01-03
300
2
2023-01-01
150
WARNING:
Be aware that the DISTINCT ON expression is a PostgreSQL-specific feature and may not be available in other SQL databases.

Hope you find this article useful. If you have any questions, please let me know in the comment section below. 👇