T-SQL Tutorial - When is DISTINCT evaluated

Подписаться 11 тыс.

50% 1

Another video brought to you by BeardedDev, bringing you tutorials on Data Engineering, Business Intelligence, T-SQL Programming and Data Analysis.
In this video I talk through when DISTINCT is evaluated, sometimes I see people write queries with a unique identifier and then wonder why the results include duplicates (aside from the unique identifier, of course), understanding how data is processed is very important to writing accurate SQL queries.
I introduce Derived Tables to take the results from a query that produces unique rows using DISTINCT then add a unique identifier using ROW_NUMBER.
This video was created on SQL Server 2019 using SQL Server Management Studio but you can following along with Azure SQL Database or Azure Synapse Analytics and using another client tool such as Azure Data Studio.
If you wish to follow along with the video, you can use the SQL Scripts below:
IF OBJECT_ID(N'dbo.Customer', N'U') IS NOT NULL
DROP TABLE dbo.Customer;
CREATE TABLE dbo.Customer
(
FirstName VARCHAR(50),
LastName VARCHAR(50)
);
INSERT INTO dbo.Customer (FirstName, LastName)
VALUES
('Sarah', 'Power'),
('Sarah', 'Power'),
('Nathan', 'Jones'),
('Nathan', 'Jones'),
('Nathan', 'Jones'),
('Nathan', 'Jones'),
('Michelle', 'Arnold'),
('Michelle', 'Arnold'),
('Michelle', 'Arnold');
-- unique customers
SELECT DISTINCT
FirstName,
LastName
FROM dbo.Customer;
-- unique customers with unique identifier
SELECT
ROW_NUMBER() OVER(ORDER BY FirstName, LastName) AS RowNum,
FirstName,
LastName
FROM
(
SELECT DISTINCT
FirstName,
LastName
FROM dbo.Customer
) AS D;

Опубликовано:

15 сен 2024

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 12

@johnbreslin4521 2 года назад

Great explanation as always. Good to see you putting more topics up again.

@ashkrv123 2 года назад

Back to back 2 videos.. Cheers. And always knowledgeable

@BeardedDevData 2 года назад

Trying to get some consistency now, I'm going to branch out on topics too.

@briandennehy6380 2 года назад

2 vids in 2 days 👍 distinct always melts my brain when theres multiple columns selected with a distinct, good to know how it works now

@BeardedDevData 2 года назад

I'm getting back in to the swing of things now, will be branching out on topics too.

@31redorange08 2 года назад

What would be better performance-wise? Your query or a single SELECT with RANK?

@BeardedDevData 2 года назад

You would have to use DENSE RANK to prevent the gaps. In terms of performance, as a general rule the derived table approach will perform better as you are running a Window Function over less data.

@superfreiheit1 Год назад

Is distinct and Group by the same process?

@BeardedDevData Год назад

Usually they will generate the same execution plan however I did some work a few months outputting some results in JSON format and group by was significantly faster. I seem to remember historically distinct caused a sort operation which is expensive.

@karinshamama5591 2 года назад

How can I add grand total line at the end … I have like 150 col?

@BeardedDevData 2 года назад

You would have to add another query and use UNION ALL, I wouldn't recommend it though, we are just querying a database here so it would be worth using a reporting or presentation tool to compute that based on the data.

@superfreiheit1 Год назад

You could also use grouping set and then ()