Group By, Where, Having, Select Distinct Group By is possibly the least understood commonly used function of SQL. A typical use for group by will come from a query such as this one: SELECT count([ProductID]) ,[Name] ,[ProductNumber] ,[MakeFlag] ,[FinishedGoodsFlag] ,[Color] ,[DaysToManufacture] FROM Production.Product Which returns an error: Msg 8120, Level 16, State 1, Line 1 Column 'Production.Product.Name' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause. So a developer will end up doing something like this: SELECT count([ProductID]) ,[Name] ,[ProductNumber] ,[MakeFlag] ,[FinishedGoodsFlag] ,[Color] ,[DaysToManufacture] FROM Production.Product group by [Name] ,[ProductNumber] ,[MakeFlag] ,[FinishedGoodsFlag] ,[Color] ,[DaysToManufacture] Which doesn’t really provide any useful information. A proper use for the group by clause is to specify on what to base the aggregate function (count, sum, avg, max, etc.). By saying “select count(productid)…. Group by daystomanufacture” the developer is requesting a listing of the count of products FOR EACH value in daystomanufacture. Count(productid) | daystomanufacture | 246 | 0 | 154 | 1 | 7 | 2 | 97 | 4 |
The where and having clauses can change these numbers. Where is used for pre-aggregation filtering, and Having is used for post-aggregation filtering. Consider these two queries SELECT count([ProductID]) ,[DaysToManufacture] FROM Production.Product where daystomanufacture<2 group by [DaysToManufacture] And SELECT count([ProductID]) ,[DaysToManufacture] FROM Production.Product group by [DaysToManufacture] having count(ProductID)>7 The first will return the rowset where DaysToManufacture, a field in the table, is less than 2. The second will return the rowset where the COUNT of productid, a calculated aggregate is >7 Select Distinct returns records that are not identical. THIS DOES NOT FIX PROBLEMS WITH INCORRECT CARTESIANS! Select a, b from tblsomething A | B | 1 | Alpha | 1 | Bravo | 2 | Delta | 2 | Delta |
Select distinct a, b from tblsomething |