Various Ways to Get Distinct Values From a List<T> Using LINQ

This article explains the various scenarios regarding filtering distinct values from the List<T>. One practical example is if you have a list of products and wanted to get the distinct values from the list. To make it more clear let's see an example. Consider that we have a model below that houses the following properties:

 

  1. public class Product  
  2. {  
  3.    public int ProductID { getset; }  
  4.    public string Make { getset; }  
  5.    public string Model { getset; }  
  6. }  

 

Now let's create a method that creates a list of Products. For example:

 

  1. private List < Product > GetProducts()   
  2. {  
  3.     List < Product > products = new List < Product > ();  
  4.   
  5.     products.Add(new Product {ProductID = 1, Make = "Samsung", Model = "Galaxy S3"});  
  6.     products.Add(new Product {ProductID = 2, Make = "Samsung", Model = "Galaxy S4"});  
  7.     products.Add(new Product {ProductID = 3, Make = "Samsung", Model = "Galaxy S5"});  
  8.     products.Add(new Product {ProductID = 4, Make = "Apple", Model = "iPhone 5"});  
  9.     products.Add(new Product {ProductID = 5, Make = "Apple", Model = "iPhone 6"});  
  10.     products.Add(new Product {ProductID = 6, Make = "Apple", Model = "iPhone 6"});  
  11.     products.Add(new Product {ProductID = 7, Make = "HTC", Model = "Sensation"});  
  12.     products.Add(new Product {ProductID = 8, Make = "HTC", Model = "Desire"});  
  13.     products.Add(new Product {ProductID = 9, Make = "HTC", Model = "Desire"});  
  14.     products.Add(new Product {ProductID = 10, Make = "Nokia", Model = "Lumia 735"});  
  15.     products.Add(new Product {ProductID = 11, Make = "Nokia", Model = "Lumia 930"});  
  16.     products.Add(new Product {ProductID = 12, Make = "Nokia", Model = "Lumia 930"});  
  17.     products.Add(new Product {ProductID = 13, Make = "Sony", Model = "Xperia Z3"});  
  18.   
  19.     return products;  
  20. }  

 

The method above returns a list of Products by adding a dummy data to the List<Product> just for the simplicity of this demo. In z real scenario you may want to query your database and load the result to your model. Now let's bind the Products data into a GridView.

 

  1. protected void Page_Load(object sender, EventArgs e)   
  2. {  
  3.     if (!IsPostBack)   
  4.     {  
  5.         GridView1.DataSource = GetProducts();  
  6.         GridView1.DataBind();  
  7.     }  
  8. }  

 

Running the code will get you the following output:



If you notice, there are a few items above that contain the same values, commonly called “duplicate” values. Now let's try to get the distinct row values from the list using the LINQ Distinct function. The code now would look like this:

 

  1. if (!IsPostBack)   
  2. {  
  3.     GridView1.DataSource = GetProducts().Distinct();  
  4.     GridView1.DataBind();  
  5. }  

 

Unfortunately, running the code will still provide you the same output. This means that the Distinct LINQ function doesn't work at all. I was surprised and my first reaction was like:



What??? Really???

Yes, it doesn't work as expected! This is because the Distinct method uses the Default equality comparer to compare values under the hood. And since we are dealing with reference type object then the Distinct will threat the values as unique even if the property values are the same.

What to do

So how are we going to deal with this?

There are a few possible ways to accomplish this as in the following.

Option 1: Using a combination of LINQ GroupBy and Select operators

 

  1. if (!IsPostBack)  
  2. {  
  3.     GridView1.DataSource = GetProducts()  
  4.         .GroupBy(o = > new{o.Make, o.Model})  
  5.         .Select(o = > o.FirstOrDefault());  
  6.     GridView1.DataBind();  
  7. }  

 



Option 2: Using a combination of LINQ Select and Distinct operators

 

  1. if (!IsPostBack)   
  2. {  
  3.     GridView1.DataSource = GetProducts()  
  4.         .Select(o = > new {o.Make, o.Model})  
  5.         .Distinct();  
  6.     GridView1.DataBind();  
  7. }  

 

The approach above creates a collection of an anonymous types. Doing a Distinct on the anonymous types will automatically override the Equals and GetHashCode to compare each property.



Option 3: Using the IEqualityCompare<T> interface

 

  1. class ProductComparer: IEqualityComparer < Product >   
  2. {  
  3.     public bool Equals(Product x, Product y)   
  4.     {  
  5.         if (Object.ReferenceEquals(x, y)) return true;  
  6.         if (Object.ReferenceEquals(x, null) || Object.ReferenceEquals(y, null)) return false;  
  7.         return x.Make == y.Make && x.Model == y.Model;  
  8.     }  
  9.     public int GetHashCode(Product product)   
  10.     {  
  11.         if (Object.ReferenceEquals(product, null)) return 0;  
  12.         int hashProductName = product.Make == null ? 0 : product.Make.GetHashCode();  
  13.         int hashProductCode = product.Model.GetHashCode();  
  14.         return hashProductName ^ hashProductCode;  
  15.     }  
  16. }  

 

The Distinct operator has an overload method that lets you pass an instance of an IEqualityComparer. So for this approach we created a class “ProductComparer” that implements the IEqualityCompaper. Here's the code to use it:

 

  1. if (!IsPostBack)   
  2. {  
  3.     GridView1.DataSource = GetProducts()  
  4.         .Distinct(new ProductComparer());  
  5.     GridView1.DataBind();  
  6. }  

 

This approach is my preferred option because it allows me to implement my own GetHashCode and Equals methods for comparing custom types. I am also getting into a habit of making interfaces that makes your code more reusable and readable.



As you observe the duplicate values are now gone. Now here's another scenario. What if we want to get the distinct values for a certain field in the list? For example, get the distinct “Make” values such as Samsung, Apple, HTC, Nokia and Sony and then populate the result to a DropDownList control for filtering purposes. I was hoping that the Distinct function has an overload that can compare values based on a property or field like GetProducts().Distinct(o => o.PropertyToCompare) but then again it doesn't seem to have that overload. So I came up with the following workarounds.

Option 1: Using GroupBy and Select operators

  1. if (!IsPostBack)   
  2. {  
  3.     DropDownList1.DataSource = GetProducts()  
  4.         .GroupBy(o = > o.Make)  
  5.         .Select(o = > o.FirstOrDefault());  
  6.     DropDownList1.DataTextField = "Make";  
  7.     DropDownList1.DataValueField = "Make";  
  8.     DropDownList1.DataBind();  
  9. }  

 

Option 2: Using Select and Distinct operators

  1. if (!IsPostBack)  
  2. {  
  3.     DropDownList1.DataSource = GetProducts()  
  4.         .Select(o = > new {Make = o.Make})  
  5.         .Distinct();  
  6.     DropDownList1.DataTextField = "Make";  
  7.     DropDownList1.DataValueField = "Make";  
  8.     DropDownList1.DataBind();  
  9. }  

 

Running the code for both options will produce this output below:



That's it! I hope someone finds this article useful.

Up Next
    Ebook Download
    View all
    Learn
    View all