ForEach in IEnumerable

Recently one of my friend asked me, is it possible to do foreach on an IEnumerable. I said ‘yes’ based on my knowledge on List. But it turned out I was wrong. You can not do ForEach on an IEnumerable right out of the box. It doesn’t mean, you can’t do it, it was not implemented. So I asked the question, can’t you use foreach? The answer was, sure. It was the person’s curiosity that led me to see why it was not implemented to start with. I did couple of Google search and got the following link, which explained very well why it was not implemented and if you want how you would implement it.

It is a very good read for anyone interested.

http://blogs.msdn.com/b/ericlippert/archive/2009/05/18/foreach-vs-foreach.aspx

Also there was a little debate in StackOverflow on the same subject.

http://stackoverflow.com/questions/200574/linq-equivalent-of-foreach-for-ienumerablet

http://stackoverflow.com/questions/858978/lambda-expression-using-foreach-clause

In my opinion, I would stick with ‘foreach’ instead of ‘ForEach’.

If I would need to get fancy and show off, I would write the extension rather than ‘.ToList()’.ForEach’. When you perform, ToList, it creates the copy of the object before send it to ForEach.

Out of curiosity, I wrote a small unit test code to test the performance of the three flavors. It turned out, traditional ‘foreach’ have a very thin lead in the process. Here is what the program does, I have a person model as follows; what we are trying to do is remove all the children from all the parent which have any children in them. For simplicity, we assume there is only two levels.

public class Person
{
   public string Name { get; set; }
   public int Age { get; set; }
   public List<Person> Children { get; set; }
   public Person()
   {
       Children = new List<Person>();
   }
}

I have a class library which does all three different flavor of for each

public class LinqTesting
{
   public List<Person> SourceList;
   public List<Person> TargetList;
   public void RemoveChildrenWithLinq()
   {            
       SourceList.Where(p => p.Children.Count() > 0).ToList().ForEach(p => p.Children.Clear());
   }
   public void RemoveChildrenWithForEach()
   {
       foreach (var item in SourceList.Where(p=>p.Children.Count() > 0))
       {
           item.Children.Clear();
       }
   }
   public void RemoveChildrenWithLinqExtension()
   {
       SourceList.Where(p => p.Children.Count() > 0).ForEach(p => p.Children.Clear());
   }
}

We load the data as follows

private List<Person> LoadSourceData()
{
     List<Person> ppl = new List<Person>();
     for (int i = 0; i < 1000; i++)
     {
         Person singlePerson = new Person() { Name = string.Format("Parent {0}", i), Age = i + 10 };
         if (i % 3 == 0)
         {
             for (int j = 0; j < 5000; j++)
                 singlePerson.Children.Add(new Person() { Name = string.Format("Child {0}", j), Age = j + 10 });
         }
         ppl.Add(singlePerson);
     }
     return ppl;
 }

The tree test methods are

[TestMethod]
public void TestMethodWithLinq()
{
    start = DateTime.Now;
    tester.RemoveChildrenWithLinq();
    end = DateTime.Now;
    PerformCommonAssert();
}

[TestMethod]
public void TestwithForEach()
{
   start = DateTime.Now;
   tester.RemoveChildrenWithForEach();
   end = DateTime.Now;
   PerformCommonAssert();
}

[TestMethod]
public void TestwithExtensions()
{
   start = DateTime.Now;
   tester.RemoveChildrenWithLinqExtension();
    end = DateTime.Now;
   PerformCommonAssert();
}

And the result is

0:0:4 (m:s:ms)

0:0:3 (m:s:ms)

0:0:4 (m:s:ms)

3 passed, 0 failed, 0 skipped, took 7.44 seconds (MSTest 10.0).

With the above set of input, both Linq and Linq extenstion method took 4ms, while the traditional for each only took 3ms. With 1ms better, traditional ‘foreach’ is the better way to do it. Not only that, the code is very easy to understand and easy to maintain.

If anyone have different opinion or suggestions, please let me know.

Order Matters – LINQ

One of my colleague ran into this problem when using LINQ to join two tables. The problem is that he has two tables a & b where he is joining tables on column called key on both the tables. Following is what he had in the beginning

var list = from firstTable in a

              join secondTable in b

              on firstTable.Key == secondTable.Key

              select a;

This is a very straight forward query. On first glace there is nothing wrong with that. But when he tries to enter the above mentioned code, VS 2010 did not let him complete and was give error “The name ‘secondTable’ is not in scope on the left side of ‘equals’”  Couple of things you have to remember when you are using Linq to do joins

  • The order of comparison matters. So always use first table first and compare it against second table.
  • All LINQ joins are equijoins. so instead of using == you should use equals 

So the correct LINQ query should be like the following

 

var list = from firstTable in a

              join secondTable in b

              on firstTable.Key equals secondTable.Key

              select a;

Technorati Tags: