Enumerable, Enumerator, and Yielding a "Free" State Machine

Enumerator -- A State Machine.

A  .NET 1.0 enumerator is basically a state machine object that implements the following interface:

public interface IEnumerator
{
    object Current { get; }
    bool MoveNext();
    void Reset();
}

The current state of our object is accessed through the "Current" property.  If we want to get the next state in the machine we call "MoveNext()" and if the value was true, we have a "Current" state available.  If we want to start over from the beginning, we call "Reset()".

With .NET 2.0 and the arrival of generics, we have a more type-safe definition for a state machine which is basically the same thing that we had in .NET 1.0, but with a type-safe "Current" property and we have added the IDisposable interface.

public interface IEnumerator<T> : IDisposable, IEnumerator
{
    T Current { get; }
}

What do we mean by a state machine? 

It would be any object that would have multiple "states" that we want to view.  The following class is a state machine that produces 10 (and only 10) random integers.  If we try and get an 11th random integer, our state machine throws an exception.  Every call to "MoveNext()" generates a new number and increments the count of items received.

class RandomEnumerator : IEnumerator<Int32>
{
    public RandomEnumerator()
    {
        m_rand = new Random();
        m_Current = m_Count = 0;
    }

    private readonly Random
        m_rand;

    private Int32
        m_Current,
        m_Count;

    public int Current
    {
        get
        {
            if (m_Count > 10)
                throw new InvalidOperationException("Sorry... out of results");

            return m_Current;
        }
    }

    public void Dispose()
    {
        ; // nothing to do;
    }

    object System.Collections.IEnumerator.Current
    {
        get { return Current; }
    }

    public bool MoveNext()
    {
        if (m_Count < 10)
        {
            m_Current = m_rand.Next();
            m_Count++;
            return true;
        }
        else
            return false;
    }

    public void Reset()
    {
        m_Count = m_Current = 0;
    }
}

Now if we want to consume our random number generating state machine, we could do so with the following code.

IEnumerator<Int32> enumerator = new RandomEnumerator();

while (enumerator.MoveNext())
    Console.WriteLine(enumerator.Current);

Enumerable -- A State Machine Builder

Almost any collection is going to be able to build a state machine that will show us all the items in the collection.  (Such as  List<> Collection<>, Stack<>, Queue<>, etc)  This contract is expressed through the IEnumerable (in .NET 1.0 or a more type-safe IEnumerable<T>  in 2.0) which gives us a method to retrieve a state machine.

public interface IEnumerable
{
    IEnumerator GetEnumerator();
}

public interface IEnumerable<T> : IEnumerable
{
    IEnumerator<T> GetEnumerator();
}

This is such a common pattern, that it is integrated with the C# language such that we can easily iterator over all members of an object that produces a state machine.

For example, let's make a RandomGroup

class RandomGroup : IEnumerable<Int32>
{

    #region IEnumerable<int> Members

    public IEnumerator<int> GetEnumerator()
    {
        return new RandomEnumerator();
    }

    #endregion

    #region IEnumerable Members

    System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()
    {
        return GetEnumerator(); // get the IEnumerable<int> Enumerator
    }

    #endregion
}

Now, we could iterate through our state machine as before:

RandomGroup rGroup = new RandomGroup();

IEnumerator<Int32> enumerator = rGroup.GetEnumerator();

while (enumerator.MoveNext())
    Console.WriteLine(enumerator.Current);

But we can use a shorter syntax using a "for/each" loop with our IEnumerable<> state engine factory and we'll have the exact same behavior:

foreach (Int32 i in new RandomGroup())
    Console.WriteLine(i);

The "Free" State Engine

One of the really cool things in .NET 2.0 is the ability to have the compiler build the state engine for us.  We can take all the following code:

class RandomEnumerator : IEnumerator<Int32>
{
    public RandomEnumerator()
    {
        m_rand = new Random();
        m_Current = m_Count = 0;
    }

    private readonly Random
        m_rand;

    private Int32
        m_Current,
        m_Count;

    public int Current
    {
        get
        {
            if (m_Count > 10)
                throw new InvalidOperationException("Sorry... out of results");

            return m_Current;
        }
    }

    public void Dispose()
    {
        ; // nothing to do;
    }

    object System.Collections.IEnumerator.Current
    {
        get { return Current; }
    }

    public bool MoveNext()
    {
        if (m_Count < 10)
        {
            m_Current = m_rand.Next();
            m_Count++;
            return true;
        }
        else
            return false;
    }

    public void Reset()
    {
        m_Count = m_Current = 0;
    }
}

And the method to build our state engine.

static IEnumerator<Int32> GetRandomEnumerator()
{
    return new RandomEnumerator();
}

And use the "yield" keyword to have the C# compiler build the state engine behind the scenes.  We just specify how it should behave in the builder method.  So now we have much less code and one less class which means easier maintenance.  The following builder has the compiler generate a state engine class just as before.

static IEnumerator<Int32> GetRandomEnumerator2()
{
    Random rand = new Random();

    for (Int32 i = 0; i < 10; i++)
        yield return rand.Next();
}

This is TONS easier to maintain and we can consume the state engine just as before:

IEnumerator<Int32> enumerator = GetRandomEnumerator2();

while (enumerator.MoveNext())
    Console.WriteLine(enumerator.Current);

Evey time we call "MoveNext()" the state of our machine is changed to whatever we yield return from our method.  When we run out of "yields" the "MoveNext()" method returns false.

We can have multiple "yield return" statements to build our state engine as in the following code:

static IEnumerator<Int32> GetFirstSixPrimesEnumerator()
{
    yield return 1;
    yield return 2;
    yield return 3;
    yield return 5;
    yield return 7;
    yield return 11;
}

Which produces the following resultes when consumed:

IEnumerator<Int32> enumerator = GetFirstSixPrimesEnumerator();

while (enumerator.MoveNext())
    Console.WriteLine(enumerator.Current);

1
2
3
5
7
11

Custom Iterators.

We can also use the yield statement to produce custom IEnumerator<> state machine objects as well as IEnumerable<> iterator objects.  If we change our method to return IEnumerable<> instead of IEnumerator<> we get slightly different behavior.

static IEnumerable<Int32> GetFirstSixPrimesEnumerable()
{
    yield return 1;
    yield return 2;
    yield return 3;
    yield return 5;
    yield return 7;
    yield return 11;
}

Now we have the compiler generate not only the state engine class but the state engine builder class as well.  So we can consume our new compiler-generated class with the following code:

foreach (Int32 i in GetFirstSixPrimesEnumerator())
    Console.WriteLine(i);

Which will give us:

1
2
3
5
7
11

Ok, so that's really cool, but how else can I use it?

Besides a simplified way to implement IEnumerable<> and IEnumerator<>, we can use the "yield" statement to generate sets of numbers such as a Fibinocci sequence:

static IEnumerable<Int32> GetFibonacci(Int32 max)
{
    Int32
        x = 0,
        y = 1,
        temp;

    while (y < max)
    {
        yield return y;
        temp = y;
        y = x + y;
        x = temp;
    }
}

Or if we have a tree type structure as follows:

class Node<T>
{
    private Node<T>
        m_left,
        m_right;

    private T
        m_value;

    public T Value
    {
      get { return m_value; }
      set { m_value = value; }
    }

    internal Node<T> Right
    {
        get { return m_right; }
        set { m_right = value; }
    }

    internal Node<T> Left
    {
        get { return m_left; }
        set { m_left = value; }
    }
}

We can use the yield statement to implement recursive traversal through an iterator:

class Tree<T>: IEnumerable<Node<T>>
{
    private Node<T> m_topNode;

    internal Node<T> TopNode
    {
        get { return m_topNode; }
        set { m_topNode = value; }
    }

    private IEnumerable<Node<T>> GetChildren(Node<T> node)
    {
        if (null != node.Left)
            foreach (Node<T> leftNode in GetChildren(node.Left))
                yield return leftNode;

        yield return node;

        if (null != node.Right)
            foreach (Node<T> rightNode in GetChildren(node.Right))
                yield return rightNode;
    }

    #region IEnumerable<Node<T>> Members

    public IEnumerator<Node<T>> GetEnumerator()
    {
        return GetChildren(m_topNode);
    }

    #endregion

    #region IEnumerable Members

    System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()
    {
        return GetEnumerator();
    }

    #endregion
}

And we can build other creative auto-generated state engine structures such as Jeffery Richter's AsyncEnumerator (which is pure genius, by the way: http://msdn2.microsoft.com/en-us/magazine/cc163323.aspx).  I was truly blown ayway by the elegant code presented during his session on threading at the Devscovery event in NYC where he demonstrated his AsyncEnumerator and how it uses a compiler built state engine to help manage the complexities of asynchronous programming. I found out he is going to be writing more on the AsyncEnumerator in an upcoming article which can't wait to read.

Stopping the Yield

One thing to be aware of is that we can stop yielding at any time by issuing a "yield break" statement.  The following code will return a random number of 'x' characters:

static IEnumerable<Char> GetRandomNumberOfCharacters()
{
    Random r = new Random();

    if (r.NextDouble() > .9)
        yield break;
    else
        yield return 'x';
}

Anyways... As you can see, the possiblities of simplifying our code using the "yield" keyword to build cusom iterators and state machines is vast and very powerful.

Until next time,
Happy coding

Up Next
    Ebook Download
    View all
    Learn
    View all