Frank Bakker talks about software development: December 2008

Caching objects based on a key is a very common task in software development, making it both thread safe and scalable however is not quite trivial. That's why I implemented a generic implementation of such a cache, to handle the concurrency issues I used the Parallel Extensions to the .NET framework (PFX).
[Update: See this post for a cache that works with .Net 3.5 without PFX]
A implementation pattern I have seen quite often, uses a dictionary to store cached items based on a cache key and looks like the following:

Dictionary<int, Item> _cachedItems = new Dictionary<int, Item>();
object _lock = new object(); 
  
public Item GetItem(int id)
{    
    Item result;    
    if(!_cacheItems.TryGetValue(id, out result))    
    {        
        lock(_lock)        
        {            
            if(!_cacheItems.TryGetValue(id, out result))
            {                
                result = CreateItem(id); // does the actual expensive work of creating the item
                _cacheItems.Add(id, result);
            }
        }
    }    
    return result;
}

This implementation uses a pattern known as 'double check locking', this allows for items already in the cache to be retrieved without using a lock, only if the item is not found it acquires a lock, checks again for the item having been added by another thread between checking the first time and acquiring the lock, and then creates the item and stores it for future use.

Last week I posted on how to use an elements sequence number in Linq queries. In the project I just finished working on, we wrote a lot of Linq queries where we needed even more information related to an elements position in the sequence, like comparing it to the the element that came directly before or directly after it.
To be concrete: the input sequence contains all (ordered) historical versions of the same insurance policy object. From this sequence we need to filter all items for which a specific property has changed since the previous version. This could be implemented like:

Version previous = null;
  
foreach (var version in Versions)
{
    if (previous != null && previous.amount != version.amount)
    {
        // do something with this item
    }
    previous = version;
}

Once you have gotten addicted to the declarative style of Linq, writing code like this just doesn't cut it anymore. I wanted to be able to use the filtered items as a sequence so it can be used in a Linq query.
Using the pattern I described last week this could be written like

   1: Versions.select((item, index) => new {item, index}).Skip(1)
   2:         .where(v=>Versions.ElementAt(v.index - 1).amount != v.item.amount)

This query finds each items index number and uses it to find the preceding item using the ElementAt() method. This makes it possible to compare each item to it's previous item in the where clause. The Skip(1) is needed because the first element obviously doesn't have a previous item.
The ElementAt() method however is not very efficient, because in most cases to find a single item it iterates the sequence from the begin to the requested item. Finding each item's preceding item by re-looping the sequence again can get you into serious performance issues when using larger sets.
Because I needed this kind of queries a lot in this project, I wanted to make then both easy to write and efficient in execution. So I wrote an extension method I called WithContext() that wraps each input item in a container object, along with some information about its context in the sequence. This allows for the code above to be written as:

   1: Versions.WithContext().Where(v=>v.Previous != null && v.Previous.Amount != v.Current.Amount)

To select the duration of each version, based on its own StartDate and the StartDate of the next item, I can now write:

   1: Versions.WithContext().Select(v.Next.StartDate - v.Current.StartDate)

WithContext() takes an IEnumerable and returns an IEnumerable>. ElementWithContext is a simple class that provides properties to retrieve the Current item as well as the Previous and the Next. While at it I added properties to get all Preceding and all Following items, as well as the sequence number.

public class ElementWithContext
{
    public T Previous { get; private set; }
    public T Current { get; private set; }
    public T Next { get; private set; }
    public int Index { get; private set; }
  
    public IEnumerable Preceding { get; private set; }
    public IEnumerable Following { get; private set; }
  
    internal ElementWithContext(T previous, T current, T next,
        int index, IEnumerable preceding, IEnumerable following)
    {
        Current = current;
        Previous = previous;
        Next = next;
        Index = index;
        Preceding = preceding;
        Following = following;
    }
}

The implementation of WithContext() looks a bit like the code in the first sample. It loops the input sequence and for each element it yields a new ElementWithContext. To find the next element I did a kind of 'look ahead' in the for loop. To do this I tweaked the input sequence so that it represents all 'Next' items' . I did this by first adding an empty element at the end (the last item does not have a Next item) and then skip the first item (the first 'next' item is the second in the sequence). This way WithContext() conforms to Linq's 'deferred execution' model by not taking more items from the input then necessary.

static public IEnumerable> WithContext(this IEnumerable source)
{
    // initialize the previous and current item for the first source element
    T previous = default(T);
    T current = source.FirstOrDefault();
    int index = 0;
  
    // Loop all 'Next' items
    foreach (T next in source.Union(new[] { default(T) }).Skip(1))
    {
        yield return new ElementWithContext(previous, current, next,
                index, source.Take(index), source.Skip(index + 1));
  
        previous = current;
        current = next;
        index++;
    }
}

One more item that I'll keep handy on my personal utility belt

Oh, and thanks to Erno for pointing me to this Live Writer Add-In for the code snippets. I hope this helps reading these way to long code samples (sorry for that).

Frank Bakker talks about software development

Wednesday, December 31, 2008

Implementing a Thread Safe cache using the Parallel Extensions

Thursday, December 11, 2008

F# to ship as part of Visual Studio 2010

Tuesday, December 2, 2008

More context information in Linq queries