Introduction
The .NET framework contains a number of 'Dictionary' classes including:
- Dictionary<K, V>
- SortedDictionary<K, V>
- Hashtable
- ListDictionary
- OrderedDictionary
All of these associate a key with a value and
are represented internally by a collection of key/value pairs. However, none of
them permit duplicate keys and so, if you try to add an item whose key is
already present, an exception is thrown.
There is a Lookup<K, V> class which is a collection of keys mapped to one or
more values. However, this is intended for use in LINQ queries and is not really
suitable for general purpose use.
A strategy for a workaround
The absence of a general purpose dictionary class which permits duplicate keys
is a serious shortcoming of the framework, in my opinion.
The most usual way to work around this shortcoming is to associate a list of
values, rather than a single value, with each key in the dictionary.
In fact, the GroupBy extension method which is used in LINQ works by reading the
input elements into a temporary dictionary of lists so that all elements with
the same key end up in the list associated with that key. The lists are then
emitted as a sequence of objects which implement the IGrouping interface.
Although this strategy works, it is tedious to employ in practice because you
need to write code to manipulate the list associated with each key every time an
item is added, removed or checked for containership. Also, when enumerating the
dictionary, you have to enumerate first the keys and then their associated
lists.
The Lexicon class
It would be nice if we had a class which did all the 'housekeeping' of
maintaining the lists for us and which could be used in a natural way. I have
therefore written such a class.
As Dictionary is already a long name, I decided to simply use the synonym
'Lexicon' for the new class rather than append yet another prefix to
'Dictionary'.
The Lexicon<K, V> class contains all the same members as the Dictionary<K, V>
class but has some new ones to deal with the possibility that a key could have
multiple values. The usage of some existing members has had to be modified for
the same reason and some additional overloads have been added.
Although the Lexicon<K, V> class is a wrapper for a Dictionary<K, List<V>>, I
decided that it would not be appropriate to formally implement the generic and
non-generic IDictionary interfaces because these interfaces implicitly assume
that classes which implement them will not support duplicate keys. However, it
does implement all the other interfaces which Dictionary implements.
It also implements a new interface I've created called ILexicon<K, V> which
contains signatures for its essential members.
This table lists its main constructors with a brief description of what they do:
Constructor Signature | Description |
Lexicon() | creates a new empty Lexicon |
Lexicon(dictionary) | creates a new Lexicon from a generic IDictionary |
Lexicon(lexicon) | creates a new Lexicon from an ILexicon |
Lexicon(capacity) | creates a new Lexicon with the specified initial capacity |
Lexicon(comparer) | creates a new Lexicon which uses a specific IEqualityComparer to test for equality of keys |
This table lists its main properties:
Property Name | Description |
Count | gets the total number of items in the Lexicon |
KeyCount | gets the total number of unique keys |
this[key] | gets or sets the list of values with this key |
this[key, index] | gets or sets the value of the item at this index within this key's list of values |
Keys | gets a collection of all unique keys in the Lexicon |
ValueLists | gets the lists of values of all items in the Lexicon in the same order as the Keys property |
Values | gets an enumeration of all values in the Lexicon in the same order as the Keys and Values properties |
And, finally, this table lists its main
methods:
Method Name and Signature | Description |
Add(key, value) | adds an item to the Lexicon with this key and value |
AddList(key, list) | adds items to the Lexicon with this key from this list of values |
AddRange(keyValuePairs) | adds an enumerable collection of KeyValuePairs to the Lexicon |
ChangeValue(key, oldValue, newValue) | changes the value of the first item with this key and this oldValue to this newValue |
ChangeValueAt(key, index, newValue) | changes the value of the item with this index in this key's list of values to newValue |
Clear() | clears the Lexicon of all items |
Contains(key, value) | indicates if an item with this key/value pair is within the Lexicon |
ContainsKey(key) | indicates if an item with this key is within the Lexicon |
ContainsValue(value) | indicates if an item with this value is within the Lexicon |
ContainsValue(value, out key) | indicates if an item with this value is within the Lexicon and, if so, returns the first key found with that value as an output parameter |
CopyTo(array, index) | copies the key /value pairs of the Lexicon to an array starting at the specified index |
FindKeyIndexPairs(value) | finds all key/index pairs in the Lexicon which have this value |
GetValueCount(key) | gets the number of all values in this key's list |
IndexOfValue(key, value) | gets the first index of this value within this key's list |
Remove(key, value) | removes the first item in the Lexicon with this key and value |
RemoveAt(key, index) | removes the item at this index in this key's list of values |
RemoveKey(key) | removes all items in the Lexicon with this key |
TryGetValueList(key, out value) | tries to get the list of values with this key |
TryGetValueAt(key, index, out value) | tries to get the value of the item at this index within this key's list of values |
Notes on members
Lexicon's constructors mirror those of the generic Dictionary class except that
there is an additional constructor to create a new Lexicon from an existing
ILexicon.
The Count property returns the total number of key/value pairs in the Lexicon.
The indexer which takes a sole key argument gets or sets the whole list of
values for that key. If there's already a list for that key, it is replaced (not
merged) by the 'setter'. Otherwise, a new entry for that key is created.
The indexer which takes an additional index parameter gets or sets the
individual value at that index in the key's value list. If the key doesn't exist
or there's no value at that index (and it's the next index to be assigned) then
a new entry is created.
The Keys property returns just the unique keys (however many times they're
duplicated) and the KeyCount property returns how many such keys there are.
The ValueLists property returns the lists of value corresponding to the unique
keys and the Values property returns an enumeration of all values in all
key/value pairs in the Lexicon.
The Add, Contains and Remove methods also have overloads (not shown) which take
a KeyValuePair structure as a parameter.
The AddList method adds a key with a list of values to the Lexicon. If the key
already exists, its existing list of values is merged with the new one, not
replaced by it. This differs from the indexer behaviour.
The FindAllKeyIndexPairs method does what it says on the tin for a given value.
Notice, though, that this method will be slow for a large Lexicon as it needs to
iterate through every key/value pair within it. It has therefore been
implemented using deferred execution (i.e. an iterator).
The 'ChangeValue' and 'Remove' family of methods all return a bool value to
indicate whether the operation was a success or not.
The other members are largely self-explanatory.
When Lexicons are created (or added to) from other objects, they are copied
first so that subsequent changes don't affect the original objects. However,
when lists of values are retrieved by external code, they are not copied and so
may be manipulated directly by that code.
Enumerating the Lexicon
The best way to enumerate a Lexicon object is to use the enumerator returned by
the GetEnumerator method which returns all key/value pairs in the same order as
the keys are enumerated in the underlying Dictionary object
However, it's also possible to enumerate using the Keys property and then use
the indexer to get the list of values for that key and iterate through those.
The ValueLists property enables the lists of values for each key to be
enumerated separately.
As mentioned in the previous section, the Values property and FindKeyIndexPairs
method are implemented as generic IEnumerables and so can also be enumerated
using the foreach statement.
Example of usage
The code for the Lexicon class and ILexicon interface can be downloaded from the
link accompanying this article as it is too long to include in the body of the
article itself. Both these types are included in a namespace called Lexicons and
can be used in .NET 2.0 or later.
The following console application shows how to use some of its principal
members:
using
System;
using
System.Collections.Generic;
using
Lexicons;
class
Program
{
static void
Main(string[] args)
{
// create and populate a Lexicon object
Lexicon<string,
int> lex = new
Lexicon<string,
int>();
lex.Add("Dave", 1);
lex.Add("John", 2);
lex.Add("Dave", 3);
lex.Add("Stan", 4);
lex.Add("Dave", 5);
lex.Add(new
KeyValuePair<string,
int>("Fred",
6));
// iterate through key/value pairs
Console.WriteLine("The
lexicon initially contains the following key/value pairs\n");
foreach (KeyValuePair<string,
int> kvp in lex)
{
Console.WriteLine("{0}
: {1}", kvp.Key, kvp.Value);
}
// add a new entry to the Lexicon
lex["Alan"] =
new List<int>
{ 7 };
// add some more values for the new key
lex.AddList("Alan",
new List<int>
{ 8, 9 });
// add another new entry
lex.Add("Mary", 10);
// iterate the Lexicon again, this time using the Keys
collection
Console.WriteLine("\nFollowing
the addition of new entries, the lexicon now contains\n");
foreach (string
key in lex.Keys)
{
foreach (int
value in lex[key])
{
Console.WriteLine("{0}
: {1}", key, value);
}
}
Console.WriteLine("\nDave
has {0} values", lex.GetValueCount("Dave"));
lex.RemoveKey("Dave");
// remove key and all its values
lex.Remove("Alan", 8);
// remove a single value
lex.ChangeValue("Fred", 6, 5);
// change a value
// iterate the Lexicon again
Console.WriteLine("\nFollowing
some removals and a change, the lexicon now contains\n");
foreach (KeyValuePair<string,
int> kvp in lex)
{
Console.WriteLine("{0}
: {1}", kvp.Key, kvp.Value);
}
if (lex.Contains("Stan",
4))
{
Console.WriteLine("\nStan
has a value of 4");
}
// create an array of key/value pairs and copy the
Lexicon's contents to it
KeyValuePair<string,
int>[] kvpa = new
KeyValuePair<string,
int>[lex.Count];
lex.CopyTo(kvpa, 0);
Console.WriteLine("There
are currently {0} key value pairs in the Lexicon", kvpa.Length);
// try and get the value at index 1 for Alan
int val;
bool b = lex.TryGetValueAt("Alan",
1, out val);
if (b)
Console.WriteLine("Alan has a value of {0} at
index 1", val);
// create a new dictionary
Dictionary<string,
int> dict = new
Dictionary<string,
int>();
dict["Nora"] = 3;
dict["John"] = 4;
// uses a key already in the Lexicon
// create a new Lexicon from the Dictionary
Lexicon<string,
int> lex2 = new
Lexicon<string,
int>(dict);
// add some more members to it
lex2["Zena"]
= new List<int>
{ 11 };
lex2["Myra", 0] = 12;
// merge with existing Lexicon
lex.AddRange(lex2);
lex.Remove(new
KeyValuePair<string,
int>("Stan",
4)); // effectively remove Stan
lex.RemoveAt("Mary", 0);
// effectively remove Mary
// iterate the Lexicon again
Console.WriteLine("\nFollowing
a number of changes, the lexicon now contains\n");
foreach (KeyValuePair<string,
int> kvp in lex)
{
Console.WriteLine("{0}
: {1}", kvp.Key, kvp.Value);
}
Console.WriteLine("\nNora
has a value of 3 at index {0}", lex.IndexOfValue("Nora",
3));
lex["Zena",
1] = 1; // add a new value for Zena
if
(lex.ContainsValue(12))
{
Console.WriteLine("The
lexicon contains a value of 12");
}
string k;
if (lex.ContainsValue(5,
out k)) Console.Write("{0}
had a value of 5 ", k);
lex.ChangeValue(k, 5, 2);
if (lex[k, 0] == 2)
Console.WriteLine("but
now has a value of 2", k);
Console.WriteLine("\nThe
following key/index pairs have a value of 2\n");
foreach (KeyValuePair<string,
int> kip in
lex.FindKeyIndexPairs(2))
{
Console.WriteLine("Key
: {0} Value : 2 Index : {1}", kip.Key, kip.Value);
}
Console.ReadKey();
}
}
Example output
A screenshot of the output is shown below:
Conclusion
I hope you will find the Lexicon class to be a useful adjunct to the other
'Dictionary' classes in the .NET Framework.
It is not intended as a replacement for the generic Dictionary class and should
only be used in situations where keys may be duplicated. As a List object needs
to be assigned to each key, it will clearly use more memory and be slightly
slower than an 'ordinary' Dictionary. This is not ideal in situations where the
duplicated keys are relatively sparse but an inevitable consequence of the
strategy used.
I am currently working on sorted versions of the Lexicon and a 'ToLexicon'
extension method for use in LINQ queries . I am also considering the use of a
different strategy to address the issue of Lexicons with sparse duplicate keys.
The results will be the subject of a future article.