Tuesday, March 31, 2015

Gotcha With Dictionary Copies

Last week I found a little “gotcha” when I had the need to make a copy of an existing Dictionary object. The Dictionary class has a constructor that takes an existing Dictionary object and makes a copy of it to a new object (https://msdn.microsoft.com/en-us/library/et0ke8sz(v=vs.110).aspx), so I thought that was the perfect way to handle this. Unfortunately, I was wrong.

Here’s the scenario. I am listening to a TCP port, and storing the data as it comes in into a Dictionary<string, MyData>, either adding new items or updating existing ones. Since incoming data can arrive at a pretty fast rate, I didn’t want to process it as it came in, so at 1 minute intervals, I copy the Dictionary and process the copy, and the TCP Listener can go on grabbing data and putting it into the original Dictionary. We can test this with something like this:

// First, here's the data class
private class MyData
{
public int MyInt { get; set; }
public string MyString { get; set; }

public MyData(int start)
{
this.MyInt = start;
this.MyString = "Initialize ...";
}
public override string ToString()
{
return string.Format("{0}: {1}", this.MyString, this.MyInt);
}
}

// And a quick method to show the data:
private void PrintDictionary<T>(string message, Dictionary<string, T> test)
{
Console.WriteLine(message);
foreach (var key in test.Keys)
{
Console.WriteLine("{0}: {1}", key, test[key]);
}
}

// Now the testing begins
Dictionary<string, MyData> Orig = new Dictionary<string, MyData>();
Dictionary<string, MyData> Copy;

// put data in original
Orig.Add("a", new MyData(1));
Orig.Add("b", new MyData(2));
Orig.Add("c", new MyData(3));

// now let's make a copy
Copy = new Dictionary<string, MyData>(Orig);

this.PrintDictionary<MyData>("*** Original ***", Orig);
this.PrintDictionary<MyData>("*** Copy is the same ***", Copy);

Looks pretty simple, but unfortunately, that didn’t work as planned, because the Dictionary copy is actually a “shallow” copy instead of a “deep” copy. Nowhere in the documentation (link above) does it say anything about this being a shallow copy. Probably most of you know the difference between a shallow and a deep copy, but let me explain for those who might not.
  • A shallow copy will copy the original values to the copy.
    • If the values are references, only the references are copied.
    • If the values are primitive types, such as int or string, the actual values are copied.
  • A deep copy will copy the actual data to the copy.
    • If the values are references, the actual object is copied (a new instance, not just the reference to the existing object).
    • For primitive types, there’s no difference, since it copies the actual value of the object with either type of copy.

So what does this mean in plain English? For class types, such as the MyData class in the Dictionary I’ve defined in the above code snippet, a shallow copy means that Orig and Copy are both pointing to (referencing) the same data! If I make a change to an item in Orig, you’ll see the same change in Copy too. I needed a deep copy, meaning that changes to items in Orig did not affect the items in Copy.

Put this code after the above code snippet to see that the items in Orig do indeed point to the same items in Copy:

// now let's modify the original
MyData oldData = Orig["b"];
oldData.MyString = "Made a change";
oldData.MyInt = 42;

this.PrintDictionary<MyData>("*** Original After Modifying Existing Item ***", Orig);
this.PrintDictionary<MyData>("*** Copy changes ALSO!!! ***", Copy);

// Why did the copy change? Because when the original was copied,
// the copied class object did not actually get copied, only the
// reference (the pointer) to the class object was copied.
// So, when the original is changed, since the copy references the
// original, it appears to be changing too.

Once I discovered that the copy was not a deep copy, rather than research ways to change that behavior in the copy itself, for a quick fix I simply created a new MyData object when I needed to update an item in the Orig dictionary:

// I first got around this problem by creating a new class object instead of modifying existing
MyData newData = new MyData(666);
newData.MyString = "Brand New Instance";
Orig["b"] = newData;

this.PrintDictionary<MyData>("*** Original With New Item ***", Orig);
this.PrintDictionary<MyData>("*** Copy will NOT change ***", Copy);

I had needed a quick fix to my problem (I needed to get the code deployed), but I had some time to research this more while I was writing this blog post and found that there are a few other options. Here are some links to decent articles and other discussions:
 
 
I decided to make use of the ICloneable interface for MyData class and add an extension method for Dictionary. Here’s what I did:
 
// First, make MyData implement ICloneable
private class MyData : ICloneable
{
public int MyInt { get; set; }
public string MyString { get; set; }

public MyData(int start)
{
this.MyInt = start;
this.MyString = "Initialize ...";
}
// This protected constructor is just for the purpose of cloning/copying
protected MyData(MyData cloneFrom)
{
this.MyInt = cloneFrom.MyInt;
this.MyString = cloneFrom.MyString;
}
public override string ToString()
{
return string.Format("{0}: {1}", this.MyString, this.MyInt);
}

#region ICloneable Members

public object Clone()
{
// For classes that contain members that are more complex,
// utilize a protected constructor just for the purpose of cloning/copying
return new MyData(this);

// For this particular class, I could also have done this
// using MemberwiseClone, which does a shallow copy.
// Since my class has only primitive types, a shallow copy is fine.
//return this.MemberwiseClone();
}

#endregion
}

public static class Extensions
{
// An extension method for Dictionary. Notice that it is limited to TValue
// types that implement ICloneable.
public static Dictionary<TKey, TValue> Clone<TKey, TValue>(this Dictionary<TKey, TValue> source) where TValue : ICloneable
{
return source.ToDictionary(item => item.Key, item => (TValue)item.Value.Clone());
}
}

Now let’s test this using the following code:

Dictionary<string, MyData> Orig = new Dictionary<string, MyData>();
Dictionary<string, MyData> Copy;

// put data in original
Orig.Add("x", new MyData(1));
Orig.Add("y", new MyData(2));
Orig.Add("z", new MyData(3));

// now let's make a cloned copy, using the Dictionary Clone() extension method
Copy = Orig.Clone();

this.PrintDictionary<MyData>("*** Original ***", Orig);
this.PrintDictionary<MyData>("*** Copy created from Clone of Orig ***", Copy);

// now let's modify the original
MyData oldData = Orig["y"];
oldData.MyString = "Made a change";
oldData.MyInt = 42;

this.PrintDictionary<MyData>("*** Original After Modifying Existing Item ***", Orig);
this.PrintDictionary<MyData>("*** Copy will NOT change!!! ***", Copy);

Yay! It works! This is much “cleaner” than my quick-and-dirty original solution to the problem, which I’ve already deployed, so I’m not going to change it now. Next time I need to make a change to that particular bit of code, I think I’ll refactor it to make use of ICloneable. I like it.

Happy coding!