XML is still widely used in applications. My recent usage of XML comes mostly from REST services. REST service responses nowadays use JSON formatting for the most part, but XML is still used as well. Most services, I have ever dealt with, use camel casing for naming properties, that is, a property name has a name like customerName, and not CustomerName (with a capital C).
This write-up shows how to apply a pattern to Xml serialization and deserialization (in this example, camel casing) without having to use a ton of attributes.
For my example, I have created a simple silly class with multiple property types. I will show how to serialize it using a mix of XML attributes and attribute overrides.
Why would I want to do this?
Because, in a real life scenario, it frees me from writing thousands of attributes next to thousands of object properties. I’d rather push the behavior one level deeper into the logic stack, and reduce the amount of code duplication, and the places that may require fixes when the code has a bug. This will become clearer by the conclusions of this blog.
I have defined a few classes with a variety of member types. Notice that property allowance has an XmlElement attribute (WEEKLY_MONEY) attached to it. There are occasions when the serialization tag names have to be different than the property name. For these cases, you will need to add an attribute in code, just like I did with “Allowance”. Here is the code that defines the classes to be serialized:
- public enum Gender
- {
- Female, Male
- }
-
- public class Book
- {
- public string Title { get; set; }
- public string Color { get; set; }
- }
-
- public class MyCustomer
- {
- public string NameAndLastName { get; set; }
- public int Age { get; set; }
- public Double Height { get; set; }
- public Gender Gender { get; set; }
- public List<Book> ReadingPile { get; set; }
- public Boolean Active { get; set; }
-
- [XmlElement(ElementName = "WEEKLY_MONEY")]
- public Decimal Allowance { get; set; }
-
- public MyCustomer ReferredBy { get; set; }
- }
Notice the simplicity of how these classes are defined: XML attributes are only minimally present. And the property naming in the class definition follows the C# naming convention (PascalCasing), even though it will be serialized using camel casing.
In order to serialize this object to XML, using camel casing and without polluting my code with repetitive attributes that in real life happens thousands of times (ad nauseum), I have used XML attribute overrides. These are not simple to use. I spent quite some time getting them to work. But patience pays off. Thousands of lines of code will be saved.
XML attribute overrides, by definition, will do just that: override any attribute present in a class or property. If the attribute is not present, it’s added anyway. But, I want the behavior to be the inverse of this, i.e., the overrides to be the normal behavior, and any attributes present in code to “override the override”. To accomplish this, I need to check for attributes for every processed property and class. This is done using reflection. Wherever I find an attribute in code, I do not apply the override.
Some notes on the simplicity of some of my code- my attribute detection is poor. I am just checking for attributes whose name starts with “Xml”. This is not real life code. Just to illustrate how a feature is used.
Finally, before moving on to the code, I will give a very brief explanation of how XML override attributes work. The minimum necessary knowledge to get the job done:
There are many classes in dot Net related to the topic of attribute overrides, but the 2 most important ones are: XmlAttributeOverrides and XmlAttributes (both plural). The first is “the bag” of definitions which I will send to the serializer. It tells the serializer “Hey, this is how I want you to serialize the classes I specify here”. For any class not in “the bag”, the serializer will apply the default behavior. The second object, (XmlAttributes) is a collection of attributes that must be included for each class, and for each property. XmlAttributes defines the behavior (whether a property is an XmlElement attribute, XmlRoot, or XmlArray with XmlArrayItem, etc.). So the rule is to create one XmlAttributes object for each class, and one for each property in the class. Once these are defined, they are all thrown into the bag (added to XmlAttributeOverrides) to be sent to the serializer.
The following code adds the attributes for all the properties in my classes using reflection. This code does not cover every scenario under the sun, so if you use it, most likely you will have to consider a few more cases to match your application needs, and apply more logic for handling exceptions and attribute detection.
This example,
- Adds an XmlRoot attribute to the outmost element
- Sets a default tag name which is the camel case representation of the property name.
- If a property has an Xml attribute attached to it, it will not change its serializing behavior (no overrides will be applied)
- Will wrap collection items (List and Array only) with a “COLLECTIONxxx” tag.
- Uses UPPER case in places with the intention of highlighting things in the serialized output. Just for ease of finding the important parts. Not the right formatting for a real life scenario.
The code uses some small “helper” methods, which I list here first. You may want to skip them and refer to them after you study the main code, if you feel you still need to check them. These helpers include a few simple string overrides as well.
HELPERS
- public static class Overrides
- {
- public static string ToPascal(this string s) { return ChangeCasing(s, Char.ToUpper); }
- public static string ToCamel(this string s) { return ChangeCasing(s, Char.ToLower); }
-
- private static string ChangeCasing(string s, Func<Char, Char> convert)
- {
- return string.IsNullOrWhiteSpace(s) ? s : string.Format("{0}{1}", convert(s[0]), s.Substring(1));
- }
- }
- private static HashSet<Type> GetTypesToOverride(Type objectType)
- {
- var returnValue = new HashSet<Type>();
- returnValue.Add(objectType);
- Type elementType = objectType.GetElementType();
- if (elementType != null)
- returnValue.UnionWith(GetTypesToOverride(elementType));
-
- objectType.GetGenericArguments()
- .Where(t => t != null)
- .ToList()
- .ForEach(t => returnValue.UnionWith(GetTypesToOverride(t)));
-
- returnValue.RemoveWhere(t => t == null || t.FullName.StartsWith("System"));
- return returnValue;
- }
-
- private static bool HasXmlAttributes(MemberInfo minfo)
- {
- List<Attribute> xmlAttributes = minfo.GetCustomAttributes()
- .Where(t => {
- string typeName = t.GetType().Name;
- return typeName.StartsWith("Xml") && !"XmlObjectWrapperAttribute".Equals(typeName);
- })
- .ToList();
-
- return xmlAttributes.Count > 0;
- }
In the above code, MemberInfo type was used for “HasXmlAttributes”. This allows the method to handle arguments of type Type or PropertyInfo. They both have MemberInfo base class.
CORE FUNCTIONALITY
- public static XmlAttributeOverrides CreateAttributeOverrides(Type objectType)
- {
- Func<Type, bool> IsList = t => t.IsGenericType &&
- t.GetGenericTypeDefinition().IsAssignableFrom(typeof(List<>));
-
- HashSet<Type> workTypes = GetTypesToOverride(objectType);
- XmlAttributeOverrides theBag = new XmlAttributeOverrides();
- var classNames = new HashSet<string>();
- classNames.Add(objectType.FullName);
-
- while (workTypes.Count > 0)
- {
- Type singleType = workTypes.First();
- workTypes.Remove(singleType);
- if (HasXmlAttributes(singleType))
- continue;
-
- if (singleType == objectType)
- theBag.Add(objectType, new XmlAttributes()
- {
- XmlRoot = new XmlRootAttribute("OUTEROBJECT")
- });
-
- PropertyInfo[] allPropsInfo = singleType.GetProperties();
- foreach (PropertyInfo propInfo in allPropsInfo)
- {
- HashSet<Type> overridableTypes = GetTypesToOverride(propInfo.PropertyType);
- bool propHasXmlAttributes = HasXmlAttributes(propInfo);
- if (propHasXmlAttributes)
- overridableTypes.Remove(propInfo.PropertyType);
-
- List<String> overridableNames = overridableTypes
- .Select(t => t.FullName)
- .ToList();
-
- overridableTypes.RemoveWhere(t => classNames.Contains(t.FullName));
- classNames.UnionWith(overridableTypes.Select(t => t.FullName));
- workTypes.UnionWith(overridableTypes);
- if (propHasXmlAttributes)
- continue;
-
- string camelName = propInfo.Name.ToCamel();
- var propOverrides = new XmlAttributes();
- Type propType = propInfo.PropertyType;
- if (propType.IsArray || IsList(propType))
- {
- string pascalName = propInfo.Name.ToPascal();
- propOverrides.XmlArray = new XmlArrayAttribute("COLLECTION" + pascalName);
- propOverrides.XmlArrayItems.Add(new XmlArrayItemAttribute(camelName));
- }
- else
- {
- propOverrides.XmlElements.Add(new XmlElementAttribute(camelName));
- }
-
- theBag.Add(singleType, propInfo.Name, propOverrides);
- }
- }
-
- return theBag;
- }
So, this code does what I described. Some description of non-obvious things,
- There is a set called workTypes where the classes to be serialized are stored as they are progressively found when going through the outermost class properties. Items are removed from the set once overrides are applied for them. Processing ends when the set is empty.
- There is a HashSet<String> called classNames where all processed classes are kept. If a class is already in the “bag”, then it should not be added again. This set helps handle this functionality. Plus if there are circular dependencies in some property, this will avoid an infinite loop. No items are ever removed from this set.
- If the class starts with “System”, then, it’s a dot Net class, and it does not need to be processed. If a system class uses non system classes (e.g. Array[MyClass1], List[MyClass2}), then the non-system generic classes will be included for processing.
- All attribute overrides are added until workTypes is empty.
That’s it. The return type is the overrides to be sent to the serializer. And now, I will show how to call that serializer. The following code does the job.
I assume my result will be small, and so, serializing into a String is not be too bad of a fault. The method below is a static member of a class I named XmlCamelSerializer, and which is not shown here (just a wrapper for this method and the one above - the one that creates the overrides).
- public static string Serialize(object obj)
- {
- string returnValue = null;
- Type objectType = obj.GetType();
- XmlAttributeOverrides xmlOverrides = CreateAttributeOverrides(objectType);
- XmlSerializer serializer = new XmlSerializer(objectType, xmlOverrides);
- StringBuilder sbReturn = new StringBuilder();
- using (MemoryStream memStream = new MemoryStream())
- using (StreamWriter writer = new StreamWriter(memStream))
- {
- serializer.Serialize(writer, obj);
- returnValue = Encoding.UTF8.GetString(
- memStream.ToArray(), 0, (int)memStream.Length);
- }
-
- return returnValue;
- }
And now, how to exercise it? I use NUnit3 for testing (easy and fast to use), so I wrote a test, although it’s not really a test. It’s just a way to exercise my code conveniently.
- [Test]
- public void TestCustomer()
- {
- var cust2 = new MyCustomer()
- {
- Age = 32,
- Gender = Gender.Female,
- NameAndLastName = "Leann Lunn",
- Height = 6.01,
- ReadingPile = new List<Book>(),
- Allowance = 21.61m
- };
-
- var cust = new MyCustomer()
- {
- NameAndLastName = "Lee Leevan",
- Age = 35,
- Gender = Gender.Male,
- Height = 6.01,
- ReferredBy = cust2,
- Active = true,
- Allowance = 21.61m,
- ReadingPile = new List<Book> {
- new Book() { Title = "C# for Dummies", Color = "Yellow" },
- new Book() { Title = "The Black Box", Color = "Orange" },
- new Book() { Title = "The Red Sea", Color = "White" }
- }
- };
-
- string serFoo = XmlCamelOverrides.Serialize(cust);
- Console.WriteLine(serFoo);
-
- }
- }
Below is the output from the serialization. Notice how properties have camel casing, and I avoided the massive proliferation of attributes in my code.
- <?xml version="1.0" encoding="utf-8"?>
- <OUTEROBJECT xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
- <nameAndLastName>Lee Leevan</nameAndLastName>
- <age>35</age>
- <height>6.01</height>
- <gender>Male</gender>
- <COLLECTIONReadingPile>
- <readingPile>
- <title>C# for Dummies</title>
- <color>Yellow</color>
- </readingPile>
- <readingPile>
- <title>The Black Box</title>
- <color>Orange</color>
- </readingPile>
- <readingPile>
- <title>The Red Sea</title>
- <color>White</color>
- </readingPile>
- </COLLECTIONReadingPile>
- <active>true</active>
- <WEEKLY_MONEY>21.61</WEEKLY_MONEY>
- <referredBy>
- <nameAndLastName>Leann Lunn</nameAndLastName>
- <age>32</age>
- <height>6.01</height>
- <gender>Female</gender>
- <COLLECTIONReadingPile />
- <active>false</active>
- <WEEKLY_MONEY>21.61</WEEKLY_MONEY>
- </referredBy>
- </OUTEROBJECT>
As you can see, the code attribute ("WEEKLY_MONEY") was not overridden to camle case because the XML attribute in the class definition was respected. The root attribute was applied (the root node has tag "OUTEROBJECT"), and all other nodes were serialized using camel casing.
CONCLUSION
When you see lots of duplicate code, find a better way. Seeing a lot of XML and JSON attributes in the code led me to find how to avoid the massive code duplication and find simplicity. In this case, XmlAttributeOverrides came to the rescue. It was not easy. Getting this to work may take a while because the documentation is not easy to understand, the class names probably not the best, and its use is anything but intuitive. But the final result has very few lines of code.
These few lines of code will surely avoid thousands of lines of repetitive code in a system where serialization and deserialization are heavily used (REST service clients, for example). I will soon follow this blog with an explanation of how to serialize List responses to produce a List<T> object instead of having to create useless and silly empty wrapper classes when a collection is returned.