This is the first of four parts this article consists of:
- Introduction - Part in which
proposed goals and desired features of the solution are defined.
- Design - Part in which solution
design is outlined, explaining critical details that the solution will have
to implement.
- Implementation - Third part which
explains actual implementation of the formatter classes.
- Example - Final part which lists
numerous examples of formatters use.
The last part of the article will have a
compressed file attached with it, containing complete and commented source code
of the formatter classes, source code of the demonstration project and compiled
library.
Introduction
Converting objects to strings is an everyday programming task which can be
viewed in one of two basic forms: serialization or formatting. Serialization is
reversible process in which object is converted to string (or some other form)
from which it can be recovered into original state later. Formatting is less
rigid operation, in which string is built so to resemble contents of the object,
but without a requirement that object should be restored from that string later.
In this article we will deal with the problem of formatting. We will address the
question how to build a string which represents object of unknown structure and
contents. This question is opposed to typical situation in which programmer
formats the string which represents object of known structure, and hence
contents of the string can reflect object's semantics in their proper ways.
For example, if we have an instance of the Rectangle structure, it could be
formatted like this:
Rectangle r = new Rectangle(1, 2, 3, 4);
Console.WriteLine("({0},{1})-({2},{3})", r.Left, r.Top, r.Right, r.Bottom);
This piece of code produces output:
(1,2)-(4,6)
Output presented like this shows upper-left and lower-right corner of the
rectangle. But in order to show such output, we must know that represented
object is a rectangle, and also that such string will be informative to the
reader. Under different circumstances, user might not be satisfied to see this
format, but might require something different (e.g. left-top-right-bottom
representation of rectangle's coordinates). One could remember several other
formats applicable to the Rectangle structure, let alone other data types.
Different formats that could be used in a complex software project are seemingly
endless.
This naturally leads to the question whether there is a format that could be
applied to all objects uniformly? The answer is certainly yes, simply because
formatting is a loosened form of serialization to string. Any serializer can be
legitimately used as a formatter as well. But then, reader might become quite
confused reading the formatted string, because serializers typically produce
output which is not really user friendly. Their purpose is on the other side,
and they should not put too much effort into readability.
So our question regarding formatters might need to be refined. We might ask if
there is a format that could be applied uniformly to all objects in such way
that resulting string is readable and sufficiently informative to the reader. In
other words, we are searching for such formatter to convert any object to string
so that human reader might visually search for the needed information (contained
in the original object) and find it without much effort.
Answering this question positively means to look for the formatter which finds
proper balance between length of the string presented to the reader and its
informative contents which resembles contents of the object. In this article we
will present a formatter which attempts to find such balance. The complete
designing process will be explained and full source code given to the reader,
along with numerous examples of its use.
Formatting Goals
When trying to format string which represents object of unknown structure, we
must first determine what the proposed goal is. Formatter would have to deal
with quite different objects in its time and all objects, from simplest to most
complex, should be presented in the form of human-readable string in accordance
to same set of fixed, predefined formatting rules.
So let's start naming the predefined formatting rules. First of all, every
object is either primitive or consists of other objects. Primitive objects, like
integer values, can be simply formatted as:
int Count = 4
More complex object, like Point which has three properties: IsEmpty, X and Y,
can be represented by simply listing their contained objects:
Point Center { bool IsEmpty=false, int X=3, int Y=4 }
In this case, we have printed the three public properties of the Point structure
in one line. Somebody might prefer multi-lined representation, in which
indentation is used to determine which object is child of which other object:
Point Center = {
bool IsEmpty = false
int X = 3
int Y = 4 }
Things may become even more complex, like in the following example of a
Rectangle structure:
Rectangle Window {
|-- int Bottom = 51
|-- int Height = 42
|-- bool IsEmpty = false
|-- int Left = 14
|-- Point Location = {
| |-- bool IsEmpty = false
| |-- int X=14
| +-- int Y=9 }
|-- int Right = 31
|-- Size Size = {
| |-- int Height = 42
| |-- bool IsEmpty = false
| +-- int Width = 17 }
|-- int Top = 9
|-- int Width = 17
|-- int X=14
+-- int Y=9 }
In this example we have formatted all public properties of the Rectangle
structure as a tree, taking advantage of the fact that string is printed in
monospaced font. However, public properties exposed by the Rectangle type are so
redundant that at least half of the output is redundant as well. But general
formatter cannot detect redundancies and it has to live with them. What could be
improved in the example above is to compact Location and Size properties
representation into one line each, rather than spreading them to multiple lines:
Rectangle Window {
|-- int Bottom=51
|-- int Height=42
|-- bool IsEmpty=false
|-- int Left=14
|-- Point Location { bool IsEmpty=false, int X=14, int Y=9 }
|-- int Right=31
|-- Size Size { int Height=42, bool IsEmpty=false, int Width=17 }
|-- int Top=9
|-- int Width=17
|-- int X=14
+-- int Y=9 }
This output is still full of redundancies, but at least takes less room to
print.
Article continues in General Formatter for .NET 2/4: Design.