The source code and a compiled dll are available for download.
In this article I am going to discuss how to take a string then compile it and then how you execute it. There are two ways compiled code can be executed:Create an executable and then run it. Compile it into an Assembly object and directly access the objects and methods within (or save it as a dll).
The way I decided to tackle this is to have a dll which I can then reference from program. So I created a new Class Library I called it RunTimeCompiler and thus my namespace was also called RunTimeCompiler. Within this dll there would be 4 objects:
1. CompiledExe
2. CompiledAssembly
3. ErrorListing
4. Compiler
CompiledAssembly and CompiledExe are both very simple classes, in fact they could have been struct's, below is the implementation for both:
To try and clarify my ideas, below is a pseudo UML diagram:
CompiledExe:
This object will allow you to start the executable, also it will return you a Process object from which you can monitor the program itself.
CompiledAssembly:
This object will return you an Assembly, which in turn will allow you to directly access Types and their members
ErrorListing:
The ErrorListing struct basically contains a list of errors that occurred during compilation. I wanted my compiler to automatically throw up a standard error message like: Opps! There are problems here and then allow the outside program to display the list of errors how it wants. So if you request a CompiledExe or CompiledAssembly and you get a null value returned check the Compiler.Errors for what went wrong. Well thats the point of the object, below is the implementation:
And there is one enumerated type that will define which language we are compiling: CodeType, below is the definition:
Compiler:
Compiler is what is known as a Class Factory. This means that Compiler instantiates and returns other Classes. Basically, Compiler will return instances of CompiledExe and instances of CompiledAssembly.
Compiler is also a Singleton, what makes Compiler a Singleton is the fact that there are only Class methods or static methods in Compiler. There are no Instance methods or Instance fields, you do not instantiate a copy of Compiler ever, in fact it will not let you. (This is due to the private constructor). Basically this will keep memory usage down as only one instance of the methods and fields will ever reside in memory.
Compiler has a few private fields, bellow is the list:
Apart from the field errors, cmdLineParameters and compilationResults, this is all information that the Compiler needs to know in order to compile.
To save space and typing effort I decided to have one method that did most of the work, this method is private and there are a few public methods, which provide various ways of getting to the big method. The method which does all the work is called: Compile
Before I explain the public methods that you use to compile your code, I should explain the possible outcomes for the two main scenarios:
Compiling An Assembly 1. Do you wish to have it only in memory, or would you like it to make a dll on disk. 2. Do you want the compiler to give the new assembly references to assemblies that the current application references (this is useful when compiling simple code) OR do you want to specify which assemblies your assembly will reference.
Compiling An Executable1. Do you want the compiler to give the new executable references to assemblies that the current application references (this is useful when compiling simple code) OR do you want to specify which assemblies your executable will reference.
There are then, four overloads for compiling an Assembly and two overloads for compiling an executable. (Just to clarify: the point of these methods is to set up the compiler so it knows what to do)
CompileAssembly Overloads
Overload 1 Compile In Memory, Using Existing Assembly References
Overload 2 Compile .dll To Disk, Using Existing Assembly References
Overload 3 Compile In Memory, Specifying Assembly References
Overload 4 Compile .dll To Disk, Specifying Assembly References
CompileExecutable Overloads
Overload 1 Use Existing Assembly References
Overload 2 Specify Assembly References
These six methods should provide a way of using the compiler for most jobs that we would probably want.. (I make no guarantees here!)
I havent bothered to explain these methods, purely because they have no logic involved, just set up the Compiler prior to compilation. If you do get stuck, please feel free to contact me.
Now, we get down to the hardest part The compiler itself! The method is private and will only be called from one of the above 6 methods, you may notice that each of the six methods returns the value from Compile(), yet two of the six return a CompiledExecutable object and the other four return a CompiledAssembly object. This works because Compile() returns object. Now in case you dont already know, everything inherits from the base class object. Therefore if I create either a CompiledAssembly or CompiledExecutable I can return them as an object back to the method that called Compile, and then I can recast them back to which ever type was requested.
Compilation really isnt too hard, all you need to do is have an object that implements the ICodeCompiler interface (as a side note most of these following objects come under the System.CodeDom namespace). This is the actual compiler itself, so what do I need to do to implement the ICodeCompiler interface you ask? Well, nothing, assuming you are compiling C# or VB, take a look at the following code:
This will assign to compiler an object that implements the interface.. cool eh?
Now, in theory all you need to do is just compile.. however, there are two things we need to do first.
1. Set up the parameters, such as: what is this, an Assembly or Exe?2. Sort out the Assembly references
Part 1 - Parameters
This is pretty straight forward, remember the six previous methods, this is really what they are for (follow the code below):
and thats it for the parameters, not too hard.
Part 2 - Assembly References
I allowed two options here if you remember, either you can specify which assemblies your code references or you can let the compiler just assign the ones it has reference to, i.e. enough for a simple bit of code.
If you didnt specify the private field referencedAssemblies should be null, otherwise it should be an array of Assembly names.
This bit of code actually warrants some explanation. The logic itself is simple, i.e.: Did you specify the assemblies, if yes then we will load those for you, if not then we will load the ones currently available to our program. However, you might very well be asking what is the AppDomain. Well (as I understand it) an AppDomain is the highest level of an Application, it is where all the assemblies for that program are loaded and so on. Using your own AppDomain (obtainable from the class property AppDomain.CurrentDomain)You can see what assemblies are loaded into your application most will have System.dll also you can load other assemblies into your application from here too. Another nifty feature is that you can instantiate objects from here using
That will take an assembly name and then a Type object which describes the object you wish to instantiate. This feature is available at the Assembly level as well as the AppDomain level. Anyway enough of the tangents, back to the compiler.
Really, all thats left is to compile woohooo!
Now we need to check whether any mistakes occurred, i.e. syntax problems with the supplied code and if its all ok, the we return the desired object:
The outcome of this will be, either successful compilation and an instance of either CompiledExecutable or CompiledAssembly will be returned. Or null will be returned and the errors private field will no longer be null and will instead contain the list of errors that occurred, also a message box will display. To access the list of errors, I set up a property:
Well, now all you need to do is compile this code and then add the resulting dll into any project of your choice.