" Sometimes the best way to do things is to do it reverse "
- a well known book on reverse engineering
I would like to start this post by stating that all closed source non - obfuscated code strips are actually open but hidden to bare eyes. Sounds cryptic ? Well I think it should :) . Well i will unpack it . So what on earth is obfuscation ?
Some people who have done something "innovative" don't like to share their code and hence just distribute their executables/dlls . But there are some people who go one step ahead and obfuscate it .
Wikipedia says
"Obfuscation refers to the concept of concealing the meaning of communication by making it more confusing and harder to interpret."
" Obfuscated code is source code that is (usually intentionally) very hard to read and understand".
Actually the code is transformed in such a way that even if you use any disassembler the code does not make any sense ! All the variables are renamed ,statements are readjusted so that they convey wrong meaning .
Now coming to actual languages it has been found that C,C++,Perl are harder to obfuscate since they compile to machine code and their intermediate code is machine dependent(that is their assembly language). But newer languages like Java , C# ,VB.NET belong to the category of languages which are hardware independent. The code is compiled to an intermediate language which is not dependent on hardware.
Now let me get down to .NET platform . Here that intermediate language is called MSIL (Microsoft Intermediate Language).
A hello world program in MSIL would look like this
.method public static void Main() cil managed
{
.entrypoint
.maxstack 1
ldstr "Hello, world!"
call void [mscorlib]System.Console::WriteLine(string)
ret
}
Note : CLR is stack based .
As you can see it looks pretty simple and understandable .
Contrary to that look at this x86 code for same thing
db msg 'Hello, world!$'
mov ah, 09h
lea dx, msg ; or mov dx, offset msg
int 21h
mov ax,4C00h
int 21h
Difference is clearly visible.
Obfuscation is basically done to make that IL produced by disassemblers impossible to decipher.
Want to try out ?
Do this
1.Write a simple .NET application in say C#
2. Now compile it and locate the executable in /bin .
3. type this
ildasm /out:.il
Done !
Now rather than talking about obfuscation I will talk about reverse engineering because the latter is more thrilling.
To "assist" people who dont want to break their heads reading the IL and as an adversary of obfuscators certain decompilers are available.
They actually produce ready-made code from just the executable ! Not just that they also produce high level designs from just the code.
Amazing isnt it?
I would personally recommend Spices.NET package . It does modeling ,code production , high level documentation ( Oh my God!) and other stuff from just the executable . It also produces source code in 5 languages ! They claim that is decompile anything except the ones obfuscated by their obfuscator ( lol !).
The catch is that it is not freely available (but 30 day evaluation is available).
It has been seen that certain very well known closed source executable /dlls(from someone big ( u guess!) ) can be reversed to source code .
But at the same time certain companies are producing high quality obfuscators to destroy the possibility of code recovery.They are also producing decompilers which can decompile everything except the ones obfuscated by their own software.
They say that in business everything is fair .
What do you think ?