Snooping on C#.NET Regular Expressions


Prologue

I present here two micro test utils, which will let you play with regular expressions. Regular expression ( System.Text.RegularExpressions.Regex ) is a powerful class very useful for parsing, splitting or replacing texts.
The most common need is either to split the text by a delimiter or by a pattern like *.com.

Regular expressions is a gift from the Unix environment. It needs special treatment and takes time to master. I recommend that you refer the chapter 4, pages 101 till 105 of the famous Unix book by Brian W. Kernighan and Rob Pike, titled "The Unix Programming Environment" [ISBN-0-87692-499-2].

I present below two micro test programs called RegExpSplit.cs and RegExpMatches.cs.

To compile them run from the Command Window (aka Glorified DOS Prompt):

make all which should produce the two .exe files of the respective names.

Split Tester

Given below is the Split method tester source:

using System;
// assembly is System.Text.RegularExpressions.dll
using System.Text.RegularExpressions;
namespace Ash.Test
{
public class RegExpSplit
{
public static void Main(string[] args)
{
if(args.Length != 2)
{
Console.WriteLine("Usage: RegExpSplit <regular exp> <target>");
Environment.Exit(1);
}
metaExp = args[0];
string [] rets = ParseExtn(args[1]);
if(rets == null)
{
Console.WriteLine("Sorry no match");
}
else
{
Console.WriteLine(rets.Length);
foreach(string x in rets)
Console.WriteLine(x);
}
}
public static string[] ParseExtn(String ext)
{
Regex rx =
new Regex(metaExp);
return rx.Split(ext);
}
private static string metaExp = "[0-9 a-z A-Z]*" ;
}
}

To test the splitting action give the following commands:
 
RegExpSplit ";" "test;me;now"

This should result in the following output:

3
test
me
now

Now, I suggest that you can play around with it, to get the hang of the splitting method.

Matches Tester

The next program, RegExpMatches.cs, is much like the above splitter. Except for the static method ParseExtn which I reproduce below: 

public static string[] ParseExtn(String ext)
{
Regex rx =
new Regex(metaExp,"i"); // case insensitive match
MatchCollection rez = rx.Matches(ext);
string[] ret = null;
if(rez.Count > 0)
{
ret =
new string[rez.Count];
for(int i=0; i < rez.Count;i++)
{
ret[i] = rez[i].ToString();
}
}
return ret;
}

Note that the Regex constructor is instructed to parse in a case insensitive manner.

To run this type on the DOS prompt (or Windows Console if you so desire):
RegExpMatches "\w+\.(COM|DLL|EXE)" "ash.com Anu.EXE"

You should get the result:

2
ash.com
Anu.EXE

Epilogue

We need to test the regular expressions with rigor prior to deploying them in actual programs. These two micro test utils. might help you out. It sure does for me! I used these test utils to debug my previous util VirChk.cs.

Up Next
    Ebook Download
    View all
    Learn
    View all