Introduction
This article demonstrates the complete URL rewriting concept using regular
expression and set up the predefined rules and handling the issue regarding post back
of ASP.NET while requesting the virtual path of a Website.
Why does URL rewriting matter?
URL rewriting is one of the most common techniques used to lure search engines and provide search engine friendly URLs. As a developer, we want to write code that is the most flexible and easier to implement search engine friendly URLs.
Also, handling cases when you want to reconstruct the pages within your website and also take care of the old url which are kept as a bookmark by
most of the user should not break during the page relocation.
Improve the search relevancy of the pages on your site using most regular search
engine like Google, yahoo and Bing. Specifically, URL Rewriting can often make
it easier to embed common keywords into the URLs of the pages on your sites,
which can often increase the chance of someone clicking your link. Moving from
using querystring arguments to instead use fully qualified URL's can also in
some cases increase your priority in search engine results. Using techniques
that force referring links to use the same case and URL entrypoint (for example:
weblogs.asp.net/scottgu instead of weblogs.asp.net/scottgu/default.aspx) can
also avoid diluting your pagerank across multiple URLs, and increase your search
results.
Native URL mapping
It is possible that old URLs can map new URLs without writing any line of code by
using the URL mapping concept in ASP.NET. To use this concept, just create a new
urlMapping section under the system.Web section of your web.config file. As
shown below.
<url Mapping enable=true>
<add url="~/Info/Copywrite.aspx" mappedurl="~/Help/CopyWrite.aspx" />
<add url="~/Info/Contact.aspx" mappedurl="~/Help/Contact.aspx" />
</url Mapping>
Note:-"~/ represent the root directory of the application"
This solution is fine and allows redirecting the user to the new location of the
page. But user will may get surprises that during the postback the url is change
from "http://www.metlab.com/Info/Copywrite.aspx" to "http://www.metlab.com/Help/CopyWrite.aspx".
This happen because the ASP.NET engine fills the action attribute of the form
with the new url.
<form action=" http://www.metlab.com/Help/CopyWrite.aspx" method="post"
name="form1">
</form>
This approach is good and fruitful if we relocate a few number of pages only. If you have many files and URLs, this technique is not recommended.
URL Rewriting
The best way to implement a URL rewriting solution is to create reusable and
easily configurable modules, so the obvious decision is to create an HTTP Module
(for details on HTTP Modules see MSDN Magazine) and implement it as an
individual assembly. To make this assembly as easy to use as possible, we need
to implement the ability to configure the rewrite engine and specify rules in the web.config file.
During the development process we need to be able to turn the rewriting module
on or off (for example if you have a bug that is difficult to catch, and which
may have been caused by incorrect rewriting rules). There should, therefore, be
an option in the rewriting module configuration section in web.config to turn
the module on or off. So, a sample configuration section within web.config can
go like this:
<rewriteModule>
<rewriteOn>true</rewriteOn>
<rewriteRules>
<rule
source="(\d+)/(\d+)/(\d+)/"
destination="Post.aspx?Year=$1&Month=$2&Day=$3"/>
<rule
source="(.*)/Default.aspx"
destination="Default.aspx?Folder=$1"/>
<rule
source="Directory/(.*)/(.*)/(.*)/(.*).aspx"
destination="Directory/Item.aspx?Source=$1&Year=$2&ValidTill=$3&Sales=$4"/>
<rule
source="Directory/(.*)/(.*)/(.*).aspx"
destination="Directory/Items.aspx?Source=$1&Year=$2&ValidTill=$3"/>
<rule
source="Directory/(.*)/(.*).aspx"
destination="Directory/SourceYear.aspx?Source=$1&Year=$2&"/>
<rule
source="Directory/(.*).aspx"
destination="Directory/Source.aspx?Source=$1"/>
</rewriteRules>
</rewriteModule>
This means all the request run
likes: "http://localhost/Web/2006/12/10/
"redirect to the page known as Post.aspx.
To the above section in web.config file the developer should register a section
name and section handler for this section. To do this, add a configsection
section in web.config.
<configSections>
<sectionGroup
name="modulesSection">
<section
name="rewriteModule"
type="UrlRewriteModule.UrlRewrittingModuleHandler"/>
</sectionGroup>
</configSections>
This means you
can use the following section below the
configurationsection
<modulesSection>
<rewriteModule>
<rewriteOn>true</rewriteOn>
<rewriteRules>
<rule
source="(\d+)/(\d+)/(\d+)/"
destination="Post.aspx?Year=$1&Month=$2&Day=$3"/>
<rule
source="(.*)/Default.aspx"
destination="Default.aspx?Folder=$1"/>
<rule
source="Directory/(.*)/(.*)/(.*)/(.*).aspx"
destination="Directory/Item.aspx?Source=$1&Year=$2&ValidTill=$3&Sales=$4"/>
<rule
source="Directory/(.*)/(.*)/(.*).aspx"
destination="Directory/Items.aspx?Source=$1&Year=$2&ValidTill=$3"/>
<rule
source="Directory/(.*)/(.*).aspx"
destination="Directory/SourceYear.aspx?Source=$1&Year=$2&"/>
<rule
source="Directory/(.*).aspx"
destination="Directory/Source.aspx?Source=$1"/>
</rewriteRules>
</rewriteModule>
</modulesSection>
Another thing we have to bear in mind that during the development of the
rewriting module it should be possible to use the virtual url with the query
string parameters, as shown in following:
"http://www.someblog.com/2006/12/10/?Sort=Dec&SortBy=Date".
Thus we have to
develop the solution that can detect the parameters passed via query string and
also via virtual url in webappication.
So, let's start by building a new Class Library. We need to add a reference to
the System.Web assembly, as we want this library to be used within an ASP.NET
application and we also want to implement some web-specific functions at the
same time. If we want our module to be able to read web.config, we need to add a
reference to the "System.Configuration" assembly.
Handling the Configuration Section
To reading the configuration section in web.config file we have to create a
class that will implement the IConfigurationSectionHandler interface. As shown
below.
namespace
UrlRewriteModule
{
#region[Directive]
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Configuration;
using System.Xml;
using System.Web;
#endregion[Directive]
///
<summary>
///
This class has implemented IconfigurationSectionHandler to read configuration
Section and
///
return it any form of object
///
</summary>
public
class UrlRewrittingModuleHandler:IConfigurationSectionHandler
{
#region[PrivateVariable]
private
XmlNode _xmlSection;
private
string _rewriteBase;
private
bool _rewriteOn;
#endregion[PrivateVarible]
#region[Poperty]
///
<summary>
///
gets or sets xmlSection
///
</summary>
public
XmlNode XmlSection
{
get {
return _xmlSection; }
set { _xmlSection =
value; }
}
///
<summary>
///
gets or sets RewriteBase
///
</summary>
public
string RewriteBase
{
get {
return _rewriteBase; }
set { _rewriteBase =
value; }
}
///
<summary>
///
gets or sets RewriteOn
///
</summary>
public
bool RewriteOn
{
get {
return _rewriteOn; }
set { _rewriteOn =
value; }
}
#endregion[Poperty]
///
<summary>
///
///
</summary>
///
<param name="parent"></param>
///
<param name="configContext"></param>
///
<param name="section"></param>
///
<returns></returns>
public
object Create(object
parent, object configContext,
XmlNode section)
{
// set base path for rewriting module to
// application root
_rewriteBase =
HttpContext.Current.Request.ApplicationPath ;//+
"/";
// process configuration section
// from web.config
try
{
XmlSection = section;
RewriteOn = Convert.ToBoolean(section.SelectSingleNode("rewriteOn").InnerText);
}
catch (Exception
ex)
{
throw (new
Exception("Error
while processing RewriteModule configuration section.", ex));
}
return this;
}
}
}
Maintain Original URL
When handling virtual URLS such as http://www. somebloghost.com/Blogs/gaidar/?Sort=Asc (that
is, a virtual URL with query string parameters), it is important that you
clearly distinguish parameters that were passed via a query string from
parameters that were passed as virtual directories. Using the rewriting rules
specified below:
<rule source="(.*)/Default.aspx" destination="Default.aspx?Folder=$1"/>,
You can use the following URL:
http://www. somebloghost.com/gaidar/?Folder=Blogs
And the result will be the same as if you used this URL:
http://www. somebloghost.com/Blogs/gaidar/
To resolve this issue, we have to create some kind of wrapper for 'virtual path
parameters'. This could be a collection with a static method to access the
current parameters set:
namespace
UrlRewriteModule
{
#region[Directive]
using System;
using System.Collections.Generic;
using System.Collections.Specialized;
using System.Linq;
using System.Text;
using System.Web;
#endregion[Directive]
///
<summary>
///
This class is use to create wrapper of virtual path parameters.
///
This is content some static methods to access current parameter set.
///
</summary>
public
class ReWriteContext
{
#region[PublicField]
///
<summary>
///
Get the RewriteContext
///
</summary>
public
static
ReWriteContext Current
{
get
{
if (HttpContext.Current.Items.Contains("RewriteContextInfo"))
{
return
HttpContext.Current.Items["RewriteContextInfo"]
as
ReWriteContext;
}
else
return
new
ReWriteContext();
}
}
///
<summary>
///
Initialize object with parameterize constructor
///
</summary>
///
<param name="param">provide
namevalueCollection</param>
///
<param name="baseUrl">provide
baseUrl</param>
public ReWriteContext(NameValueCollection
param, string baseUrl)
{
Params = new
NameValueCollection(param);
InitialUrl = baseUrl;
}
///
<summary>
///
Initialize object with the default setting
///
</summary>
public ReWriteContext()
{
_Params = new
NameValueCollection();
_initialUrl = string.Empty;
}
public
NameValueCollection Params
{
get {
return _Params; }
set { _Params =
value; }
}
public string
InitialUrl
{
get {
return _initialUrl; }
set { _initialUrl =
value; }
}
#endregion[PublicField]
#region[PrivateField]
private
NameValueCollection _Params;
private
string _initialUrl;
#endregion[PrivateField]
}
}
Rewriting URL
For rewriting the url we have implement the module. Which will take care of all
the url rewriting technique. The code for this is provided in below.
namespace
UrlRewriteModule
{
#region[Directives]
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Text.RegularExpressions;
using System.Web;
using System.Configuration;
using System.Xml;
#endregion[Directives]
public class
ReWriteModule:IHttpModule
{
public
void Dispose()
{
throw
new NotImplementedException();
}
public void
Init(HttpApplication context)
{
////This event
handler is execute during the BeginRequest Handler
context.BeginRequest +=
new
EventHandler(RewriteModule_BeginRequest);
////This event
will execute during PreRequestHandler Execute
context.PreRequestHandlerExecute +=
new
EventHandler(RewriteModule_PreRequestHandlerExecute);
}
///
<summary>
///
PreRequestHandler it will execute before request process to the handler
///
</summary>
///
<param name="sender">the
object which will execute this event</param>
///
<param name="e">Event
argument that is contain all the known argument value</param>
void
RewriteModule_PreRequestHandlerExecute(object
sender, EventArgs e)
{
HttpApplication
application = (HttpApplication)sender;
if ((application.Context.CurrentHandler
is System.Web.UI.Page)
&& (application.Context.CurrentHandler !=
null))
{
System.Web.UI.Page
page = application.Context.CurrentHandler as
System.Web.UI.Page;
page.PreInit += new
EventHandler(page_PreInit);
}
}
///
<summary>
///
It event will execute during the pre initialization of
///
page
///
<remarks>
///
This method checks if the user requested a normal ASP.NET
///
page and adds a handler for the PreInit event of the page lifecycle.
///
This is where RewriteContext will be populated with actual parameters and
///
a second URL rewriting will be performed. The second rewriting is necessary to
make
///
ASP.NET believe it wants to use a virtual path in the action attribute of an
HTML form.
///
</remarks>
///
</summary>
///
<param name="sender">represent
the page object</param>
///
<param name="e">argument
that contain </param>
void
page_PreInit(object sender,
EventArgs e)
{
if (HttpContext.Current.Items["OriginalUrl"]
!= null)
{
string path = (string)HttpContext.Current.Items["OriginalUrl"];
ReWriteContext reWriteContex =
new
ReWriteContext(HttpContext.Current.Request.QueryString,
path);
if (path.IndexOf("?")
== -1)
path = path + "?";
HttpContext.Current.RewritePath(path);
}
}
///
<summary>
///
It will execute BeginRequest has execute
///
</summary>
///
<param name="sender">Represent
the object that execute the event</param>
///
<param name="e">Provide
argument that is pass as EventArg</param>
void
RewriteModule_BeginRequest(object sender,
EventArgs e)
{
////Read
configuration section using rewriteModuleSection
UrlRewrittingModuleHandler rewriteModuleSection = (UrlRewrittingModuleHandler)ConfigurationManager.GetSection("modulesSection/rewriteModule");
////check rewrite option hasbeen on
or off
if (!rewriteModuleSection.RewriteOn)
return;
////Keep the current request path
string
path = HttpContext.Current.Request.Path;
////Check for path length
if (path.Length
== 0)
return;
////Get the rule section from the
xmlconfig file
XmlNode rules = rewriteModuleSection.XmlSection;
foreach (XmlNode
node in rules.SelectNodes("rewriteRules/rule"))
{
try
{
////Create
regural excepresion for further match
Regex re =
new
Regex(rewriteModuleSection.RewriteBase + node.Attributes["source"].InnerText,
RegexOptions.IgnoreCase);
////Check
the match
Match match = re.Match(path);
////match
is Success or Unsuccess
if
(match.Success)
{
////Replace the old url with the destination one
path = re.Replace(path,
node.Attributes["destination"].InnerText);
////Check path length
if (path.Length != 0)
{
////check for querystring count
if (HttpContext.Current.Request.QueryString.Count
!= 0)
{
////check for sign
string sign = path.IndexOf("?")
== -1 ? "?" :
"&";
////Reconfigure the path
path = path + sign
+ HttpContext.Current.Request.QueryString;
}
////Write newly change one
string
newUrl = rewriteModuleSection.RewriteBase + path;
////Writting the OriginalUrl
for future referance
HttpContext.Current.Items.Add("OriginalUrl",
HttpContext.Current.Request.RawUrl);
////Rewrite the url
HttpContext.Current.RewritePath(newUrl);
}
return;
}
}
catch (System.Exception
ex)
{
throw (new
Exception("Incorrect
rule.", ex));
}
}
return;
}
}
}
Summary
In this article, we learned the URL rewriting concept in ASP.NET and how to implement it using the regular expressions.
Resources
Here are some useful related resources: