Email address validation is a basic programming task. This article reviews the basic rules for email validation and provides .Net code to perform the validation.
Email Address Validation Rules
The basic email address format is LOCAL @ DOMAIN. That is, there is a LOCAL section followed by the "at" sign (@) followed by a DOMAIN section. In the email address
[email protected], the LOCAL section is "my.name" and the DOMAIN is "gmail.com."
A valid email address has one and only one @ sign. A valid email has no spaces.
INVALID:
my.name [No @ sign. No DOMAIN.]
myname@ [No DOMAIN.]
@gmail.com [No LOCAL section.]
The LOCAL section must be 1 to 64 characters long. The DOMAIN section must be 1 to 63 characters long. Neither section can begin or end with a period (.) or have two periods ( ..) in a row. Each character must be a letter, number, or one of the permitted special characters: ! $ % & * - = . ? ^ _ { } ~ \
INVALID:
myname@gmail+com [Invalid character in the DOMAIN section.]
The DOMAIN section contains one or more LABEL sections separated by periods: LABEL.LABEL.LABEL. For example "gmail.com" has two LABEL sections. It's in the format LABEL.LABEL.
Each LABEL section must be 1 to 62 characters long. It cannot begin or end with a hyphen (-) or have two hyphens (--) in a row.
INVALID:
myname@-gmail [LABEL begins with a hyphen.]
myname@gmail--com [LABEL contains two or more hyphens in a row.]
The last label must be entirely letters. It cannot contain any numbers or special characters.
INVALID:
Email Validation Code
There are many articles out there offering up different regular expressions to validate email. I have not been satisfied with these efforts. Many contain bugs, allow invalid emails or reject valid ones. Most do not attempt to enforce the complete set of validation rules.
In addition, I simply don't like regular expressions. They're cryptic. It's hard to write a regular expression to validate something complicated. It's hard to understand someone else's regular expression. It's hard to modify a regular expression when the rules change.
It's better to write code. It's much easier to understand. It's easy to debug. It's easy to modify. This is the code I use:
public class EmailHelper
{
/// <summary>
/// Makes sure email
address is in the proper format.
/// </summary>
public static bool IsValidEmail(string
email)
{
// No spaces allowed
email =
email.Trim();
int space = email.IndexOf("
");
if (space != -1)
return false;
// -------------------------
// EMAIL MUST BE IN FORMAT:
LOCAL @ DOMAIN
// -------------------------
// There must be exactly one @
int firstAtSign = email.IndexOf("@");
if
(firstAtSign == -1)
return false;
int lastAtSign = email.LastIndexOf("@");
if (lastAtSign != firstAtSign)
return false;
// There must be a LOCAL and a DOMAIN
string local = email.Substring(0, firstAtSign);
string domain = email.Substring(firstAtSign + 1);
if ((local.Length < 1) || (local.Length >
64)) // max
length of 64.
return false;
if ((domain.Length < 1) || (domain.Length >
63)) // max
length of 63.
return false;
// -------------------------
// TEST LOCAL PIECE
// -------------------------
// Can't begin or end with . or have two .. in a row.
if (ValidatePeriods(local) == false)
return false;
// All characters must be a letter, number or allowed special
character.
if (ValidateCharacters(local) == false)
return false;
// -------------------------
// TEST DOMAIN PIECE
// -------------------------
// Can't begin or end with . or have two .. in a row.
if (ValidatePeriods(domain) == false)
return false;
// Domain is in format label.label.label
string[] labels = domain.Split('.');
// Test each label
foreach (string label
in labels)
{
if (label.Length < 1 || label.Length > 62)
return false;
if (label[0] == '-'
|| label[label.Length - 1] == '-')
return false;
if
(label.Contains("--"))
return false;
if (ValidateCharacters(label) == false)
return false;
}
// Last label must be all alphabetic
string lastLabel = labels[labels.Length - 1];
foreach (char c in lastLabel.ToCharArray())
{
if (Char.IsLetter(c)
== false)
return false;
}
return true;
}
private static bool ValidatePeriods(string
label)
{
if (string.IsNullOrEmpty(label))
return false;
// Can't have two periods in a row.
if (label.Contains(".."))
return false;
// Can't begin or end with a period.
if (label[0] == '.')
return false;
if (label[label.Length - 1] == '.')
return false;
return true;
}
private static bool ValidateCharacters(string
label)
{
if (string.IsNullOrEmpty(label))
return false;
char[] allowed = { '!',
'$', '%',
'&', '*',
'-', '=',
'?', '^',
'_', '{',
'}', '~',
'\'', '.'
};
foreach (char c in label.ToCharArray())
{
if (Char.IsLetterOrDigit(c))
continue;
int x = c.ToString().IndexOfAny(allowed);
if (x == -1)
return false;
}
return true;
}
}
The sample code includes an email address tester: