Sure, it matches joe.blogs@example.com, or joe-blogs@example.com… But how about joe.blogs+work@example.com, or joe.blogs+support@example.com, or e=mc^2@example.com, or many other valid (albeit uncommon) email address varieties?
Email address validation and storage is a rarely considered but often encountered concept on the web. Many developers thoughtlessly implement validations that prevent legitimate email addresses being entered into a membership, support form, or being subscribed to a newsletter.
The most common barriers I’ve encountered are length limitations, domain black lists, or imitation standards compliance.
Length Limitations
One of the implementations of this blunder that I’ve come across is database field lengths. Arbitrary lengths are painfully common. Next time you’re designing database and think to yourself “100 characters seems long enough” please stop, and ask yourself why you’re imposing a length limitation. And why 100?
- By old standards (RFC822) an email address could be up to 320 characters long. That’s 64 (local user) + 1 (”@”) + 255 (domain).
- And by new standards (RFC 2822) up to 992. That’s 1000 – 6 (”From: “) – 2 (\r\n).
The other reason email address lengths are often limited is more forgivable as it is conceivably the result of migrating from a specifically chosen login username to using an email address as a login.
Domain Black Lists
Spam is now an accepted inconvenience. But blocking @hotmail.com, or @gmail.com email addressed from using your support form is a misguided defense mechanism. Amongst the proportionally few spam email addresses with each provider are many legitimate users. Don’t punish them for your site’s popularity in a spam network. There are a variety of more effective anti-spam techniques you can implement.
Standards Compliance
This is my second most hated web development faux pas. One that even MSDN .NET documentation has screwed up for 7 years. Disallowing characters like +, %, *, $. At first glance a developer might see these as harmful, but such a developer would be amateur in thinking so. In any syntax where a character might be considered something other than it’s literal meaning an escape sequence is available. For example + and % are special characters in a url, but a thorough developer should encode them to %2B and %25 respectively.
I recommend the limit to your email validation to checking for the “@” character, and testing the domain part for a “.” character. This will ensure your users have at least attempted to enter an email address. The rest will be resolved in what experienced developers say is a significant part of any email validation: sending an email. Here you may find such an address as e=mc^2@example.com is valid, while john@example.com may not exist and thus be invalid.





June 9th, 2009 at 11:27 am
validation can definitely be a two-edged sword, ultimately it should be designed to make life easier for the user (preventing invalid submissions) but at the same time it can frustrate the benny huha out of ya when you can’t submit legitimate details.
if your site is having issues with spam i recommend they include a captcha field (just make sure its human readable!)