Validating the format of an URI is one of those problems that periodically arises when you are validating model attributes in Rails.
There are tons of solutions available on the web, but the 90% of them are usually based on complex regular expressions and they often made custom (and perhaps too restrictive) assumptions. This is a small list of the most common "mistakes":
- Some validators don't support custom domain names such as
http://simone.weppos
, absolutely legal if my application is working behind a custom DNS service - Some validators don't support hostnames such as
http://localhost
- Some validators focus on specific URL patterns instead of supporting a common validation mechanism
- Some validators don't understand that
http://www.google.co.uk
esists - Some validators rely on a TLD whitelist that often becomes outdates.
I'm working on a project where I need to validate URLs quite often and I decided to approach the problem from an other point of view. I don't like to reinvent the wheel, thus I decided to take advantage of Ruby URI
library.
This is a super simple validator I wrote. It is based on URI.parse
.
# Validates whether the value of the specified attribute matches the format of an URL,
# as defined by RFC 2396. See URI#parse for more information on URI decompositon and parsing.
#
# This method doesn't validate the existence of the domain, nor it validates the domain itself.
#
# Allowed values include http://foo.bar, http://www.foo.bar and even http://foo.
# Please note that http://foo is a valid URL, as well http://localhost.
# It's up to you to extend the validation with additional constraints.
#
# class Site < ActiveRecord::Base
# validates_format_of :url, :on => :create
# validates_format_of :ftp, :schemes => [:ftp, :http, :https]
# end
#
# ==== Configurations
#
# * <tt>:schemes</tt> - An array of allowed schemes to match against (default is <tt>[:http, :https]</tt>)
# * <tt>:message</tt> - A custom error message (default is: "is invalid").
# * <tt>:allow_nil</tt> - If set to true, skips this validation if the attribute is +nil+ (default is +false+).
# * <tt>:allow_blank</tt> - If set to true, skips this validation if the attribute is blank (default is +false+).
# * <tt>:on</tt> - Specifies when this validation is active (default is <tt>:save</tt>, other options <tt>:create</tt>, <tt>:update</tt>).
# * <tt>:if</tt> - Specifies a method, proc or string to call to determine if the validation should
# occur (e.g. <tt>:if => :allow_validation</tt>, or <tt>:if => Proc.new { |user| user.signup_step > 2 }</tt>). The
# method, proc or string should return or evaluate to a true or false value.
# * <tt>:unless</tt> - Specifies a method, proc or string to call to determine if the validation should
# not occur (e.g. <tt>:unless => :skip_validation</tt>, or <tt>:unless => Proc.new { |user| user.signup_step <= 2 }</tt>). The
# method, proc or string should return or evaluate to a true or false value.
#
def validates_format_of_url(*attr_names)
require 'uri/http'
configuration = { :on => :save, :schemes => %w(http https) }
configuration.update(attr_names.extract_options!)
allowed_schemes = [*configuration[:schemes]].map(&:to_s)
validates_each(attr_names, configuration) do |record, attr_name, value|
begin
uri = URI.parse(value)
if !allowed_schemes.include?(uri.scheme)
raise(URI::InvalidURIError)
end
if [:scheme, :host].any? { |i| uri.send(i).blank? }
raise(URI::InvalidURIError)
end
rescue URI::InvalidURIError => e
record.errors.add(attr_name, :invalid, :default => configuration[:message], :value => value)
next
end
end
end
The code is also available as a Gist.
I do have some unit tests, but they are specific to my application and I can't post them here. I encourage you to build your own.
I found shoulda to be particularly helpful in this situation.
class ModelTest < ActiveSupport::TestCase
VALID_URLS = [
'http://godaddy.com', 'http://www.godaddy.com',
'https://godaddy.com', 'https://www.godaddy.com',
'http://godaddy.host', 'https://godaddy.host',
]
INVALID_URLS = [
'ftp://godaddy.com',
'www.godaddy.com', 'godaddy.com',
'http:/godaddy.com',
]
should_allow_values_for :myattr, *VALID_URLS
should_not_allow_values_for :myattr, *INVALID_URLS
end
Please note that this validator does not claim to be perfect. As I explained in the documentation, it does not validates some requirements that might be mandatory for your specific application.