Public Suffix is a Ruby domain name parser based on the Public Suffix List, the cross-vendor initiative to provide an accurate list of domain name suffixes.
Parsing a domain name is not as simple as you may suppose and, since there is no algorithmic method of finding the highest level at which a domain may be registered for a particular top-level domain (the policies differ with each registry), the only method is to create a list of all top-level domains and the level at which domains can be registered. This is the aim of the effective TLD list.
The Public Suffix Service Ruby library parses domain names using the definitions stored in the Public Suffix List.
This week I released Public Suffix Service 0.5.0. I normally don't create a post for every minor release of my libraries, but this version is really worth a mention.
Compared to the previous version, there are no public API changes. The only, very significant change is an internal refactoring by Camilo Lopez to the main definition lookup method which results in performance improvement by orders of magnitude.
Here's a very simple benchmark comparing Public Suffix Service 0.4.0 with 0.5.0.
$ ruby bm_pss_gem.rb 0.4.0
Rehearsal -------------------------------------------------
Version 0.4.0 5.540000 0.170000 5.710000 ( 6.682396)
---------------------------------------- total: 5.710000sec
user system total real
Version 0.4.0 5.320000 0.130000 5.450000 ( 5.614912)
$ ruby bm_pss_gem.rb 0.5.0
Rehearsal -------------------------------------------------
Version 0.5.0 0.140000 0.020000 0.160000 ( 0.155435)
---------------------------------------- total: 0.160000sec
user system total real
Version 0.5.0 0.020000 0.000000 0.020000 ( 0.019537)
Yes, there's no typo. Thanks to Camilo, the library performed the same execution in 0.01 vs 5.16 seconds.
Public Suffix Service 0.4.0 uses the standard algorithm, described in the Public Suffix List website, which is known to not be the most efficient. The new 0.5.0 release adds an internal index that dramatically increases definition lookup.
Thank you very much to Camilo Lopez for his high quality contribution. This patch is probably one of the best demonstrations of what can be the benefits of releasing your libraries as an open source projects. RoboDomain can now take advantage of a much more efficient domain parsing library.
Public Suffix Service 0.5.1, released today, also includes an update to the definition file.