I'm working very hard to include some of the most important features in the new version of the Ruby Whois library.
Today, I'm very happy to report that this week I closed the issue #18 which introduces a completely new caching system for the Whois::Answer::Parser.
The way Whois parsers currently work is to extract a property only the very first time it is requested.
r = Whois.query "weppos.it"
# the property has never been requested before
# the value is computed, cached and returned
r.status
# => :ok
# the property has been requested before
# the value is returned without further elaborations
r.status
# => :ok
So far, so good.
The way the system works under the hood is to create a parser instance variable for every single requested property.
r = Whois.query "weppos.it"
# get the first parser
# because
# r.status relies on
# a
# r.parser.parsers.first.status
p = r.parser.parsers.first
# value is not cached
p.instance_variable_get("@status")
# => nil
p.status
# => :ok
# value is cached
p.instance_variable_get("@status")
# => :ok
So far, so good. This approach has a couple of drawbacks.
First, it creates an instance variable for every single property. Because of the large (and increasing) number of properties, the parser object space counts a large number of instance variables. This makes it hard, for instance, to sweep the cache because you have to loop through all instance variables and remove each one.
Second, there's a small inefficiency here. Because in Ruby you don't have to define variables, an undefined instance variable is nil. But nil is actually a value and properties can have a nil value. In this implementation, you don't have a way to distinguish when a value is nil and when it hasn't been elaborated yet, thus nil properties will never hit the cache.
def created_on
@created_on ||= if very_expensive_scan(/created_on/)
# ... set the value
end
end
# calling #created_on several times will continue to perform
# the very_expensive_scan as long as the value != nil.
The new approach uses a single instance variable called @cached_properties as cache. The variable contains a Hash<:key => value>, where the key is the property and the value is the cached result.
If the cache doesn't contain any key for a given property, then the method hasn't been executed yet. Cached nil properties will return nil without further elaboration.
The method #cached_properties_fetch takes care of everything.
def created_on
cached_properties_fetch(:created_on) do
nil
end
end
# value is cached and returned
created_on
# => nil
# the request hits the cache
created_on
# => nil
If you need to sweep the cache, reset @cached_properties to an empty Hash.