#345 Hstore pro
- Download:
- source codeProject Files in Zip (85.5 KB)
- mp4Full Size H.264 Video (21.3 MB)
- m4vSmaller H.264 Video (11.5 MB)
- webmFull Size VP8 Video (13.6 MB)
- ogvFull Size Theora Video (25.4 MB)
Sometimes when we’re building an application we run into a situation that would benefit from a schema-less database. Below is an app that has different kinds of product records including books and videos.
We need to store different information based on the type of product. For a book we want to store the author while for videos we want to store a rating and a runtime. This information is currently stored in a description
text field in the database which isn’t very efficient and gives us no way to extract these attributes and act upon them individually. We could make each of these attributes a separate column in the database but this can quickly get out of hand as we introduce more types of products to our application. How do we best deal with this situation?
Hstore
At this point we might consider switching to a NoSQL database such as MongoDB but as our application uses Postgres we can try hstore instead. This allows us to store a hash of attributes in a single database column. Hstore isn’t just a serialized hash in a text field, it provides a multitude of operators and functions that we can use to query and interact with this data. We’ll add it to our application so store data related to each product type.
First a couple of notes about our application. We’ve already configured it to use Postgres as its database as we showed in episode 342. In our application config file we’ve uncommented the line that sets the schema_format to sql as the schema won’t be simple enough to represent in Ruby code.
# Use SQL instead of Active Record's schema dumper when creating the database. # This is necessary if your schema can't be completely dumped by the schema dumper, # like if you have constraints or database-specific column types config.active_record.schema_format = :sql
To add hstore to our app we’ll use the activerecord-postgres-hstore gem. When Rails 4 is released this will have built-in support for hstore which will make this gem unnecessary. Installing the gem is done in the usual way, by adding it to the gemfile and running bundle.
gem 'activerecord-postgres-hstore'
Next we’ll need to run a generator to set up hstore.
$ rails g hstore:setup create db/migrate/20120504000000_setup_hstore.rb
This generates a migration that adds the hstore
extension to our database.
class SetupHstore < ActiveRecord::Migration def self.up execute "CREATE EXTENSION hstore" end def self.down execute "DROP EXTENSION hstore" end end
Before we migrate the database we’ll create another migration that adds a properties
column to our products
table. This will be used to store the dynamic attributes that are different for each type of product and will be of type hstore
, which is will be a valid datatype once we’ve run the other migration.
$ rails g migration add_properties_to_products properties:hstore
There’s one more migration that we’ll create now which adds an index to our new column as it’s generally a good idea to index these columns.
class IndexProductsProperties < ActiveRecord::Migration def up execute "CREATE INDEX products_properties ON products USING GIN(properties)" end def down execute "DROP INDEX products_properties" end end
This migration adds a migration to the properties column using the GIN
function. We could use GiST
here instead and there’s more information about the differences between these in the documentation.
With all these migrations in place we can now migrate that database by running rake db:migrate
.
Using Hstore
We’ll start by demonstrating hstore in the console. First we’ll fetch a product. This now has a properties
attribute which will be nil
be default.
1.9.3-p125 :001 > p = Product.first 1.9.3-p125 :002 > p.properties => nil
We can set this attribute to a hash of values. The product we’ve fetched is a video so we’ll give it rating
and runtime
values then save it. This will save this information into the properties
column in the database.
1.9.3-p125 :003 > p.properties = { rating: "PG-13", runtime: 107 } => {:rating=>"PG-13", :runtime=>107} 1.9.3-p125 :004 > p.save
When we reload the product and check its properties we’ll get that hash of values.
1.9.3-p125 :005 > p.reload 1.9.3-p125 :006 > p.properties => {"rating"=>"PG-13", "runtime"=>"107"}
One important difference about the hash that’s retuned from the database is that the keys and values are all strings, even though we used a symbol and an integer when we set it. Hstore only stores string values so if we want to store a boolean, date or integer value we’ll need to convert it manually afterwards. Another thing to be aware of is that the properties
object will be a different object each time we fetch it. If we try to set a specific property through this hash it won’t work as the old hash will be used each time. We always have to set the full hash each time.
# This won’t work 1.9.3-p125 :007 > p.properties["runtime"] = 123 => 123 1.9.3-p125 :008 > p.properties => {"rating"=>"PG-13", "runtime"=>"107"} # We have to set the full hash like this 1.9.3-p125 :009 > p.properties = { :rating => "PG-13", :runtime => 123 } => {:rating=>"PG-13", :runtime=>123} 1.9.3-p125 :010 > p.properties => {:rating=>"PG-13", :runtime=>123}
Querying Hstore Columns
With these problems out of the way we can start to look at the interesting stuff: querying our new hstore column. The documentation shows us the different operators that we can use here and one of them is -> which is used to extract a value from a hstore, given a key. Let’s say that we want to find all the products with a rating of “PG-13”, we can do so like this:
1.9.3-p125 :015 > Product.where("properties -> 'rating' = 'PG-13'") Product Load (0.8ms) SELECT "products".* FROM "products" WHERE (properties -> 'rating' = 'PG-13') => [#<Product id: 1, name: "The Sixth Sense", category: "Videos", price: #<BigDecimal:7fdb01f85c30,'0.999E1',18(18)>, description: "Rated: PG-13\n\nRuntime: 107 minutes\n\nA boy who commu...", created_at: "2012-05-04 17:46:25", updated_at: "2012-05-04 20:24:50", properties: {"rating"=>"PG-13", "runtime"=>"107"}>]
We can use LIKE
here, too, and find all the products with a rating that contains the letter G like this:
1.9.3-p125 :016 > Product.where("properties -> 'rating' LIKE '%G%'")
If we want to compare a number, say finding all the products with a runtime
greater than 100 we’ll have to cast the value to an integer which we can do like this:
1.9.3-p125 :017 > Product.where("(properties -> 'runtime')::int > 100")
Typecasting in Postgres can be a little tricky as it will raise an exception if any of the data doesn’t convert to the target datatype. It can be necessary to add another WHERE
clause to ensure that we only fetch the records that have a value that can be converted. If all we’re going is a simple string comparison then it’s more efficient to use the @>
operator instead. This will check to see if the hstore on the left contains the one on the right.
1.9.3-p125 :021 > Product.where("properties @> ('rating' => 'PG-13')")
The advantage of this is that it will use the index if there is one.
Using Hstore in Our Application
Now that we know how hstore works let’s apply it to our application’s product form. Instead of having each book’s author’s name in the description
field we’ll have a separate field for the author. We’ll start by adding a new field to the form partial.
<div class="field"> <%= f.label :author %><br /> <%= f.text_field :author %> </div>
We’ll need accessor methods on Product
for this new field. It would be nice if we could take advantage of the new store features that Rails 3.2 provides. There’s a store_accessor
method that will do exactly what we want. We pass it the name of a column, in this case properties
, and a list of attributes that we want to make accessors for and this will create a getter and setter method. Unfortunately this isn’t compatible with the hstore gem. Presumably it will work with Rails 4 but for now we’ll have to this manually and write getter and setter methods.
class Product < ActiveRecord::Base attr_accessible :name, :category, :price, :description, :author # store_accessor :properties, :author def author properties && properties["author"] end def author=(value) self.properties = (properties || {}).merge("author" => value) end end
The getter method checks that the properties
hash exists and if so returns the value of its author
key. The setter method sets the full hash to the value of the existing hash (or an empty hash if doesn’t exist) and merges the new value for the author
key. The final change we’ve made is to add the new attribute to the list in attr_accessible
to make it accessible through mass assignment.
Reloading the page now shows the new field and if we fill in this field, submit the form then go back to edit it again we’ll see the value we entered in the author field.
Generating The Other Accessor Methods
We could write similar getter and setter methods for the other hstore attributes but instead we’ll use meta-programming to generate them.
class Product < ActiveRecord::Base attr_accessible :name, :category, :price, :description # store_accessor :properties, :author %w[author rating runtime].each do |key| attr_accessible key define_method(key) do properties && properties[key] end define_method("#{key}=") do |value| self.properties = (properties || {}).merge(key => value) end end end
Now we have a an array of the attributes we store in our properties
hash and we loop through them and call define_method
on each one to create each getter and setter. We also call attr_accessor
to add the attribute to the list of accessible attributes. Once we’ve added rating
and runtime
fields to the edit and show forms in the same way we added author
we can try this out. If we edit one of our films then visit its page we’ll see the new fields listed.
One thing we should do is make the edit form smarter so that it only shows the author field if we’re editing a book and the rating
and runtime
fields if we’re editing a video. Doing this is a out of the scope of this episode, however. An alternative approach to dealing with hstore attributes in a form is demonstrated in this example app. This allows us to add and remove attributes dynamically and store them in hstore. This is a less strict approach that we have in our application but it might work well in others.
One benefit to having accessor methods to hstore attributes is that it makes validations easy. If we want to make sure that any entered runtime value is valid we can do so in the usual way.
validates_numericality_of :runtime, allow_blank: true
There are many different styles of form that we can make with hstore. We can treat attributes like database columns as we have in our application or make a more free-flowing hash of data that lets us enter anything we want. We’ll finish off this episode with a quick bonus tip. If we want to perform queries on our attributes we can do so by adding a scope call, like this:
class Product < ActiveRecord::Base attr_accessible :name, :category, :price, :description validates_numericality_of :runtime, allow_blank: true # store_accessor :properties, :author %w[author rating runtime].each do |key| attr_accessible key scope "has_#{key}", lambda { |value| where("properties @> (? => ?)", key, value) } define_method(key) do properties && properties[key] end define_method("#{key}=") do |value| self.properties = (properties || {}).merge(key => value) end end end
We can now use this scope to find, say, all the products with a rating of “R”.
1.9.3-p125 :001 > Product.has_rating("R") Product Load (3.2ms) SELECT "products".* FROM "products" WHERE (properties @> ('rating' => 'R')) => []
A good refactoring to do on our Product
model would be to move the code that creates the accessors for each attribute into some kind of method call to clean the code up, but we’ll leave this as an exercise.