Creating Custom Sorts In Apache Solr

I attended Drupalcon for the first time this year in San Francisco, and of all the sessions I attended, one that really stood out for me was Apache Solr Search Mastery, put on by Robert Douglass and Peter Wolanin of Acquia and James McKinney of Evolving Web.  It was like the first time I read "Pro Drupal Development", in that it really opened my eyes as to what is possible with the Solr module.  On top of that, I was almost immediately able to take what I learned back to a project I was working on and start using it.

One of the first use cases I had was the need to add a custom sort to the search results.  Out of the box, the module has the following five sorts available:

  •     Relevancy
  •     Title
  •     Type
  •     Author
  •     Date

However, the client also wanted to add a search by the number of times a node had been viewed, as provided by the core Statistics module.  In order to make this data available in Solr as a sortable field, we have to follow a two step process:

  1.     Add the field to the Solr index
  2.     Tell Solr that the field is sortable

Adding the field to the index

The apachesolr module has multiple available hooks that can be used to alter the search data.  In this case, the one that we want is hook_apachesolr_update_index.  This hook is run for each node, and is passed the $document object by reference.  $document is the XML document that is passed to Solr for indexing.  The source for our data is the totalcount field in the node_counter table.

function mymodule_apachesolr_update_index(&amp;$document, $node) {<br />
        &nbsp; // Get count from node_counter table<br />
        &nbsp; $count = db_result(db_query(&quot;SELECT totalcount FROM {node_counter} WHERE nid = %d&quot;, $node-&gt;nid));<br />
        <br />
        &nbsp; // Add field to index<br />
        &nbsp; if ($count !== FALSE) {<br />
        &nbsp;&nbsp;&nbsp; $document-&gt;tis_hit_count = (int) $count;<br />
        &nbsp; }<br />
        }}

So all we do is run a query to get our count value, and add it to the Solr index in the tis_hit_count field, but only if the query returns a value.  If you try to add an empty value, such as the case where there is no record in node_counter for a node yet, Solr will throw an error.

One important thing to notice is the field name.  The 'tis_' at the beginning of the name tells Solr what kind of data is stored in the field.  If you look in the schema.xml file that comes with the module (and that you use to replace the default Solr file when installing the module), there is a section that lists all of the dynamic field naming patterns.

&lt;!-- Dynamic field definitions.&nbsp; If a field name is not found, dynamicFields<br />
        &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; will be used if the name matches any of the patterns.<br />
        &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; RESTRICTION: the glob-like pattern in the name attribute must have<br />
        &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; a &quot;*&quot; only at the start or the end.<br />
        &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; EXAMPLE:&nbsp; name=&quot;*_i&quot; will match any field ending in _i (like myid_i, z_i)<br />
        &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Longer patterns will be matched first.&nbsp; if equal size patterns<br />
        &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; both match, the first appearing in the schema will be used.&nbsp; --&gt;<br />
        <br />
        &nbsp;&nbsp; &lt;dynamicField name=&quot;is_*&quot;&nbsp; type=&quot;integer&quot; indexed=&quot;true&quot;&nbsp; stored=&quot;true&quot; multiValued=&quot;false&quot;/&gt;<br />
        &nbsp;&nbsp; &lt;dynamicField name=&quot;im_*&quot;&nbsp; type=&quot;integer&quot; indexed=&quot;true&quot;&nbsp; stored=&quot;true&quot; multiValued=&quot;true&quot;/&gt;<br />
        &nbsp;&nbsp; &lt;dynamicField name=&quot;sis_*&quot; type=&quot;sint&quot;&nbsp;&nbsp;&nbsp; indexed=&quot;true&quot;&nbsp; stored=&quot;true&quot; multiValued=&quot;false&quot;/&gt;<br />
        &nbsp;&nbsp; &lt;dynamicField name=&quot;sim_*&quot; type=&quot;sint&quot;&nbsp;&nbsp;&nbsp; indexed=&quot;true&quot;&nbsp; stored=&quot;true&quot; multiValued=&quot;true&quot;/&gt;<br />
        &nbsp;&nbsp; &lt;dynamicField name=&quot;sm_*&quot;&nbsp; type=&quot;string&quot;&nbsp;&nbsp;&nbsp; indexed=&quot;true&quot;&nbsp; stored=&quot;true&quot; multiValued=&quot;true&quot;/&gt;<br />
        &nbsp;&nbsp; &lt;dynamicField name=&quot;tm_*&quot;&nbsp; type=&quot;text&quot;&nbsp;&nbsp;&nbsp; indexed=&quot;true&quot;&nbsp; stored=&quot;true&quot; multiValued=&quot;true&quot; termVectors=&quot;true&quot;/&gt;<br />
        &nbsp;&nbsp; &lt;dynamicField name=&quot;ss_*&quot;&nbsp; type=&quot;string&quot;&nbsp;&nbsp;&nbsp; indexed=&quot;true&quot;&nbsp; stored=&quot;true&quot; multiValued=&quot;false&quot;/&gt;<br />
        &nbsp;&nbsp; &lt;dynamicField name=&quot;ts_*&quot;&nbsp; type=&quot;text&quot;&nbsp;&nbsp;&nbsp; indexed=&quot;true&quot;&nbsp; stored=&quot;true&quot; multiValued=&quot;false&quot; termVectors=&quot;true&quot;/&gt;<br />
        &nbsp;&nbsp; &lt;dynamicField name=&quot;tsen2k_*&quot; type=&quot;edge_n2_kw_text&quot; indexed=&quot;true&quot; stored=&quot;true&quot; multiValued=&quot;false&quot; omitNorms=&quot;true&quot; omitTermFreqAndPositions=&quot;true&quot; /&gt;<br />
        &nbsp;&nbsp; &lt;dynamicField name=&quot;ds_*&quot; type=&quot;date&quot;&nbsp;&nbsp;&nbsp; indexed=&quot;true&quot;&nbsp; stored=&quot;true&quot; multiValued=&quot;false&quot;/&gt;<br />
        &nbsp;&nbsp; &lt;dynamicField name=&quot;dm_*&quot; type=&quot;date&quot;&nbsp;&nbsp;&nbsp; indexed=&quot;true&quot;&nbsp; stored=&quot;true&quot; multiValued=&quot;true&quot;/&gt;<br />
        &nbsp;&nbsp; &lt;dynamicField name=&quot;tds_*&quot; type=&quot;tdate&quot;&nbsp;&nbsp;&nbsp; indexed=&quot;true&quot;&nbsp; stored=&quot;true&quot; multiValued=&quot;false&quot;/&gt;<br />
        &nbsp;&nbsp; &lt;dynamicField name=&quot;tdm_*&quot; type=&quot;tdate&quot;&nbsp;&nbsp;&nbsp; indexed=&quot;true&quot;&nbsp; stored=&quot;true&quot; multiValued=&quot;true&quot;/&gt;<br />
        &nbsp;&nbsp; &lt;dynamicField name=&quot;bm_*&quot;&nbsp; type=&quot;boolean&quot; indexed=&quot;true&quot;&nbsp; stored=&quot;true&quot; multiValued=&quot;true&quot;/&gt;<br />
        &nbsp;&nbsp; &lt;dynamicField name=&quot;bs_*&quot;&nbsp; type=&quot;boolean&quot; indexed=&quot;true&quot;&nbsp; stored=&quot;true&quot; multiValued=&quot;false&quot;/&gt;<br />
        &nbsp;&nbsp; &lt;dynamicField name=&quot;fs_*&quot;&nbsp; type=&quot;sfloat&quot;&nbsp; indexed=&quot;true&quot;&nbsp; stored=&quot;true&quot; multiValued=&quot;false&quot;/&gt;<br />
        &nbsp;&nbsp; &lt;dynamicField name=&quot;fm_*&quot;&nbsp; type=&quot;sfloat&quot;&nbsp; indexed=&quot;true&quot;&nbsp; stored=&quot;true&quot; multiValued=&quot;true&quot;/&gt;<br />
        &nbsp;&nbsp; &lt;dynamicField name=&quot;ps_*&quot;&nbsp; type=&quot;sdouble&quot; indexed=&quot;true&quot;&nbsp; stored=&quot;true&quot; multiValued=&quot;false&quot;/&gt;<br />
        &nbsp;&nbsp; &lt;dynamicField name=&quot;pm_*&quot;&nbsp; type=&quot;sdouble&quot; indexed=&quot;true&quot;&nbsp; stored=&quot;true&quot; multiValued=&quot;true&quot;/&gt;<br />
        <br />
        &nbsp;&nbsp; &lt;dynamicField name=&quot;tis_*&quot;&nbsp; type=&quot;tint&quot;&nbsp; indexed=&quot;true&quot;&nbsp; stored=&quot;true&quot; multiValued=&quot;false&quot;/&gt;<br />
        &nbsp;&nbsp; &lt;dynamicField name=&quot;tim_*&quot;&nbsp; type=&quot;tint&quot;&nbsp; indexed=&quot;true&quot;&nbsp; stored=&quot;true&quot; multiValued=&quot;true&quot;/&gt;<br />
        &nbsp;&nbsp; &lt;dynamicField name=&quot;tls_*&quot;&nbsp; type=&quot;tlong&quot; indexed=&quot;true&quot;&nbsp; stored=&quot;true&quot; multiValued=&quot;false&quot;/&gt;<br />
        &nbsp;&nbsp; &lt;dynamicField name=&quot;tlm_*&quot;&nbsp; type=&quot;tlong&quot; indexed=&quot;true&quot;&nbsp; stored=&quot;true&quot; multiValued=&quot;true&quot;/&gt;<br />
        &nbsp;&nbsp; &lt;dynamicField name=&quot;tfs_*&quot;&nbsp; type=&quot;tfloat&quot;&nbsp; indexed=&quot;true&quot;&nbsp; stored=&quot;true&quot; multiValued=&quot;false&quot;/&gt;<br />
        &nbsp;&nbsp; &lt;dynamicField name=&quot;tfm_*&quot;&nbsp; type=&quot;tfloat&quot;&nbsp; indexed=&quot;true&quot;&nbsp; stored=&quot;true&quot; multiValued=&quot;true&quot;/&gt;<br />
        &nbsp;&nbsp; &lt;dynamicField name=&quot;tps_*&quot;&nbsp; type=&quot;tdouble&quot; indexed=&quot;true&quot;&nbsp; stored=&quot;true&quot; multiValued=&quot;false&quot;/&gt;<br />
        &nbsp;&nbsp; &lt;dynamicField name=&quot;tpm_*&quot;&nbsp; type=&quot;tdouble&quot; indexed=&quot;true&quot;&nbsp; stored=&quot;true&quot; multiValued=&quot;true&quot;/&gt;<br />
        &nbsp;&nbsp; &lt;!-- Sortable version of the dynamic string field --&gt;<br />
        &nbsp;&nbsp; &lt;dynamicField name=&quot;sort_ss_*&quot; type=&quot;sortString&quot; indexed=&quot;true&quot; stored=&quot;false&quot;/&gt;<br />
        &nbsp;&nbsp; &lt;copyField source=&quot;ss_*&quot; dest=&quot;sort_ss_*&quot;/&gt;

 

In this case, the 'tint' field type is most appropriate for this data.  For more information on naming fields you can read the schema.xml document, or watch the video from the Drupalcon session (Peter talks about dynamic fields starting at 15:50 into the video).

Once this has been saved, you will need to re-index the content at admin/settings/apachesolr/index.  You can then go to admin/reports/apachesolr and verify that your field has been indexed.
Telling Solr that the field is sortable

Now that the data has been added to the index, we need to tell Solr that the field is sortable.  To do this, we use another hook, hook_apachesolr_prepare_query().  This hook is run after the query has been generated by Solr, and it allows you to make any modifications before it is run.

 

function mymodule_apachesolr_prepare_query(&amp;$query, &amp;$params) {<br />
        &nbsp; $query-&gt;set_available_sort(&#39;tis_hit_count&#39;, array(<br />
        &nbsp;&nbsp;&nbsp; &#39;title&#39; =&gt; t(&#39;Number of Views&#39;),<br />
        &nbsp;&nbsp;&nbsp; &#39;default&#39; =&gt; &#39;asc&#39;,<br />
        &nbsp; ));<br />
        <br />
        }

As you can see in the code above, the $query object is passed by reference, and we use the set_available_sort method to add this as a sort field that will be displayed in the Apache Solr Sorting: Core block.  The first parameter is the field name that was defined in hook_apachesolr_update_index(), and the second is an array that contains the string that will be displayed in the Sorting block, and the type of sort (ascending or descending).  In addition, if you do not want to use any of the default sorts, they can be removed at the same time with the remove_available_sort method. passing the field name:

$query-&gt;remove_available_sort(&#39;solr_field_name&#39;);

And voila!  You should now have a Number of Views link in your Sort block.

One additional thing to note is that there are actually two different hooks for modifying the $query object: hook_apachesolr_prepare_query and hook_apachesolr_modify_query.  The one you want to use is determined by whether or not you want the user to see the change.  Anything modified in prepare_query will be visible to your users, and anything modified in modify_query will not be visible to your users (a more detailed explanation is given by James McKinney in the DCSF session video from 37:20-39:15).  In this case, since we want the change to be visible to our users, we use hook_apachesolr_prepare_query.

There are many more customizations that can be done in Solr with the available hooks.  In subsequent posts, I will show how to add a field so that it is displayed with the search results, and a more complicated use case of creating a custom search path that automatically applies a filter to the results.

Share This