Solr

  • Tags: /
  • Latest: 0.2
  • Last Updated: 30 April 2010
  • Grails version: 1.1 > *
  • Authors: Mike Brevoort
0 vote
Dependency:
compile ":solr:0.2"

 Documentation

Summary

Installation

grails install-plugin solr

Description

The documentation for 0.1 is still very much a work in progress

Overview

The Solr Grails plugin integrates a Grails domain model with the Apache Solr search engine through the SolrJ api. It is inspired by the Searchable plugin though in it's initial state lacks many of the features of Searchable including the ability to return a list of domain objects as a result set. But with Solr, this plugin overcomes a key limitations of the Searchable plugin in that it scales easily supporting multiple/clustered application servers and Solr is highly scalable. It also support Facets "out of the box". Wow, that's a phrase that needs to be updated… how about "out of the bits"? Anyway...

This plugin seeks to do the following:

  • Easy to index Grails domain model
  • Easy to query both Grails domain data and any other documents in the Solr index
  • Provide conveniences yet never mask access to excellent Solrj api
  • Flexible enough to work with existing Solr deployments/indexes
  • Become the new defacto Grails search plugin
The plugin comes with Solr embedded within the plugin install to make it very easy to get started. Though SolrJ has the capability to run in an embedded mode (within an application, negating the need to deploy it separately from your application and communicating over HTTP) this plugin intentionally does not support it because it has some limitations and is much less common thus less proven.

New to Solr? Check out this 5 minute primer or other resources at the end of the page

The initial work for this plugin was generously sponsored by Patheos.com and Avalon Consulting LLC. It is being actively maintained by Mike Brevoort and many others have volunteered to help moving forward.

Installation

The plugin is published to the main Grails plugin repo and can be installed in the usual way:

grails install-plugin solr

The source for the plugin is at GitHub.

Configuration

After the plugin is installed, there will be a new solr directory under grails-app/conf. This contains all of the solr config files for the underlying instance. Each time solr is started these config files will be copied into the Solr Home directory. Though the plugin tries allow you to get by with as little as possible about Solr, you will have full control over the schema and other configuration. Also, when you're ready to deploy your application you will need these files for your Solr installation.

mike$ ls grails-app/conf/solr/
admin-extra.html		schema.xml			stopwords.txt
elevate.xml			scripts.conf			synonyms.txt
mapping-ISOLatin1Accent.txt	solrconfig.xml			xslt
protwords.txt			spellings.txt

The two primary configuration files are

  • schema.xml - field types, fields, etc.
  • solrconfig.xml - indexing parameters, request handlers, plugins, cache settings, etc.
The plugin relies heavily on the concept of dynamic fields which allows you to add fields to the index on the fly while specifying the datatype conventionally by the field suffix. When the plugin indexes your Grails domain class it will try to resolve the attribute type and append a suffix to the Solr field name accordingly. For example

int age       // will be indexed as age_i
Date birthday // will be indexed as birthday_dt

For more detail, here is a snippet out of the schema.xml file

<!-- Dynamic field definitions.  If a field name is not found, dynamicFields
        will be used if the name matches any of the patterns.
        RESTRICTION: the glob-like pattern in the name attribute must have
        a "*" only at the start or the end.
        EXAMPLE:  name="*_i" will match any field ending in _i (like myid_i, z_i)
        Longer patterns will be matched first.  if equal size patterns
        both match, the first appearing in the schema will be used.  -->
   <dynamicField name="*_i"  type="int"    indexed="true"  stored="true"/>
   <dynamicField name="*_s"  type="string"  indexed="true"  stored="true"/>
   <dynamicField name="*_l"  type="long"   indexed="true"  stored="true"/>
   <dynamicField name="*_t"  type="text"    indexed="true"  stored="true"/>
   <dynamicField name="*_b"  type="boolean" indexed="true"  stored="true"/>
   <dynamicField name="*_f"  type="float"  indexed="true"  stored="true"/>
   <dynamicField name="*_d"  type="double" indexed="true"  stored="true"/>
   <dynamicField name="*_dt" type="date"    indexed="true"  stored="true"/>

<!-- some trie-coded dynamic fields for faster range queries --> <dynamicField name="*_ti" type="tint" indexed="true" stored="true"/> <dynamicField name="*_tl" type="tlong" indexed="true" stored="true"/> <dynamicField name="*_tf" type="tfloat" indexed="true" stored="true"/> <dynamicField name="*_td" type="tdouble" indexed="true" stored="true"/> <dynamicField name="*_tdt" type="tdate" indexed="true" stored="true"/>

<dynamicField name="*_pi" type="pint" indexed="true" stored="true"/>

<dynamicField name="ignored_*" type="ignored" multiValued="true"/> <dynamicField name="attr_*" type="textgen" indexed="true" stored="true" multiValued="true"/>

<dynamicField name="random_*" type="random" />

Starting Solr and Solr Config

Once installed, there are three scripts to control the underlying Solr instance.

Use start-solr to start the Jetty instance on port 8983. Details of the how to access Solr will be shown including where the Solr Home and Solr log files are located and the URL to directly access it. Currently only 1 running instance is supported. If you have multiple grails projects you're working on at the same time you'll need to stop one and start the other or else project A might use the index of project B.

grails start-solr
.......
-----------
Solr logs can be found here: /Users/mike/.grails/1.2.0/projects/SolrDemo/solr-home/logs
Console access: http://localhost:8983/solr/
-----------

To stop Solr use stop-solr. This script will stop any Solr instance you have running that has a stop port of 8079. This is the default stop port for this plugin

grails stop-solr

To clear out the index you can use delete-solr-index which will stop Solr if it's started and delete the physical index files.

grails delete-solr-index (stops solr and deleted index files)

Indexing Domain Classes

To enable indexing of a domain class include this static declaration, doing so will enable the dynamic methods on the domain object.

static enableSolrSearch = true

To have the index updated on inserts, updates and deletes of your domain classes, declare this property in your domain:

static solrAutoIndex = true

To manually call index on a domain instance you may call indexSolr() on it. For example:

def song = new Song(title: "Rockaway Beach", genre: "Punk", artist: "Ramones").save()
d.indexSolr()

To manually delete the Solr index document:

d.deleteSolr()

Advanced Domain Class indexing

@Solr(field="yourfieldname_s")
yourfieldname_s is the name of the field in the Solr index (configured in schema.xml and maybe in _data-config.xml_)

@Solr(asText=true)

@Solr(asTextAlso=true)

indexes this field as a text type independent of how else it's indexed. The best way to handle the need to do this would be the properly configure the schema.xml file but for those not familiar with Solr this is an easy way to make sure the field is processed as text which should be the default search and processed with a WordDelimiterFilter

@Solr(ignore=false)

Searching

There are currently two ways to execute a query, by dynamic methods on a specific domain class or by calling search on the SolrService. Calling these methods on a domain class will implicitly filter results by that domain.

The simplest method of querying a domain is to pass in a query string. This will search the default solr field (typically text).

Song.searchSolr("Rockaway")

The query uses the Lucene query parser syntax. There is one caveat with the current plugin implementation; you either need to have knowledge of the names of the fields in the index (because you specified the field with an annotation for example) or call solrFieldName("yourfield") to get it.

def result = Song.searchSolr("${Song.solrFieldName('title')}:Rockaway Beach")

A future version of the plugin will include a query builder that will hopefully make it more straight forward.

For more advanced queries you can set up the SolrJ SolrQuery yourself. For example that might look something like this in a controller action:

def search = {

List fq = [] def query = new SolrQuery("${params.q}")

if(params.fq) { query.addFilterQuery(params.fq) if(params.fq instanceof String) fq << params.fq else fq = params.fq } if(params.offset) query.setStart(params.offset as int) if(params.max) query.setRows(params.max as int)

query.facet = true query.setFacetMinCount(1) query.setFacetLimit(10) ["genre", "artist", "year", "name"].each { query.addFacetField(Song.solrFieldName(it)) }

[result:Song.searchSolr(query), q:params.q, fq: fq, solrQueryUrl: query.toString()] }

You can also query by calling search on the SolrService. Search can either take a string query or a SolrQuery object.

def solrService solrService.search("Ramones")

Results

Results are returned as a org.grails.solr.SearchResults object. This object contains both the SolrJ QueryResponse object and a list of map objects. The maps have entries for both the name of the field within the index as well as the name of the fields on your domain class. For example, for a result these would print the same result:

result.resultList.each { 
  println "${it.artist}, ${it.title}, ${it.genre}"
}

result.resultList.each { println "${it.artist_s}, ${it.title_s}, ${it.genre_s}" }

result.resultList.each { println it."${Song.solrFieldName('artist')}" + ", " + it."${Song.solrFieldName('title')}" + ", " + it."${Song.solrFieldName('genre')}" }

result.queryResponse.results.each { println "${it.getFieldValue(Song.solrFieldName('artist')}, ${it.getFieldValue(Song.solrFieldName('title')}, ${it.getFieldValue(Song.solrFieldName('genre')} " }

Given the controller action example above the view to render results might look something like this:

<div id="results">
    <g:each in="${result.resultList}" var="item">
        <p class="result">
            <solr:resultLink result="${item}">${item.name}</solr:resultLink><br/>	                
            ${item.artist} ${item.year}  - ${item.genre}
        </p>
    </g:each>
    <br/>
    <span class="paging">
        <g:paginate total="${result.total}" max="15" params="[q:q, fq:fq]"/>
    </span>
    <g:if test="${result.total == 0}">
    No Results found
    </g:if>
</div>

more details coming soon

Teaser: also included within the plugin is the ability to do a Haversine based spatial query. Here's what the signature of the SolrService method looks like:

/**
  * Constitute SolrQuery for a haversine based spatial search. This method returns 
  * the SolrQuery object in case you need to manipulate it further (add facets, etc)
  *
  * @param query  a lucene formatted query to execute in addition to the location range query
  * @param lat    latitude in degrees
  * @param lng    longitude in degrees
  * @param range  the proximity range to filter results by (small the better performance). unit is miles unless the radius param is passed in which case it's whatever the unit of the radius is
  * @param start  result number of the first returned result - used in paging (optional, default: 0)
  * @param rows   number of results to include - aka page size (optional, default: 10)
  * @param sort   sort direction asc or desc (optional, default: asc)
  * @param funcQuery provide a function query to be summed with the hsin function (optional)
  * @param radius sphere radius for haversine algorigthm (optional, default: 3963.205 [earth radius in miles])
  * @param lat_field SOLR index field for latitude in radians (optional, default: latitude_d)
  * @param lng_field SOLR index field for longitude in radians (optional, default: longitude_d)
  * @return SolrQuery object representing this spatial query
  */  
  SolrQuery getSpatialQuery(query, lat, lng, range, start=0, rows=10, sort="asc", funcQuery="", radius=3963.205, lat_field="latitude_rad_d", lng_field="longitude_rad_d") {

Taglibs

There are currently two tags in the solr namespace. More details coming son.

The facet tag example:

<solr:facet field="${Song.solrFieldName('genre')}" result="${result}" fq="${fq}" q="${q}" min="2">
    <h3>Filter by Genre</h3>
</solr:facet>

The result link example:

<solr:resultLink result="${item}">${item.name}</solr:resultLink>

Roadmap

Resources