Using Solr to search database content

Contributor ,
Jun 16, 2015 Jun 16, 2015

Copy link to clipboard

Copied

I'm working with a database that stores links to various pieces of government regulation.  Each regulation has a discipline, subdiscipline, regulation type, and responsible office.  Each one also has a list of assigned keywords.

Previously, we were just using an HTML form to do a <cfquery> to search these items.  At one point, someone tied our Google appliance into the keyword search to improve results, but now we're dumping the appliance due to cost issues.  I've been asked to come up with a replacement for the appliance, so my first thought was Solr.

I've created a collection, then used cfindex to index it.  If I use cfsearch to pull keyword results, it works quite well.  Here's my cfindex tag:

<cfindex

    collection="docsearch"

    action="update"

    type="custom"

    category="category_name"

    body="keyword"

    custom1="discipline"

    custom2="subdiscipline"

    custom3="fulldate_issued"

    custom4="office_name"

    query="docdata"

    key="id"

    title="document_title"

    urlpath="link">

My problem is that I'm being asked to combine the keyword (Solr) search with the HTML form (select drop-downs for category, subcategory, regulation type, etc) and I'm having some trouble making that work.  If someone enters a keyword, no problem.  I can use a QoQ to parse the query results.  But it's not so easy if the keyword is left blank because then the cfsearch takes AGES to run.  I wish I could force a keyword, but going by stats, at least half of everyone omits them, so I don't want to fiddle with how people use the form.

Has anyone here had any experience using Solr to search database content like this?

Views

286

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Contributor ,
Jun 17, 2015 Jun 17, 2015

Copy link to clipboard

Copied

Success.  I basically just indexed all the things I wanted to search on and fed them into the body attribute, then used custom 1-4 for the ID numbers, so I can further parse the results.

<cfindex

    collection="pgcsearch"

    action="update"

    type="custom"

    category="cat_id"

    body="keyword,discipline,subdiscipline,office_name,category_name"

    custom1="disc_id"

    custom2="sub_id"

    custom3="yearissued"

    custom4="office_id"

    query="pgcdata"

    key="id"

    title="document_title"

    urlpath="link">


I just wish I had more than 4 custom placeholders, because I'd like to pass two more items through.  Anyway, at least this takes care of my blank keyword problem.

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Contributor ,
Jun 17, 2015 Jun 17, 2015

Copy link to clipboard

Copied

Aha! Turns out I can have all the custom fields I want, using name_datatype

So in the <cfindex> tag I can have:

flavor_s="chocolate"

The sky's the limit!

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Contributor ,
Jun 18, 2015 Jun 18, 2015

Copy link to clipboard

Copied

LATEST

Slight bug with the custom fields.  The documentation states that the syntax is fieldname_datatype (i.e. field1_s for string, field1_i for integer, etc).  But I get a datatype mismatch error (There is an invalid attributname-attributevalue combination) whenever I use anything but _s, even with numeric fields.

Looks like it's also been reported here: Bug#3935959 - Solr on ColdFusion 10 Does not Support More than 4 Custom Fields

Fortunately, the _s works, so I can use that as a workaround.

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines