• Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
    Dedicated community for Japanese speakers
  • 한국 커뮤니티
    Dedicated community for Korean speakers
Exit
1

Solr Parser in CF 2023 Question

Participant ,
Feb 17, 2024 Feb 17, 2024

Copy link to clipboard

Copied

Does anyone know if CF 2023 uses Solr’s default Query Parser is also known as the “lucene” parser?

 

I have been testing searches using the term modifiers suggested in the Solr Reference Guide 8.2.

A simple one word keyword search seems to function properly.  However, i am getting results with more complex keyword strings that are out of whack.

For example, If I do a search using "medical device" including the quotes as indicated by the guide, the result is 55868 candidate hits.  That is roughly 500 hits short of every caniddate in the system! Using our production machine running verity against the same data the actual result should be 1618 candidate hits.

Another example, using "medical device"~4 again results in 55868 candidate hits.  The verity equivalent medical<near/4>device is actually 8649.  Just for the heck of it, I tried "medical device"~4 without the quotes on the development machine and get 16698 hits.  Again, nowhere near the correct number of 8649.

 

Should I be using different parsing terms for Solr running under CF 2023?

 

Please advise.  Thanks.

Alex Craig, General Manager
"Avid Saltwater Fly Fisherman"
TOPICS
Advanced techniques , Builder , Getting started

Views

269

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines

correct answers 1 Correct answer

Participant , Feb 18, 2024 Feb 18, 2024

Well, my theory about the double quotes being automatically applied in the form code appears to be correct.

I solved it by stripping the double quotes out of the field before repopulating the field with solr style keyword strings to include the required double quotes on each side of each keyword element.

Thus, for example, instead of ""medical device" AND "orthopedic"", the system was processing "medical device" AND "orthopedic" and it started producing results that mirrored those on the productio

...

Votes

Translate

Translate
Community Expert ,
Feb 17, 2024 Feb 17, 2024

Copy link to clipboard

Copied

Solr is a superset of Lucene, basically just a web application that works with Lucene. So, my guess is that you're using the Lucene query parser.

 

You can add query="debug" to find out more information. Your query should have worked the way you wrote it, though. Here's Adobe's guide to Solr queries in CF:

 

https://helpx.adobe.com/coldfusion/developing-applications/accessing-and-using-data/solr-search-supp...

 

Dave Watts, Eidolon LLC 

 

 

Dave Watts, Eidolon LLC

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Participant ,
Feb 17, 2024 Feb 17, 2024

Copy link to clipboard

Copied

Much appreciate the prompt reply Dave.  Thanks for the confirmation on the terminology.

Well, for some reason which I do not recall, we were not able to use double quotes in Verity and I had to comment out validation edits to that effect.  Perhaps the form data is being modified after the data is entered before is it being entered into the keywords field.  Guess I'll check that out next.   Don't know what else it might be.

Alex Craig, General Manager
"Avid Saltwater Fly Fisherman"

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Feb 17, 2024 Feb 17, 2024

Copy link to clipboard

Copied

In addition to doing such debugging of exactly what's in your form field values, another option for debugging solr specifically is to use the solr admin ui that is available, where you can enter search criteria in a form, as well as see the results and more.

 

But this available ui is NOT made available via the cf admin. Instead, just try localhost:8993, or use whatever port is indicated on the solr server page of the cf admin. 

 

Note also that the tool shows you the REST url that would provide that search result, and you can use that in a cfhttp call from cf--which means you can do  things that solr allows, even though cfsearch itself may not. 


/Charlie (troubleshooter, carehart.org)

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Participant ,
Feb 17, 2024 Feb 17, 2024

Copy link to clipboard

Copied

Very cool stuff Charlie!

Tried "medical device"~4 using Solr Admin and it finds 1412 hits which sounds about right.  About 400 less than the production machine, but that is due to a difference that I'll get into in a new thread later tonight.

Must be a form related issue that I'll need to solve.  There does not appear to be any manipulation of the keywords that I can find thus far.

Thanks man!

Alex Craig, General Manager
"Avid Saltwater Fly Fisherman"

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Participant ,
Feb 18, 2024 Feb 18, 2024

Copy link to clipboard

Copied

Hey Charlie,

To begin with the Solr Admin tool which enabled me to query our "Resumes" collection was very helpful.

It produced accurate results in every test instance.  Thus, it appears our collection  is "good".

 

It took awhile, but I believe I have uncovered the source of the problem.

Once again, it would appear that using "hidden" Input Types are creating a problem.

 

I used an output tag as indicated in the code below in the page that starts the search and also in the page that does the Solar search AND displays the results at the conclusion of the search (see the code between the asterisks below).

 

The keyword criteria at the start of the search was precisely what I input on the web page.  However, if I used the double quote symbol which is required by Solr. The output for the keyword criteria was blank and the search returned whacked results.

It appears that using the "hidden" Input Type automatically inserts the double quotes around the entire string. Because just entering a single keyword without double quotes would produce an accurate result and the output tag would accurately display the keyword used at the conclusion of the search.

*****************************************

<INPUT TYPE="hidden" NAME="keywords" VALUE="#Form.keywords#">
<CENTER> <output> Key Words: #Form.keywords# </output> </CENTER>

*****************************************

The other 24 "hidden" form values appear to work as they should.  As long as I do not use a keyword entry, searches produce accurate results as they only involve db queries.

 

So ....  can someone please suggest another method of passing keyword strings through a variable in a form in the background to use in a Solr collection search, please?   Thanks!

Alex Craig, General Manager
"Avid Saltwater Fly Fisherman"

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Participant ,
Feb 18, 2024 Feb 18, 2024

Copy link to clipboard

Copied

LATEST

Well, my theory about the double quotes being automatically applied in the form code appears to be correct.

I solved it by stripping the double quotes out of the field before repopulating the field with solr style keyword strings to include the required double quotes on each side of each keyword element.

Thus, for example, instead of ""medical device" AND "orthopedic"", the system was processing "medical device" AND "orthopedic" and it started producing results that mirrored those on the production machine.  There is one remaining wrinkle.  The results were correct as long as I chose a search that limited it to database defined Job Types.  When I had it process ALL candidates in the database, the results were still far greater than they should have be.  But that problem is another matter which I'll need to tackle next.

I suppose it is all about "eating the elephant one bite at a time"!  😉

*************************************

<INPUT TYPE="hidden" NAME="keywords" VALUE="#Replace( Form.keywords, '"', '', 'ALL' )#">
<INPUT TYPE="hidden" NAME="keywords" VALUE="#Form.keywords#">
<CENTER> <output> Key Words: #Form.keywords# </output> </CENTER>

Alex Craig, General Manager
"Avid Saltwater Fly Fisherman"

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Resources
Documentation