Copy link to clipboard
Copied
Running RH 2020.1, generating Resonsive HTML5 with Azure_Blue.
In my output settings, Enable substring search is disabled (i.e., NOT selected).
Please advise.
Copy link to clipboard
Copied
I have flagged this thread with Adobe so hopefully someone will come along with some answers. These are not questions that forum users can answer.
Copy link to clipboard
Copied
They will be looking at another thread so you might want to add a link there to this thread.
Copy link to clipboard
Copied
Thank you!
Copy link to clipboard
Copied
Circling back on this. I've since upgraded to 2020.3 and it's still an issue - even with "Auto correct search query" and "Enable substring search" both disabled. Here's an example - note how "tc" is bold in the search preview/context of the result (btw, this topic doesn't have an instance of "tcs" in it - anywhere), even though "tcs" was the search term.
Copy link to clipboard
Copied
I have flagged this thread with Adobe and they will be responding.
________________________________________________________
See www.grainge.org for free Authoring and RoboHelp Information
Copy link to clipboard
Copied
Hi,
This is happenning due to a preprocessing step(stemming) which is applied on topics. In this step we convert all the words to it's root word before indexing them. For example, Development, Developing and Develop all have same root word, which is Develop, so instead of indexing all the three words separately, we index only one word. So when you search using any of the above words, you will get results for all three words.
Same thing is happening with tc and tcs where both are converted to tc. So when you search using any of these two words, you always get the results for both words. Currently there is no work around for this issue.
In future we will evaluate if certain words can be excluded from this preprocessing steps which can be used in scenarios like this.
For other issue which you mentioned for tcs being broken into tc and s, please try this with update 3 and let us know if you still see this issue.
Copy link to clipboard
Copied
Thank you for the explanation. I understand. It is a little confusing considering that output presets include a substring search option.
I would expect/hope - as my users have reported that they do - that when "Enable substring search" is disabled, a search would only return identical matches. So for example, if I search on "developing", "development" and "develop" are ignored.
"tc" and "tcs" are two very different things in our technical documentation. Returning results for both is creating a high level of frustration for our end users (not to mention, "tcs2" is similar but different from "tcs" for our knowledge base). More specifically, in this particular instance, the issue is that users are searching on "tcs" and getting back results with "tc" - and what makes it worse, is that the topics with "tc" are weighted/ranked higher than the "tcs" topics.
I tried update 3 - no difference in result.
Maybe this could be handled in the code so that when an output end user encloses a search term in quotes, the stemming process for that term is disabled. Then only identical matches for that term would be returned. Ideally, this could be supported for various search permutations: e.g., multi-term searches (including a mix of stemmed and non-stemmed terms), when "Include all search terms" is selected, etc.
Copy link to clipboard
Copied
I'd echo @RoboFan - I get that stemming makes sense, but when you've explicitly said to ignore any string match that's a sub-string of what you're searching for, that just looks like a logic bug. In your example of Development, Developing and Develop, I would expect it to index just Develop, but when that flag is off, the "matching" logic should check what the search term is against the match that contains Develop to ensure that it doesn't show Develop when asked for Development. IIRC, this used to work in prior versions of RH.
Copy link to clipboard
Copied
5 January 2025. An update for anyone finding this thread.
In 2022.5 I created a new project with just two topics. I entered pos in one and positive in the other.
I found that with Enable Substring Search selected, a search for pos (no quotes) found both topics as it should. With the option deselected, the search only found one topic.
Somewhere along the line, the issue seems to have been fixed.
________________________________________________________
My site www.grainge.org includes many free Authoring and RoboHelp resources that may be of help.
Copy link to clipboard
Copied
@vikchandI am glad that a fix is available in this new update which resolves one of my concerns. However, I am encountering addtional issues such as randomly search not returning any results which clients have escalated due to the fact that our online documentation is based on specific code and description. I am using RH 2022.47 and HTML 5 output. I have isolated being a content based issue... For example 28088 returns results but not 14040 which are both in the same topic?! I had to disable the "enable substring" which was not functionning as expected but was also causing the search to freeze.
Thank you
Christele