Full text search in PostgreSQL with search term including unrelated hyphen and ending

by organizer   Last Updated July 31, 2018 16:06 PM

I've a problem in my DB (PGSQL 9.6) that I've not been able to resolve and I don't really even understand what options I might have (if any). Simplified I have a DB of multi-language news articles where I've created full text search index (tsvector) on headlines and text. I need to search through the DB to find articles with a certain game name.

Example of an index that would match what I do:

SELECT to_tsvector('simple', 'Yakumo Shionome featured in new Code Vein-trailer');

This results in the following full text search index:

'code':6 'features':3 'in':4 'new':5 'shionome':2 'trailer':9 'vein':8 'vein-trailer':7 'yakumo':1

The challenge come when I want to search for "Code Vein" as game title. In this case I would normally use "phraseto_tsquery". A raw test query could be similar to this:

SELECT to_tsvector('simple','Yakumo Shionome featured in new Code Vein-trailer') 
  @@ phraseto_tsquery('simple','Code Vein');

This would result in FALSE in this case. I would have to search for "Code Vein-Trailer" to get a TRUE match. So the "hyphen" followed by "some word" in essence is causing me trouble and I don't really get what options I might have as far as creating a way around this (there can be many different ending words, many I also might not know ahead of time).

Normally the fact that my Boolean searches are word boundary based is good as it avoid potential false positives, but this seems to be a challenging drawback. I use "simple" due to multiple languages and that game names and elements I sometime search for included stop words - which I felt ended up being problematic or more prune to issues if not taking this approach. I also do require exact matches and not ranked results.

Any thoughts and help welcome.



Related Questions


Get partial match from GIN indexed TSVECTOR column

Updated October 23, 2017 17:06 PM

How is LIKE implemented?

Updated June 11, 2015 23:02 PM

How to match similar records using multiple columns

Updated April 04, 2017 02:06 AM