Ticket #9 (closed defect: released)

Opened 6 years ago

Last modified 4 years ago

$highlight{} doesn't handle accented characters correctly

Reported by: arjen Owned by: olly
Priority: high Milestone:
Component: Omega Version: 0.7.4
Severity: minor Keywords:
Cc: Blocked By:
Operating System: All Blocking:

Description

I was just notified of the behaviour that words with é (and probably other characters aswell) are split into multiple words, as far as I know that shouldn't happen? Working around is of course to do a phrase search, but afaik xapian should either replace the é with an e or treat is as a normal character.

Change History

Changed 6 years ago by olly

  • owner changed from james to olly
  • severity changed from minor to normal

Bug is in omindex.cc / scriptindex.cc. Similar code should be pulled out into a shared file and fixed to handle accented characters. Probably transliterate most to unaccented characters to normalise accent representation...

Changed 6 years ago by olly

  • status changed from new to assigned

Changed 6 years ago by james

Presumably this will also affect query construction? Or does that already transliterate characters? Whatever, $highlight{} will also be affected, as it will need to do the transliteration while it's looking for words to highlight. I have a feeling I didn't pull out code when writing $highlight{} - it's probably duplicating code in query.cc or similar, and so will need to be fixed twice.

Changed 5 years ago by olly

Now fixed up everywhere apart from $highlight{} in query.cc which should share code with indextext.cc.

Changed 5 years ago by olly

  • summary changed from A word like bézier is split into b and zier to $highlight{} doesn't handle accented characters correctly

Changed 5 years ago by olly

  • rep_platform changed from PC to All
  • version changed from 0.6.4 to 0.7.4
  • op_sys changed from Linux to All
  • severity changed from normal to minor

Changed 4 years ago by olly

  • status changed from assigned to closed
  • resolution set to fixed

Fixed in CVS

Changed 4 years ago by olly

  • resolution changed from fixed to released

Fixed in 0.8.2

Changed 4 years ago by trac

  • platform set to All
Note: See TracTickets for help on using tickets.