Opened 22 years ago

Closed 20 years ago

Last modified 20 years ago

#9 closed defect (released)

$highlight{} doesn't handle accented characters correctly

Reported by: Arjen Owned by: Olly Betts
Priority: high Milestone:
Component: Omega Version: 0.7.4
Severity: minor Keywords:
Cc: Blocked By:
Blocking: Operating System: All

Description

I was just notified of the behaviour that words with é (and probably other characters aswell) are split into multiple words, as far as I know that shouldn't happen? Working around is of course to do a phrase search, but afaik xapian should either replace the é with an e or treat is as a normal character.

Change History (8)

comment:1 by Olly Betts, 22 years ago

Owner: changed from James Aylett to Olly Betts
Severity: minornormal

Bug is in omindex.cc / scriptindex.cc. Similar code should be pulled out into a shared file and fixed to handle accented characters. Probably transliterate most to unaccented characters to normalise accent representation...

comment:2 by Olly Betts, 22 years ago

Status: newassigned

comment:3 by James Aylett, 22 years ago

Presumably this will also affect query construction? Or does that already transliterate characters? Whatever, $highlight{} will also be affected, as it will need to do the transliteration while it's looking for words to highlight. I have a feeling I didn't pull out code when writing $highlight{} - it's probably duplicating code in query.cc or similar, and so will need to be fixed twice.

comment:4 by Olly Betts, 22 years ago

Now fixed up everywhere apart from $highlight{} in query.cc which should share code with indextext.cc.

comment:5 by Olly Betts, 21 years ago

Summary: A word like bézier is split into b and zier$highlight{} doesn't handle accented characters correctly

comment:6 by Olly Betts, 21 years ago

op_sys: LinuxAll
rep_platform: PCAll
Severity: normalminor
Version: 0.6.40.7.4

comment:7 by Olly Betts, 20 years ago

Resolution: fixed
Status: assignedclosed

Fixed in CVS

comment:8 by Olly Betts, 20 years ago

Operating System: All
Resolution: fixedreleased

Fixed in 0.8.2

Note: See TracTickets for help on using tickets.