Ticket #225 (assigned defect)

Opened 12 months ago

Last modified 4 days ago

Spelling algorithm should consider frequency and not just edit-distance

Reported by: philipn Owned by: olly
Priority: normal Milestone: 1.1.1
Component: Library API Version: SVN trunk
Severity: normal Keywords:
Cc: Blocked By:
Operating System: All Blocking:

Description (last modified by olly) (diff)

As described here: http://thread.gmane.org/gmane.comp.search.xapian.general/5740/focus=5743

If the spelling correction algorithm considered frequency and edit-distance (using some reasonable heuristic) we would see dramatically better results. The current spelling algorithm will only correct words that never appear in the spelling index.

Attachments

spelling_frequency.diff (2.6 kB) - added by philipn 12 months ago.
An example implementaiton

Change History

Changed 12 months ago by philipn

An example implementaiton

Changed 12 months ago by olly

  • status changed from new to assigned
  • rep_platform changed from PC to All
  • component changed from Backend-Flint to Library API
  • op_sys changed from Linux to All

Changed 12 months ago by trac

  • platform set to All

Changed 9 months ago by olly

  • owner changed from newbugs to olly
  • status changed from assigned to new
  • description modified (diff)

Changed 4 days ago by olly

  • status changed from new to assigned
  • description modified (diff)
  • milestone set to 1.1.1

It would be good to improve this during the 1.1.x series.

[Changed link to list discussion to point to gmane for easier browsing of the thread.]

Note: See TracTickets for help on using tickets.