wiki:GSoC2011/ChineseSegmentationAnalysis/ProjectPlan

The aim of this project is to use Chinese segmentation algorithm to break Chinese text up to improve the performance.

23 - 29 May

Finish the code about segmentation algorithm based on dictionary

30 May – 5 June

Prepare to code how to deal with name

6 – 12 June

Test the ability to recognize number and location

13 - 19 June

Test the ability to recognize number and location

20 – 26 June

Finish the code about how to deal with name

27 June – 3 July

Test the ability to recognize name

4 – 10 July

Start to code about how to deal with recognize some high-frequency word,

11 – 17 July

Finish coding about how to recognize high-frequency out-of-vocabulary word

18 July – 24 July

Test the efficiency of the efficiency of the analysis.

25 – 31 July

Add the analysis to the Xapian System

1 – 7 August

Test the Xapian as a whole

8 – 14 August

Further refine tests and finish documentation for the whole project

Last modified 13 years ago Last modified on 23/05/11 14:56:08
Note: See TracWiki for help on using the wiki.