GSoC2014/Learning to Rank Jiarong Wei/Journal – Xapian

wiki:GSoC2014/Learning to Rank Jiarong Wei/Journal

Context Navigation

Community Bonding Week 2: April 28-May 4

April 29 - April 30

Learned to use Makefile and GNU autotools(autoscan, autoheader, autoconf, automake) to configure the building environment.

Community Bonding Week 3: May 5-May 11

Community Bonding Week 4: May 12-May 19

Coding Week 1: May 20-May 25

May 21

Reviewed the code. Discussed with Hanxiao Sun about API design and co-work details.

May 22

Sorted out the content of discussion with Hanxiao Sun on May 22 and sent an E-mails to the community to ask for help.

May 23

Reviewed the code. Fix bugs of questletor.cc in Letor module.

May 24

Read about Design Pattern and tried to find useful patterns, especailly for feature calculation for documents in query result. Wrote design document.

May 25

Wrote design document . Discussed with Richard Boulton about refactoring. Refactoed the code.

Coding Week 2: May 26-June 1

May 26

Come up with a draft for modifying MSet::Internal. Discuss with Olly about modification of MSet::Internal.

May 27

Add a new structure letor_item into MSet. Revise the design document.

May 28

Discuss with Hanxiao Sun about APIs. Revise letor_item design. Add details of scoring and training. Add Ranker, Letor and Letor::Internal.

Coding Week 3: June 2-June 8

June 8

Implement new feature class and feature manager class (partially completed).

Coding Week 4: June 9-June 15

June 9

Revise feature class (completed) and feature manager class (partially completed).

June 11

Modify MSet to support letor module. Revise feature class and feature manager class.

June 13

Modify Letor class and Letor::Internal class.

Coding Week 5: June 16-June 22

June 19

Update documents and fix errors in documents.

June 20

Discuss with community about new design on IRC channel and confirm what to do next.

The process of updating MSet and training model in pseudo-code.

The design of attaching letor information into mset.

Coding Week 6: June 23-June 29 (Midterm deadline June 27)

June 23

Refactor SVMRanker to fit in new APIs which doesn't use RankList class.

June 24

Try to merge Ranker and Ranklist into new design. Look for some referance materials.

Coding Week 7: June 30-July 6

June 30

Work with my mentor to replan new timeline.

July 1 - July 6

Off because my computer crashed and I can't find any replacement.

Coding Week 8: July 7-July 13

July 7

I just get my laptop back :D

Add new FeatureVector. New FeatureVector keeps all functions of the previous one. I add new functions which will output the feature information of FeatureVector easily. The format conforms to the format of standard letor features representation. See on GitHub

Add new RankList. New RankList keeps all functions of the previous one. I add boundary checking when needed. I also add new output functions which will output the information of RankList easily. See on GitHub

Add a new Normalizer and DefaultNormlizer. Normailizer is used for normalizing the features in RankList. The Normalizer is the base class. DefaultNormlizer uses the basic normlization method. See on GitHub

Update Feature to be compatible with FeatureVector. It will generate FeatureVector instance. See on GitHub

July 8 - July 9

Add get_did function which returns the letor compatible doc's id in Feature. See on GitHub

Add new functions in FeatureManager, including functions which generates RankList and FeatureVector in training stage and ranking stage, function which load qrel file, function used for normalizing. See on GitHub

Refactor functions in Letor::Internal. Add fucntion prepare_training_file which will create files used for training, function train which will do the training thing. Also add functions used for reading and writing training data. See on GitHub

Modify Letor to be compatible with Letor::Internal and add necessary comments See on GitHub

Coding Week 9: July 14-July 20

July 14

Keep track of original index of document in MSet. That means now we store original index of document in MSet in FeatureVector. See on GitHub

Add function that attaches letor iterms into MSet. See on GitHub

Learn about details of commit of Git (Thank you, James Aylett :D) Add necessary comments and update copyright. Also there're some whitespace changes. See on GitHub

July 15

Add function get_cwd and function calc. The function get_cwd returns the current working directory. The function calc returns RankList which have scores added. See on GitHub

Update SVMRanker to be compatible with new APIs. Remove get_cwd function. Implement calc function. Leave rank function in TO-DO state. Clean up include files and update copyright statement. See on GitHub

Add create_ranklist function which takes no argument. See on GitHub

Use new APIs of FeatureManager and Ranker in Letor::Internal. See on GitHub

Clean up code. See on GitHub

July 16

Review the paper related to feature selection.

Coding Week 10: July 21-July 27

Read the program implemented by Parth. Design the workflow of the feature selector to be implemented. Implement the FeatureSelector class for selecting features.

See on GitHub See on GitHub

Coding Week 11: July 28-August 3

Implement all sample programs:

letor-prepare: Prepare training data from qrel file and query file.
letor-training: Use the training data to train the model.
letor-request: Feed the query and use the model trained from letor-training to re-rank the resulting MSet.
letor-select: Feed the training data to select effetive features.

See on GitHub

Coding Week 12: August 4-August 10

Bugs fix for all programs: letor-prepare, letor-training, letor-request and letor-select. Write documentation for letor module.

Coding Week 13: August 11-August 18 (Final evaluation based on work up to August 18)

Clean up code and finish the documentation. Prepare the pull request.

Last modified 11 years ago Last modified on 18/08/14 07:15:48

Note: See TracWiki for help on using the wiki.

Download in other formats:

Plain Text