GSoC2014/Learning to Rank Hanxiao Sun/Journal – Xapian

wiki:GSoC2014/Learning to Rank Hanxiao Sun/Journal

Context Navigation

Community Bonding Week 2: April 28-May 4

date 1

entry 1

date 2

entry 2

Community Bonding Week 3: May 5-May 11

Community Bonding Week 4: May 12-May 19

May 19

Write a ndcg scorer（input: a labeled ranklist,output: ndcg socre）

Coding Week 1: May 20-May 25

Coding Week 2: May 26-June 1

May 26

In previous edition, the ranker return a vector of scores of documents. Then qusetletor.cc will sort the documents according to their scores. In order to use the metric module, I modify the return type of ranker::rank function. Now, the svmranker will directly return a sorted ranklist.

May 27

add a sort function into ranklist which sort the feature vector according its socre. In the training process, ranker just update the score in each feature vector. After training process, ranker can return the final ranked ranklist by calling the sort function in the ranklist.

May 28

discuss the new structure of the mset with Jiarong. I will continue my work on the ranklist and finish the listnet ASAP. Then we will have a meeting before the midterm to discuss the specific APIs of letor module. Jiarong will refactor the existing code according the new mset(take the place of ranklist) and I will adjust my code later on.

update some functions in ranklist and svmrank. remove the current listnet and listmle from makefile.

May 29

fix some bugs

add the ERR scorer into scorer.cc (need to be tested tomorrow)

push recent prograss onto github(branch: gsoc2014)

May 30

add two test case to verify the ndcg_scorer() and err_scorer(). The NDCG scorer test use the example from wikipedia(http://en.wikipedia.org/wiki/Discounted_cumulative_gain) and the ERR scorer test use the one from lingpipe-blog(http://lingpipe-blog.com/2010/03/09/chapelle-metzler-zhang-grinspan-2009-expected-reciprocal-rank-for-graded-relevance/) .They both return the same scores from these two sources.

Coding Week 3: June 2-June 8

June 2

As James's comments, clear the format(delete all the test code, add some annotations and fix the bug of "Dangling else") of scorer.cc

June 3

JiaRong will design the APIs for Letor, so I will rewirte the ListNet and ListMLE first.
Read the paper of ListNet again
New a new file of listnetranker(have not push it onto github yet)

June 4

fix the bug in ranklist::sort_by_score() and move the rank function from specific ranker(svmranker) into ranklist
find something need to be discuss in the letor model, and I will write a E-mail to discuss the problem with the community tomorrow.

June 5

consider the workflow of letor module and write a E-mail to discuss my thinking with the community.
Read the paper of ListMLE

Coding Week 4: June 9-June 15

June 9

rename svmranker to svm_ranker
delete existing listmle and listnet
add a listnet_ranker and finish its framework

June 10

in order to identify easily, rename some function
add a get_fcount() function to get the feature number from featurevector
finish full implemetation of rank() and partial implemetation of train_model() in listnet_ranker.cc

June 11

add the crossEntropy() function to compute the cross entropy

June 12

implement the train_model() according to the paper

Coding Week 5: June 16-June 22

Coding Week 6: June 23-June 29 (Midterm deadline June 27)

June 24

move the get_cwd() into ranker
finish the load and save model function in listnet function

June 26

add two options into questletor to call the special ranker and metric

Coding Week 7: June 30-July 6

June 30

in order to assign the metric when the ranker created, design and implement a new structure of scorer

July 2

compare the listNet with ranklib, optimize some code

July 3

write a mail to ask for the details of the listnet
read the paper of ListMLE

Coding Week 8: July 7-July 13

July 8

finish the framework of ListMLE

July 9

do some optimizations on the ListNet

July 10

find and read three open source implementations of ListMLE

July 11

finish the main training process of ListMLE

Coding Week 9: July 14-July 20

July 14

finish the rest part of ListMLE

July 15

add some comments and push recent work onto github
todo: modify the hard code in featuremanager(the length of the feature list, now the feature stored from 1, not 0)

July 17

read the paper of adarank
todo: Jiarong seems to have finished the new MSet, need to have a look at his work

Coding Week 10: July 21-July 27

July 21

read two open source implementations of Adarank, compare thier code to the paper
finish the framework of Adarank

July 22

finish the main training process of Adarank

July 25

finish the rest part of Adarank
todo: need to optimize the process to get the feature length of the training set

Coding Week 11: July 28-August 3

August 2

read the paper of Borda Fuse for ranking aggregation module

August 3

finish the ranking aggregation module（temporarily add a function in rank.cc now）

Coding Week 12: August 4-August 10

August 4

read the code of new MSet designed by Jiarong and learn the usage of "rebase"

August 8

read some materials about letor4.0 dataset and consider the framework of the evaluation

August 10

begin to write the framework of the evaluation, on a new branch gsoc2014-evaluation.

Coding Week 13: August 11-August 18 (Final evaluation based on work up to August 18)

August 12

finish the evaluation framework.

August 13

made some modifications to accommodate the letor4.0 dateset and finish a basic evaluation.

August 15

finish the evaluation work.
write a reprot. Report

August 17

clear code and add some copyright information.
add more details into the reprot. Report

Last modified 11 years ago Last modified on 17/08/14 17:20:45

Note: See TracWiki for help on using the wiki.

Download in other formats:

Plain Text

Context Navigation

Table of Contents

Community Bonding Week 2: April 28-May 4

date 1

date 2

Community Bonding Week 3: May 5-May 11

Community Bonding Week 4: May 12-May 19

May 19

Coding Week 1: May 20-May 25

Coding Week 2: May 26-June 1

May 26

May 27

May 28

May 29

May 30

Coding Week 3: June 2-June 8

June 2

June 3

June 4

June 5

Coding Week 4: June 9-June 15

June 9

June 10

June 11

June 12

Coding Week 5: June 16-June 22

Coding Week 6: June 23-June 29 (Midterm deadline June 27)

June 24

June 26

Coding Week 7: June 30-July 6

June 30

July 2

July 3

Coding Week 8: July 7-July 13

July 8

July 9

July 10

July 11

Coding Week 9: July 14-July 20

July 14

July 15

July 17

Coding Week 10: July 21-July 27

July 21

July 22

July 25

Coding Week 11: July 28-August 3

August 2

August 3

Coding Week 12: August 4-August 10

August 4

August 8

August 10

Coding Week 13: August 11-August 18 (Final evaluation based on work up to August 18)

August 12

August 13

August 15

August 17

Download in other formats: