wiki:GSoC2018/LTR/ProjectPlan

April 23 – May 14 2018: Community Bonding Period:

Expected Deliverables: All working data downloaded, and testing started, with preliminary test examples and reporting as well as solving possible bugs.

  1. Writing tests.
  1. Setting up tools required.

Also, I will not be able to contribute very actively in the last week due to my end semester exams from May 7th - May 14th.

Week 1: May 14 - May 21

Testing and benchmarking period.

Expected Deliverables: Benchmarked reports for each of the INEX2009 and FIRE dataset. And a report on the run time performance.

  1. Run benchmarking tests on the data-sets (INEX and FIRE).
  1. Report the evaluation to the mentors.
  1. Make sure xapian-letor is stable by reporting and fixing bugs if any.

Week 2: May 14 - May 28

Testing and benchmarking period.

Expected Deliverables: Stability of xapian-letor stress tested with various example datasets.

  1. Writing examples to test proper execution.
  1. Make sure xapian-letor is stable by reporting and fixing bugs if any.
  1. Cleaning up any previous work.

Week 3-4: May 28 - June 8:

The next goal would be to adding a regression to combine multiple tests.

Expected deliverables: Most of the regression implemented along with most working completed.

  1. Implement the feedforward backpropagation algorithm to combine various rankers by assignming random weights and then let them adjust according to the algorithm and learning rate.
  1. Add regression to xapian-letor, along with the tests.

Week 5: June 8 - June 15:

Expected deliverables: All of the regression code cleaned up, and ready to merge. This period will be kept as a buffer for any pending work.

  1. Complete and clean out the code.
  1. Will act as a buffer period for any unreported work.
  1. Phase 1 evaluation reportable.

Week 6: June 15 - June 22:

Expected deliverables: principal component analysis implemented maintaining basic input vector dimensions and giving a FeatureVector? space output.

  1. Implementing independent principal component analysis.
  1. Check it’s functioning.

Week 7: June 22 - June 29:

Expected deliverables: Merging PCA implementation into Xapian-letor.

  1. Implementing principal component analysis in the Xapian module.
  1. Writing tests for the same.

Week 8: June 29 - July 13:

Expected deliverables: Any previous work not delivered.

  1. Get evaluation for the PCA implementation and get it merged into the main module.
  1. Clean code and get done with documentation.

Week 10-11: July 13 - July 27:

Expected deliverables: Adding ADARank to xapian-letor rankers.

  1. Will chalk out implementation details done by Vhasu and integrate it into xapian.
  1. Ensure working after cleaning up and documenting the code.

Week 11-13: July 27 -Aug 10:

Expected deliverables: Clean documented code completed so far, along with proper tests and to pursue one of the stretch goals, mergeable into the main project.

Working on stretch goals and cleaning up existing code and writing good tests to run for the code.

  1. Adding a support for backend to track the length of the fields. To allow implementation of weighting schemes like BM25F
  1. Where our stretch goal is to add OpenCL and OpenMP parallelization support to training models and improving overall performance.
Last modified 15 months ago Last modified on 27/05/18 23:51:59