1 | | **April 23 – May 14 2018: Community Bonding Period:** |
2 | | |
3 | | Expected Deliverables: All working data downloaded, and testing started, with preliminary test examples and reporting as well as solving possible bugs. |
4 | | |
5 | | 1. Writing tests. |
6 | | |
7 | | 2. Setting up tools required. |
8 | | |
9 | | Also, I will not be able to contribute very actively in the last week due to my end semester exams from May 7th - May 14th. |
10 | | |
11 | | |
12 | | **Week 1: May 14 - May 21** |
13 | | |
14 | | Testing and benchmarking period. |
15 | | |
16 | | Expected Deliverables: Benchmarked reports for each of the INEX2009 and FIRE dataset. And a report on the run time performance. |
17 | | |
18 | | 1. Run benchmarking tests on the data-sets (INEX and FIRE). |
19 | | |
20 | | 2. Report the evaluation to the mentors. |
21 | | |
22 | | 3. Make sure xapian-letor is stable by reporting and fixing bugs if any. |
23 | | |
24 | | |
25 | | **Week 2: May 14 - May 28** |
26 | | |
27 | | Testing and benchmarking period. |
28 | | |
29 | | Expected Deliverables: Stability of xapian-letor stress tested with various example datasets. |
30 | | |
31 | | 1. Writing examples to test proper execution. |
32 | | |
33 | | 2. Make sure xapian-letor is stable by reporting and fixing bugs if any. |
34 | | |
35 | | 3. Cleaning up any previous work. |
36 | | |
37 | | |
38 | | **Week 3-4: May 28 - June 8:** |
39 | | |
40 | | The next goal would be to adding a regression to combine multiple tests. |
41 | | |
42 | | Expected deliverables: Most of the regression implemented along with most working completed. |
43 | | |
44 | | 1. Implement the feedforward backpropagation algorithm to combine various rankers by assignming random weights and then let them adjust according to the algorithm and learning rate. |
45 | | |
46 | | 2. Add regression to xapian-letor, along with the tests. |
47 | | |
48 | | |
49 | | **Week 5: June 8 - June 15:** |
50 | | |
51 | | Expected deliverables: All of the regression code cleaned up, and ready to merge. This period will be kept as a buffer for any pending work. |
52 | | |
53 | | 1. Complete and clean out the code. |
54 | | |
55 | | 2. Will act as a buffer period for any unreported work. |
56 | | |
57 | | 3. Phase 1 evaluation reportable. |
58 | | |
59 | | |
60 | | **Week 6: June 15 - June 22:** |
61 | | |
62 | | Expected deliverables: principal component analysis implemented maintaining basic input vector dimensions and giving a FeatureVector space output. |
63 | | |
64 | | 1. Implementing independent principal component analysis. |
65 | | |
66 | | 2. Check it’s functioning. |
67 | | |
68 | | |
69 | | **Week 7: June 22 - June 29:** |
70 | | |
71 | | Expected deliverables: Merging PCA implementation into Xapian-letor. |
72 | | |
73 | | 1. Implementing principal component analysis in the Xapian module. |
74 | | |
75 | | 2. Writing tests for the same. |
76 | | |
77 | | |
78 | | **Week 8: June 29 - July 13:** |
79 | | |
80 | | Expected deliverables: Any previous work not delivered. |
81 | | |
82 | | 1. Get evaluation for the PCA implementation and get it merged into the main module. |
83 | | |
84 | | 2. Clean code and get done with documentation. |
85 | | |
86 | | |
87 | | **Week 10-11: July 13 - July 27:** |
88 | | |
89 | | Expected deliverables: Adding ADARank to xapian-letor rankers. |
90 | | |
91 | | 1. Will chalk out implementation details done by Vhasu and integrate it into xapian. |
92 | | |
93 | | 2. Ensure working after cleaning up and documenting the code. |
94 | | |
95 | | |
96 | | **Week 11-13: July 27 -Aug 10:** |
97 | | |
98 | | Expected deliverables: Clean documented code completed so far, along with proper tests and to pursue one of the stretch goals, mergeable into the main project. |
99 | | |
100 | | Working on stretch goals and cleaning up existing code and writing good tests to run for the code. |
101 | | |
102 | | 1. Adding a support for backend to track the length of the fields. To allow implementation of weighting schemes like BM25F |
103 | | |
104 | | 2. Where our stretch goal is to add OpenCL and OpenMP parallelization support to training models and improving overall performance. |