wiki:GSoC2017/Clustering/Timeline

Timeline:

Coding Week 9: July 15–July 21

Work on dimensionality PR's #167 (Stopword removal) and ##168 (Stemming) and get them merged in. Current work includes adding in appropriate tests for StemStopper and also correcting the tests that broke after adding in stemming and putting in changes suggested during PR review. Also work on profiling to find out any other smaller optimizations that are possible.

Coding Week 10: July 22–July 28 (evaluations: July 24–28)

Move Round Robin clusterer to the testsuite and implement the triangle inequality optimization for KMeans. Since an API is already in place, it should be easier to get this working. Initiate a PR by the start of the week and get it merged by the end of the week.

Coding Week 11: July 29–August 4

Buffer time for getting the triangle inequality PR merged, writing tests and documentation for the implemented functionality. Getting the triangle inequality PR merged in if it isn't done in the past week will be priority. Along side, start work on the Cluster Evaluation class half way through the week and implement, test and document one of the cluster evaluation methods in this period.

Coding Week 12: August 5–August 11

Continue work on Cluster evaluation class and implement, test and document two of the evaluation methods in this week. Start a different PR for each new evaluation method and work on the review.

Coding Week 13: August 12–August 18

Finish work on the Cluster evaluation class by implementing the final two methods that will be left. This includes buffer time for making up in case behind the timeline, and testing and documenting how the two clusterers (KMeans and ElkansKMeans) work in runtime and the quality of clusters generated over an already clustered dataset (something like the BBC newsgroup datasets).

Coding Week 14: August 19–August 25 (evaluations; August 21–29)

Initiate work on the Agglomerative clusterer. Similar to the triangle inequality optimization, since an API is in place, it should be easier to get this done now. Initiate a PR by the start of the week and get it merged in by the end of the week.

Final Evaluations: August 26–August 29

Buffer time for merging the PR from the previous week, and time for adding in any documentation into either the user guide, or any pending documentation work.

Last modified 7 years ago Last modified on 07/16/17 20:10:24
Note: See TracWiki for help on using the wiki.