wiki:GSoC2018/Maths/ProjectTimeline

Community Bonding Period: April 23 – May 13

  • Get to know the community, interact with the people.
  • Read and understand the Xapian code base -understand the underlying design the principle, get to know all the relevant classes.
  • Submit patches for existing issues, go through code review process.
  • Get equipped with all the background knowledge needed to implement the project parts - writing parser, adding weighting scheme, study how wildcard expansion performed.
  • Have clear blueprint of the project.

Coding Week 1: May 14–May 20

  • Implementation to extract a list of presentation MathML expression from the input document.
  • Write symbol layout tree class. Add the necessary attributes, implement member functions. Task requires representing math symbols as different types of nodes and spatial relationship as edge types, writing helper functions to traverse the tree, adding children, updating the tree etc.

Coding Week 2: May 21–May 27

  • Construct symbol layout tree from presentation MathML expression. This involves parsing the MathML expression and adding the extracted token to the tree structure.

Coding Week 3: May 28–June 3

  • Make sure test cases are there for the code written so far. Write documentation.
  • Buffer to cover up any lagging work if any.
  • Create symbol pair tuple class, make symbol pair tuple class indexable.
  • Generate symbol pair tuple from symbol layout tree with given window size parameter.

Coding Week 4: June 4–June 10

  • Make sure test cases are there for the code written so far. Write documentation.
  • Integrate the work done so far. Rework the class design, refactoring the code if needed.

Coding Week 5: June 11–June 17 (evaluations: June 11-15)

  • Buffer time.

Coding Week 6: June 18–June 24

  • Work on indexing math terms available at the end of block 1. Implement posting list for math terms.
  • Test indexing of documents with multiple test data files. Fix issues if any.

Coding Week 7: June 25–July 1

  • Implement dice's coefficient of similarity weight metric.
  • Test the weight metric with multiple test data. Fix issues if any.
  • Add documentation.

Coding Week 8: July 2–July 8

  • Construct symbol layout tree from the query input. This involves majority code reuse from block 1. Handle query specific changes needed.
  • Buffer time. Work on anything lagging, else take a long break.

Coding Week 9: July 9–July 15 (evaluations: July 9-13)

  • Implement document retrieval from the given query. This involves generating symbol pair tuples from the symbol layout tree for the query and fetching postings from the database index.

Coding Week 10: July 16–July 22

  • Integrate the code and perform testing. Code refactor if needed. Document the code. Profile the code, evaluate the performance.

Coding Week 11: July 23–July 29

  • Do the housekeeping work in this period. Address any pending requested changes and fix issues etc.
  • Matrix support requires an update to parser code and tuple generation code. Write tests and make sure no breakage in any functionality.

Coding Week 12: July 30–August 5

  • Implement support for wildcard queries. This requires extending tuple generation module of query further.

Coding Week 13: August 6–August 12 (evaluations: August 6-14)

  • Complete any review modifications pending. Finalize the documentation. Do clean up work if any present.
Last modified 17 months ago Last modified on 24/05/18 19:44:50