wiki:GSoC2019/TextExtraction/Journal

Coding Week 1: May 28-June 3

● Implement code to add the libraries and isolate errors.
● Get familiar with Poppler for handle PDF files.

Coding Week 2: June 4-June 10

● Add Poppler to omindex.
● Modify the build system for conditional compilation.

Coding Week 3: June 11-June 17

● Solve build system problems.
● Probe different versions of Poppler.
● Look for a new library.

Coding Week 4: June 18-June 24

● Add Libe-book to omindex.
● Modify the build system and solve pkg.m4 problem.

Coding Week 5: June 25-July 1 (first evaluation June 24-28)

● Solve mime types problem with Libe-book.
● Write Documentation for adding support for a new format.

Coding Week 6: July 2-July 8

● Add Libetonyek to omindex.
● Modify the communication protocol to keep the assistant alive in case of non-fatal errors.

Coding Week 7: July 9-July 15

● Improve communication protocol between worker and assistant.
● Fix stderr problem.
● Add Tesseract to omindex.

Coding Week 8: July 16-July 22

● Add Libmimetic to omindex.
● Get merged some handlers.

Coding Week 9: July 23-July 29 (second evaluation July 22-26)

● Solve problems with Libmimetic (related to header encoding).
● Close some PR.

Coding Week 10: July 30-August 5

● Add Gmime to omindex.
● Try solve Libetonyek problem and report it on brew.

Coding Week 11: August 6-August 12

● Get Libetonyek and Tesseract merged.
● Update Documentation if necessary.
● Solve Gmime problem.

Submit code and evaluations: August 13-August 21 (final evaluation August 19-26)

● Add Libmarkdown2 to omindex
● Prepare code for final evaluation
● Add some automated testing to omindex
● Close some PR.

Last modified 5 years ago Last modified on 22/08/19 12:59:14
Note: See TracWiki for help on using the wiki.