Table of Contents
- Community Bonding Week 1: May 4–May 10
- Community Bonding Week 2: May 11-May 17
- Community Bonding Week 3: May 18-May 24
- Community Bonding Week 4: May 25-May 31
- Coding Week 1: June 1–June 7
- Coding Week 2: June 8-June 14
- Coding Week 3: June 15-June 21
- Coding Week 4: June 22-June 28
- Coding Week 5: June 29-July 5 (first evaluation due July 3)
- Coding Week 6: July 6-July 12
- Coding Week 7: July 13-July 19
- Coding Week 8: July 20-July 26
- Coding Week 9: July 27-August 2 (second evaluation due July 31)
- Coding Week 10: August 3-August 9
- Coding Week 11: August 10-August 16
- Coding Week 12: August 17-August 23
- Submit code and evaluations: August 24-August 31
Community Bonding Week 1: May 4–May 10
6 May 2020
Research about Poppler's classes and how support was added in Omega.
7 May 2020
Read https://poppler.freedesktop.org/api/cpp/classpoppler_1_1document.html#ad9cc5b66e864e4f0f15024c4a56c1861 and understand the functions used in handler_poppler.cc
8 May 2020
Understand changes done in configure.ac and Makefile.am for poppler and read https://developer.gnome.org/anjuta-build-tutorial/stable/library-autotools.html.en and setup Omega CGI - https://trac.xapian.org/wiki/OmegaExample
Community Bonding Week 2: May 11-May 17
11 May 2020
Started reading about Libarchive - https://www.libarchive.org/ and https://github.com/libarchive/libarchive/wiki/Examples
14,15 May 2020
Compiled and Installed libarchive 3.4.2 and further read about odf formats and using libarchive and its documentation
16,17 May 2020
Started reading libarchive's internal documentation
Community Bonding Week 3: May 18-May 24
20 May 2020
Read
- https://en.wikipedia.org/wiki/OpenDocument_technical_specification#Format_internals
- https://www.linuxjournal.com/article/9347
- index_file.cc(950)
- OpenDocparse.cc
21 May 2020
Read
- https://github.com/libarchive/libarchive/wiki/FormatZip
- https://www.gemboxsoftware.com/spreadsheet/articles/read-write-ods-odf-net
- https://www.codeproject.com/Articles/38425/How-to-Read-and-Write-ODF-ODS-Files-OpenDocument-2
22 May 2020
Read the following files in libarchive's internal documentation :
- File:archive_read.3.html
- archive_read_data.3.html
- archive_entry.3.html
- archive_read_filter.3.html
- archive_read_format.3.html
- archive_read_new.3.html
- archive_read_open.3.html
- archive_read_set_options.3.html
Community Bonding Week 4: May 25-May 31
25 May 2020
Set up environment for coding, go through coding conventions look into how to extract data from content.xml specifically using libarchive
26 May 2020
Read and understand minitar.c (libarchive)
27 May 2020
Started coding for extracting data from OpenDocument format using libarchive
28 May 2020
Read
- https://stackoverflow.com/questions/12516162/usr-bin-ld-error-cannot-find-lecl
- https://manpages.debian.org/testing/libarchive-dev/archive_read_open.3.en.html archive_read_open()
Extract data from content.xml and style.xml
29 May 2020
Read about read() system call
- https://manpages.debian.org/testing/libarchive-dev/archive_read_open.3.en.html
- https://stackoverflow.com/questions/2883165/libarchive-reads-too-many-chars-when-extracting-a-file
Coding Week 1: June 1–June 7
1 June 2020
Completed handler_libarchive, and made required additions in makefile.am and configure.ac
2-3 June 2020
Solved errors regarding libarchive_sources and libpcre, pushed the repo to remote- https://github.com/Exter-dg/xapian/commit/9f0fdaf4ef600d4840f68b111534bc699d242644
4-5 June 2020
Tried to optimise the code and check for any errors. Read-
- https://fle.github.io/git-tip-keep-your-branch-clean-with-fixup-and-autosquash.html
- https://dev.to/koffeinfrei/the-git-fixup-workflow-386d
- https://opensource.com/article/19/7/introduction-gnu-autotools
Coding Week 2: June 8-June 14
8-9 June 2020
- Try and optimize the handler
- Solve xapian-check-patch errors
- Test omindex_libarchive on different systems.
- Started working on Abiword (.zabw / .abw.gz )
10-11 June 2020
- https://docs.travis-ci.com/user/tutorial/
- Learn about zabw and abw.gz formats.
- Added OpenOffice/StarOffice documents to index_file.cc along with OpenDocument format documents.
- Test for all possible added formats.
12-14 June 2020
- Try and solve errors in the handler and index_file.cc
Coding Week 3: June 15-June 21
15 - 19 June 2020
- Work on socketpair error in worker.cc and fix coding convention errors, Update the PR.
20-21 June 2020
Test for ApacheOffice documents.
Coding Week 4: June 22-June 28
22-24 June 2020
Use code from omindextest PR#280 and created sample tests for formats added using libarchive
25-26 June 2020
Create Class omindexcheck and add functions
Coding Week 5: June 29-July 5 (first evaluation due July 3)
29 - 30 June 2020
Read about IPC and fix errors in omindexcheck and handler.'
1 - 2 July 2020
- Add tests to PR #300
- Work on omindexcheck
- Search for any suitable library for LaTeX documents
3 - 4 July 2020
- Work on mime-type modifications
- Read about libspectre (Postscript).
Coding Week 6: July 6-July 12
6 - 9 July 2020
- Complete and refactor mime-type modifications
- Create MS 2007 files for testing
- Work on completing and getting PR 300 merged
- Read about postscript format
10 - 12 July 2020
- Get PR 300 merged
- Open and work on PR 303 - adding OOXML formats to Libarchive
Coding Week 7: July 13-July 19
13 - 14 July 2020
- Open PR 304 (Add mimetype to handlers)
- Read about using libextractor as a potential library to extract meta data from audio and video files.
- Discuss and Update project plan with proposed libraries (Libextractor)
15 - 19 July 2020
- Make changes to PR 303 and PR 304(closed)
- Read Libextractor's documentation
- Solve issues regarding libextractor
- Create handler_libextractor
Coding Week 8: July 20-July 26
- Update handler_libextractor
- Update PR 303
- Find files for testing libextractor
- Write tests for libextractor
- Research new libraries/formats to be added.
Coding Week 9: July 27-August 2 (second evaluation due July 31)
27 July 2020
- Read libabw and librevenge Documentation
- Work on handler_libabw
July 28 - July 30
- Try and solve Libextractor's test issue.
- Complete handler_libabw
July 31 - August 2
- Solve libabw's problem while extracting title.
- Research the libraries for next phase
Coding Week 10: August 3-August 9
August 3 2020
- Research about Wordperfect and libwpd and understand how does it functions.
August 4 - August 7 2020
- Research about Corel Draw and cdr format structure
- Update PR 306 and PR 307
- Work on handler_libcdr
August 9 2020
- Make sample cdr and cmx files
Coding Week 11: August 10-August 16
August 10 - August 12 2020
- Solve issues regarding libcdr and cdr format
August 13 - August 14 2020
- Update PR 311
- Research about Zoner DRAW and libzmf
Coding Week 12: August 17-August 23
August 17 - August 18 2020
- Update PR 311
- Read about libmwaw and work on handler_libmwaw
August 19 - August 23 2020
- Solve issues and update PR 306 (libextractor)
- Make changes for the test to be skipped in omindexcheck
- Open PR 315 - Libmwaw
- Research about libzmf
Submit code and evaluations: August 24-August 31
August 24 - August 26 2020
- Update PR 306 (Change implementation of testcase- allow it to store per-testcase flags)
- Update PR 315
- Create a draft for project report