**Project Structure**

Math formula is generally represented in `LaTeX`

or `MathML`

format. Currently, We support only `Presentation MathML` format.

Example: `x = 2 + b / c`

in Presentation MathML format:-

<math> <mi> x </mi> <mo> = </mo> <mn> 2 </mn> <mo> + </mo> <mfrac> <mi> b </mi> <mi> c </mi> </mfrac> </math>

We parse the math equation and create a symbol layout structure. Symbol Layout structure is a visual representation of MathML format. This structure is formed by connected symbols in the math equations by an edge representing the spatial relationship between connected symbols. The spatial relation can be above, below, adjacent, within etc.

symbol Layout structure of above equation:

Symbol pair tuple is generated from the layout tree structure by taking multiple combinations of symbol pairs within certain path distance. Symbol pair tuple format: [S1, S2, path with spatial relation]. Ex. [V!xO!=N] where N stands for next.

==Key points about implementation in Xapian==

- Math term structure (symbol pair tuple) is different from terms generated from free text, we can't use existing

`TermGenerator`

class. We decided to add a new API class`MathTermGenerator`

to handle equations in MathML format.

- I planned to store the tree structure in
`std::vector`

, this avoids the frequent call to heap memory allocation, hence gives

better performance. I set the equation size as a heuristic and estimated tree structure size and symbol pair tuple size. These values are used to preallocate capacity for

`std::vector`

to avoid frequent reallocations. Once we generate symbol pair tuple using the layout tree, memory for the tree will be released.

### Attachments (3)

- System_diagram.html (3.3 KB ) - added by 18 months ago.
- system_diagram.png (47.6 KB ) - added by 18 months ago.
- slt.png (9.3 KB ) - added by 18 months ago.

Download all attachments as: .zip

**Note:**See TracWiki for help on using the wiki.