Feature request: Simple latex content
See original GitHub issueRequirements
In our application we need the user to input math-formulas quite a lot, which are then parsed to semantic ASTs and used in further analysis and checking. This process needs to be highly robust and adaptable, and we need to be able to customize it for different users. We need both the ability to enter symbols and entire formulas via the virtual keyboard (mixed with the normal keyboard) and latex-commands for advanced users.
Current approach
Right now, we’re using the $latex()
-output of the mathfield and parse this using a family of EBNF grammars, which works rather well in general:
- it makes it easy for us to extend the parser by simply supporting more latex
- the parsers grammar can be altered for different clients/users
- users can write latex in the mathfields and/or copy+paste from one field to another rather easily and intuitively
I was trying out the MASTON-output initially, but found it too limited.
Problem
In a lot of cases, the mathfields behave a little too “smart” and their contents are not what one would expect:
- there are certain formatting operators added to it, like e.g.
\mathop
or\mathrm
- sometimes the order of inputs is changed, like
a_b^c
becomesa^c_b
. That can be compensated for in the parser, but it is annoying and this order feels less “semantic”. - Text-areas get combined, so
\mathbb{R}\mathbb{R}
becomes\mathbb{RR}
.
I see how these optimization increase the overall quality of the latex in general and make it better visually, but it makes it harder to work with it in a parser (or other automations). Users enter a=b
and the field converts it to a\mathop{=}b
, which looks nicer because of the improved spacing. But the parser needs to understand both a=b
and a\mathop{=}b
.
It also seemed these improvements are added over time, so with every new version of mathlive we need to check our application rather thoroughly.
Suggestion
It would be great if there was a way to get the content of the mathfield in a simplified format that is optimized for getting parsed. This could be another accessor like $simpleLatex()
, an option for $latex()
or something entirely different, that lets us work with the content more easily in an automated way. This doesn’t need to solve all the problems I was listing above (there’s no reason to write \mathbb{R}\mathbb{R}
anyways), but it would be nice if this could prove the more stable and reliable interface.
I am also very open to suggestions how we could change our approach from our side.
Issue Analytics
- State:
- Created 4 years ago
- Comments:10 (8 by maintainers)
Top GitHub Comments
I like that idea best. It can be very useful to build in heuristics for semantics, but there are times when it will be wrong, particularly for specialized areas. Having a way to get at the expression before the semantics are inferred seems like a good idea.
On Wed, Oct 30, 2019 at 2:41 PM Arno Gourdol notifications@github.com wrote:
Is there a fix for this yet?In my use-case we’re parsing the output from the mathfield to a function for analysing and the drawback is that the syntax of the latex is changed.