P2X
Universal parser with XML output. Configurable for arbitrary binary and unary operators and parentheses, it can easily parse any structured text. The output is XML, and also JSON or MATLAB, with some small limitations. Output for R is available via R2X.
In most cases it is advisable to devise a small pipeline of XSLT that transform the raw P2X output to something that looks more like what the text input is meant to represent, for example by using XML elements with appropriate names. You may also wish to use your own namespace. We believe in most cases these transformation are usually easily devised and implemented. In case of particular challenges you are always welcome to consult with us.
The repository is at https://github.com/rainac/p2x.