## A Top Down Operator Precedence parser for algebraic expressions

In my previous post I wrote about a polar curve grapher and its use of Kevin Mehall's `tdop_math.js`

to parse a user input algebraic expression into a
javascript eval(uable) string. I ended up spending some time with `tdop_math.js`

- you can find the source here -
and I fell into the magical rabbit hole of Top Down Operator Precedence (TDOP) parsers. This post describes how I ended up writing a
TDOP parser for algebraic expressions in Ruby.

Some facts: - Vaughan Pratt first wrote about Top Down Operator Precedence parsers in 1973 - Douglas Crockford wrote about them in Beautiful Code, and in an article that includes an implementation of a parser for a Simplified Javascript - he went on to use this in JSlint

#### Overview of TDOP parsers

In a nutshell, Top Down Operator Precedence parsers are similar to Recursive Descent parsers, but, as claimed by Vaughan Pratt, simpler to understand, implement, and providing of better performance. Borrowing from this excellent article by Eli Bendersky, the main features of a TDOP parser are:

- A
*binding power*mechanism to handle precedence levels - e.g. the binding power of multiplication`'*'`

is higher than that of addition`'+'`

in arithmetic - A means of implementing different functionality of tokens depending on their position relative to their neighbors:
*prefix*or*infix.*For example, the minus`'-'`

operator is behaving as a prefix operator in`-2`

but in its infix form, it is subtraction`4 - 2.`

- As opposed to classic Recursive Descent, where semantic actions are associated with grammar rules (BNF), TDOP associates them with tokens. Tokens have a binding power, and know how to handle themselves in infix and/or in prefix situations. Expressions are then evaluated recursively by evaluating them at each 'precedence' level and returning the result to its caller.

Compilers are difficult to understand at the best of times and most articles I have come across on TDOP parsers are by very bright people finishing (or pausing) on their journey to get to know Pratt's work. At my current stage of comfort with TDOP parsers, I cannot come close to explaining the parser any better than Eli Bendersky.

Eli's article, elaborates on the features described above
and explains the parser by setting one up to handle simple arithmetic. He then works through a simple example that illustrates it in
action as it parses the expression `3 + 1 * 2 * 4 + 5.`

For further detail on TDOP parsers, I refer the interested reader to this and Douglas Crockford's article referenced above.

### TDOP and Ruby

As I was working on the polar curve grapher I got curious about implementing `tdop_math.js`

in Ruby. I studied the
source of `tdop_math.js,`

and its tokeniser `tokens.js,`

and felt that I had a decent understanding of its structure and operation (improving my Javascript skills
along the way!). However, a like-for-like implementation in Ruby seemed out of the question. Well, it seemed like a hard proposition that would involve
first developing new metaprogramming skills in Ruby!

And then I ran into Glen Vandenburg's project `radish,`

which is now called `smithereen`

.

#### Smithereen

Smithereen is a result of Glen Vandenburg's meditation on TDOP parsers, during which he decided to write an implementation in Ruby.
He has implemented a *library for building Top Down Operator Precedence parsers,* using Ruby idioms that were completely unfamiliar to me, but
that provided an excellent opportunity to learn. As Glen says, he has *tried to provide a well-factored set of tools to help with
the core TDOP algorithm and several related problems,* and so I decided to fork Smithereen with the intention of helping with its development, and,
using it to implement `tdop_math`

in Ruby.

Smithereen comes with two samples, an arithmetic parser, and the Simplified Javascript parser. I found it quite difficult to get started with Smithereen
and asked Glen a question or two. To my delight, he helped me get started and I found this to be so inspiring that I made rapid progress towards
implementing a simple scientific calculator. This is available on my fork of Smithereen
as `equation_parser.rb`

`equation_parser.rb`

will, for instance, take strings such as `sin(cos(2pi))`

and return `0.84147`

By way of explanation, it has hardcoded hashes that declare functions and constants, and these are tokenised as `names`

and given a high
binding power.

```
fns = {
'sin' => Proc.new { |x| Math.sin(x) },
'cos' => Proc.new { |x| Math.cos(x) },
'tan' => Proc.new { |x| Math.tan(x) },
'sqrt' => Proc.new { |x| Math.sqrt(x) },
'exp' => Proc.new { |x| Math.exp(x) }
}
CONSTANTS = {
'pi' => Math::PI,
'e' => Math::E
}
```

The `infix`

method on the token `(`

checks for the presence of a function or a constant (essentially a key in the above hashes),
and looks up the corresponding value for the definition of the function. Smithereen takes care of the rest. Well, that is, after tokens for
the usual arithmetic operators are defined, with infix and prefix methods as described above.

#### tdop_math.rb

Completing my goal to create a Ruby equivalent to `tdop_math.js,`

I've taken the work in the example `equation_parser.rb`

and used it to create a Ruby gem called `tdop_math.`

This takes an algebraic expression, parses it, and returns a string of Ruby that represents
the expression.

Because the `tdop_math`

gem depends on `smithereen`

(which has yet to be released), it is not available as a built gem.

As far as I know, if you want to install this, you are going to need to pull down my repo and
`include '<path-to-tdop_math.rb>'`

manually. Once you have, you use it as follows:

```
u = "sin(cos(2x))" ## or some user input algebraic expression
t = TDOPMath::Parser.new(u)
f = t.parse ## f == "Math.sin(Math.cos(2 * x))"
```

`tdop_math`

reserves symbols `x`

,`y`

and `t`

as variables. It also
provides a basic list of functions, namely, `sin`

, `cos`

, `tan`

, `sqrt`

and `exp`

. If you need to provide an alternate list of variables and
functions, you can do so as follows:

```
u = "baz(foo(2a))" ## or some user input algrebraic expression
t = TDOPMath::Parser.new(u,
:vars => ['a'],
:functions => {
'foo' => 'Foo::foo',
'baz' => 'Baz::baz'
})
f = t.parse ## f == "Baz::baz(Foo::foo(2 * a))"
```

You can use `tdop_math`

wherever you need a user to input an algebraic expression, comprising of a known set of functions and variables. This
could find use in apps in maths education or science/data analytics. I hope the community finds it useful, although my motivation was mainly
to build it as an exercise, along my path to becoming a better programmer. I've made a gem!