question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Suggestion for new, less brittle, analysis system.

See original GitHub issue

A little while ago, I posted an issue about supporting macros, and the response made sense. Based on the goals of chevrotain, you weren’t concerned with this idea because other options were available.

But, the discussion got me thinking about the current func.toString() analysis system, and whether it could be improved on. This was just an idea I had based on a cursory look over the code, and I won’t be surprised if I’ve missed something important that would make this incompatible with chevrotain.

It seems to me that all the brittle magic of chevrotain happens in Parser.performSelfAnalysis(this), where the GAST is produced based on the string representations of the rule functions, which is a means of understanding what the user is trying to do with the parser so that analysis can be performed. I’ve come up with another way to shine a light into the user’s intentions without a function.toString().

This code is ready to be run (I’m using node v9.8.0). Note that it was only meant as a proof of concept, and intentionally doesn’t do a lot of things.

Essentially what’s happening here is that the static analyze function is simply replacing all the low level parsing methods (“monkey patching” them) with ones that build up an analysis of what they’re doing instead of actually consuming tokens. There’s a scope variable that all these replacement functions have a closure to, that each of them uses to “catch” all invocations that happen beneath them. All of these methods are able to know what parsing methods were called beneath them, no matter how deep in the call tree they are.

// a convenience logging function that's more clear
const util = require('util')
function log(obj) {
  console.log(util.inspect(obj, { depth: null }))
}

// this will be returned by all the "monkey patched" low level parsing methods
// it's a dummy value that will stand in for real parsing results
const INSPECT = Symbol()

// the variable that will be used to "catch" child parsing calls
let scope = null

// since we're faking real parsing results during the analysis phase,
// and since the user might be performing embedded actions
// we need to give them an easy way to act on something that could be fake
function actOnPossibleInspect(possibleInspect, actionFunction) {
  if (possibleInspect !== INSPECT) return actionFunction(possibleInspect)
  else return INSPECT
}
// this system could be replaced with just a enclosed boolean
// that indicates whether we are in inspection mode or not

// a token matching function that uses the action system
function matchToken(token, testToken) {
  return actOnPossibleInspect(token, (tok) => tok.tokenType == testToken.tokenType)
}

// all the monkey patch functions
function monkeyPatchLook(amount) {
  scope.push(`look:${amount}`)
  return INSPECT
}

function monkeyPatchConsume(tokenType) {
  scope.push(`consume:${tokenType}`)
  return INSPECT
}

// subrule doesn't invoke the subrule, because it might not be analyzed or fulfilled yet
// whatever chevrotain does to resolve the
// inherent recursiveness of rule calls would happen here
function monkeyPatchSubrule(ruleName) {
  scope.push(`subrule:${ruleName}`)
  return INSPECT
}

// I haven't included OR, OPTION, AT_LEAST_ONE, etc.,
// because they would be very similar to this and redundant
function monkeyPatchMany(options) {
  let gate, def
  if (typeof options == 'function') {
    gate = () => {}
    def = options
  }
  else ({ gate, def } = options)

  const oldScope = scope
  scope = []
  gate()
  const gateScope = scope

  scope = []
  def()
  const defScope = scope
  scope = oldScope

  scope.push({ type: 'many', defScope, gateScope })

  return INSPECT
}


class Parser {
  constructor() {
    this._rules = {}
    this._dumbGast = {}
  }

  rule(ruleName, ruleFunction) {
    this._rules[ruleName] = ruleFunction
    this[ruleName] = ruleName
  }

  look(amount) {
    throw new Error("This isn't real, not needed for this proof of concept.")
  }

  consume(tokenType) {
    throw new Error("This isn't real, not needed for this proof of concept.")
  }

  subrule() {
    throw new Error("This isn't real, not needed for this proof of concept.")
  }

  many() {
    throw new Error("This isn't real, not needed for this proof of concept.")
  }


  static analyze(parserInstance) {
    const realLook = parserInstance.look
    const realConsume = parserInstance.consume
    const realSubrule = parserInstance.subrule
    const realMany = parserInstance.many

    // this method swaps out the real methods with the monkey patches
    parserInstance.look = monkeyPatchLook
    parserInstance.consume = monkeyPatchConsume
    parserInstance.subrule = monkeyPatchSubrule
    parserInstance.many = monkeyPatchMany

    for (const [ruleName, rule] of Object.entries(parserInstance._rules)) {
      // it sets up the scope for the first time
      // all the child invocations will end up here
      scope = parserInstance._dumbGast[ruleName] = []
      // it calls all the rules, which will use the monkey patched methods
      rule()
      // then it sets the scope back
      scope = null
    }

    // here they are!
    // this isn't a real or useful data structure,
    // but it demonstrates that you can build up something from
    // the invocations down the call stack
    log(parserInstance._dumbGast)

    // then we put all the real methods back
    parserInstance.look = realLook
    parserInstance.consume = realConsume
    parserInstance.subrule = realSubrule
    parserInstance.many = realMany
  }
}


// here's a parser using this
// the actual grammar here is complete nonsense
// but again a proof of concept
class ConceptParser extends Parser {
  constructor() {
    super()

    // a shorter name for this function
    const act = actOnPossibleInspect

    // since we aren't doing function string analysis anymore,
    // we can just use basic functions to call parser methods
    // this acts as a macro that does the same thing with different arguments
    const macroAlternating = (oneArg, otherArg) => {
      const a = this.consume('a')

      this.subrule(this.manyC)

      const oneAlternation = this.many(() => {
        const one = this.consume(oneArg)
        const other = this.consume(otherArg)
        return [one, other]
      })

      this.subrule(this.manyC)

      const otherAlternation = this.many(() => {
        const other = this.consume(otherArg)
        const one = this.consume(oneArg)
        return [other, one]
      })

      this.subrule(this.manyD)

      const b = this.consume('b')

      // we have to use the act function,
      // since the analysis phase will produce fake INSPECT's
      // that don't have methods like .map
      return act(a, () => {
        return {
          a, b,
          c: otherAlternation,
          d: oneAlternation.map(([ind, dep]) => { ind, dep })
        }
      })
    }

    this.rule('topLevel', () => {
      // here we are using a plain function that calls parser methods
      const alternating = macroAlternating('e', 'f')

      const cs = this.subrule(this.manyC)
      const ds = this.subrule(this.manyD)
      return act(cs, () => { alternating, cs, ds })
    })

    this.rule('manyC', () => this.many({
      gate: () => matchToken(this.look(1), 'c'),
      def: () => {
        return act(this.consume('c'), tok => tok.tokenValue)
      }
    }))

    this.rule('manyD', () => this.many(() => {
      return act(this.consume('d'), tok => tok.tokenValue)
    }))

    Parser.analyze(this)
  }
}

new ConceptParser()

And the output (on my machine)

{ topLevel: 
   [ 'consume:a',
     'subrule:manyC',
     { type: 'many',
       defScope: [ 'consume:e', 'consume:f' ],
       gateScope: [] },
     'subrule:manyC',
     { type: 'many',
       defScope: [ 'consume:f', 'consume:e' ],
       gateScope: [] },
     'subrule:manyD',
     'consume:b',
     'subrule:manyC',
     'subrule:manyD' ],
  manyC: 
   [ { type: 'many',
       defScope: [ 'consume:c' ],
       gateScope: [ 'look:1' ] } ],
  manyD: [ { type: 'many', defScope: [ 'consume:d' ], gateScope: [] } ] }

The possible objection I can see to this method?

It’s gross

I can see the objection that swapping out methods, using an enclosed pointer variable to catch data, and producing fake output that the user could see is sort of hacky. But is it considerably worse than analyzing the string representations of functions? Especially given the fact that it opens the door to the massive convenience of using plain functions that call parser methods? I think the trade off is well worth it.

The biggest inconvenience introduced is in making embedded actions more cluttered with the introduction of INSPECT results, but again I think it’s a small price to pay.

Would this work well as a separate api?

If the chevrotain maintainers aren’t interested in pursuing this direction, I could imagine taking this on as a new project using the chevrotain engine (or at least the underlying “auto lookahead” algorithm).

Thoughts?

Issue Analytics

  • State:closed
  • Created 6 years ago
  • Comments:5 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
blainehansencommented, Mar 25, 2018

The macros would be expanded into an equivalent GAST representation yes. And it only appears I wasn’t using MANY/OR/etc simply because I was quickly typing up a proof of concept and named them differently.

I’m going to go down the path of seeing how this would work, and reopen this issue if I need any help.

0reactions
bd82commented, Aug 14, 2019

Hello @blainehansen

A similar approach was once again suggested in #992 and I’ve started implementing it as part of a major version change in #998, but with hopefully limiting the number of breaking changes somewhat…

I guess I have suffered enough from the brittleness of Function.toString and its time to move-on 😄 . This will also enable implementing new capabilities/features (e.g macros).

Cheers.

Read more comments on GitHub >

github_iconTop Results From Across the Web

(PDF) Brittle System Analysis - ResearchGate
The goal of this paper is to define and analyze systems which exhibit brittle behavior. This behavior is characterized by a sudden and...
Read more >
Brittle Fracture - an overview | ScienceDirect Topics
Brittle fractures show very little deformation of the material around the fracture. ... Ductile failures are less common as subjects for failure analysis, ......
Read more >
Recommendations for Ductile and Brittle Failure Design ...
The values of material properties that should be used in the structural analysis are those values that correspond to the appropriate temperatures at...
Read more >
Brittle System Analysis | DeepAI
The goal of this paper is to define and analyze systems which exhibit brittle behavior. This behavior is characterized by a sudden and...
Read more >
Analytical Model with Independent Control of Load ... - MDPI
behavior of brittle materials. No less important are the problems of developing not only numerical, but also new analytical models.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found