question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

New feature: node graph comparison

See original GitHub issue

I’m developing a library that does annotation processing and generates code using KotlinPoet. While before I had this working by just comparing the raw generated text against that of a golden file, I’m now using kastree to parse both to rule out trivial differences like comments and whitespace.

To improve my ability to spot where the generated and expected code diverge in case the test fails, I made a SyntaxTreeComparer object that traverses two node graphs in tandem and returns the first divergence found between them, if any. I can then pretty-print this in my failure messages.

For example, given this golden file to compare against:

// Package statement and various imports...

class MinAndMax<T: Comparable<T>>(id: String) : ElementaryNode(id) {
    private val `_values`: SinglePortInputBoundary<Iterable<T>> = inputBoundary("values")

    val values: InputPort<Iterable<T>> = `_values`.exposed

    private val `_min`: SinglePortOutputBoundary<T?> = outputBoundary("min")

    val min: OutputPort<T?> = `_min`.exposed

    private val `_max`: SinglePortOutputBoundary<T?> = outputBoundary("max")

    val max: OutputPort<T?> = `_max`.exposed

    override suspend fun executeOnce() {
        val foo = "Bar" // Shouldn't really be in generated code
        coroutineScope {
            val values = async {
                `_values`.inner.yield()
            }
            val output = nodeBody(values.await())
            `_min`.inner.forward(output.min)
            `_max`.inner.forward(output.max)
        }
    }

    private fun nodeBody(values: Iterable<T>): ReturnValue<T> = ReturnValue(values.min(), values.max())
    private class ReturnValue<T>(val min: T?, val max: T?)
}

And a test case that involves this custom kotlintest matcher, where diff passes both node graphs to SyntaxTreeComparer:

fun matchTree(expectedTree: Node) = object : Matcher<Node> {
    override fun test(value: Node): Result {
        val divergence = value diff expectedTree
        val ifShouldMatch = "Trees should match, but diverge:\n${divergence?.prettyPrint()}"
        val ifShouldNotMatch = "Trees should not match, but they do"
        return Result(divergence == null, ifShouldMatch, ifShouldNotMatch)
    }
}

I get this failure message:

Trees should match, but diverge:
EXPECTED
{
    val foo = "Bar"
    coroutineScope {
        val values = async {
            `_values`.inner.yield()
        }
        val output = nodeBody(values.await())
        `_min`.inner.forward(output.min)
        `_max`.inner.forward(output.max)
    }
}
ACTUAL
{
    coroutineScope {
        val values = async {
            `_values`.inner.yield()
        }
        val output = nodeBody(values.await())
        `_min`.inner.forward(output.min)
        `_max`.inner.forward(output.max)
    }
}

Anyway, point is - would you be interested in incorporating something like this into your library? If so, I can open a PR with the code for the syntax tree comparer and add some unit tests later on.

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:6 (6 by maintainers)

github_iconTop GitHub Comments

1reaction
cretzcommented, Nov 8, 2018

Here’s something I hacked together that shows the idea. The differ:

import kastree.ast.Node
import kastree.ast.Visitor
import kotlin.reflect.KClass

open class NodeDiffer {
    fun diff(expected: Node, actual: Node): List<Diff> {
        val expected = PathVisitor().apply { visit(expected) }
        val actual = PathVisitor().apply { visit(actual) }

        // Find the longest paths that are not the same
        val pathDiffs = expected.children.mapNotNull { (expectedPath, expectedChild) ->
            // If the path isn't even in the actual, not a diff (the diff is one higher).
            val actualChild = actual.children[expectedPath] ?: return@mapNotNull null
            // If the children are the same, not a diff
            if (actualChild == expectedChild) return@mapNotNull null
            // Also, if the children are the same type and there is a child path for it, not a diff
            if (actualChild::class == expectedChild::class) {
                val childPath = ParentPath(actualChild::class, expectedPath.parents + expectedPath)
                if (actual.children.containsKey(childPath) && expected.children.containsKey(childPath))
                    return@mapNotNull null
            }

            // Otherwise, it's a diff
            expectedPath to Diff(
                expectedParents = expectedPath.parents.map { expected.parents[it]!! },
                expected = expectedChild,
                actualParents = expectedPath.parents.map { actual.parents[it]!! },
                actual = actualChild
            )
        }.toMap()

        // Remove all lower paths that have higher path diffs
        return pathDiffs.filterKeys { it.parents.none(pathDiffs::containsKey) }.values.toList()
    }

    class PathVisitor : Visitor() {
        var stack = emptyList<ParentPath>()
        val children = mutableMapOf<ParentPath, Node>()
        val parents = mutableMapOf<ParentPath, Node>()

        override fun visit(v: Node?, parent: Node) {
            // Node must be present
            v ?: return
            // Create unique path
            var path = ParentPath(parent::class, stack)
            while (children.containsKey(path)) path = path.copy(index = path.index + 1)
            // Add this node to the known children
            children[path] = v
            // Add the parent to the known parents
            parents[path] = parent
            // Go deeper
            stack += path
            super.visit(v, parent)
            stack = stack.dropLast(1)
        }
    }

    data class ParentPath(val cls: KClass<*>, val parents: List<ParentPath>, val index: Int = 0)
    data class Diff(
        val expectedParents: List<Node>,
        val expected: Node,
        val actualParents: List<Node>,
        val actual: Node
    )

    companion object : NodeDiffer()
}

A main test:

package kastree.ast.psi.temp

import kastree.ast.Writer
import kastree.ast.psi.Parser


fun main(args: Array<String>) {
    val expectedCode = Parser.parseFile("""// Package statement and various imports...

class MinAndMax<T: Comparable<T>>(id: String) : ElementaryNode(id) {
    private val `_values`: SinglePortInputBoundary<Iterable<T>> = inputBoundary("values")

    val values: InputPort<Iterable<T>> = `_values`.exposed

    private val `_min`: SinglePortOutputBoundary<T?> = outputBoundary("min")

    val min: OutputPort<T?> = `_min`.exposed

    private val `_max`: SinglePortOutputBoundary<T?> = outputBoundary("max")

    val max: OutputPort<T?> = `_max`.exposed

    override suspend fun executeOnce() {
        coroutineScope {
            val values = async {
                `_values`.inner.yield()
            }
            val output = nodeBody(values.await())
            `_min`.inner.forward(output.min)
            `_max`.inner.forward(output.max)
        }
    }

    private fun nodeBody(values: Iterable<T>): ReturnValue<T> = ReturnValue(values.min(), values.max())
    private class ReturnValue<T>(val min: T?, val max: T?)
}""")
    val actualCode = Parser.parseFile("""// Package statement and various imports...

class MinAndMax<T: Comparable<T>>(id: String) : ElementaryNode(id) {
    private val `_values`: SinglePortInputBoundary<Iterable<T>> = inputBoundary("values")

    val values: InputPort<Iterable<T>> = `_values`.exposed

    private val `_min`: SinglePortOutputBoundary<T?> = outputBoundary("min")

    val min: OutputPort<T?> = `_min`.exposed

    private val `_max`: SinglePortOutputBoundary<T?> = outputBoundary("max")

    val max: OutputPort<T?> = `_max`.exposed

    override suspend fun executeOnce() {
        val foo = "Bar" // Shouldn't really be in generated code
        coroutineScope {
            val values = async {
                `_values`.inner.yield()
            }
            val output = nodeBody(values.await())
            `_min`.inner.forward(output.min)
            `_max`.inner.forward(output.max)
        }
    }

    private fun nodeBody(values: Iterable<T>): ReturnValue<T> = ReturnValue(values.min(), values.max())
    private class ReturnValue<T>(val min: T?, val max: T?)
}""")
    val diffs = NodeDiffer.diff(expectedCode, actualCode)
    diffs.forEach {
        println("EXPECTED:\n------\n" + Writer.write(it.expectedParents.last()) + "\n------")
        println("ACTUAL:\n------\n" + Writer.write(it.actualParents.last()) + "\n------")
    }
}

Returns:

EXPECTED:
------
{
    coroutineScope {
        val values = async {
            `_values`.inner.yield()
        }
        val output = nodeBody(values.await())
        `_min`.inner.forward(output.min)
        `_max`.inner.forward(output.max)
    }
}
------
ACTUAL:
------
{
    val foo = "Bar"
    coroutineScope {
        val values = async {
            `_values`.inner.yield()
        }
        val output = nodeBody(values.await())
        `_min`.inner.forward(output.min)
        `_max`.inner.forward(output.max)
    }
}
------

Now, there are little bugs and what not and if I were developing a full solution, I’d change it up and add tests and what not. But in general, the idea of keeping paths of classes and then comparing at the end is reasonable, and future proof when I add more AST classes as more language features get developed.

0reactions
cretzcommented, Nov 8, 2018

@TAGC - Note, it may not work on top-level changes and you need to add some test cases. Basically, it just keeps a path of the classes it has traversed and then compares paths at the end.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Graph machine learning with missing node features
This blog post discusses how feature propagation can be an efficient and scalable approach for handling missing features in graph machine ...
Read more >
Feature Extraction for Graphs by K. Kubara
Extracting features from graphs is completely different than from normal data. This article summarizes the most popular features for graphs.
Read more >
Learning to Compare Nodes in Branch and Bound with ...
We propose a new siamese graph neural network model to tackle this problem, where the nodes are represented as bipartite graphs with attributes....
Read more >
On Positional and Structural Node Features for Graph ...
Cross Feature Type Comparison: For graph classification, though the best performance is not consistently achieved on a particular feature across ...
Read more >
Metrics for graph comparison: A practitioner's guide
In this work, we compare commonly used graph metrics and distance measures, and demonstrate their ability to discern between common topological ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found