Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

New feature: node graph comparison

See original GitHub issue

I’m developing a library that does annotation processing and generates code using KotlinPoet. While before I had this working by just comparing the raw generated text against that of a golden file, I’m now using kastree to parse both to rule out trivial differences like comments and whitespace.

To improve my ability to spot where the generated and expected code diverge in case the test fails, I made a SyntaxTreeComparer object that traverses two node graphs in tandem and returns the first divergence found between them, if any. I can then pretty-print this in my failure messages.

For example, given this golden file to compare against:

// Package statement and various imports...

class MinAndMax<T: Comparable<T>>(id: String) : ElementaryNode(id) {
    private val `_values`: SinglePortInputBoundary<Iterable<T>> = inputBoundary("values")

    val values: InputPort<Iterable<T>> = `_values`.exposed

    private val `_min`: SinglePortOutputBoundary<T?> = outputBoundary("min")

    val min: OutputPort<T?> = `_min`.exposed

    private val `_max`: SinglePortOutputBoundary<T?> = outputBoundary("max")

    val max: OutputPort<T?> = `_max`.exposed

    override suspend fun executeOnce() {
        val foo = "Bar" // Shouldn't really be in generated code
        coroutineScope {
            val values = async {
                `_values`.inner.yield()
            }
            val output = nodeBody(values.await())
            `_min`.inner.forward(output.min)
            `_max`.inner.forward(output.max)
        }
    }

    private fun nodeBody(values: Iterable<T>): ReturnValue<T> = ReturnValue(values.min(), values.max())
    private class ReturnValue<T>(val min: T?, val max: T?)
}

And a test case that involves this custom kotlintest matcher, where diff passes both node graphs to SyntaxTreeComparer:

fun matchTree(expectedTree: Node) = object : Matcher<Node> {
    override fun test(value: Node): Result {
        val divergence = value diff expectedTree
        val ifShouldMatch = "Trees should match, but diverge:\n${divergence?.prettyPrint()}"
        val ifShouldNotMatch = "Trees should not match, but they do"
        return Result(divergence == null, ifShouldMatch, ifShouldNotMatch)
    }
}

I get this failure message:

Trees should match, but diverge:
EXPECTED
{
    val foo = "Bar"
    coroutineScope {
        val values = async {
            `_values`.inner.yield()
        }
        val output = nodeBody(values.await())
        `_min`.inner.forward(output.min)
        `_max`.inner.forward(output.max)
    }
}
ACTUAL
{
    coroutineScope {
        val values = async {
            `_values`.inner.yield()
        }
        val output = nodeBody(values.await())
        `_min`.inner.forward(output.min)
        `_max`.inner.forward(output.max)
    }
}

Anyway, point is - would you be interested in incorporating something like this into your library? If so, I can open a PR with the code for the syntax tree comparer and add some unit tests later on.

Issue Analytics

State:
Created 5 years ago
Comments:6 (6 by maintainers)

Top GitHub Comments

1reaction

cretzcommented, Nov 8, 2018

Here’s something I hacked together that shows the idea. The differ:

import kastree.ast.Node
import kastree.ast.Visitor
import kotlin.reflect.KClass

open class NodeDiffer {
    fun diff(expected: Node, actual: Node): List<Diff> {
        val expected = PathVisitor().apply { visit(expected) }
        val actual = PathVisitor().apply { visit(actual) }

        // Find the longest paths that are not the same
        val pathDiffs = expected.children.mapNotNull { (expectedPath, expectedChild) ->
            // If the path isn't even in the actual, not a diff (the diff is one higher).
            val actualChild = actual.children[expectedPath] ?: return@mapNotNull null
            // If the children are the same, not a diff
            if (actualChild == expectedChild) return@mapNotNull null
            // Also, if the children are the same type and there is a child path for it, not a diff
            if (actualChild::class == expectedChild::class) {
                val childPath = ParentPath(actualChild::class, expectedPath.parents + expectedPath)
                if (actual.children.containsKey(childPath) && expected.children.containsKey(childPath))
                    return@mapNotNull null
            }

            // Otherwise, it's a diff
            expectedPath to Diff(
                expectedParents = expectedPath.parents.map { expected.parents[it]!! },
                expected = expectedChild,
                actualParents = expectedPath.parents.map { actual.parents[it]!! },
                actual = actualChild
            )
        }.toMap()

        // Remove all lower paths that have higher path diffs
        return pathDiffs.filterKeys { it.parents.none(pathDiffs::containsKey) }.values.toList()
    }

    class PathVisitor : Visitor() {
        var stack = emptyList<ParentPath>()
        val children = mutableMapOf<ParentPath, Node>()
        val parents = mutableMapOf<ParentPath, Node>()

        override fun visit(v: Node?, parent: Node) {
            // Node must be present
            v ?: return
            // Create unique path
            var path = ParentPath(parent::class, stack)
            while (children.containsKey(path)) path = path.copy(index = path.index + 1)
            // Add this node to the known children
            children[path] = v
            // Add the parent to the known parents
            parents[path] = parent
            // Go deeper
            stack += path
            super.visit(v, parent)
            stack = stack.dropLast(1)
        }
    }

    data class ParentPath(val cls: KClass<*>, val parents: List<ParentPath>, val index: Int = 0)
    data class Diff(
        val expectedParents: List<Node>,
        val expected: Node,
        val actualParents: List<Node>,
        val actual: Node
    )

    companion object : NodeDiffer()
}

A main test:

package kastree.ast.psi.temp

import kastree.ast.Writer
import kastree.ast.psi.Parser


fun main(args: Array<String>) {
    val expectedCode = Parser.parseFile("""// Package statement and various imports...

class MinAndMax<T: Comparable<T>>(id: String) : ElementaryNode(id) {
    private val `_values`: SinglePortInputBoundary<Iterable<T>> = inputBoundary("values")

    val values: InputPort<Iterable<T>> = `_values`.exposed

    private val `_min`: SinglePortOutputBoundary<T?> = outputBoundary("min")

    val min: OutputPort<T?> = `_min`.exposed

    private val `_max`: SinglePortOutputBoundary<T?> = outputBoundary("max")

    val max: OutputPort<T?> = `_max`.exposed

    override suspend fun executeOnce() {
        coroutineScope {
            val values = async {
                `_values`.inner.yield()
            }
            val output = nodeBody(values.await())
            `_min`.inner.forward(output.min)
            `_max`.inner.forward(output.max)
        }
    }

    private fun nodeBody(values: Iterable<T>): ReturnValue<T> = ReturnValue(values.min(), values.max())
    private class ReturnValue<T>(val min: T?, val max: T?)
}""")
    val actualCode = Parser.parseFile("""// Package statement and various imports...

class MinAndMax<T: Comparable<T>>(id: String) : ElementaryNode(id) {
    private val `_values`: SinglePortInputBoundary<Iterable<T>> = inputBoundary("values")

    val values: InputPort<Iterable<T>> = `_values`.exposed

    private val `_min`: SinglePortOutputBoundary<T?> = outputBoundary("min")

    val min: OutputPort<T?> = `_min`.exposed

    private val `_max`: SinglePortOutputBoundary<T?> = outputBoundary("max")

    val max: OutputPort<T?> = `_max`.exposed

    override suspend fun executeOnce() {
        val foo = "Bar" // Shouldn't really be in generated code
        coroutineScope {
            val values = async {
                `_values`.inner.yield()
            }
            val output = nodeBody(values.await())
            `_min`.inner.forward(output.min)
            `_max`.inner.forward(output.max)
        }
    }

    private fun nodeBody(values: Iterable<T>): ReturnValue<T> = ReturnValue(values.min(), values.max())
    private class ReturnValue<T>(val min: T?, val max: T?)
}""")
    val diffs = NodeDiffer.diff(expectedCode, actualCode)
    diffs.forEach {
        println("EXPECTED:\n------\n" + Writer.write(it.expectedParents.last()) + "\n------")
        println("ACTUAL:\n------\n" + Writer.write(it.actualParents.last()) + "\n------")
    }
}

Returns:

EXPECTED:
------
{
    coroutineScope {
        val values = async {
            `_values`.inner.yield()
        }
        val output = nodeBody(values.await())
        `_min`.inner.forward(output.min)
        `_max`.inner.forward(output.max)
    }
}
------
ACTUAL:
------
{
    val foo = "Bar"
    coroutineScope {
        val values = async {
            `_values`.inner.yield()
        }
        val output = nodeBody(values.await())
        `_min`.inner.forward(output.min)
        `_max`.inner.forward(output.max)
    }
}
------

Now, there are little bugs and what not and if I were developing a full solution, I’d change it up and add tests and what not. But in general, the idea of keeping paths of classes and then comparing at the end is reasonable, and future proof when I add more AST classes as more language features get developed.

0reactions

cretzcommented, Nov 8, 2018

@TAGC - Note, it may not work on top-level changes and you need to add some test cases. Basically, it just keeps a path of the classes it has traversed and then compares paths at the end.