question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Class-based dependency tracking in name hashing algorithm

See original GitHub issue

This is a meta-issue that provides quick overview of what needs to be done in order to change dependency tracking from source file-based to class name-based.

Tasks

Class-based dependency tracking can be implemented in two phases. The third phase is dedicated to testing on large projects.

Invalidate by inheritance based on class dependencies (phase 1)

Tasks marked as completed are completed only in a prototype implementation. No changes has been merged to sbt yet.

  • add tracking of declared classes in source files; we want to invalidate classes but at the end we need className -> srcFile relation because we can recompile source files only
  • add dependency tracking of top-level (not owned by any other class) imports; we’ll assign dependencies introduced by those imports to the first top level class/object/trait declared in the source file (see notes below for details)
  • track dependencies at class level (and use declaredClasses to immediately map them back to files until the rest of the algorithm is implemented)
  • add tracking of children of sealed parents; we should track this in API object corresponding to sealed parent; this will enable us to perform proper invalidation when a new case (class) introduced to a sealed hierarchy
  • add per (top level) class tracking of api hashes; tracking of api hashes per each (even nested) class separately will be done in phase 2
  • handle dependencies coming from local and anonymous classes (see discussion below)
  • switch invalidation of dependencies by inheritance to invalidate at class level instead of at source level (would fix #2320)
  • distinguish between source and binary class names. Introduce a new relation binaryClassName: Relation[String, String] and use it whenever the conversion is needed.

Track name hashes per class (phase 2)

  • refactor tracking of APIs to be done at class level instead of at source level
  • extract APIs of all classes (including inner ones) separately
  • track name hashes at class level instead of being at source level (see also: #2319)
  • implement the rest of invalidation logic (member reference, name hashing) based on class names (get rid of mapping through declaredClasses in almost entire algorithm) (would fix #2320)

Only once the last bullet is merged we’ll see improvement to incremental compilation when big number of nested classes is involved (e.g. like in Scala compiler’s case).

Testing, bug fixing and benchmarking (phase 3)

  • test with the Scalatest repo
  • test with the scala repo using the sbt build
  • test with the specs2 repo
  • fix handling of anonymous and local classes defined in Java (similar logic will have to be implemented as for handling local Scala classes) (https://github.com/sbt/zinc/issues/192)
  • index classes only declared (not inherited) in source files (https://github.com/sbt/zinc/issues/174)
  • benchmark ;clean;compile performance
  • simplify classpath and Analysis instance lookup in incremental compiler (I think number of classpath lookups can be reduced now)

Merge changes upstrem, prepare for shipping (phase 4)

  • determine the location where this work will be merged into

Targeted sbt version

This most likely is going to be shipped with sbt 1.0-M1.

Benefits

The improved dependency tracking delivers up to 40x speedups of incremental compilation in scenarios tested. Check benchmarking results here: https://github.com/sbt/sbt/issues/1104#issuecomment-193002190

The speedups are caused by fixing two main issues:

  1. The #2319 would be fixed once name hashes are tracked per class. This way introduction of a new class and members coming with it would not affect source files dependent (by member ref) on already existing classes.
  2. Effects of adding members (like methods) to a class would affect only classes that inherit from that class. At the moment, adding a member to a class that nobody inherits from can trigger invalidation of all descendants of all classes defined in the same source file (see #2320 for a specific scenario).

The 1. is likely to be triggered by any code base that uses more than one class defined in a single source file. The 2. is affecting code bases with big number of nested classes that are inherited. One example is Scala compiler itself. Even with name hashing, we invalidate too much and code edit cycle becomes long whenever a new member is introduced.


This work described in this issue is funded by Lightbend. I’m working on it as a contractor.

Issue Analytics

  • State:closed
  • Created 10 years ago
  • Reactions:27
  • Comments:48 (43 by maintainers)

github_iconTop GitHub Comments

2reactions
gkossakowskicommented, Feb 25, 2016

I’ve done some preliminary testing of class-based dependency tracking to see how it fares.

Performance improvements

Scalatest (master)

Add a method to AndHaveWord in Matchers.scala Run compile with name hashing:

[info] Compiling 1 Scala source to /Users/grek/scala/scalatest/scalatest/target/scala-2.11/classes...
[warn] there were four deprecation warnings; re-run with -deprecation for details
[warn] one warning found
[info] Compiling 45 Scala sources to /Users/grek/scala/scalatest/scalatest/target/scala-2.11/classes…
[success] Total time: 49 s, completed Feb 24, 2016 12:15:01 AM

Run compile with name hashing + class-based dependency tracking:

[info] Compiling 1 Scala source to /Users/grek/scala/scalatest/scalatest/target/scala-2.11/classes...
[warn] there were four deprecation warnings; re-run with -deprecation for details
[warn] one warning found
[success] Total time: 4 s, completed Feb 24, 2016 12:21:12 AM

Specs2 (master)

Add a method to OptionResultMatcher class in OptionMatchers.scala.

Run compile with name hashing:

[info] Compiling 1 Scala source to /Users/grek/tmp/specs2/matcher/target/scala-2.11/classes...
[info] Compiling 5 Scala sources to /Users/grek/tmp/specs2/matcher/target/scala-2.11/classes...
[info] Compiling 6 Scala sources to /Users/grek/tmp/specs2/core/target/scala-2.11/classes...
[info] Compiling 3 Scala sources to /Users/grek/tmp/specs2/junit/target/scala-2.11/classes...
[info] Compiling 99 Scala sources to /Users/grek/tmp/specs2/core/target/scala-2.11/test-classes...
[info] Compiling 1 Scala source to /Users/grek/tmp/specs2/form/target/scala-2.11/classes...
[success] Total time: 48 s, completed Feb 25, 2016 12:48:38 AM

Run compile with name hashing + class-based dependency tracking:

[info] Compiling 1 Scala source to /Users/grek/tmp/specs2/matcher/target/scala-2.11/classes...
[success] Total time: 1 s, completed Feb 25, 2016 12:58:27 AM
0reactions
dwijnandcommented, Mar 27, 2017
Read more comments on GitHub >

github_iconTop Results From Across the Web

Computing Cryptographic Hashes for Cyclic Dependencies
Cryptographic hashes (or one-way hash functions) allow us to compute a digest that uniquely identifies a resource. If we make a small change ......
Read more >
RFC 6920 - Naming Things with Hashes - IETF Datatracker
Creation of Named Information Hash Algorithm Registry . . 16 9.5. ... MUST be able to generate/send and to accept/process names based on...
Read more >
Find the ordering of tasks from given dependencies
Find the ordering of tasks from given dependencies ... using DFS: In this implementation, we use DFS based algorithm for Topological Sort.
Read more >
Dependency Injection ASP.NET : Unaware about the ...
CytpoOther. Now, in UserService I inject the ICyrpto to hash passwords for example: Public class UserService { ICrypto _crypto; public ...
Read more >
Hash Keys in the Data Vault | Experts in Consulting and Training
Hashing instead of sequencing means that we can load in complete 100% parallel operations to all hubs, all links, all satellites, and enrich...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found