question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Identifying backend compatibility versions

See original GitHub issue

We are currently working on identifying the backend versions with which we are compatible and with which we want to be compatible. These backends are PyTorch and TensorFlow. We will be considering Flax at a later point in time.

The first step was to identify the number of failures in each PyTorch/TensorFlow version and was done in https://github.com/huggingface/transformers/issues/18181.

Total number of tests: 38,991.

Framework No. Failures Release date Older than 2 years
PyTorch 1.10 50 Mar 10 2021 No
PyTorch 1.9 710 Jun 15 2021 No
PyTorch 1.8 1301 Mar 4 2021 No
PyTorch 1.7 1567 Oct 27 2020 No
PyTorch 1.6 2342 Jul 28 2020 Yes
PyTorch 1.5 3315 Apr 21 2020 Yes
PyTorch 1.4 3949 Jan 16 2020 Yes
TensorFlow 2.8 118 Feb 2 2022 No
TensorFlow 2.7 122 Nov 4 2021 No
TensorFlow 2.6 122 Aug 11 2021 No
TensorFlow 2.5 128 May 13 2021 No
TensorFlow 2.4 167 Dec 14 2020 No

We’re proposing to drop versions older than 2 years old and to work towards providing support (support = 0 tests failing) for versions we aim to support. We will drop support for older versions once we reach their two-year-old date.

Here is the proposed plan moving forward:

  • Have a detailed breakdown of failures for the following versions:
    • Torch 1.7
    • Torch 1.8
    • Torch 1.9
    • Torch 1.10
    • Torch 1.11
    • Torch 1.12
    • TensorFlow 2.4
    • TensorFlow 2.5
    • TensorFlow 2.6
    • TensorFlow 2.7
    • TensorFlow 2.8
    • TensorFlow 2.9
  • Start with an initial compatibility document to mention which models are supported in which versions
  • Open good first issues to improve compatibility for models not compatible with all versions, starting from the latest one and moving back in time.
  • As versions become supported, run tests on older versions to ensure no regression.

Work by @ydshieh and @LysandreJik


Some context and tips when working on Past CI

  1. The Past CI runs against a specific commit/tag:
    • Motivation: To be able to run the test against the same commit to see if a set of fixes improves the overall backward compatibility without new issues introduced.
    • The chosen commit could be changed (to more recent ones) along the time, but it should never be main.
    • When working on the fix for Past CI , keeping in mind that we should check the source code in the commit that is chosen for that particular Past CI run. The commit given at the beginning of each report provided in the following comments.
  2. For each report, there is an attached errors.txt where you can find more information to ease the fix process:
    • The file contains a list whose elements have the following content:
      • The line where an error occurs
      • The error message
      • The complete name of the failed test
      • The link to the job that ran that failed test
    • The errors in the reports sometimes don’t contain enough information to make the decision/action. You can use the corresponding links provided in errors.txt to see the full trackback on the job run pages.
  3. One (possible) fix process would be like:
    • For a framework and a particular version, go to the corresponding reporting table provided in the following comments.
    • Make sure you have a preferred way to navigate the source code in a specific commit.
    • Download/Open the corresponding errors.txt.
    • From the General table, take a row whose status is empty. Ideally, take the ones with higher value in no. column.
    • Search in errors.txt for the error in the picked row. You get information about the failed line, failed test, and the job link.
    • Navigate to the failed line or failed test in your workspace (or in a browser) that checks out to the specific commit for the run.
    • Use the job link to go to the job run page if you need more information about the error.
    • Then you might come up with a solution 😃, or decide a fix is not necessary with good reasons.
    • Update the status column with a comment once a fix or a decision is made.
  4. Some guides/hints for the fix:
    • 🔥 To install a specific framework version, utils/past_ci_versions.py can help!
    • ⚠️ As the tests are run against a chosen commit, which may not contain some fixes in the main branch. (This is particular confusing if you try to run the failed test without checking out to that commit.).
      • If the test passes when you run a failed test (in the report) against the main branch, with the target framework version, it’s very likely a fix exists on main that applies to the target framework version too.
      • In this case,
        • either update status with fixed in #XXXXX (if you know clearly that PR fixes that error)
        • or works for commits since **b487096** - a commit sha (It’s not always trivial to find out which PR fixed a particular error - especially when working with Past CI)
    • We decide to focus on the PyTorch and TensorFlow version, and not to consider other 3rd libraries. Therefore, some packages are not installed, like kenlm or detectorn2. We could just simply update the status column with XXX not installed.
    • When an error is coming from a C/C++ exception, and the same code and inputs work for new framework versions, we could skip that failed test with a @unittest.skipIf, and update the status like torch._C issue -> works wth PT >= 11 Fixed in #19122.
      • PR #19122 is one such example.
    • If an error occurs in several framework versions, say, PT 11 and PT 10, and a status is updated for the newer version (here PT 11), we can simply put see PT 11 in the report status column for older versions.
    • Some old framework versions lack attributes or arguments introduced in newer versions. See #19201 and #19203 for how a fix would look like in such cases. If a similar warning (to the one in #19203) already exists, we could update status with, for example, Vilt needs PT >= 1.10.
      • Adding such warning is not a fix in a strict sense, but at least it provides some information. Together with the updated status, we keep information tracked.

Issue Analytics

  • State:open
  • Created a year ago
  • Comments:12 (2 by maintainers)

github_iconTop GitHub Comments

1reaction
LysandreJikcommented, Oct 25, 2022

I think we can add it, we’ve had it in the main file for 8 months so it’s unlikely to cause an issue. Looking forward to the next launch!

0reactions
ydshiehcommented, Oct 25, 2022

I was trying to fix the kenlm issue, but I see it’s correctly installed here and has been for a while.

I guess it is an image issue?

Hi @LysandreJik. In fact, Past CI use transformers-past-gpu/Dockerfile: https://github.com/huggingface/transformers/blame/main/docker/transformers-past-gpu/Dockerfile

It’s probably arguable if we should (or should not) include kenlm. I don’t remember well if I got issue when installing it. Maybe yes for more elder versions, so I decide not to install it for all versions (to avoid confusion).

We can try with it in the next launch.

Read more comments on GitHub >

github_iconTop Results From Across the Web

What is Backward Compatible (Backward Compatibility)?
Backward compatibility is more easily accomplished if the previous versions have been designed to be forward compatible, or extensible, with built-in features ...
Read more >
Ensuring backward compatibility in your applications - Medium
In the context of software, backward compatibility checks whether a newer version of a product is compatible with an older version.
Read more >
Ensuring backwards compatibility in distributed systems
The goal is to test new application versions with real traffic, while minimizing the impacts of any problems that might occur. If the...
Read more >
Backwards compatibility across updates - GitLab Docs
You've identified a potential backwards compatibility problem, what can you do about it? Coordinate. For major or minor version updates of Rails or...
Read more >
Front-end supporting multiple back-end versions
Another approach could be to keep one version of the front-end, and have a new feature rely on either backend version 1 or...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found