question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

daml-ghc-shake-test-ci is > 10x slower on Windows

See original GitHub issue

The daml-ghc-shake-test-ci (we really need to change that name) tests are significantly slower on Windows than they are on Linux. I get runtimes of ~40s on LInux but on Windows they sometimes seem to take longer than 900s and thereby timeout. This happens on master (7ee793140760cf082e16cce7580a775f7ec64a24). My current suspicion is that it might be caused by killThread blocking because we’re stuck in some syscall.

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:8 (8 by maintainers)

github_iconTop GitHub Comments

2reactions
majcherm-dacommented, Jun 25, 2019

We’re getting stuck in this for loop in GRPC https://github.com/grpc/grpc/blob/master/src/core/lib/channel/channel_args.cc#L81 because of src->num_args value getting corrupted / being read from wrong part of mem (it’s an argument passed from haskell to C function).

When it worked, for my sample app derived from the above example, the src->num_args was giving proper value of 0, when it hung it was giving some ‘random’ numbers causing a loooong loop, like src->num_args: 8103229619571785728.

It’s caused by .chs which are architecture dependant and should be generated during build, but are at the moment pushed into repo (generated on linux). The problem is that a C lang’s size_t type is mapped to CULong type which is 8 bytes long on Linux 64 and only 4 bytes long on mingw64, where size_t is 8 byte long.

The GRPC’s grpc_channel_args.num_args is of size_t type: https://github.com/grpc/grpc/blob/v1.19.x/include/grpc/impl/codegen/grpc_types.h#L133, which is 8 byte long on both platforms causing a binding type mismatch on Windows / mingw64.

Windows:

sizeof(int) = 4
sizeof(long) = 4             <------
sizeof(long long) = 8
sizeof(size_t) = 8

Linux:

sizeof(int) = 4
sizeof(long) = 8             <------
sizeof(long long) = 8
sizeof(size_t) = 8

After I modified a binding type for grpc_channel_args.num_args from CULong to CULLong or CSize (Word64) it’s working correctly - no hangs 🎉

I’ll create a PR after cleaning all the tracing printfs 😃 (cc @neil-da @cocreature @aherrmann-da )

0reactions
aherrmann-dacommented, May 31, 2019

Initially I could reproduce this issue on a Windows VM fairly reliably. After purging all Bazel caches and reinstalling the Windows dev-env I can no longer reproduce this particular issue.

I’ve also tried to generate a minimal reproduction. Main.hs

{-# LANGUAGE OverloadedStrings #-}

module Main (main) where

import Control.Monad.Managed (with)
import DA.Service.Daml.Compiler.Impl.Scenario as SS
import qualified DA.Service.Logger.Impl.Pure as Logger

main :: IO ()
main = do
  putStrLn "starting main"
  with
    (SS.startScenarioService (\_ -> pure ()) Logger.makeNopHandle)
    $ \_scenarioService -> do
      putStrLn "Running scenario service"
  putStrLn "finishing main"

BUILD.bazel

load("//bazel_tools:haskell.bzl", "da_haskell_test")

da_haskell_test(
    name = "test",
    srcs = ["Main.hs"],
    hazel_deps = [
        "base",
        "managed",
    ],
    deps = [
        "//daml-foundations/daml-ghc/daml-compiler",
        "//libs-haskell/da-hs-base",
    ],
    data = [
        "//compiler/scenario-service/server:scenario_service_jar",
        "//daml-foundations/daml-ghc/package-database:package-db",
    ],
)

This passes bazel test //:test and prints the expected output. At least in all the attempts I ran.

When executed as bazel run //:test then it hangs after printing “starting main” in some cases.


Surprisingly, if instead declared as da_haskell_binary then it fails to execute with bazel run due to missing DLLs. I’m not sure if that is related to this issue, or just a problem with my VM setup:

> bazel run //test
C:/users/aj/_bazel_aj/4vtoktyf/execroot/com_github_digital_asset_daml/bazel-out/x64_windows-fastbuild/bin/test/test.exe: error while loading shared libraries: ?: cannot open shared object file: No such file or directory

Dependency Walker lists many system libraries as missing

API-MS-WIN-EVENTING-CONTROLLER-L1-1-0.DLL
API-MS-WIN-EVENTING-PROVIDER-L1-1-0.DLL
EXT-MS-WIN-ADVAPI32-REGISTRY-L1-1-0.DLL
...

Executing the .exe directly produces no output but returns with $LASTEXITCODE -1073741515.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Slow internet on one PC, but fast LAN speeds - Microsoft Q&A
Slow internet on one PC, but fast LAN speeds. My ISP provides internet speeds of 300 Mbps up / 10 Mbps down. On...
Read more >
Untitled
#markings Lauren alaina songs, Avion venezolana de aviacion, S10 zr2 off road parts, Installing linux mint on windows 10? Favij horror terrificante ...
Read more >
Untitled
Lijtenstein, Intel edison windows 10, Broken bone or fracture, Iron on using ... Fiorentina juventus 1 2, Maluma mi ex lyrics, Slow rock...
Read more >
Untitled
Gtx 560 ti crysis 3 fps, Marla hanson 2015, Outlook program for windows 10, Davao history tagalog, Puijon hiihtoseura? Unimech marine, Tiberio sempronio ......
Read more >
Restaurant - Around The Metro
No. 10 Restaurant. Earl's Court. Pesantissimo ... The Slow Bread Company. Turnham Green. Aux Pains de Papy ... Galvin at Windows. Hyde Park...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found