[Megathread] Infrastructure Issues
See original GitHub issueWe have a wide range of infrastructure issues, and it has gotten to the point that we often have to rerun CI on PRs several times before they succeed due to transient failures. At this point it has become a significant pain-point (evident here https://github.com/dotnet/sdk/pull/28317#event-7522977008) for the team and it’s likely worth having the domestic-cat rotation individual look into one of these on rotation. I encourage all of you to add by editing this post any issues you encounter. They are separated into things that are actionable for our team or other contributors who flow their code into our repo, and things which are actionable for dnc-eng.
Closed issues
#29008 #29006 #28802 #28801 #27766 #27658 #29289
Open issues: Global property flow race conditions: https://github.com/dotnet/sdk/pull/29168/checks?check_run_id=9644380039
Issue:
error]src\Layout\redist\targets\sdks\sdks.csproj(0,0): error NU1301: (NETCORE_ENGINEERING_TELEMETRY=Build) Failed to retrieve information about ‘Microsoft.NET.Sdk.WindowsDesktop’ from remote source ‘https://pkgs.dev.azure.com/dnceng/9ee6d478-d288-47f7-aacc-f6e6d082ae6d/_packaging/a65e5cb4-26c0-410f-9457-06db3c5254be/nuget/v3/flat2/microsoft.net.sdk.windowsdesktop/index.json’.
Child Issue: N/A
Example Run PR + Pipeline: https://github.com/dotnet/sdk/runs/8712332231
Estimated Impact: 1
Issue:
[error]ShellJSInternalError: ENOSPC: no space left on device, write
Child Issue: N/A
Example Run PR + Pipeline: [Pipelines - Run 20220921.69 logs (azure.com)](https://dev.azure.com/dnceng-public/public/_build/results?buildId=25573&view=logs&j=92885c4a-db2e-5086-f9ba-51524576e2ac&t=437333b1-f621-5ac8-6b31-945800dcd511&l=51)?
Estimated Impact: 4
Issue:
‘D:\a\1\s\artifacts\bin\Release\Sdks\Microsoft.NET.Sdk.Razor\targets\Microsoft.NET.Sdk.Razor.StaticWebAssets.Pack.CrossTargeting.targets’ because it is being used by another process. [TargetFramework=net472]
Child Issue: Does not yet exist.
Example Run PR + Pipeline: https://github.com/dotnet/sdk/pull/28327, https://github.com/dotnet/sdk/runs/8756003552
Estimated Impact: 4
Issue:
Child Issue: Does not yet exist.
[Example Run PR + Pipeline: |
](https://dev.azure.com/dnceng-public/public/_build/results?buildId=272072&view=ms.vss-test-web.build-test-results-tab&runId=5373286&resultId=100222&paneView=debug)
Expected command to exit with 0 but it did not.\r\nFile Name: D:\a\_work\1\s\artifacts\bin\redist\Release\dotnet\dotnet.exe\r\nArguments: new console --no-restore --force --debug:custom-hive D:\a\_work\1\s\artifacts\tmp\Release\dotnet-new.IntegrationTests\SharedHomeDirectory\20230512164846363\r\nExit Code: 100\r\nStdOut:\r\n\r\nStdErr:\r\nTemplate "Console App" could not be created.\r\nFailed to create template.\r\nDetails: Error while processing file /content/ConsoleApplication-CSharp/Company.ConsoleApplication1.csproj\r\nThe process cannot access the file 'D:\a\_work\1\s\artifacts\tmp\Release\dotnet-new.IntegrationTests\CanOverwriteFilesWithForce\20230512165119230\20230512165119230.csproj' because it is being used by another process.\r\n\r\nFor details on the exit code, refer to https://aka.ms/templating-exit-codes#100\r\n
Issue:
Child Issue: Does not yet exist.
Example Run PR + Pipeline: https://dev.azure.com/dnceng-public/public/_build/results?buildId=276879&view=results SkipsLocalizationOnInstantiate_WhenInvalidFormat SkipsLocalizationOnInstantiate_WhenLocalizationValidationFails Estimated Impact: unknown
Issue:
Child Issue: Does not yet exist.
Example Run PR + Pipeline: https://dev.azure.com/dnceng-public/public/_build/results?buildId=298136&view=ms.vss-test-web.build-test-results-tab&runId=6027854&resultId=102677&paneView=debug
Issue:
Child Issue: https://github.com/dotnet/sdk/issues/33956 Example Run PR + Pipeline: https://dev.azure.com/dnceng-public/cbb18261-c48f-4abb-8651-8cdcb5474649/_build/results?buildId=336802 Estimated Impact: ?
Get "http://localhost:5010/v2/": read tcp 127.0.0.1:40054->127.0.0.1:5010: read: connection reset by peer
Issue:
Child Issue: CannotDisplayUnknownPackageDetails failing with unexpected error Example Run PR + Pipeline: https://dev.azure.com/dnceng/internal/_build/results?buildId=2249124&view=ms.vss-test-web.build-test-results-tab&runId=51948431&resultId=100087&paneView=debug
Estimated Impact: failing every internal PR
DNC-Eng Related Issues:
Fixed:
- 429 Errors: https://github.com/dotnet/arcade/issues/10885
- 503 Errors: https://github.com/dotnet/arcade/issues/10943 Open:
Issue Template
Issue:
Child Issue: Does not yet exist.
Example Run PR + Pipeline: |
Estimated Impact: [0-5]
Issue Analytics
- State:
- Created a year ago
- Comments:7 (7 by maintainers)
Top GitHub Comments
another instance of the flakiness that @Forgind recorded. https://dev.azure.com/dnceng-public/public/_build/results?buildId=276879&view=ms.vss-test-web.build-test-results-tab&runId=5490452&resultId=100005&paneView=debug
Reminder that we should update the description above rather than having to search replies.
Issue: Transient container builds tests
Cause: The container tests on Ubuntu push to a local container registry to mimic a real remote registry. This local container registry is run in Docker on the ubuntu host, and sometimes it does not start up as expected.
Presentation: Generally something about a connection problem to localhost:5010/v2. Something like:
Mitigation: Restart the tests, these issues usually are very transient and only impact the Ubuntu leg