Optimize the build process
See original GitHub issueContext
This issue is linked to a side discussion in #1109 on the very long build times. Here are some bits and pieces of the discussion in #1109, starting with my comment on a statement made by @bart-degreed:
Running the cibuild already takes a while today (and occasionally times out, the VMs are quite slow)
That is definitely true. One set of activities that takes relatively long is this:
RunInspectCode RunCleanupCode
On Linux (Ubuntu image), RunInspectCode takes a whopping 16 to 18 minutes (yes, minutes) whereas it “only” takes 4 to 5 minutes on Windows (Visual Studio 2019 image). RunCleanupCode takes 2 to 4 minutes on Linux and 1 to 2 minutes on Windows. In total, we are talking about roughly 18 to 22 minutes on Linux and 5 to 7 minutes on Windows. Overall RunInspectCode accounts for roughly half of the total time on Linux.
The question is why you need to run:
- both steps as part of the cibuild and
- those two steps in both images (Ubuntu and Visual Studio 2019).
If possible, you should only run this on Windows.
@bart-degreed responded as follows:
We need to run Linux before Windows, because the Windows build publishes the NuGet package and the documentation website. We’d like to fail fast on formatting issues, not wait for the whole Linux build to complete first. During regular work, formatting issues are usually the cause for failure. Instead of running cleanup locally, we queue up a build and work on something else meanwhile.
I know how frustrating it can be to have to wait on cibuild when working on that, so by all means disable steps temporarily to be productive.
Problem
The two steps in question take 18 to 22 minutes on Linux and 5 to 7 minutes on Windows, i.e., ca. 23 to 29 minutes in total!
The following example highlights how frustrating this long time can be. The build started for my commit e65facee related to PR #1103 led to a formatting-related failure after 22 min 8 sec
. Here are relevant parts of the output:
Build succeeded.
0 Warning(s)
0 Error(s)
Time Elapsed 00:03:53.77
JetBrains Inspect Code 2021.2.2
Running on AMD 64 in 64-bit mode, .NET runtime 3.1.18 under Unix 5.4.0.1056
Running code cleanup on changed files in pull request
JetBrains Cleanup Code 2021.2.2
Running on AMD 64 in 64-bit mode, .NET runtime 3.1.18 under Unix 5.4.0.1056
Version: 2021.2.2
dotnet tool run jb cleanupcode "JsonApiDotNetCore.sln" --include=";JsonApiDotNetCore.sln;src/Examples/JsonApiDotNetCoreExample.Cosmos/Controllers/NonJsonApiController.cs;src/Examples/JsonApiDotNetCoreExample.Cosmos/Controllers/OperationsController.cs;src/Examples/JsonApiDotNetCoreExample.Cosmos/Controllers/PeopleController.cs;src/Examples/JsonApiDotNetCoreExample.Cosmos/Controllers/TodoItemsController.cs;src/Examples/JsonApiDotNetCoreExample.Cosmos/Data/AppDbContext.cs;src/Examples/JsonApiDotNetCoreExample.Cosmos/Definitions/TodoItemDefinition.cs;src/Examples/JsonApiDotNetCoreExample.Cosmos/JsonApiDotNetCoreExample.Cosmos.csproj;src/Examples/JsonApiDotNetCoreExample.Cosmos/Models/Person.cs;src/Examples/JsonApiDotNetCoreExample.Cosmos/Models/Tag.cs;src/Examples/JsonApiDotNetCoreExample.Cosmos/Models/TodoItem.cs;src/Examples/JsonApiDotNetCoreExample.Cosmos/Models/TodoItemPriority.cs;src/Examples/JsonApiDotNetCoreExample.Cosmos/Program.cs;src/Examples/JsonApiDotNetCoreExample.Cosmos/Properties/launchSettings.json;src/Examples/JsonApiDotNetCoreExample.Cosmos/Startup.cs;src/Examples/JsonApiDotNetCoreExample.Cosmos/appsettings.json;src/JsonApiDotNetCore/Configuration/NoSqlServiceCollectionExtensions.cs;src/JsonApiDotNetCore/Queries/INoSqlQueryLayerComposer.cs;src/JsonApiDotNetCore/Queries/NoSqlQueryLayerComposer.cs;src/JsonApiDotNetCore/Resources/Annotations/NoSqlHasForeignKeyAttribute.cs;src/JsonApiDotNetCore/Resources/Annotations/NoSqlOwnsManyAttribute.cs;src/JsonApiDotNetCore/Resources/Annotations/NoSqlResourceAttribute.cs;src/JsonApiDotNetCore/Services/NoSqlResourceService.cs" --profile --profile="JADNC Full Cleanup" --properties:Configuration=Release --verbosity=WARN -dsl=GlobalAll -dsl=SolutionPersonal -dsl=ProjectPersonal
JetBrains Cleanup Code 2021.2.2
Running on AMD 64 in 64-bit mode, .NET runtime 3.1.18 under Unix 5.4.0.1056
!!!! Process Aborted !!!!
The following files do not match .editorconfig:
* src/Examples/JsonApiDotNetCoreExample.Cosmos/appsettings.json
Run the following command to fix style issues:
dotnet regitlint -s JsonApiDotNetCore.sln --print-command --jb --profile --jb --profile="JADNC Full Cleanup" --jb --properties:Configuration=Release --jb --verbosity=WARN -f commits -a ffe5d62ec78d70b5dcf9f7c7ba080f04f53c908f -b b7930dee25fd506fd4beb55f5b3f4280f08f3244 --fail-on-diff --print-diff
diff --git a/src/Examples/JsonApiDotNetCoreExample.Cosmos/appsettings.json b/src/Examples/JsonApiDotNetCoreExample.Cosmos/appsettings.json
index 38ca8947..8fc2aef8 100644
--- a/src/Examples/JsonApiDotNetCoreExample.Cosmos/appsettings.json
+++ b/src/Examples/JsonApiDotNetCoreExample.Cosmos/appsettings.json
@@ -1,6 +1,7 @@
{
"Data": {
- "DefaultConnection": "AccountEndpoint=https://localhost:8081/;AccountKey=C2y6yDjf5/R+ob0N8A7Cgv30VRDJIWEHLM+4QDU5DE2nQ9nDuVTqobD4b8mGGyPMbIZnqyMsEcaGQy67XIw/Jw=="
+ "DefaultConnection":
+ "AccountEndpoint=https://localhost:8081/;AccountKey=C2y6yDjf5/R+ob0N8A7Cgv30VRDJIWEHLM+4QDU5DE2nQ9nDuVTqobD4b8mGGyPMbIZnqyMsEcaGQy67XIw/Jw=="
},
"Logging": {
"LogLevel": {
The situation was that the relevant part of appsettings.json
looked like this:
{
"Data": {
"DefaultConnection": "AccountEndpoint=https://localhost:8081/;AccountKey=C2y6yDjf5/R+ob0N8A7Cgv30VRDJIWEHLM+4QDU5DE2nQ9nDuVTqobD4b8mGGyPMbIZnqyMsEcaGQy67XIw/Jw=="
},
}
However, it was supposed to look like that:
{
"Data": {
"DefaultConnection":
"AccountEndpoint=https://localhost:8081/;AccountKey=C2y6yDjf5/R+ob0N8A7Cgv30VRDJIWEHLM+4QDU5DE2nQ9nDuVTqobD4b8mGGyPMbIZnqyMsEcaGQy67XIw/Jw=="
},
}
In this case, the error was reported by RunCleanupCode
(which is relatively fast) rather than RunInspectCode
(which is super-slow at least on Linux).
Solution Options
To kick this off, I’d see multiple solutions.
Run on Windows Only and Possibly Fail Just a Little Later While Creating a Much Better Developer Experience
In PR #1103, I changed the relevant part of Build.ps
as follows to only run the two checks on Windows:
if ($isWindows) {
RunInspectCode
RunCleanupCode
}
The Linux build took 16 min 13 sec
while the Windows build took 14 min 21 sec
. Roughly 8 min 20 sec
into the Windows build, RunInspectCode
was done. 9 min 30 sec
into the build, RunCleanupCode
finished. Thus, it would have taken approximately 24 min 30 sec
instead of 22 min 8 sec
to fail the build, e.g., in case of the above formatting issue. However, this way would have led to a much better developer experience. The build on one platform would have confirmed some more material facts (“does it work?”) while also pointing out issues that are relatively less material (“does the code look consistent?”). In that case, I would have been more than happy to also fix that formatting issue. However, having to wait 22 minutes to only learn that a JSON file was not laid out as expected was super-frustrating.
Consider Changing the Order of the Images
You said you need to run Linux before Windows, because the Windows build publishes the NuGet package and the documentation website while also wanting to fail fast on formatting issues.
Looking at appveyor.yml, my question would be whether this must run on Windows. If it were possible to publish the NuGet package and the documentation website on Linux, we could:
- run the Windows build first, including the
RunInspectCode
andRunCleanupCode
steps, where those two steps only take 5 to 7 minutes in total as opposed to 18 to 22 minutes on Linux; and - run the Linux build second, without the
RunInspectCode
andRunCleanupCode
steps but with the NuGet package and documentation website publishing steps.
We would fail much faster on formatting issues, i.e., after 5 to 7 rather than 18 to 22 minutes, while at the same time reducing the total build time by 18 to 22 minutes (as we also would in the first solution option).
Issue Analytics
- State:
- Created 2 years ago
- Comments:8 (4 by maintainers)
Top GitHub Comments
My latest response was based on new insights. I’ve come to realize that we should focus our efforts on improving the local development experience, instead of adapting the cibuild to optimize the workflow for individuals. Tomorrow someone may ask us to run the tests first because they don’t have docker running locally. Or to build the documentation website first, because it takes so long to build it locally. This is solving the wrong problem, so I don’t think it’s a good idea to change the cibuild for such purposes.
In the exceptional case that someone is debugging the cibuild itself (such as installing the Cosmos Emulator for Linux), I already mentioned that turning off steps temporarily in a PR to get faster feedback is fine, so that’s not the discussion here.
The cibuild partly validates that the code complies with our principles and quality norms, which typically vary between organizations, teams, projects, and products. The sole fact that “it works” is not good enough, as pointed out in our guidelines:
So instead of changing the cibuild, I’d like to focus our efforts on improving the local tools. Things we can do:
cleanupcode.ps1
. This makes it complete quickly. It would resolve the specified branch name to a commit hash and pass it to regitlint. We’ll need to decide what happens when there are staged/unstaged changes.@bart-degreed, sorry for the very late response. I did not see this.
I think the answer is “yes”.