Paket pack is unbearable slow when there are about hundred of projects to pack.
See original GitHub issuePaket version 4.8.7
We have a solution with a little more than 100 C# projects in it. Some projects are class libraries and a few are “leaf” projects - EXE and WebSites, which use those class libraries.
I want to use “paket pack” to package build output to many NuGet packages with inter-dependencies. Class libraries are usually consumed by multiple “leaf” projects, so I want NuGet packages for those “leaf” projects reference class libraries packages, instead of duplicating binaries in the build drop. For that purpose I placed tiny autogenerated paket.template file to each project folder.
When I run paket.exe pack output nugetFolder include-referenced-projects lock-dependencies
it scans the folder tree in seconds and I see messages “No description was provided for package” for each project. But then it become absolutely quiet (even in verbose mode) and it takes at least half hour (actually I never waited to the end of it). Using Process Monitor I can see that Paket continuously read the same project and template files over and over again. One CPU core is 100% loaded. Nothing is written to the output folder.
What exactly it is doing for so long? Is there anything I can do to make it faster? This sort of kills the whole idea.
Thanks beforehand! Konstantin
Issue Analytics
- State:
- Created 6 years ago
- Comments:20 (20 by maintainers)
Top GitHub Comments
I got in a sticky situation here, because I badly need “paket pack” to work, but I don’t know F# at all and cannot publish original big project as the repro. So what I could do is to profile it myself and when I got some idea I debugged the code and kind of understood what is going on, but I still need somebody with F# background to fix it.
Here is what I think happening:
Imagine solution with 100 projects. First project is independent, second depends on first, third depends on the first two, etc. Each next project depends on all previous.
Problematic code is in
Paket.Core\PaketConfigFiles\ProjectFile.fs, member this.GetAllReferencedProjects()
When it gets invoked for the project N, it reads in all N-1 dependencies and then recursively calls itself on each dependency. Then the process repeats itself for N-2, etc. This means processing time is proportional to factorial of the number of projects! For 100 projects it is 1.0e158! No surprise I cannot get it to completion. Instead each project scanned should cache the results of the scan in the hashtable with the project path as a key. Then each new project would only 1 (one) project scan as all previous projects are already scanned and results available. So when it is done right the algorithm should be linear, instead of N!Who can make it happen, please?
Besides I also noticed that GetAllReferencedProjects is always invoked TWICE for each project, don’t know why. That’s on top of N! Would be nice to remove that duplication too.
Thank you for your attention!
Congratulations @gerardtoconnor! With the new implementation execution time reduced from eternity to 11 seconds, which is remarkable and definitely more usable! I’m looking forward seeing it released.
One other thing about project scanning process is still not clear. It seems Paket scans folder tree for all files named like “*.*proj” and tries to parse them as projects. In most cases they are indeed projects, but some of them may be very much irrelevant to the currently packed solution and not really referenced from anywhere. In one case it picked up .sqlproj file and stopped because corresponding DLL was not found. But that project does not produce DLL (it produces .dacpac). I wonder how hard would it be to instill some more smartness to Paket and make it not read projects which do not belong to closure of references.