question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Possible `Set/Add-Content` performance improvement

See original GitHub issue

Summary of the new feature / enhancement

As appears from #8270 Suggestion: Add a chunking (partitioning, batching) mechanism to Select-Object, analogous to Get-Content -ReadCount the concerned cmdlets run faster when fed with batches. <strike>Presumable because the output file is (re)opened and closed for each individual item (or batch)</strike>. This suggested that a -WriteCount parameter as apposed to the -ReadCount parameter in Get-Content (which keeps the process/file open for a longer period) might improve the performance of the concerned cmdlets.

<strike> If the assumption¹ is correct that the output file is (re)opened and closed for each item in the pipeline, a simple "`-KeepOpen`" switch (or even a fixed change) where the file is closed at the end of the pipeline (output timeout) or the cmdlet is finalized might also a direction to improve the performance of these cmdlets.

1) I lack the C# knowledge to reverse engineer the concerned cmdlets and therefore can only base my assumptions on PowerShell knowledge and (performance) testing. </strike>

Name                           Value
----                           -----
PSVersion                      7.2.6
PSEdition                      Core
GitCommitId                    7.2.6
OS                             Microsoft Windows 10.0.22000
Platform                       Win32NT
PSCompatibleVersions           {1.0, 2.0, 3.0, 4.0…}
PSRemotingProtocolVersion      2.3
SerializationVersion           1.1.0.1
WSManStackVersion              3.0

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:20 (6 by maintainers)

github_iconTop GitHub Comments

2reactions
mklement0commented, Sep 17, 2022

Re 1: Yes, I think that ConvertTo-Csv and Export-Csv - along with all the other ones in Category C (from #4242) - could be “batch-enhanced” without negatively impacting one-by-one processing (in a meaningful way).

Re 2: Among the Category B cmdlets - which are “batch-enhanced” already - Set-Content / Add-Content and Join-Content could be improved to benefit the one-by-one case, by defining their -Value / -InputObject as object rather than object[], which the others in that category, including Out-File, already do.

2reactions
jhoneillcommented, Sep 12, 2022

@iRon7 that’s not what your comment shows, I’ve answered it there, but your examples don’t output the correct CSV.

On my machine 1..100000 |ForEach-Object { [pscustomobject]@{ id = $_; name = "name$_" } } |Export-Csv .\test2.csv
takes about 20% longer than
1..100000 |ForEach-Object { [pscustomobject]@{ id = $_; name = "name$_" } } | convertto-csv | out-null

(the times are 1.2 vs 1.0 secs, ) and 1..100000 |ForEach-Object { [pscustomobject]@{ id = $_; name = "name$_" } } | out-null takes 0.75 secs. So the file operations aren’t much of the total.

Looking at export-csv it doesn’t open and close the file in the process block, it uses a stream-writer object and it flushes that as the last line of the process block. The end block calls code to flush and dispose of the stream-writer and the file-stream it uses. Possibly flushing the stream-writer less often would give a small perf boost (a few large writes should be faster than many small ones - I have a fast SSD in this machine, a slower disk would show bigger gains). So I wouldn’t rule this out completely, but it may not be the huge improvement you thought.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Improve the performance of Set-Content ...
Passing a [byte[]] array to Set-Content / Add-Content -AsByteStream should result in efficiently writing the given bytes. Currently, this is ...
Read more >
Powershell: smaller file takes 30x longer to write with 'Out- ...
Generally, for writing objects that are already strings, Set-Content is the better - and faster - choice compared to Out-File . See this...
Read more >
6 Things to Improve Your Content Performance
1. Use keywords based on target audience's search intent · 2. Make your headlines irresistible AND helpful · 3. Improve your writing flow...
Read more >
Tips and best practices to improve performance of canvas ...
Follow the best practices and tips in this topic to boost the performance of canvas apps.
Read more >
Content performance: A smart 7-step process for improving it
Setting up a performance-driven content model · 1. Identify unique content lifecycles · 2. Measure content performance · 3. Evaluate your content ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found