Possible `Set/Add-Content` performance improvement
See original GitHub issueSummary of the new feature / enhancement
As appears from #8270 Suggestion: Add a chunking (partitioning, batching) mechanism to Select-Object, analogous to Get-Content -ReadCount the concerned cmdlets run faster when fed with batches. <strike>Presumable because the output file is (re)opened and closed for each individual item (or batch)</strike>.
This suggested that a -WriteCount
parameter as apposed to the -ReadCount
parameter in Get-Content
(which keeps the process/file open for a longer period) might improve the performance of the concerned cmdlets.
1) I lack the C# knowledge to reverse engineer the concerned cmdlets and therefore can only base my assumptions on PowerShell knowledge and (performance) testing. </strike>
Name Value
---- -----
PSVersion 7.2.6
PSEdition Core
GitCommitId 7.2.6
OS Microsoft Windows 10.0.22000
Platform Win32NT
PSCompatibleVersions {1.0, 2.0, 3.0, 4.0…}
PSRemotingProtocolVersion 2.3
SerializationVersion 1.1.0.1
WSManStackVersion 3.0
Issue Analytics
- State:
- Created a year ago
- Comments:20 (6 by maintainers)
Top Results From Across the Web
Improve the performance of Set-Content ...
Passing a [byte[]] array to Set-Content / Add-Content -AsByteStream should result in efficiently writing the given bytes. Currently, this is ...
Read more >Powershell: smaller file takes 30x longer to write with 'Out- ...
Generally, for writing objects that are already strings, Set-Content is the better - and faster - choice compared to Out-File . See this...
Read more >6 Things to Improve Your Content Performance
1. Use keywords based on target audience's search intent · 2. Make your headlines irresistible AND helpful · 3. Improve your writing flow...
Read more >Tips and best practices to improve performance of canvas ...
Follow the best practices and tips in this topic to boost the performance of canvas apps.
Read more >Content performance: A smart 7-step process for improving it
Setting up a performance-driven content model · 1. Identify unique content lifecycles · 2. Measure content performance · 3. Evaluate your content ......
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Re 1: Yes, I think that
ConvertTo-Csv
andExport-Csv
- along with all the other ones in Category C (from #4242) - could be “batch-enhanced” without negatively impacting one-by-one processing (in a meaningful way).Re 2: Among the Category B cmdlets - which are “batch-enhanced” already -
Set-Content
/Add-Content
andJoin-Content
could be improved to benefit the one-by-one case, by defining their-Value
/-InputObject
asobject
rather thanobject[]
, which the others in that category, includingOut-File
, already do.@iRon7 that’s not what your comment shows, I’ve answered it there, but your examples don’t output the correct CSV.
On my machine
1..100000 |ForEach-Object { [pscustomobject]@{ id = $_; name = "name$_" } } |Export-Csv .\test2.csv
takes about 20% longer than
1..100000 |ForEach-Object { [pscustomobject]@{ id = $_; name = "name$_" } } | convertto-csv | out-null
(the times are 1.2 vs 1.0 secs, ) and
1..100000 |ForEach-Object { [pscustomobject]@{ id = $_; name = "name$_" } } | out-null
takes 0.75 secs. So the file operations aren’t much of the total.Looking at export-csv it doesn’t open and close the file in the
process
block, it uses a stream-writer object and it flushes that as the last line of the process block. Theend
block calls code to flush and dispose of the stream-writer and the file-stream it uses. Possibly flushing the stream-writer less often would give a small perf boost (a few large writes should be faster than many small ones - I have a fast SSD in this machine, a slower disk would show bigger gains). So I wouldn’t rule this out completely, but it may not be the huge improvement you thought.