question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Make Select-String more intelligent for non-primitive types

See original GitHub issue

Summary of the new feature/enhancement

As a user, I want to be able to easily use Select-String to find string data in formatted output, so that I can find and process data in PowerShell more quickly.

By default, when you pass non-string data into the Select-String cmdlet, it will find matches based on the object’s ToString() method results. This is great when the ToString output actually represents the data you want to match, but very often ToString results do not represent the data that you see in PowerShell, and this can result in incorrect or failed matches.

For example, consider this script:

# First, capture the date:
$date = Get-Date

# Now, get the current month in string format:
$month = $date.ToString("MMMM")

# Now, look at how the date renders in PowerShell, showing the month as a string
$date

# Now, try to match the month using Select-String. This returns nothing.
Get-Date | Select-String -Pattern $month

That script shows the current date, including the month in string format, but if you try to select that string based on the actual string month, there are no matches. Why does this happen? Because ToString() on DateTime objects returns the date time with a numeric month, not the string month.

Now let’s look at a more realistic example:

# Get some services, including the Windows Update service, and filter output on the
# string "Update"
Get-Service wuauserv,bits | Select-String Update

That returns nothing. Why? Because the ToString method on service objects returns the name of service, so you can’t filter output based on a partial match of a display name string this way.

Here is one more example:

Get-Process -Id $PID | Format-List * | Select-String Memory

This also returns nothing, because of the ToString method on Format cmdlet output objects returns their type name, none of which match Memory.

It is reasonable for a user to expect to be able to easily and consistently parse/filter output that is rendered in the PowerShell console, but this is not possible unless they pipe to Out-String -Stream before they then pipe that streamed result to Select-String.

Proposed technical implementation details (optional)

[EDIT]: Updated to reflect the parameter name we decided to move forward in the discussion below.

I want to make Select-String better by adding a new -ConsoleOutput switch parameter (or some better parameter name: suggestions welcome) -FromFormattedOutput switch parameter that indicates that you want to select a string based on the console output of the data that is piped into Select-String, which would automatically take care of the formatting and output of non-value and non-string types (and non-MatchInfo, but that’s a special case internally for Select-String), and select string matches based on that output rather than based on the ToString method output of individual objects.

Additional details

Personally I would prefer if Select-String worked this way by default for non-value and non-string types, but that would be a breaking change at this point, so that’s not an option; however, users who want it to work this way by default can use $PSDefaultParameterValues['Select-String:FromFormattedOutput'] = $true, and that will have the same result.

Issue Analytics

  • State:open
  • Created 4 years ago
  • Reactions:7
  • Comments:33 (21 by maintainers)

github_iconTop GitHub Comments

6reactions
KirkMunrocommented, Nov 6, 2019

I was just watching a 30-minute demo of Docker streamed from Ignite, when the presenter ran this command:

Get-Process | findstr smss

This is exactly what Select-String should be able to do by default, rather than switching to findstr or grep. In fact, consulting the Select-String documentation shows that it is even documented to work similarly to both grep and findstr, yet as demonstrated by the difference in output from gps | findstr smss and gps | sls smss, you can see that Select-String does not function like findstr or grep, which work against the textual output of a command.

Given that’s the case, I’m hopeful the PowerShell Committee votes in favor of not having an additional parameter to get this behavior.

5reactions
mklement0commented, Feb 4, 2020

In addition to the points made by @vexx32 and @KirkMunro:

And if you really want to grep against formatted output, it’s simple:

  • Select-String’s sole purpose is to search through strings (and being able to do that “quick and dirty”, as @KirkMunro explains, is an invaluable interactive tool).
  • Now, if the input isn’t composed of strings, searching what string representation of the input makes more sense?
    • What you see in the console (host), i.e. the formatted representations?
    • Or the result from a .ToString() call which produces a near-useless and hard-to-predict stringification you don’t typically get to see elsewhere (which answers the on what basis question).

Piping to | Out-String -Stream isn’t simple: it’s an obscure, cumbersome workaround for something the cmdlet should have done automatically to begin with.

Oh and BTW, Select-String works against files too so that has to be taken into account.

Yes, the -LiteralPath binding via Get-Item / Get-ChildItem output for file-content searching would have to be retained, which makes for a (preexisting) inconsistency - but an easily explained one.

That is, if you really wanted to search a directory listing as printed to the screen, then - and only then - would you need | Out-String -Stream.

Apparently we have an oss function wrapper for that now - that such a function was created speaks to how often you currently have to resort to that workaround.

It would certainly break my daily workflow.

Assuming the -LiteralPath binding is retained, what would break?

Other than someone inappropriately using Select-String with non-string input in a script (the scenario that’s @joeyaiello’s concern) - which is where you should definitely check object properties instead - nothing should break, and much is gained.

To me, that makes it a bucket 3 change, which spares us the confusion of introducing another cmdlet.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Should primitive types or non-primitive types be preferred ...
Primitive types should be used for efficiency and simplicity unless there is a specific reason to use the object type (e.g., you need...
Read more >
Non-primitive Data Types in Java
Java programming language has two categories of data types : Primitive and Non-Primitive data types. Primitive data types are built-in data types such...
Read more >
Python Data Structures with Primitive & Non- ...
Python has four primitive variable types: Integers; Float; Strings; Boolean. In the next sections, you'll learn more about them! Integers. You can ...
Read more >
Data Types in Java | Primitive and Non-Primitive Data Types
This article on Data Types in Java will give you a brief insight into various primitive and non primitive data types in Java...
Read more >
Non-primitive data types in Java
Non-primitive data types in Java with java tutorial, features, history, variables, object, programs, operators, oops concept, array, string, map, math, ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found