Suggestion to deprecate and later remove `Particle.find`
See original GitHub issueLet me start again with my nitpick by saying that I like particle very much, I use all the time in my code.
I suggest to deprecate Particle.find and to remove it in a future version. It is fully superseded by Particle.findall which is the correct interface for a query (well, actually #312 is). Particle.find can only return one result, but that is a very special case when the query fundamentally can yield none or several entries. Having such a restricted return type is not a good design and should be a non-starter. The API works around this by raising exceptions whenever the API contract cannot be fulfilled, this works correctly, but it’s duck-tape over the more fundamental issue.
Scott Meyers says: “Make interfaces easy to use correctly and hard (or impossible) to use incorrectly”. Particle.find is very difficult to use correctly (in the sense that you do not get an exception), because a priori, the user cannot know whether the query will yield exactly one result. Frequently, the results will contain zero or more entries, but these states cannot be represented by the method.
If you are not convinced yet, there are three other closely-related design principles which Particle.find violates.
- Each function/method should do the absolute minimum amount of work that is necessary to respond to the user request.
find always first finds all the matches and stores them in a temporary list. But it could stop already when the query found two matches, because then the result is clear, an exception has to be thrown. Also storing everything in a list only to discard that list afterwards is wasting resources. The implemention only needs to store the first match and once a second mathc is found, it can abort by raising an exception. This could be fixed by implementing finditer as suggested in #312 and use that internally instead of findall, but it would be better to just remove this method.
- Whenever feasible, fully return the result of some work. In other words, minimize the amount of temporary data that is generated and then thrown away internally in the implementation to respond to some user request.
Functions and methods should be so designed as to return the maximum amount of information for work that has been already done. The idea here is that the work has been performed already so it would be a waste to automatically discard some of that work. Let the user decide what to discard. Even if the user gets more than they intended to get, they may actually find that extra information useful, and if not, it is no problem for them to discard the extra information. If the disposal happens automatically, the extra amount of work is simply wasted.
According to this principle and since find first finds all the matches, it should also return all the matches. It already did the work to find all of them, the resources have been used. Designing the API in such a way that then only one result is returned means to automatically discard the rest of the work, without giving the user a choice to look at this extra information. Again, this could be fixed with #312, but findall is already the right kind of implementation which returns all results of the query.
- Only raise exceptions in exceptional situations.
This is an important principle in C++, because exceptions in C++ cost nothing when they are not triggered, but are very costly when they are triggered. This was a deliberate design trade-off to the indented use case of exceptions, namely to signal situations in which the program cannot continue because of some unrecoverable error.
Python is slow whether exceptions are raised or not 😉, so the performance argument is not valid, but it is still a good design to use exceptions only in exceptional cases.
How can you avoid raising exceptions in an API? By using return types that can reflect common “error” states and reserve the exception for the truly exceptional circumstance. In C++, one can return a std::optional in those cases where it is not clear whether some query can be fulfilled or not. In Python, we return None instead of some object in a similar situation.
find can easily fail, since it is rather common that a query does not return exactly one match. So the right design is to chose a return type that can reflect the fact that there can be zero, one, two, or many results, instead of raising exceptions.
Issue Analytics
- State:
- Created 2 years ago
- Comments:15 (15 by maintainers)

Top Related StackOverflow Question
OK, so I deprecated
Particle.findin https://github.com/scikit-hep/particle/pull/318 and tagged you all. That closes this one “issue” that is very scoped/well-defined. The rest of the discussions should be discussed on https://github.com/scikit-hep/particle/issues/312 to have it all in one place. I would be tempted to release 0.16 with this one new PR and then work on the breaking changes for a future 0.17.Thanks for the constructive discussions towards a better package, BTW.
Yes, that the return value of
findallhas a nice string representation. That would be useful in interactive searches in ipython and co. Programmatically, a generator seems the Python way to go, but for visualization in interactive searches it could be useful to see all the particles at once directly.