question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Performance questions?

See original GitHub issue

I’m keen to “run onto the spike” and find any big potential performance improvements in pydantic-core while the API can be changed easily.

I’d therefore love anyone with experience of rust and/or pyo3 to have a look through the code and see if I’m doing anything dumb.

Particular concerns:

  • The “cast_as vs. extract” issues described in https://github.com/PyO3/pyo3/issues/2278 was a bit scary as I only found the solution by chance, are there any other similar issues with pyo3?
  • generally copying and cloning values - am I copying stuff where I don’t have to? In particular, is input or parts of input (in the case of a dict/list/tuple/set etc.) copied when it doesn’t need to be?
  • Similarly, could we use a PyObject instead of PyAny or visa-versa and improve performance?
  • here and in a number of other implementations of ListInput and DictInput we do a totally unnecessary map, is this avoidable? Is this having a performance impact? Is there another way to give a general interface to the underlying datatypes that’s more performance
  • The code for generating models here seems to be pretty slow compared to other validators, can anything be done?
  • Recursive models are slowing than I had expected, I thought it might be the use of RwLock that was causing the performance problems, but I managed to remove that (albeit in a slightly unsafe way) in #32 and it didn’t make a difference. Is something else the problem? Could we remove Arc completely?
  • lifetimes get pretty complicated, I haven’t even checked if get a memory leak from running repeat validation, should we/can we change any lifetimes?

I’ll add to this list if anything else comes to me.

More generally I wonder if there are performance improvements that I’m not even aware of? “What you don’t know, you can’t optimise”

@pauleveritt @robcxyz

Issue Analytics

  • State:closed
  • Created a year ago
  • Reactions:22
  • Comments:24 (14 by maintainers)

github_iconTop GitHub Comments

4reactions
samuelcolvincommented, Apr 30, 2022

Amazing yes please.

Are stylistic changes welcome too (thinks like iterators vs loops etc)?

yes, but my third priority (after safety and speed) is readability, particularly for python developers - e.g. me. So if changes make it technically more correct rust but make it harder to read for notices, and have no other impact, I might be inclined to refuse them.

Best to create some small PRs and I’ll comment.

4reactions
Stranger6667commented, Apr 30, 2022

Hi 😃 First of all, thank you for Pydantic!

Here are some ideas about allocations:

truncate - as it allocates in format!, but it could be avoided:

use std::fmt::Write;

macro_rules! truncate {
    ($out: expr, $value: expr) => {
        if $value.len() > 50 {
            write!($out, "{}...{}", &$value[0..25], &$value[$value.len() - 24..]);
        } else {
            $out.push_str(&$value);
        }
    };
}

Similarly, some push_str calls together with &format! could be replaced by write! (clippy should warn about it from Rust 1.61 btw):

output.push_str(&format!(", input_type={}", type_));

to

write!(output, ", input_type={}", type_);

There are a few more to_string calls that could be avoided:

let loc = self
    .location
    .iter()
    .map(|i| i.to_string())
    .collect::<Vec<String>>()
    .join(" -> ");

to:

let mut first = true;
for item in &self.location {
    if !first {
        output.push_str(" -> ");
        first = false
    }
    match item {
        LocItem::S(s) => output.push_str(&s),
        LocItem::I(i) => {
            output.push_str(itoa::Buffer::new().format(*i))
        },
    }
}

Required the itoa crate though - it would be helpful in a few more places (ryu would help with floats as well)

StrConstrainedValidator::_validation_logic if strip_whitespace and to_lower are true, then it will allocate twice, but only the latter actually requires allocating a new string.

Also, the signature implies an allocation, but it is not needed if e.g. min_length / max_length validation fails, so it might be better to use Cow instead (maybe also use Cow for Validator::get_name?)

py_error! uses format!, so I assume that calling to_string on its argument should not be necessary (like this one).

Are you ok with me submitting a PR for such things?

Static vs Dynamic dispatch. Is there any particular reason for using &'data dyn Input in the Validator trait? I’d assume it is possible to use &impl Input instead and avoid vtable indirection in cost of some compilation time.

The code for generating models here seems to be pretty slow compared to other validators, can anything be done?

inside with_prefix_location:

self.location = [location.clone(), self.location].concat();

One allocation could be avoided:

let mut new_location = location.clone();
new_location.extend_from_slice(&self.location);
self.location = new_location;

I’ll take a look at other places during the weekend 😃

Read more comments on GitHub >

github_iconTop Results From Across the Web

The Best Questions to Ask in Performance Reviews
Best Performance Review Questions for Employees · What experience, project, or action are you most proud of since the last review? · Which...
Read more >
25 Performance Review Questions to Start Using Today
25 performance review questions (and how to use them) · How well do you work with others? · How would team members describe...
Read more >
The 15 Best Performance Review Questions | HRForecast
Questions about possible areas for improvement · What were your biggest lapses this year and why do you think they happened? How will...
Read more >
25 Smart Performance Review Questions to Improve ...
Questions about overall performance · What accomplishments are you proud of since our last performance review? · What factors motivate you to get...
Read more >
9 performance review questions for more efficient reviews
For managers: · What areas does this employee excel in? Having a manager weigh in on their direct reports' strengths can be very...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found