Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Performance questions?

See original GitHub issue

I’m keen to “run onto the spike” and find any big potential performance improvements in pydantic-core while the API can be changed easily.

I’d therefore love anyone with experience of rust and/or pyo3 to have a look through the code and see if I’m doing anything dumb.

Particular concerns:

The “cast_as vs. extract” issues described in https://github.com/PyO3/pyo3/issues/2278 was a bit scary as I only found the solution by chance, are there any other similar issues with pyo3?
generally copying and cloning values - am I copying stuff where I don’t have to? In particular, is input or parts of input (in the case of a dict/list/tuple/set etc.) copied when it doesn’t need to be?
Similarly, could we use a PyObject instead of PyAny or visa-versa and improve performance?
here and in a number of other implementations of ListInput and DictInput we do a totally unnecessary map, is this avoidable? Is this having a performance impact? Is there another way to give a general interface to the underlying datatypes that’s more performance
The code for generating models here seems to be pretty slow compared to other validators, can anything be done?
Recursive models are slowing than I had expected, I thought it might be the use of RwLock that was causing the performance problems, but I managed to remove that (albeit in a slightly unsafe way) in #32 and it didn’t make a difference. Is something else the problem? Could we remove Arc completely?
lifetimes get pretty complicated, I haven’t even checked if get a memory leak from running repeat validation, should we/can we change any lifetimes?

I’ll add to this list if anything else comes to me.

More generally I wonder if there are performance improvements that I’m not even aware of? “What you don’t know, you can’t optimise”

@pauleveritt @robcxyz

Issue Analytics

State:
Created a year ago
Reactions:22
Comments:24 (14 by maintainers)

Top GitHub Comments

4reactions

samuelcolvincommented, Apr 30, 2022

Amazing yes please.

Are stylistic changes welcome too (thinks like iterators vs loops etc)?

yes, but my third priority (after safety and speed) is readability, particularly for python developers - e.g. me. So if changes make it technically more correct rust but make it harder to read for notices, and have no other impact, I might be inclined to refuse them.

Best to create some small PRs and I’ll comment.

4reactions

Stranger6667commented, Apr 30, 2022

Hi 😃 First of all, thank you for Pydantic!

Here are some ideas about allocations:

truncate - as it allocates in format!, but it could be avoided:

use std::fmt::Write;

macro_rules! truncate {
    ($out: expr, $value: expr) => {
        if $value.len() > 50 {
            write!($out, "{}...{}", &$value[0..25], &$value[$value.len() - 24..]);
        } else {
            $out.push_str(&$value);
        }
    };
}

Similarly, some push_str calls together with &format! could be replaced by write! (clippy should warn about it from Rust 1.61 btw):

output.push_str(&format!(", input_type={}", type_));

write!(output, ", input_type={}", type_);

There are a few more to_string calls that could be avoided:

let loc = self
    .location
    .iter()
    .map(|i| i.to_string())
    .collect::<Vec<String>>()
    .join(" -> ");

to:

let mut first = true;
for item in &self.location {
    if !first {
        output.push_str(" -> ");
        first = false
    }
    match item {
        LocItem::S(s) => output.push_str(&s),
        LocItem::I(i) => {
            output.push_str(itoa::Buffer::new().format(*i))
        },
    }
}

Required the itoa crate though - it would be helpful in a few more places (ryu would help with floats as well)

StrConstrainedValidator::_validation_logic if strip_whitespace and to_lower are true, then it will allocate twice, but only the latter actually requires allocating a new string.

Also, the signature implies an allocation, but it is not needed if e.g. min_length / max_length validation fails, so it might be better to use Cow instead (maybe also use Cow for Validator::get_name?)

py_error! uses format!, so I assume that calling to_string on its argument should not be necessary (like this one).

Are you ok with me submitting a PR for such things?

Static vs Dynamic dispatch. Is there any particular reason for using &'data dyn Input in the Validator trait? I’d assume it is possible to use &impl Input instead and avoid vtable indirection in cost of some compilation time.

The code for generating models here seems to be pretty slow compared to other validators, can anything be done?

inside with_prefix_location:

self.location = [location.clone(), self.location].concat();

One allocation could be avoided:

let mut new_location = location.clone();
new_location.extend_from_slice(&self.location);
self.location = new_location;

I’ll take a look at other places during the weekend 😃

Top Results From Across the Web

The Best Questions to Ask in Performance Reviews

Best Performance Review Questions for Employees · What experience, project, or action are you most proud of since the last review? · Which...

25 Performance Review Questions to Start Using Today

25 performance review questions (and how to use them) · How well do you work with others? · How would team members describe...

The 15 Best Performance Review Questions | HRForecast

Questions about possible areas for improvement · What were your biggest lapses this year and why do you think they happened? How will...

25 Smart Performance Review Questions to Improve ...

Questions about overall performance · What accomplishments are you proud of since our last performance review? · What factors motivate you to get...

9 performance review questions for more efficient reviews

For managers: · What areas does this employee excel in? Having a manager weigh in on their direct reports' strengths can be very...

Troubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.

Start Free

Top Related Reddit Thread

No results found

Top Related Tweet

No results found

Top Related Dev.to Post

No results found

Performance questions?

Issue Analytics

Top GitHub Comments

Top Results From Across the Web

Top Related Medium Post

Top Related StackOverflow Question

Troubleshoot Live Code

Top Related Reddit Thread

Top Related Hackernoon Post

Top Related Tweet

Top Related Dev.to Post

Top Related Hashnode Post

First class field validator

decode error for unicode charaters