question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Cannot do parallelized "X is one of [A,B,C]" queries

See original GitHub issue

The old DB and NDB libraries offered a way to query a field as a member of a list:

CityTable.query(CityTable.city.IN(city_names)).fetch(100)

I think it should be possible to do this parallelized query in google.cloud.datastore as well.

The google.cloud.datastore offers no async API, and so can be an order of magnitude slower when manually stitching together this query, which makes it unusable for my purposes as a replacement for NDB. (I am getting off NDB as it is currently a requirement for getting off python-compat onto a generic python runtime.)

The async API bug mentions that an async API is not needed “because NDB exists”, which doesn’t make sense to me because NDB is more than an async API…its also an ORM and an eventloop library and more.

And even the NDB bug talks about how NDB positions as a complex ORM layer vs the simple key-value store of google.cloud.datastore.

I would like the simple key-value approach combined with a barebones async API, that one could use to implement a “where field is in list” query like the old DB and NDB libraries supported.

Issue Analytics

  • State:closed
  • Created 6 years ago
  • Comments:7 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
dannymilsomcommented, Jan 10, 2020

+1 for adding an async API interface to the google cloud datastore library. I agree with @mikelambert that it should not a requirement to use NDB to gain this async ability (as ndb comes with a number of other things which might be surplus to requirements - e.g. the ORM).

I think this is particularly important as async Python web frameworks are growing in popularity, and other well established frameworks are actively working to support ASGI (biggest example being Django 3). It should not be a pre-requisite for existing apps to migrate to NDB IMO to unlock this functionality.

0reactions
chemelnucfincommented, Jan 22, 2018

Hello, feature requests will now be tracked in the project Feature Requests. I will close this issue now, please feel free to continue to address any issues/concerns here.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Parallel Processing - Ask TOM
I've got an insert statement that I can't get to perform parallel DML. I'm inserting into a parallel (degree 8) table. The table...
Read more >
Extreme Parallel Processing (XPP) For Hive | FINRA.org
Build a dependency tree between the queries of the pattern, and split individual SQL statements. Parallelize the query executions based on ...
Read more >
Can't get attribute 'abc' on <module '__main__' from 'abc_h.py'>
I am defining a function in python. Program ...
Read more >
Parallel processing | Basics | kdb+ and q documentation
Each Parallel iterates a unary value: the argument list of the derived function is divided between secondary processes for evaluation. The result of...
Read more >
Query Processing Architecture Guide - SQL Server
The SQL Server Query Optimizer will use a parallel execution plan to ... NOEXPAND can be specified only for an indexed view and...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found