Is there a way to perform batch operations across databases and containers(Cosmos DB)
See original GitHub issueQuery/Question I am looking to perform operations across databases and containers to process a large data dump. Here is the situation,
- I receive a data dump(large with millions of records) that I import into a database/container(say
a
) owned by me - I need to read the records one by ones and for each record in the feed I need to ,
- Check for a value in the record in another container(say
b
) and database - If match is found then read from that other matching record in container B
- Create a new document in a new container in DB
a
with values As you can see this whole flow above is 1 operation in the step. Since we have a huge data dump I am looking for the most efficient way of handling this.
- Check for a value in the record in another container(say
Why is this not a Bug or a feature Request? I am not sure if this is feasibly and or other methods exist within the SDK.
Setup (please complete the following information if applicable):
- OS: PCF deployment
- IDE: IntelliJ
- Library/Libraries: Any java library preferably Spring-data-cosmos
Information Checklist Kindly make sure that you have added all the following information above and checkoff the required fields otherwise we will treat the issuer as an incomplete report
- Query Added
- Setup information Added
Issue Analytics
- State:
- Created a year ago
- Comments:6 (3 by maintainers)
Top Results From Across the Web
Transactional batch operations in Azure Cosmos DB using the ...
Learn how to use TransactionalBatch in the Azure Cosmos DB .NET or Java SDK to perform a group of point operations that either...
Read more >How to do bulk and transactional batch operations ... - YouTube
Matías Quaranta shows Donovan Brown how to do bulk operations with the Azure Cosmos DB .NET SDK to maximize throughput, and how to...
Read more >Move multiple documents in bulk with the Azure Cosmos DB ...
The easiest way to learn how to perform a bulk operation is to attempt to push many documents to an Azure Cosmos DB...
Read more >Azure Cosmos DB service quotas - GitHub
Azure Cosmos DB supports CRUD and query operations against resources like containers, items, and databases. It also supports transactional batch requests ...
Read more >Uses of Package com.azure.cosmos - NET
Represents a batch of operations against items with the same PartitionKey in a container that will be performed in a transactional manner at...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@bhattacharyyasom change feed processor is definitely a good approach for this. Spark Connector is a great approach as Kushagra mentioned, but if you find that working with Dataframes does not give you the level of programmability you need for the “unit of work” you outlined above (or you prefer just Java) then recommend just using change feed processor with multiple delegates to handle processing change feed from “container a” in parallel, custom code in each delegate to handle the matching logic to container b, and use bulk api to saturate throughput when writing back to container a. Hope it helps.
@bhattacharyyasom -
change feed processor
support is not present in spring-data-cosmos. worth looking into our spark connector for cosmos db, which supports heavy data loading + computation and processing. Our spark connector supports change feed as well. You can find information on it here -https://docs.microsoft.com/en-us/azure/cosmos-db/sql/sql-api-sdk-java-spark-v3