question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Automatic failover with multi region in case of region failure

See original GitHub issue

We are using CosmosDb with multi-region writes and SDK is configured to connect in Direct mode (default value).

Is SDK going to auto failover to another region in case of region failure? The EnableTcpConnectionEndpointRediscovery is set to false (false is a default value).

Today I’ve noticed that after configuring private endpoint my application failed to connect to Cosmos Db until I restarted application. It looks like SDK didn’t refresh endpoints and after configuring private endpoint public access was disabled.

I think similar situation might appear in case of a region failure.

From the documentation:

Multiple IP addresses are created per private endpoint:

One for the global (region-agnostic) endpoint of the Azure Cosmos account One for each region where the Azure Cosmos account is deployed

In fact I can see 3 private IP addresses assigned to a Private Endpoint:

  • region agnostic
  • IP for westeurope
  • IP for northeurope

Using nslookup I can see Cosmos Db domain name is resolved to region agnostic IP address. What will be the behavior in case of for example westeurope failure (SDK preferred region is set to westeurope).

Will it failover automatically? Should I set EnableTcpConnectionEndpointRediscovery to true?

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:5 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
ealsurcommented, Feb 26, 2021

This is not a normal failover scenario. A 403 with substatus 0 means the request is being blocked by the backend directly, there is no action the client can take and failover is not expected to work. Either the keys you are using are incorrect or as the error says, there is a firewall/VPN configuration blocking the request. It could be related to Private endpoints being enabled while the client was already initialized, which might block the addresses the client was already using.

I’ll make sure this is covered in the 403 documentation, but it is not related to failover scenarios.

1reaction
ealsurcommented, Feb 25, 2021

@kamilzzz Are you setting the ApplicationRegion or ApplicationPreferredRegions? See https://docs.microsoft.com/en-us/azure/cosmos-db/troubleshoot-sdk-availability for the details on how the SDK behaves.

EnableTcpConnectionEndpointRediscovery is unrelated to regional failover.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Implementing Multi-Region Disaster Recovery Using ...
Once failover is successful and you've proven that traffic is being successfully routed to the new Region, you'll failback to the primary Region...
Read more >
Performing a cross-Region failback
AWS Elastic Disaster Recovery (AWS DRS) allows you to perform failover and failback your EC2-based applications from one AWS Region to another AWS...
Read more >
High availability in Azure Cosmos DB
Service-managed failover allows Azure Cosmos DB to fail over the write region of a multiple-region account in order to preserve availability ...
Read more >
Using multi-region failover
With multi-region failover, your Production application has a cloned version of its full stack in a secondary failover region. In the event of...
Read more >
Considerations for Architecting Resilient Multi-Region ...
An active-passive failover strategy in which a second DR Region hosts a mixture of cold, warm, and hot copies of workloads and serves...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found