question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

zone tags documentation misleading?

See original GitHub issue

I think there is a typo in this document:

This means that CrateDB will try to allocate shards and their replicas according to the zone tags, so that a shard and its replica are not on a node with the same zone value

Shouldn’t that not be removed?

Issue Analytics

  • State:open
  • Created 3 years ago
  • Comments:6 (4 by maintainers)

github_iconTop GitHub Comments

2reactions
seutcommented, May 27, 2021

@nkev

Would we have to change the shard allocation of the example table from 6 to 10 for CrateDB to start allocating to the 2 new nodes in the primary us-east-1 zone?

No you wouldn’t, CrateDB takes care to re-balance the shard distribution across all nodes incl. new ones. The shard allocation awareness will just try to ensure that not 2 copies of the same shard are allocated on nodes with the same attribute value. So in the context of multiple zones, it will try to allocate a copy of a shard (replica) to a different zone (node with a different attribute value). Try because if no nodes in a different zone are available, it will still allocate the copies in the same zone. To prevent that, the cluster.routing.allocation.awareness.force.zone.values must be set.

@norosa

Almost correct, but as mentioned above, CrateDB will try to prevent having multiple copies of the same shard inside the same zone. So it would look like (replica 3 will be tried to allocate on zone2):

zone 1:

  • node 1: primary shard 1, replica shard 5
  • node 2: primary shard 2, replica shard 6
  • node 3: primary shard 3, replica shard 4

zone 2:

  • node 4: primary shard 4, replica shard 2, replica shard 3
  • node 5: primary shard 5, primary shard 6, replica shard 1

In general, we should overhaul the multi-zone guide, specially the description of the force awareness setting because it is slightly wrong.

The 3rd requirement for a Multi-Zone Setup is already achieved by configuring awareness attributes, these settings are also taking into account on queries. So the node one is connected to, will prefer shards inside the same zone over shards in different zones.

The force awareness will force the awareness, meaning that the cluster will not only try to prevent allocating of multiple shard copies inside the same zone, but will force it instead. If configured and one zone drops out, no new shards will be allocated and thus the tables contain unassigned shards and go into a YELLOW health state. This is interesting for e.g. when one zone cannot handle the amount of data of another zone. If force awareness isn’t set in such a case, the still alive zone would run out of (storage) resources.

1reaction
nomirosecommented, Mar 22, 2021

@nkev, in the modified example you provide, I believe this would happen, if you didn’t change the sharding configuration of the table:

  • the six shards would be distributed across the five nodes, with one node having two primary shards, and all other nodes (four of them) having one primary shard
  • the six replica shards would be distributed across the five nodes, with one node having two replica shards, and all other nodes having one replica shard. no replica shards would be allocated to the node that holds the primary shard. with one exception, no zones would have a duplicate shard (primary or replica)

so, for example:

zone 1:

  • node 1: primary shard 1, replica shard 5, replica shard 6
  • node 2: primary shard 2, replica shard 3
  • node 3: primary shard 3, replica shard 4

zone 2:

  • node 4: primary shard 4, replica shard 2
  • node 5: primary shard 5, primary shard 6, replica shard 1

I believe that CrateDB will try its best to allocate all primary shards and replica shards across the whole cluster, even if it means an uneven distribution, like above

notice that in my example above, zone 1 has as much of an even distribution of shards as possible. however, it is unavoidable that shard 3, primary and replica, are both allocated. zone 2 satisfies the requirement to not have any duplicated shards (primary or replica)

I would be happy if @seut could confirm

additionally, I agree that some diagrams would help

Read more comments on GitHub >

github_iconTop Results From Across the Web

Zone Sharding example is misleading regarding behavior on ...
https://docs.mongodb.com/manual/core/zone-sharding/#zones. In the example, document. { x : 23 }. ends up on shard "Charlie".
Read more >
ZONE TAGS - EDITED SCALE - Graphisoft Community
Re: new zone tags in AC 18....nice, but a bit confusing. Can someone educate me as to what the edited scales refer to?...
Read more >
Creating and managing tags - Google Cloud
If all tags evaluated on a resource are directly attached, the inherited field is false and is omitted. If you want to list...
Read more >
Tags | Action Builder
Tags are resources that contain information on your entities and connections. Tags correspond to info on the UI. Each tag belongs to a...
Read more >
How to tag pages - The MDN Web Docs project
Incorrect tags. If you're looking at an article about HTML and it's tagged JavaScript , that's probably wrong! Likewise, if an article discusses ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found