Question-: You have node cluster, initially you kept the replication factor as and then after discussing with the architect you added more nodes in the cluster and changed the replication factor to , then which of the following would happen? A. A single node in the cluster now would be able to handle more number of tokens. B. Same token would be handled by more than one node. C. Overall storage requirement would increase D. None of the above
Answer: A, B, C Exp: When you have replication factor more than one then data needs to be copied on more than one node in the cluster. Hence, same token would be handled by more than one node. In this case replication factor is 3, it means 3 nodes in the cluster would have the same token for having 3 copy of data across all the nodes in the cluster.
As replication factor is 3 means overall storage requirement increases. Because for each copy of data you are creating 3 copy of the data.
Admin and Dev both
Question-: Cassandra database best fit which of the following from CAP theorem? A. Consistency, Availability B. Availability, Partition tolerance C. Partition tolerance, Consistency D. Consistency, Availability, and Partition tolerance.
Answer: B Exp: Cassandra has tunable consistence. However, they are highly available and partition tolerance database.
Admin and Dev both
Question-: You have node cluster setups within datacenters. Each datacenter has nodes and dynamic snitch is configured. Also, you have RF= and consistency level is set as LOCAL_QUORUM. What does that mean?
A. A read or write request will be acknowledged to the client once it has achieved quorum from each data center. B. A read or write request will be acknowledged to the client once it has achieved quorum from data center it is talking to. C. A read or write request will be complete once it has achieved quorum across all the data centers. D. It will check all the copies of data in the cluster before read replies to the client and latest copy of the data would be returned.
Answer: B
Explanation: When you specify consistency level as ALL then it will make sure that the read request checks all copies and takes the latest or for write acknowledgement of successful write is only returned when all the copies have been updated. If program is writing heavy, specify Write ONE and Read All and if program is read heavy specify Read ONE and Write ALL. Practically you should avoid using ALL, because it implies that if any single node either primary or replica for a query crashes at some point, no read or write that has a consistency level of ALL targeting said nodes will be able to complete.
LOCAL_QUORUM: in that case, a read or write request would be acknowledged to the client once it has achieved quorum (more than 50% or 51% node replies in this case 2, where RF=3) within the data center it is talking to. In this setup, it is crucial that any read following write queries the same data center. If the read request queries a different data center, it is possible that the queried data center is not yet up-to-date with the latest data.
EACH_QUORUM: In this case read or write request will be complete once it has achieved quorum across all the data centers. This way the calling program can be rid of the restriction on the data centers which be queried for read following a write. However, there would be a higher latency.
Hence, we should always configure client code such a way that it always hits one specific data center for meeting quorum, it is more efficient option. Hence, it has never gone to a faraway data center for replication and improve the overall latency.