Question-: Please arrange the below in order when a node is bootstrapped in Cassandra cluster? A. Bootstrap node contact seed node B. Seed node transfer info to joining node e.g. token ranges and cluster info C. Streaming SSTables from cluster nodes to new node D. Joining nodes state changes as normal node E. New node start handling read/write request
Answer: A,B,C,D,E
Explanation: Bootstrapping means joining new node to the cluster. And while joining new node to the cluster following things happen. A. New node which is bootstrapping node contact to the seed node. B. Then Seed node transfer info to joining node e.g. token ranges and cluster info C. All the nodes which needs to transfer the SSTables, will prepare those. D. Existing node start streaming SSTables to new node. While transferring the SSTables existing node also serve the read/write request. E. Once all the SSTable data streaming is done then Joining nodes state changes as normal node F. New node start handling read/write request
Question-: Seed node is the same as coordinator node? A. True B. False
Answer: B
Explanation: In Cassandra cluster Seed node and Coordinator nodes are different things. Let’s learn about 1. Coordinator node : When you request read/write in the Cassandra cluster. The node you contacted become the coordinator node. It does not matter whether it has the data you requested or not. Because it is responsible for the read and write request. If node does not have data which you requested it get it from the node which has and then delivered. So any node in Cassandra cluster you connect become a coordinator node for that particular request. 2. Seed Nodes : These are yes, specially designated node. But not single point of failure or something. These are useful when you add a new node to the cluster. Then this new node has to know what is the current state of the Cluster. And that information can be provided by the seed nodes. Once seed nodes gives the Current cluster information to the bootstrapping node. Its job done. Hence, seed node are just to provide the Cluster info to new node joining the cluster.
Question-: When you add a new node to (v-node enabled) cluster. Token ranges are re-arranged. Which of the following would help in cleaning the data from existing node. Which is now taken care by new node? A. You should run “nodetool compact� command on the source node and neighboring nodes that shared the same subrange after the new node is up and running. B. You should run “nodetool clearnsnapshot� command on the source node and neighboring nodes that shared the same subrange after the new node is up and running. C. You should run “nodetool cleanup� command on the source node and neighboring nodes that shared the same subrange after the new node is up and running. D. You should run “nodetool repair� command on the source node and neighboring nodes that shared the same subrange after the new node is up and running. E. You don’t have to do anything. Cassandra would automatically manages and reshuffle the data.
Answer: C
Explanation: When you add a new node to the cluster. Then token ranges should re-arrange and new node should start handling the read-write request as per the its own token range. Once bootstrapping is completed (means new node joined up and running) then old node which were having the data (which are not belong to this node) should be cleaned. And than can be easily done using the “nodetool cleanup� process on the source node and neighboring nodes. This command helps in preventing the database from including the old data to rebalance the load on that node. When you use this command temporarily it increases the disk space and higher Disk I/O possible. Remember you don’t have to run the nodetool cleanup command on the new node. But rather the source node and neighboring nodes.