Datastax Cassandra Administrator Certification Questions and Answer (Pratice Questions and Dumps)

Question-: You are having node Cassandra cluster spanning two data centers. One of the seed nodes from this cluster is down and you have to replace that node. What all you would be doing?
A. You would update the Cassandra.yaml file for each node and remove the IP of dead node as seed node.
B. You would update the Cassandra.yaml file for each node and add the IP of new node as seed node.
C. You would be performing rolling restart on all nodes so that nodes are aware of the changes in the seed list
D. You would be updating the jvm.options file of new node and add the IP address of the dead node in replace_address property.

Answer: A,C,D

Explanation: You should not add new node as a seed node, while replacing the node. All other options are correct.

Question-: What all are the functionality of the seed nodes?
A. They would be contacted while bootstrapping to get the gossip info.
B. Any time any node can contact to seed node to get the gossip info from the seed node.
C. Seed node are always used when data read from the cluster.
D. They are also known as coordinator node

Answer: A,B

Explanation: Seed nodes are useful for getting the gossip (cluster info) when any node needs it. Even while bootstrapping they can be contacted to get the gossip info. Any node in the Cassandra cluster can be a Coordinator node, when you read the data from the Cassandra cluster and you contact one of the nodes in the cluster. Whether this node has the data or not. It would sever the read request by fetching the data from the node which is having it and it become the coordinator node. Seed nodes are not master nodes or coordinator nodes. There is no concept of master node in the world of Cassandra.

Question-: You want to replace the currently running node in the cluster for applying software patch on particular node, which of the following is correct for that?
A. You would first add a new node and then remove the old node on which the patch should be applied.
B. You would be replacing node by using replace_address property in the jvm.option file.
C. You must make sure that the consistency level One is used on the old node
D. You must make sure that the consistency level One is not used on the old node

Answer: A,D
Replacing the running node may be required for applying hardware upgrade or for applying the software patch etc. There are mainly two approaches for this as below.
A. Adding a new node to the cluster and then decommission the old node : Once this new node is up and running. You do the following steps
a. Note the Host ID of the original node.
b. Using that Host ID of the original node, decommission the original node from the cluster using the nodetool decommission command.
c. Now run the “nodetool cleanup�? command on all the other nodes in the same datacenter (no need to run this cleanup command on other datacenter)
B. We can replace the currently running node and avoid the stream the data twice .
a. We have to make sure that data written using the consistency level ONE, you risk losing data because the node might contain the only copy of a record. Hence, we have to make sure no application uses the consistency level ONE.
b. Stop the node, which needs to be replaced.
c. Add the replace_address property on the new node with the old node IP address.

Related Questions

Question-: You are working with the read heavy database requirement and you decide to use the Cassandra caching mechanism, which of the following is are correct for Cassandra inbuilt caching?
A. You can only cache the Partition Key
B. You can cache both Partition Key as well as entire Row
C. When read happens it first check the existence of key in Partition Key Cache and then Row cache.
D. When read happens it first check the existence of key in Row Cache and then Partition Key cache.

Question-: Please map the followings

A. SizeTieredCompactionStrategy
B. DiteTieredCompactionStrategy
C. LeveledCompactionStrategy

1. This triggers a minor compaction when there are a number of similar sized SSTables on the disk.
2. Stores the data written within a certain period of time in the same SSTable.
3. Access Mostly Uused Products by 50000+ Subscribers

Question-: You are working with the Cassandra database for writing around MB of data. While writing you client application is making sure it has received that acknowledgement is received for each write. After just writing particular node goes down, which had acknowledged the write request. You queried another node and you don’t find the written data. How come that is possible, because Cassandra already acknowledged the write request?

A. Cassandra cluster is not configured correctly
B. There is a bug in the Cassandra storage engine
C. Data is only written to the SSTable and Memtables of that node
D. Data is only written to Memtable and Commit log of that node
E. Data is only written to Commit log and SSTables

Question-: You know when you write the data in Cassandra cluster there are various possible places where data would be written and while reading data back it checks all these storages to retrieve the latest possible data. However, for efficiency it needs to store the data in sorted order by clustering columns. Which of the following storage would have data stored by clustering column?

A. MemTable
B. SSTable
C. Partition Key Cache
D. Row Cache
E. Commit log

Question-: Please map the following

A. Row Cache
B. Bloom Filter
C. Partition Key Cache
D. Partition Summary
E. Partition Index
F. Compression offset map

1. Subset of the partition data stored on disk in the SSTables will be stored in memory
2. Helps in finding which SSTables can have requested data
3. Access Mostly Uused Products by 50000+ Subscribers
4. Stores the sampling of partition index.
5. Stores an index of all partition keys mapped to their offset.
6. Stores pointers to the exact location on disk where the desired partition data will be found.

Question-: Please map the following

A. Partition Summary
B. Key Cache
C. SSTables
D. Partition Index

1. Stores the byte offset into the partition index.
2. Stores the byte offset of the most recently accessed records.
3. Access Mostly Uused Products by 50000+ Subscribers
4. Stores the index of all partition keys mapped to their offset.