Datastax Cassandra Administrator Certification Questions and Answer (Pratice Questions and Dumps)

Question-: In your Cassandra cluster you have nodes, and you want to keep replication factor as . And you will be adding one additional node to the cluster after a month. In this case keeping replication factor is fine?
A. Yes
B. No

Answer: A
Exp: You can keep the replication factor more than number of nodes, if you plan to add more nodes to the cluster. Else you should not keep it more than number of nodes.

Admin only

Question-: Which of the following best replication strategy for production Cassandra Cluster setup. Assuming you have Cassandra cluster with nodes, which are across datacenters in Europe, North America and Asia region?

A. SimpleNetworkStrategy
B. SimpleStrategy
C. DatacenterAwareTopology
D. DatacenterAwareNetworkTopology
E. NetworkTopologyStrategy

Answer: E

Explanation: SimpleStrategy : In this topology first replica will be placed based on partitioner decision and additional replica are placed on the next nodes clockwise in the ring without considering topology for example rack and datacenter location.
NetworkTopologyStrategy: This one is recommended for the production deployments, using this you can expand your cluster to multiple datacenters. Using this you can define how many replicas are needed in a particular datacenter.

This also places the replica across the different racks in a datacenter. Because if placed in the same rack, which can fail and at the same time all the replicas will be lost, if kept in the same rack.

Replication strategy is defined per keyspace, and is set during keyspace creation.

Admin Only

Question-: Please match the below

A. Virtual Nodes
B. Single Token Architecture
C. Murmur3Partitioner
D. RandomPartitioner
E. Snitch

1. The possible range of hash value is from 0 to 2^127 -1
2. Range of partition key token between -2^63 tp +2^63-1
3. It uses either allocation algorithm or random selection algorithm to specify the number of tokens.
4. You must have to enter the values in the initial_token parameter in the Cassandra.yaml file.
5. It can be used to find which datacenters and racks nodes below to.

Answer: A-3, B-4, C-2, D-1, E-5
Exp: Token assignment depend the type of architecture you chose as below.
- Virtual nodes: This uses either the allocation algorithm or the random selection algorithm to specify the number of tokens distributed to nodes within the datacenter. All the nodes in a datacenter must use the same algorithm.
- Single token architecture: To ensure data is evenly divided across the nodes in the cluster, you must enter values in the initial_token parameter in the Cassandra.yaml file for each node.
Data partitioned with one partitioner cannot be converted to the other partitioner.

About partitioner:
- Murmur3Partitioner: This is a default partitioner and uses the hashing function to creates the 64-bit hash value of the partition key with a possible range from -2^63 to +2^63-1. It is the default one and must be used with the new cluster because it is more performant then any other existing one.
- Random partitioner: This is still available because of backward compatibility. It uniformly distributes data evenly across the nodes using an MD5 hash value of the row key. The possible range of hash values are 0 to 2^127-1. And it less performant than Murmur3Partitioner.
Snitch: Using snitch it can be found that in which datacenters or racks the nodes belong to. It is snitch responsibility to let database know the network topology so requests are routed efficiently. Hence, replication strategy places the replica of the data based on the information provided by the new snitch. All nodes in the cluster must have the same snitch. And you should avoid same replica on the same rack.

Admin Only

Related Questions

Question-: You are facing performance issue with your Cassandra cluster, specifically you observe particular node is not performing as required and facing the issue of the load balance. Which of the following command would be helpful in this case from the nodetool utility?
A. gossipinfo
B. info
C. gcstat
D. ring
E. assassinate
And : D
Exp : “nodetool ring�? provides the status and information about the ring, specially provides the idea of the load balance and if any nodes are down. Suppose your node is correctly configured it may show a different ring. Following information, you can get from the nodetool utility
- It shows the info about the tokens in the ring
- Cluster-level, table level from the perspective of the queried node.
- It can be used to determine the balance of tokens in the cluster.
- Check the load column value to determine the balance of load.
- Even if you see hotspot in your ring you can utilize detail from this command output.

Admin only
Question-: When you run the command “nodetool tablestats -H –keyspace_name�? . The output generated is all the tables in the provided keyspace across entire cluster?
A. True
B. False

Question-: By using the “tablestats�? command which of the following info is available to us?
A. It would tell us whether the table has been flushed to disk or not.
B. You can get the detail about the resources consumed by a table.
C. It would give the detail, about how much overall size of the database
D. It would let us know overall uptime of the Cassandra cluster.

Question-: You are working in Acmeshell Inc. where they have node cluster setup. One of the data scientists had reported that, while querying the table they see slowness since last couple of hours. Which of the following command you would be using to check this detail?
A. nodetool tablestats
B. nodetool ring
C. nodetool tablehistograms
D. nodetool gcstats
E. nodetool failuredetector

Question-: You run the following command of the nodetool

nodetool tpstats

Which generates initial output as below. What do you think or infer from this output?

A. There is an issue with the Compaction process.
B. JVM is unable to flush the memtable.
C. JVM is unable to reclaim the memory space.
D. There 674 rows which are yet to be processed.
E. There 674 columns which are yet to be processed.

Question-: Which of the following statements are true with regards to garbage collection in the Cassandra, database?
A. You should have as much as low gc as possible.
B. You should have as high as possible gc
C. gc info is just to know, how much data is being ignored. And not that critical.
D. Using the “nodetool gcstats�? command you would get to know, what is the current stats of the gc since last time command was executed.

Question-: if you wanted to know that all the nodes are having the same schema info, then which of the following command you would be using?
A. nodetool ring
B. nodetool nodestats
C. nodetool failuredetect
D. nodetool gossipinfo