Datastax Cassandra Administrator Certification Questions and Answer (Pratice Questions and Dumps)

Question-: You have created the Cassandra cluster using the “GossipPropertyFileSnitch�?. This is a node cluster. Now you have found that the one of the node in the cluster is placed in the wrong rack. What would you do fix that?
A. Decommission node and re-add it to the correct rack and datacenter
B. Update the node’s topology and start the node.
C. Update the Cassandra.yaml file and restart the node
D. Bring down the cluster and then place the node in correct rack

Answer: A,B

Explanation: If you have placed the node in a wrong rack, then you can use the following method to place the node in correct rack.
1. The preferred method is to decommission the node and re-add it to the correct rack and datacenter : But using this method takes longer than the alternative method (mentioned below) because data is first removed from the decommissioned node and then the node gets the new data during the bootstrapping. But the method below do this concurrently.
2. Another approach would be updating the node’s topology and restart the node. Once the node is up, run a full repair on the cluster. However, this method is risky because until the repair is completed, the node may blindly handle requests or data the node does not yet have. To mitigate this problem with the request handling, start the node with -Dcassandra.join_ring=false after repairing once, then fully join the node to the cluster using the JMX method org.apache.cassandra.db.StorageService.joinRing(). The node will be less likely to be out of sync with other nodes before it saves any requests. After joining the node to the cluster, repair the node again, so that any writes missed during the first repair will be captured.

Question-: You have your Cassandra cluster is setup in datacenters, and you want to remove one of the datacenters from the cluster. Which of the following steps at least you have to do to remove the Datacenter from the cluster?
A. No client should write on the nodes in the datacenter which is going to be removed.
B. We need to run the “nodetool repair –full�?
C. Update the keyspace so that they no longer point to datacenter which is going to be removed.
D. Shutdown all the nodes from the datacenter which is being removed.
E. Run “nodetool assassinate�? command on every node in the datacenter being removed.
F. Restart the all the nodes in remaining two datacenter

Answer: A,B,C,D,E

Explanation: When we need to remove an entire datacenter then it involves more steps then removing a single node. Hence, we have to be careful while doing that. And if your design has data only in one datacenter (which is absolutely wrong design). Then even you should be more careful. To remove a datacenter you have to accomplish at least following option mention in the option.
A. No client should write on the nodes in the datacenter which is going to be removed.
B. We need to run the “nodetool repair –full�?
C. Update the keyspace so that they no longer point to datacenter which is going to be removed.
D. Shutdown all the nodes from the datacenter which is being removed.
E. Run “nodetool assassinate�? command on every node in the datacenter being removed.

There should not be any need for restarting the entire cluster. If that is the case, then availability would be affected. And Cassandra is not designed with considering any downtime. If you have to do, it means something wrong with the architecture or your cluster design.

Question-: Which of the following statement is true with regards to the “nodetool drain�? command?
A. It flushes all the SSTables to the disk
B. It flushes all the memtables to SSTables on disk.
C. It replays data from commit log
D. Cassandra will stops listening for connections from the client and other nodes.
E. You should use this command before upgrading a node to a newer version of Cassandra

Answer: B, D, E
Exp: “nodetool drain�? command flushes all the memtables from the node to SSTables on disk. And Cassandra stop listening for connections from the client and other nodes. You need to restart the Cassandra on this node after this command. This command is usually helpful for upgrading a node to a new version of the Cassandra.

Related Questions

Question-: You are working as an administrator for the Cassandra database. Where the data modeling is done of the time series data. And you have decided to use the “DateTieredCompactionStrategy�?. What all the benefits of having compaction with this strategy?

A. It helps in compacting SSTable based on time period.
B. It helps in compacting SSTable based on size.
C. With this you can have better disk usage
D. Your read performance will also increase after compaction
E. There is lesser RAM needed.
F. You can have your data in only one data center

Question-: Whenever compaction happens
A. It always deletes the tombstone data
B. It keeps the tombstone data upto 3 consecutive Compaction. So that read repair can happen.
C. It would delete the tombstone data if gc grace period had expired.
D. It deletes the tombstone data if it is older than 1hr

Question-: When compaction happens, it picks the partition from the both the old SSTables and merge them, it is always the case that new partition segment in new SSTable bigger than both of the older partition segment.
A. True
B. False

Question-: When compaction is done, then in which of the below case, new SSTable partition segment smaller than the older one?

A. When there are lot of delete operations on both of the partition segment.
B. When there are lot of tombstone marked data in both the partition segment.
C. When there are lot of insert operations on both the partition segments.
D. When there are lot of UPDATE operations.

Question-: You have a big Cassandra table with the overall size around M records. You run the following command.

COPY HE_KEYSPACE.TBL_HADOOPEXAM_COURSES TO ‘home/hadoopexam/he_courses_data.csv’ with HEADER=true and PAGETIMEOUT=40 and PAGESIZE=20 AND DELIMITER=’~’;

However, while doing this exercise. You get below error.

./dump_cassandra.sh: xmalloc: ../../.././lib/sh/strtrans.c:63: cannot allocate XXXXXXXXXX bytes (YYYYYYYY bytes allocated)", "stdout_lines": ["[Sat Jul 13 11:12:24 UTC 2019] Executing the following query:", "COPY HE_KEYSPACE.TBL_HADOOPEXAM_COURSES TO ‘home/hadoopexam/he_courses_data.csv’ with HEADER=true and PAGETIMEOUT=40 and PAGESIZE=20 AND DELIMITER=’~’;"

What is the cause and how can you correct the same?

A. You have to remove PAGETIMEOUT parameter
B. You have to increase the PAGESIZE parameter from 20 to more
C. You have to add BEGINTOKEN and ENDTOKEN parameters
D. You have to add MAXOUTPUTSIZE parameters

Question-: Which of the following helps keeping all the data together based on the partition key?
A. Row Cache
B. Key Cache
C. Partition
D. Bloom filter
E. Clustering key