Datastax Cassandra Administrator Certification Questions and Answer (Pratice Questions and Dumps)

Question-: You are planning to decommission a node from the Cassandra cluster, which of the following is correct in this case?
A. Token ranges would be assigned to remaining node after the decommissioning of the node
B. All the data would be copied from the decommissioned nodes to other nodes.
C. All the data would be copied from the other replicas in the cluster.
D. Data would be automatically cleaned from the decommissioned node and you bring it back in service if required.

Answer: A, B
Exp: This question is asked in the context of the approaches of removing a particular node from the cluster. Suppose you are removing a node from the cluster. Then there are two approaches you can follow.
1. Nodetool decommission
2. Nodetool removenode
In case of decommission or removing nodes
- Existing token range would be handled by the other nodes in the Cassandra cluster. And data should be replicated properly.
Now in both the case copying data would have different approaches
1. If you are using decommission approach then in this case data would be copied from the decommissioned node.
2. If you are using removenode command then data would be copied from the existing nodes in the cluster.
And also remember than when you decommission a node from the Cassandra cluster then data would not be automatically cleaned. If you want reuse the same node in Cassandra cluster then you have to first clean the data your own. And you can put it back with the different token range.

Question-: You want to replace the existing node in the cluster with the new node. Hence, you add the new node as well as keep the older node in the cluster. Once data is fully copied on the new node, you would be removing the one node, which is not needed anymore.
A. True
B. False

Answer: B
Exp: In Cassandra to replace a node, you need to follow the certain steps. Just adding new node and removing a node from the cluster would not work. How tokens would be taken care also matters.
Hence, to replace a node in Cassandra cluster, you just mark the JVM startup falg -Dcassandra.replace_address_first_boot= . Once this property is enabled then node would start in hibernate state, during which all the other nodes will see this node to be in DOWN(DN), however this node will see itself as UP(UN).
Now you can start the replacing node (new node) and it will bootstrap the data from the rest of the nodes in the cluster. A replacing node will only receive writes during the bootstrapping phase, if it has a different ip address to the node that is being replaced. Once the bootstrapping completes the node will be marked as “UP�?.

Question-: Please map the below
A. Repair Process
B. Incremental Repair
C. Full Repair
D. Operate/repair all the token ranges replicated by the node on which repair started

1. Synchronizes the data between nodes by comparing their respective datasets for their common token ranges.
2. Repair the data that’s been written since last repair.
3. Access Mostly Uused Products by 50000+ Subscribers
4. nodetool repair

Answer: A-1,B-2,C-3, D-4

Explanation: Repair process in Cassandra synchronizes the data between the nodes by comparing their respective datasets for the common token ranges. And stream any differences for out of sync data between the nodes. There are two types of repairs which you can run.
1. Incremental
2. Full
For incremental repair which is the default repair mechanism, only repairs the data that’s been written since the previous incremental repair. Once data is repaired with the incremental repair, it would not be part of next incremental repair again. And not a good option in case if you have disc corruption, data loss by operator or bugs in Cassandra. Hence, it is required you regularly run the fill repair as well.
By default, repair will operate on all token ranges replicated by the node your are running repair on, which will cause the du[plicate work if you run it on every node. The -pr flag will only repair the “primary�? ranges on a node. Hence you should use the “nodetool repair -pr�? command.

Related Questions

Question-: Which of the following statements are true.
A. Memtable is maintained per table basis.
B. SSTable is maintained per table basis
C. Commit Log is shared among tables.
D. Memtable is shared among the tables.
E. SSTable is shared among the tables.

Question-: There is a process called Compaction for merging SSTables, which of the following statement is true?
A. While insert and update happens Cassandra engine overwrite existing rows with inserts and updates.
B. Engine does not perform deletes by removing the deleted data. Instead, the database marks deleted data with tombstone.
C. During compaction there is temporary spike in disk space as well as disk I/O.
D. Database can read data from new SSTables even before compaction process finishes.
E. During compaction there would be high cache miss.
F. Out of date versions of a row may exist on other node even compaction happen on another node.

Question-: Consider you have a setup of Cassandra cluster with the replication factor as to prevent data loss. Which of the following statement is true when you delete a row/data?

A. Consider you have a setup of Cassandra cluster with the replication factor as 3 to prevent data loss. Which of the following statement is true when you delete a row/data?
B. If a node has a record with the Tombstone marked and another node has more recent changes. Then while reading you would not get data.
C. If a node has record with the Tombstone marker and another node has older value record then while reading it will return the data/record.
D. If client writes a new update to existing tombstone record with the grace period, then there would be an overwritten to the existing Tombstone record.
E. Storage engine uses hinted hindoffs to replay the database mutations that the node missed while it was down.

Question-: Select correct statement with regards to tombstones marker?
A. Insert or updating data with null values can cause of tombstones record generation.
B. Tombstone go through read path
C. Tombstone go through write path
D. Having excessive number of tombstones can improve the overall performance of DB.

Question-: You have been given below database design

CREATE KEYSPACE hadoopexam WITH replication =
{'class': 'SimpleStrategy', 'replication_factor': '1'} AND durable_writes = true;

CREATE TABLE hadoopexam.price_by_year_and_name (
purchase_year int,
course_name text,
price int,
username text,
PRIMARY KEY ((purchase_year , course_name), price)
) WITH CLUSTERING ORDER BY (pricesa ASC);

Which of the following delete statement will create partition level tombstones?

A. DELETE from hadoopexam.price_by_year_and_name WHERE purchase_year = 2019 AND course_name = 'Apache Spark Scala Training' AND price= 2000;
B. DELETE from hadoopexam.price_by_year_and_name WHERE purchase_year = 2019 AND course_name = 'Apache Spark Scala Training';
C. DELETE from hadoopexam.price_by_year_and_name WHERE purchase_year = 2019 AND course_name = 'Apache Spark Scala Training' AND price> 1999;
D. Partitopn level tombstones cannot be created.

Question-: You have table in Cassandra as below

hadoopexam.price_by_year_and_name
(
purchase_year int,
course_name text,
price int,
username text
)

Where purchase_year, course_name are partition key and price is used to create secondary index. Which of the following statement is applicable here?

A. Select all records having price > 1000, causes single partition read.
B. Price column will be used for ordering of the data at storage level.
C. Secondary index on price column are stored locally on each node.
D. Price column will not be used for ordering of the data at storage level.