DataStax Cassandra Developer Certification Certification Questions and Answer (Dumps and Practice Questions)

Question : You are designing Cassandra cluster, and you have been asked to span cluster in Geneva, Nevada and Hyderabad.
You can configure consistency level of LOCAL_QUORUM or ONE.
The two primary considerations are
(1) being able to satisfy reads locally, without incurring cross data-center latency, and
(2) failure scenarios :
Which is the best strategy to configure replications.

1. Two replicas in each data center e.g. 2 copy in each DC Geneva, Nevada and Hyderabad
2. Three replicas in each data center e.g. 2 copy in each DC Geneva, Nevada and Hyderabad
3. One replica in each data center e.g. 1 copy in each DC Geneva, Nevada and Hyderabad
4. You can have three replicas in one data center e.g. Geneva and single replica in either Nevada and
Hyderabad

Correct Answer : 2
Explanation: Use NetworkTopologyStrategy when you have (or plan to have) your cluster deployed across multiple
data centers. This strategy specify how many replicas you want in each data center.
NetworkTopologyStrategy places replicas in the same data center by walking the ring clockwise until reaching the first node in
another rack. NetworkTopologyStrategy attempts to place replicas
on distinct racks because nodes in the same rack (or similar physical grouping) often fail at the same time due to power,
cooling, or network issues.

When deciding how many replicas to configure in each data center, the two primary considerations are (1) being able to satisfy
reads locally, without incurring cross data-center latency,
and (2) failure scenarios. The two most common ways to configure multiple data center clusters are:
Two replicas in each data center: This configuration tolerates the failure of a single node per replication group and still
allows local reads at a consistency level of ONE.
Three replicas in each data center: This configuration tolerates either the failure of one node per replication group at a
strong consistency level of LOCAL_QUORUM or multiple
node failures per data center using consistency level ONE.

Asymmetrical replication groupings are also possible. For example, you can have three replicas in one data center to serve
real-time application requests and
use a single replica elsewhere for running analytics.

Question : Replication strategy is set during

1. Keyspace creation

2. Cluster creation

3. Table Creation

4. ColumnFamily Creation

Correct Answer : 1
Explanation: Replication strategy is defined per keyspace, and is set during keyspace creation. The Cassandra
keyspace is a namespace that defines how data is replicated on nodes.
Typically, a cluster has one keyspace per application. Replication is controlled on a per-keyspace basis, so data that has
different replication requirements typically resides
in different keyspaces. Keyspaces are not designed to be used as a significant map layer within the data model. Keyspaces are
designed to control data replication for a set of tables.

Question : Which of the following components of Cassandra help you to determine which data centers and racks nodes belong
to.

1. Network topology

2. Replication Strategy

3. Snitch

4. Partitioner

Correct Answer : 3
Explanation: A snitch defines groups of machines into data centers and racks (the topology) that the replication
strategy uses to place replicas.
A snitch determines which data centers and racks nodes belong to. They inform Cassandra about the network topology so that
requests are routed efficiently
and allows Cassandra to distribute replicas by grouping machines into data centers and racks. Specifically, the replication
strategy places the replicas based on the
information provided by the new snitch. All nodes must return to the same rack and data center. Cassandra does its best not to
have more than one replica
on the same rack (which is not necessarily a physical location).

Related Questions

Question : You have setup of virtual node (vnode) to avoid HotSpot issue in Cassandra cluster, what will be the correct in
this case

A. You donat need to add new node in the cluster
B. As soon as you add new node in cluster, it's going to be able to take ranges of data from every node
C. It will re-partition all the data after adding new node
D. You can assign token more efficiently to each node in cluster

1. A,B
2. B,C
3. C,D
4. A,D
5. B,D

Question : You do configuration in Cassandra.yaml file to setup number of tokens in your cluster.

1. True
2. False

Question : You already have an existing Cassandra cluster setup, having single token node setup. How, would you convert this
setup to vnode setup?

1. It cannot be done, you have to create new Cluster altogether

2. you can configure another datacenter configured with vnodes already enabled and let Cassandra automatic
mechanisms distribute the existing data into the new nodes.

3. you can configure another Cassandra cluster configured with vnodes already enabled and let Cassandra
automatic mechanisms distribute the existing data into the new Cluster.

4. 2 and 3

Question : Vnodes helps Cassandra in which of the following way

A. Tokens are automatically calculated and assigned to each node
B. Rebalancing a cluster is automatically accomplished when adding or removing nodes.
C. Rebuilding a dead node is faster because it involves every other node in the cluster.
D. The proportion of vnodes assigned to each machine in a cluster can be assigned, so smaller and larger computers can be used
in building a cluster

1. A,B,C
2. B,C,D
3. A,B,D
4. A,C,D
5. A,B,C,D

Question : If not using virtual nodes (vnodes), you must calculate tokens for your cluster.

1. True
2. False

Question : Which all is correct, with regards to Gossip protocol?

A. Gossip is a peer-to-peer communication protocol in which nodes periodically exchange state information about themselves and
about other nodes they know about.
B. The gossip process runs every second and exchanges state messages with up to three other nodes in the cluster.
C. A gossip message has a version associated with it, so that during a gossip exchange, older information is overwritten with
the most current state for a particular node.
D. Every node in the Cluster should work as a seed node

1. A,B,C
2. B,C,D
3. A,C,D
4. A,B,D