Premium

DataStax Cassandra Developer Certification Certification Questions and Answer (Dumps and Practice Questions)



Question : In your node Cassandra cluster you see, that most of data is going to particular node only and others are
sitting idle. This is known as...



 : In your  node Cassandra cluster you see, that most of data is going to particular node only and others are
1. vnode

2. Bad Partitioning

3. Hot Spot

4. Dump spot


Correct Answer : 3
Explanation: When token are not evenly assigned in Cassandra cluster and all the write goes to specific nodes in
cluster. Which causes single node to be heavily loaded and Cassandra cluster performance will degrade. This is known as a Hot
Spot issue




Question : You see, your node Cassandra cluster has a evenly distributed token. But still there is a Hot spot problem, how
can it be solved?
 : You see, your  node Cassandra cluster has a evenly distributed token. But still there is a Hot spot problem, how
1. By adding new node in the cluster.

2. re-arranging the token in cluster

3. with the help of virtual nodes

4. It cannot be solved


Correct Answer : 3
Explanation: By adding additional node and re-shuffling the token range can solve the problem of Hot spot but they
cannot be solved permanently or they can create another issues. The best solution to solve this problem is using vnode or
virtual node.




Question : What is true, with regards to vnode/virtual node

A. virtual nodes allow us to create individual smaller token ranges per node and it breaks up these ranges across the cluster
B. Before Cassandra 3.0, vnode have 256 ranges per node
C. past Cassandra 3.0, there are much more token ranges per node (more than 256)
D. past Cassandra 3.0, it's much less, and it's configurable by the user (less than 256)

 : What is true, with regards to vnode/virtual node
1. A,B,C
2. B,C,D
3. A,C,D
4. A,B,D
5. A,B,C,D

Correct Answer : 4
Explanation: virtual nodes allow us to create individual smaller ranges per node, and it breaks up these ranges
across the cluster. Now, each single node doesn't have all the data. By default, pre-3.0, it is 256 ranges per node. Now, past
3.0, it's much less, and it's configurable by the user.

Prior to Cassandra 1.2, you had to calculate and assign a single token to each node in a cluster. Each token determined the
node's position in the ring and its portion of data according to its hash value. In Cassandra 1.2 and later, each node is
allowed many tokens. The new paradigm is called virtual nodes (vnodes). Vnodes allow each node to own a large number of small
partition ranges distributed throughout the cluster. Vnodes also use consistent hashing to distribute data but using them
doesn't require token generation and assignment.



Related Questions


Question : When you are designing table in Cassandra, which is most important?
 : When you are designing table in Cassandra, which is most important?
1. How data is structured.

2. How many columns in a row

3. Access Mostly Uused Products by 50000+ Subscribers

4. It depends on the query, what query we are going to execute against the data.



Question : You have defined following table

CREATE TABLE TRINING_COURSE (
id uuid,
course_sequence int,
course_id uuid,
title text,
category text,
trainer text,
PRIMARY KEY (id, course_sequence ) );

Which of the following is a valid Query on this table ?
 : You have defined following table
1. SELECT * FROM playlists;

2. SELECT album, title FROM TRINING_COURSE WHERE trainer = 'HadoopExam';

3. Access Mostly Uused Products by 50000+ Subscribers

4. 1 and 3
5. 1,2 and 3



Question : Select the correct statement?
 : Select the correct statement?
1. The partition key determines which node stores the data.

2. partition key is responsible for data distribution across the nodes.

3. Access Mostly Uused Products by 50000+ Subscribers

4. 1 and 2

5. 1,2 and 3



Question : Cassandra stores an entire row of data on a node by partition key. If you have too much data in a partition and
want to spread the data over multiple nodes, use a ________________

 : Cassandra stores an entire row of data on a node by partition key. If you have too much data in a partition and
1. Secondary Index

2. composite partition key.

3. Access Mostly Uused Products by 50000+ Subscribers

4. Single Column Primary key



Question : You have already defined a table named HADOOPEXAM, now you want to add unique collection of email address for
each primary key, which of the following is correct statement
 : You have already defined a table named HADOOPEXAM, now you want to add unique collection of email address for
1. ALTER TABLE HADOOPEXAM ADD emails set;

2. ALTER TABLE HADOOPEXAM ADD emails list;

3. Access Mostly Uused Products by 50000+ Subscribers

4. ALTER TABLE HADOOPEXAM ADD email1 text, email2 text, email3 text;



Question : . You have defined a table as below

CREATE TABLE TRINING_COURSE (
id uuid,
course_sequence int,
course_id uuid,
title text,
category text,
trainer text,
PRIMARY KEY (id, course_sequence ) );

ALTER TABLE TRINING_COURSE ADD emails set;

Which of the following is correct CQL to add new emailid for a training_course
 : . You have defined a table as below
1. ADD VALUE TO TRINING_COURSE IN emails = emails + {'hadoopexam@gmail.com'}
WHERE id = 62c36092-82a1-3a00-93d1-46196ee77204 AND course_sequence = 2;


2. UPDATE TRINING_COURSE SET emails = emails + {'hadoopexam@gmail.com'}
WHERE id = 62c36092-82a1-3a00-93d1-46196ee77204 AND course_sequence = 2;


3. Access Mostly Uused Products by 50000+ Subscribers
WHERE id = 62c36092-82a1-3a00-93d1-46196ee77204 AND course_sequence = 2;


4. UPDATE TRINING_COURSE SET emails = {emails, {'hadoopexam@gmail.com'}}
WHERE id = 62c36092-82a1-3a00-93d1-46196ee77204 AND course_sequence = 2;