Datastax Cassandra Administrator Certification Questions and Answer (Pratice Questions and Dumps)

Question-: You have been given below table design

CREATE TABLE he_food_chain(
country_code text,
state text,
city text,
cafe_name text,
PRIMARY KEY (country_code, state, city, cafe_name )
);

Which of the following statement is/are correct?

A. All café within a country, state and city reside in the same partition.
B. All café within a country reside in the same partition.
C. Within a country, state and city you would get result order by café name.
D. Within a country you would get result order by state, city and café name.

Answer: B,D

Explanation: As we defined Primary Key as (country_code, state, city, cafe_name) then it means only first column is considered as Partition Key which country_code in this case. Hence, all the data will be partitioned based on the country_code. All the records with the same country will go in the same partition. And remaining three columns state, city and café_name are considered as a clustering key. Hence, they would store data under one partition order by these three columns. If you want to fetch all the café within a city, you have to iterate the result returned based on the country_code. Clustering keys by default consider the ascending order. In this case results are first order by state ascending and then city ascending and finally name of café.

If you want different sorting order than you can design table as below for example.

CREATE TABLE he_food_chain(
country_code text,
state text,
city text,
cafe_name text,
PRIMARY KEY (country_code, state, city, cafe_name )
) WITH CLUSTERING ORDER BY (state desc, city asc, café_name asc) ;

Partition key is not part of the ORDER BY statement because its values are hashed and therefore would not be close to each other in the cluster.

When you run the CQL query, you must include all partition key columns, at a minimum. And then you can apply additional filter by adding each clustering key in the order in which the clustering key appear.

Dev Only

Question-: You have defined below table definition

CREATE TABLE he_food_chain(
country_code text,
state text,
city text,
cafe_name text,
opening_date timestamp,
PRIMARY KEY ((country_code, state, city) , opening_date, cafe_name )
) WITH CLUSTERING ORDER BY (opening_date ASC, cafe_name ASC ) ;

Which of the following is valid query?

A. SELECT * FROM he_food_chain WHERE country_code = 'IND' and state = 'RAJ' and city = 'UDAIPUR'
B. SELECT * FROM he_food_chain WHERE country_code = 'IND' and state = 'RAJ' and city = 'UDAIPUR' and opening_date < '2019-01-01 00:00:00+0200'
C. SELECT * FROM he_food_chain WHERE country_code = 'IND' and state = 'RAJ'
D. SELECT * FROM he_food_chain WHERE country_code = 'IND' and state = 'RAJ' and city = 'UDAIPUR' and cafe_name = 'PRATAP DHABA'

Answer: A,B

Explanation: In Option-3 we are not having city column in where clause which is a partition key. Hence, this is not correct. In option-4 it is using clustering column café_name without considering the preceding clustering key column opening_date. Hence, this is also invalid.

Here, clustering keys column order matters because the clustering keys provide the sort order of the result set. Because of the clustering key’s responsibility for sorting, we know all data matching the first clustering key will be adjacent to all other data matching that clustering key. For example all the café with the same opening date will be grouped together in alphabetical order. Café with different dates will appear in temporal order.

Because it know the order, CQL can truncate section of the partition that don’t match the qury criteria. As clustering key café_name is secondary to clustering ket opening_date, café_name will appear in alphabetical order only for café opened on the same day. Hence, it is not correct to define café_name in CQL without specifying an opening date.

Admin/Dev Both

Question-: Which of the following are correct limitation for the Cassandra database?
A. You should set the ordering of data while table creation only.
B. All data for a single partition must fit on disk in a single node in the cluster.
C. A single column value is limited to 1 MB only.
D. No join or subquery support for aggregation.

Answer: A,B,D

Explanation: Cassandra database is good for BigData storage but it comes with its own limitations which you should always keep in mind as below.

- Cassandra database recommend that you keep number of rows within a partition below 100,000 as well as disk size under 100MB (Please check documentation if this has changed)
- A single column value is limited to 2 GB (And 1 MB is recommended)
- All data for a single partition must fit on disk in a single node in the cluster.
- Ordering is set at table creation time on a per-partition basis. It helps in avoiding clients attempt to sort billions of stored rows on a table.
- Denormalization is encouraged and join or subquery are not supported for aggregations.

Admin and Dev both

Related Questions

Question-: In your Cassandra cluster you have nodes, and you want to keep replication factor as . And you will be adding one additional node to the cluster after a month. In this case keeping replication factor is fine?
A. Yes
B. No

Question-: Which of the following best replication strategy for production Cassandra Cluster setup. Assuming you have Cassandra cluster with nodes, which are across datacenters in Europe, North America and Asia region?

A. SimpleNetworkStrategy
B. SimpleStrategy
C. DatacenterAwareTopology
D. DatacenterAwareNetworkTopology
E. NetworkTopologyStrategy

Question-: Please match the below

A. Virtual Nodes
B. Single Token Architecture
C. Murmur3Partitioner
D. RandomPartitioner
E. Snitch

1. The possible range of hash value is from 0 to 2^127 -1
2. Range of partition key token between -2^63 tp +2^63-1
3. It uses either allocation algorithm or random selection algorithm to specify the number of tokens.
4. You must have to enter the values in the initial_token parameter in the Cassandra.yaml file.
5. It can be used to find which datacenters and racks nodes below to.

Question-: Hinted handoff is repair mechanism when
A. Read fails on a node
B. Write fails on a node
C. Both read and write fails on a node
D. In case of new node is added to cluster. It will be SyncUP using Hinted Handoff

Question-: You have a Cassandra cluster with the two nodes (X and Y) with replication factor is and consistency level is also one. You are writing one row named row key K. Here node X is coordinator node before writing the data node-x goes down, which of the following statement is correct?
A. Node X will not be able to store hint.
B. Node Y has not received data as coordinator node, hence it can not write.
C. Coordinator node will return an UnavailableException error.
D. None of the above

Question-: Read repair does not propagate expired tombstones, nor does it consider expired tombstones when actually repairing data. And during read repair if there is tombstone data that has not been propagated to all replica nodes before gc grace expired, this data may continue to be returned as live data?
A. True
B. False