Datastax Cassandra Administrator Certification Questions and Answer (Pratice Questions and Dumps)

Question-: When compaction is done, then in which of the below case, new SSTable partition segment smaller than the older one?

A. When there are lot of delete operations on both of the partition segment.
B. When there are lot of tombstone marked data in both the partition segment.
C. When there are lot of insert operations on both the partition segments.
D. When there are lot of UPDATE operations.

Answer: A, B
Exp: When there are lot of delete then it is possible that there are lot of tombstone data which has the gc grace period expired. Which can cause new SSTable partition segment bigger than the older one. In case of lot of insert operation there would be bigger partition segment as part of the result. And in case of UPDATE there is no such impact on size.

Admin Only

Question-: You have a big Cassandra table with the overall size around M records. You run the following command.

COPY HE_KEYSPACE.TBL_HADOOPEXAM_COURSES TO ‘home/hadoopexam/he_courses_data.csv’ with HEADER=true and PAGETIMEOUT=40 and PAGESIZE=20 AND DELIMITER=’~’;

However, while doing this exercise. You get below error.

./dump_cassandra.sh: xmalloc: ../../.././lib/sh/strtrans.c:63: cannot allocate XXXXXXXXXX bytes (YYYYYYYY bytes allocated)", "stdout_lines": ["[Sat Jul 13 11:12:24 UTC 2019] Executing the following query:", "COPY HE_KEYSPACE.TBL_HADOOPEXAM_COURSES TO ‘home/hadoopexam/he_courses_data.csv’ with HEADER=true and PAGETIMEOUT=40 and PAGESIZE=20 AND DELIMITER=’~’;"

What is the cause and how can you correct the same?

A. You have to remove PAGETIMEOUT parameter
B. You have to increase the PAGESIZE parameter from 20 to more
C. You have to add BEGINTOKEN and ENDTOKEN parameters
D. You have to add MAXOUTPUTSIZE parameters

Answer: D

Explanation: Yes, you can use the COPY TO command to copy data from Cassandra table to a csv file. However, you also need to know the use of each parameters for almost all basic command used. There are following command options. In real exam they would not ask for all the command but frequently used command you should know. One of the example is COPY TO and COPY FROM command.

PAGESIZE : This shows the page size while fetching the data. If your PAGESIZE is higher than PAGETIMEOUT should also be high.

PAGETIMEOUT: It is a timeout for fetching each page. If your partition size is large than you should have large PAGETIMEOUT. If there is timeout error then consider increasing this value. Which is not the case in the given example.

BEGINTTOKENS: From the token where the data should be exported.

ENDTOKEN : Maximum number of token till the data needs to be exported.

MAXREQUESTS : Count of concurrently processing the data.

MAXOUTPUTSIZE: It is not able to allocate enough space for the data you are exporting. Hence, you need to tune this parameter. This is the parameter which actually measures the maximum size of the output file measured in number of lines. Beyond this value, the output file would be splited into segments.

Admin and Dev both

Question-: Which of the following helps keeping all the data together based on the partition key?
A. Row Cache
B. Key Cache
C. Partition
D. Bloom filter
E. Clustering key

Answer: C
Exp: To keep all the data together with the same partition key is achieved by the concept of partition. Partition created based on the partition key. On a single node data with the same partition key goes into the same partition. Clustering key helps this data to be sorted based on the clustering columns. All others are like Row Cache. Key Cache and Bloom filter are in memory data structure. To retrieving and locating data quickly.

Admin and Dev both

Related Questions

Question-: You have batch processing on every midnight which runs for - hours, based on the data volume. And this batch process writes this data in a Cassandra database table. Which of the following Compaction strategy fits for this requirement?
A. Leveled compaction
B. Size Tiered compaction
C. Time window compaction
D. Batch window compaction

Question-: Which of the following event Trigger the compaction process?
A. Compaction happens every time when Memtable flushes to an SSTable.
B. Compaction happens when manual flush happens
C. Compaction happens during the bootstrap process
D. Compaction happens when Memtable or commit log become too large.
E. Compaction happens every midnight by default
F. Compaction happens with every delete row command issued by a client

Question-: Which of the following statements are correct with regards to Tombstones compaction strategy?
A. If there are no eligible buckets for size tiered compaction then this may be initiated.
B. If there are expired tombstone records are more than 20% then this would be initiated.
C. In case of Tombstone compaction largest SSTable would be chosen first.
D. SSTable must be at least one week older before it is chosen for Tombstone compaction.

Question-: In your Cassandra cluster you have a heavy write, then which of the compaction strategy most suitable?
A. Size tiered compaction strategy
B. Leveled compaction strategy
C. Tombstone compaction strategy
D. Expired data compaction strategy

Question-: When you issue “nodetool compact�? it will
A. Initiate major compaction.
B. Initiate minor compaction.
C. Initiate Tombstone compaction
D. Creates one large SSTable
E. This would result in high IO

Question-: Size tiered compaction triggers compaction process based on the number of tables?
A. True
B. False