Question-: Which of the following statements are correct for the underlying storage engine of Cassandra? A. Cassandra follows read-before-write strategy B. In most of the cases Cassandra storage engine groups inserts and updates in memory and at intervals write the data to disk in append mode. C. Cassandra database sequentially writes immutable files. D. A,B E. B,C
Answer: E Exp: Cassandra avoids the reading before writing. Read-before-write can result in large latencies in read performance and other problems. And to avoid read before write storage engine groups inserts and updates in memory and, at certain interval it sequentially writes the data to disk in append mode. Once written to disk, the data is immutable and is never overwritten.
Because the Cassandra storage engine writes data sequentially, which can avoid the amplification and disk failure, the database accommodates inexpensive, consumer SSDs extremely well.
Admin and Dev both
Question-: Please arrange below in correct order of writing the data by Cassandra Storage engine?
A. Logging data in the commit log B. Writing data to memtable C. Flushing data from the memtable D. Strong data on disk in SSTables
Answer: A,B,C,D Exp: When write happens it first goes to commit log as well as memTables. Commit logs survives permanently even if power fails on a node. Memtable keeps all the write operations in sorted order until reaching a configurable limit and then flushed to SSTable.
While flushing the data from memtable database writes data to disk and also partition index would be created on the disk that maps the tokens to a location on disk.
Even we can flush the data manually using the nodetool flush or nodetool drain command. It is always recommended that before restarting the node we should flush the memtable, which can reduce the commit log replay time.
Admin only
Question-: There are two tables Table_A and Table_B with the following throughput. - Table_A has extremely high throughput - Table_B has very low throughput Which of the following statements are correct with regards to memtable and commit log segments?
A. Commit logs are divided into segments. B. New writes would happen in new segments only when previous segment is filled. C. When the commit log reaches its threshold it will forces Table_B memtable to be flushed as well. D. A,B E. A,B,C
Answer: A,B,C
Explanation: Commit Logs are made of segments. All the writes are recorded in order and new segments are created whem existing segment filled. Engine will purges commit log segments only when all the data in a segment only after all the data in a segment has been slushed to disk from the emtable.
All the commit log segments will have write from all the tables (in this case from both A and B) as well as from system tables. As Table_A has high throughput it fills faster than Table_B. And Table_B memtable will be flushed slowly then Table_A. When the commit lg reaches to its threshold it forces Table_B memtable to flush and then purges the segments.