Question-: Please map the following A. SSTable B. Compaction C. Tombstone
1. Data written on Disk 2. Marker for column deletion in a Row 3. Used for consolidating data on the disk
A. A-1, B-3, C-2 B. A-1, B-2, C-3 C. A-2, B-3, C-1 D. A-2, B-1, C-3 E. A-3, B-2, C-1
Answer: A Exp: Each time memory structure is full, then data will be written to disk using SSTable data file. Once written all the data automatically partitioned and replicated throughout the cluster. Then Cassandra will periodically consolidate the SSTable using compaction, which discards the obsolete data which is marked for deletion using tombstone marker. Tombstone is a marker in a row which indicates that a column should be deleted.
Question-: Which of the following statements are correct with regards to Cassandra Data Modeling and Architecture? A. In Cassandra for a Row primary key is optional. B. You can connect to any node in Cassandra cluster for accessing the data. C. Typically, a cluster can have only one keyspace. D. Typically, a cluster should have only one keyspace for each application.
Answer: B,D Exp: Cassandra database is a partitioned row store, where rows are organized into tables with mandatory primary key (it is not optional). They you can connect any node in any data center and access the data using the CQL language. And usually a cluster can have one keyspace per application composed of different tables. Client read or write request can be sent to any node in the cluster. When a client connects to a node with a request, that node serves as the coordinator node for that particular client operation. This coordinator node acts as a proxy between the client application and the nodes that own the data being requested. It is the coordinator nodes responsibility to find which nodes in the ring should get the request based on how the cluster is configured.
Question-: Which of the following is/are correct for the Datacenter? A. Datacenter can be physical or virtual. B. Data cannot be replicated across the data center. C. Datacenters must be across physical locations D. A Datacenter must have at least two cluster in it.
Answer: A Exp: Let’s first understand few terminologies with regards to Cassandra database setup Node: This is very basic using where your data will be stored. You can assume it as a one computer/laptop/server. Cluster: Group of nodes which are distributed and connected, a cluster can have one or more nodes. Even a cluster can have single Datacenter or multiple Datacenter. Datacenter: As we discussed cluster can have related nodes together for replication purposes. You can have either physical or virtual datacenter. Using separate datacenters prevents transactions from being impacted by other workloads and lower latency. Based on replication factor setting data can be written to multiple datacenters. Datacenter must never span physical locations. A cluster can have multiple Datacenter in it, but Datacenter may not have multiple cluster in it. It is possible more than one Cassandra setup can exist in a single datacenter.