Cassandra – Column Family
It is a NoSQL format which contains columns of key-value pair, where the key is mapped to value. Generally, In relational databases we will work with tables in which every key-value pair is a row.
In NoSQL each column is a tuple (Triplet) consists of column name , value and timestamp. whereas in RDBMS this data will be in the form of table.
Row
Row is the smallest unit that stores related data in Cassandra and individual rows constitute a column family
Row key: uniquely identifies a row in a column family.
Row: stores pairs of column keys and column values.
Column key: uniquely identifies a column value in a row.
Column value: stores one value or a collection of values.
Lets go with basic understanding on benefit of NoSQL over RDBMS.
Lets say accesing the data which is distributed across nodes would be time consuming if it is saved in table format. Also it is inefficient to read all the columns that would make a row in a relational table.
Example:
RDBMS: Table having the columns ID , Name , Age, Gender, City. NoSQL: column family will be " ID , NAME , Age" , " Gender,City".
Now if you have query which need only the males count in a particular city , Entire table needs to be read in relational database.
In NoSQL / Distributed data store, It will access only the second chunk of standard column family as the rest of the information is not required.