cassandra

{"iq":[{"id":"1",
    "q":" Explain what is Cassandra?",
    "answer":"Cassandra is an open source data storage system developed at Facebook for inbox search and designed for storing and managing large amounts of data across commodity servers. It can server as both\n\nReal time data store system for online applications\n\nAlso as a read intensive database for business intelligence system"
  },
    {"id":"2",
      "q":"What is the use of Cassandra and why to use Cassandra?",
      "answer":"Cassandra was designed to handle big data workloads across multiple nodes without any single point of failure.  The various factors responsible for using Cassandra are\n\nIt is fault tolerant and consistent\n\nGigabytes to petabytes scalabilities\n\nIt is a column-oriented database\n\nNo single point of failure\n\nNo need for separate caching layer\n\nFlexible schema design\n\nIt has flexible data storage, easy data distribution, and fast writes\n\nIt supports ACID (Atomicity, Consistency, Isolation, and Durability)properties\n\nMulti-data center and cloud capable\n\nData compression"
    },
    {"id":"3",
      "q":"What was the design goal of Cassandra?",
      "answer":"The design goal of Cassandra is to handle big data workloads across multiple nodes without any single point of failure"

    },

    {"id":"4",
      "q":"What is NoSQL Database?",

      "answer":"NoSQL database (sometimes called as Not Only SQL) is a database that provides a mechanism to store and retrieve data other than the tabular relations used in relational databases. These databases are schema-free, support easy replication, have simple API, eventually consistent, and can handle huge amounts of data."
    },
    {"id":"5",
      "q":"Cassandra is written in which language?",
      "answer":"Java"
    },
    {"id":"6",
      "q":"How many types of NoSQL databases?",
      "answer":"– Document Stores (MongoDB, Couchbase)\n\n– Key-Value Stores (Redis, Volgemort)\n\n– Column Stores (Cassandra)\n\n– Graph Stores (Neo4j, Graph)"
    },

    {"id":"7",
      "q":"What do you understand by composite type?",
      "answer":"Composite Type is a cool feature of Hector and Cassandra.\n\nIt allow to define a key or a column name with a concatenation of data of different type.\n\nWith Cassandra Unit, you can use Composite Type in 2 places :\n\n– row key\n\n– column name"
    },

    {"id":"8",
      "q":"Mention what are the main components of Cassandra Data Model? ",
      "answer":"The main components of Cassandra Data Model are\n\n– Cluster\n\n– Keyspace\n\n– Column\n\n– Column & Family."
    },
    {"id":"9",
      "q":"What is the relationship between Apache Hadoop, HBase, Hive and Cassandra?",

      "answer":"Apache Hadoop, File Storage, Grid Compute processing via Map Reduce.\n\nApache Hive, SQL like interface on top of hadoop.\n\nApache Hbase, Column Family Storage built like BigTable\n\nApache Cassandra, Column Family Storage build like BigTable with Dynamo topology and consistency."
    },

    {"id":"10",
      "q":"What do you understand by Node in Cassandra?",
      "answer":"Node is the place where data is stored."
    },

    {"id":"11",
      "q":"Data center in Cassandra?",

      "answer":"Data center is a collection of related nodes."
    },

    {"id":"12",
      "q":" What is  Bloom filter in Cassandra?",
      "answer":"Bloom filter are nothing but quick, nondeterministic, algorithms for testing whether an element is a member of a set. It is a special kind of cache. Bloom filters are accessed after every query."
    },
    {"id":"13",
      "q":"What do you understand by CQL?",
      "answer":"User can access Cassandra through its nodes using Cassandra Query Language (CQL). CQL treats the database (Keyspace) as a container of tables. Programmers use cqlsh: a prompt to work with CQL or separate application language drivers.."
    },

    {"id":"14",
      "q":"What is the use of “void close()” method?",
      "answer":"This method is used to close the current session instance.."
    },

    {"id":"15",
      "q":" What are the collection data types provided by CQL?",
      "answer":"List : A list is a collection of one or more ordered elements.\n\n\n\nMap : A map is a collection of key-value pairs.\n\nSet : A set is a collection of one or more elements.."
    },

    {"id":"16",
      "q":"What do you understand by High availability?",
      "answer":"A high availability system is the one that is ready to serve any request at any time. High availability is usually achieved by adding redundancies. So, if one part fails, the other part of the system can serve the request. To a client, it seems as if everything worked fine."
    },

    {"id":"17",
      "q":"What ports does Cassandra use? ",
      "answer":"By default, Cassandra uses 7000 for cluster communication, 9160 for clients (Thrift), and 8080 for JMX. These are all editable in the configuration file or bin/cassandra.in.sh (for JVM options). All ports are TCP."
    },
    {"id":"18",
      "q":"  What are “Seed Nodes” in Cassandra?",
      "answer":"A seed node in Cassandra is a node that is contacted by other nodes when they first start up and join the cluster. A cluster can have multiple seed nodes. Seed node helps the process of bootstrapping for a new node joining a cluster. Its recommended to use the 2 seed node per data center.."
    },

    {"id":"19",
      "q":"  What happens to existing data in my cluster when I add new nodes?",
      "answer":" When a new nodes joins a cluster, it will automatically contact the other nodes in the cluster and copy the right data to itself."
    },

    {"id":"20",
      "q":" I have a row or key cache hit rate of 0.XX123456789 reported by JMX. Is that XX% or 0.XX% ? ",
      "answer":"XX%"
    },
    {
      "id":"21",
      "q":"When to avoid secondary indexes?",
      "answer":" Try not using secondary indexes on columns contain a high count of unique values and that will produce few results.."
    },
    {"id":"22",
      "q":" When to use secondary indexes? ",
      "answer":"You want to query on a column that isn’t the primary key and isn’t part of a composite key. The column you want to be querying on has few unique values (what I mean by this is, say you have a column Town, that is a good choice for secondary indexing because lots of people will be form the same town, date of birth however will not be such a good choice)."
    },

    {"id":"23",
      "q":"Why don’t we use strong for enum property in Objective-C ?",
      "answer":"Because enums aren’t objects, so we don’t specify strong or weak here."
    },{"id":"24",
      "q":"What allows you to combine your commits ?",
      "answer":"git squash."
    },
    {"id":"25",
      "q":" What are secondary indexes?",
      "answer":"Secondary indexes are indexes built over column values. In other words, let’s say you have a user table, which contains a user’s email. The primary index would be the user ID, so if you wanted to access a particular user’s email, you could look them up by their ID. However, to solve the inverse query given an email, fetch the user ID requires a secondary index...."
    },

    {"id":"26",
      "q":"What is bounding box?",
      "answer":"Bounding box is a term used in geometry; it refers to the smallest measure (area or volume) within which a given set of points..."
    },
    {"id":"27",
      "q":"When should you not use Cassandra? OR When to use RDBMS instead of Cassandra?",
      "answer":"Cassandra is based on NoSQL database and does not provide ACID and relational data property. If you have strong requirement of ACID property (for example Financial data), Cassandra would not be a fit in that case. Obviously, you can make work out of it, however you will end up writing lots of application code to handle ACID property and will loose on time to market badly. Also managing that kind of system with Cassandra would be complex and tedious for you.."
    },

    {"id":"28",
      "q":"What does JMX stands for? ",
      "answer":"JMX stands for Java Management Extension"
    },{"id":"29",
      "q":"What do you understand by Thrift?",
      "answer":"Thrift is the name of the RPC client used to communicate with the Cassandra server."

    },
    {"id":"30",
      "q":"What’s Code Coverage ? ",
      "answer":"Code coverage is a metric that helps us to measure the value of our unit tests.."
    },
    {"id":"31",
      "q":" What’s Completion Handler ? ",
      "answer":"Completion handlers are super convenient when our app is making an API call, and we need to do something when that task is done, like updating the UI to show the data from the API call. We’ll see completion handlers in Apple’s APIs like dataTaskWithRequest and they can be pretty handy in your own code."
    },
    {"id":"32",
      "q":"Does Cassandra works on Windows?",
      "answer":"Yes, Cassandra works pretty well on windows.."
    },{"id":"33",
      "q":"What is a keyspace in Cassandra?",
      "answer":"In Cassandra, a keyspace is a namespace that determines data replication on nodes. A cluster consist of one keyspace per node."
    },
    {"id":"34",
      "q":"What is Cassandra Data Model?",
      "answer":" Cassandra Data Model consists of four main components:\n\nCluster: Made up of multiple nodes and keyspaces\n\nKeyspace: a namespace to group multiple column families, especially one per partition\n\nColumn: consists of a column name, value and timestamp\n\nColumnFamily: multiple columns with row key reference.."
    },

    {"id":"35",
      "q":"Explain what is SStable consist of?",
      "answer":"SStable consist of mainly 2 files\n\nIndex file ( Bloom filter & Key offset pairs)\n\nData file (Actual column data)."
    },

    {"id":"36",
      "q":" Explain what is Bloom Filter is used for in Cassandra? ",
      "answer":"A bloom filter is a space efficient data structure that is used to test whether an element is a member of a set. In other words, it is used to determine whether an SSTable has data for a particular row. In Cassandra it is used to save IO when performing a KEY LOOKUP. ."
    },

    {"id":"37",
      "q":"explain how Cassandra writes changed data into commitlog? ",
      "answer":"Cassandra concatenate changed data to commitlog Commitlog acts as a crash recovery log for data Until the changed data is concatenated to commitlog write operation will be never considered successful Data will not be lost once commitlog is flushed out to file.."
    },

    {"id":"38",
      "q":" Explain what is Cassandra-Cqlsh?",
      "answer":"Cassandra-Cqlsh is a query language that enables users to communicate with its database. By using Cassandra cqlsh, you can do following things\n\nDefine a schema\n\nInsert a data and\n\nExecute a query"
    },

    {"id":"39",
      "q":"How Cassandra delete Data? ",
      "answer":"The “SSTables” are immutable. So we can’t remove a row from SSTables.When a row needs to be deleted, the “Cassandra” assigns the column value with a special value called “Tombstone” and when the data is read, “Tombstone” value is considered as “deleted”.So we can say that cannot delete data from the Cassandra database.."
    },

    {"id":"40",
      "q":"What is the use of “void close ()” method? ",
      "answer":"This “void close()” method is used to close the current instance of the session."
    },

    {"id":"41",
      "q":" How Cassandra writes data?",
      "answer":"The Cassandra writes the data in 3 components that is,\n\n1.      Commit-log Write\n\n2.      Memtable Write\\n\n3.      SStable Write"
    },

    {"id": "42",
      "q":" When you can use Alter keyspace?",
      "answer":"The “ALTER KEYSPACE” is used to change the properties like “number of replicas” and “durable write” of a keyspace..."
    },

    {"id":"43",
      "q":" What do you mean by “Data Centre” in Cassandra?",
      "answer":"The Cassandra “Data centre” is a collection of nodes and these nodes have a replica which used to handling the data in case of failure."
    },
    {"id":"44",
      "q":"What is the purpose of UIWindow object?",
      "answer" :"The presentation of one or more views on a screen is coordinated by UIWindow object."
    },
    {"id":"45",
      "q":"What do you understand by Cluster in Cassandra?",
      "answer":"“A Cluster is a container that contains one or more data centres.”A “Cluster” is a container for the key-spaces and the “Cassandra” database is segmented over multiple machines that are work together."
    },{"id":"46",
      "q":"What is a column family in Cassandra?",
      "answer":"The Column family in Cassandra is referred for a collection of rows."
    },
    {"id":"47",
      "q":"What is Thrift?",
      "answer":"The Thrift is the name of the Remote Procedure Call (RPC) client and it used to communicate with the Cassandra server."
    },
    {"id":"48",
      "q":"Explain the concept of Bloom Filter.",
      "answer":"Associated with SSTable, Bloom filter is an off-heap (off the Java heap to native memory) data structure to check whether there is any data available in the SSTable before performing any I/O disk operation.Learn more about Apache Cassandra- A Brief Intro  in this insightful blog now!"
    },
    {"id":"49",
      "q":"Define memtable.",
      "answer":"Similar to table, memtable is in-memory/write-back cache space consisting of content in key and column format. The data in memtable is sorted by key, and each ColumnFamily consist of a distinct memtable that retrieves column data via key. It stores the writes until it is full, and then flushed out."
    },
    {"id":"50",
      "q":" Define the management tools in Cassandra.",
      "answer":"DataStaxOpsCenter: internet-based management and monitoring solution for Cassandra cluster and DataStax. It is free to download and includes an additional Edition of OpsCenterSPM primarily administers Cassandra metrics and various OS and JVM metrics. Besides Cassandra, SPM also monitors Hadoop, Spark, Solr, Storm, zookeeper and other Big Data platforms. The main features of SPM include correlation of events and metrics, distributed transaction tracing, creating real-time graphs with zooming, anomaly detection and heartbeat alerting.."
    }





  ]

}