Your company decides to start the business of selling toys online and shipping globally. The E-Commerce system that supports the new business will be developed in-house. The development team is evaluating the data persistence solution for transactions. The solution shall support huge volume of transactions and the single point of failure shall be addressed. Which of the following will best address the requirements?
A. Key-value stores that support multi-node parallel computing
B. Snowflake tables in data marts supported by an active-passive server cluster
C. Data warehouse as big data storage with an active-active cluster
D. Attributes and tuples in relations with one primary and one standby server
Kindly be reminded that the suggested answer is for your reference only. It doesn’t matter whether you have the right or wrong answer. What really matters is your reasoning process and justifications.
My suggested answer is D. Attributes and tuples in relations with one primary and one standby server.
It implies NoSQL that key-value stores that support multi-node parallel computing. Key-value, document, graph stores are some of the most well-known NoSQL stores.
Not only SQL
A NoSQL (originally referring to “non SQL” or “non relational”) database provides a mechanism for storage and retrieval of data that is modeled in means other than the tabular relations used in relational databases.
NoSQL databases are increasingly used in big data and real-time web applications. NoSQL systems are also sometimes called “Not only SQL” to emphasize that they may support SQL-like query languages, or sit alongside SQL databases in polyglot persistent architectures.
Lack of True ACID Transaction
Most NoSQL stores lack true ACID transactions, although a few databases have made them central to their designs. Instead, most NoSQL databases offer a concept of “eventual consistency” in which database changes are propagated to all nodes “eventually” (typically within milliseconds) so queries for data might not return updated data immediately or might result in reading data that is not accurate, a problem known as stale reads. Additionally, some NoSQL systems may exhibit lost writes and other forms of data loss. Some NoSQL systems provide concepts such as write-ahead logging to avoid data loss. For distributed transaction processing across multiple databases, data consistency is an even bigger challenge that is difficult for both NoSQL and relational databases. Relational databases “do not allow referential integrity constraints to span databases”.
Business intelligence (BI) comprise the strategies and technologies used by enterprises for the data analysis of business information. BI technologies provide historical, current and predictive views of business operations. Common functions of business intelligence technologies include reporting, online analytical processing, analytics, data mining, process mining, complex event processing, business performance management, benchmarking, text mining, predictive analytics and prescriptive analytics.
In computing, a data warehouse (DW or DWH), also known as an enterprise data warehouse (EDW), is a system used for reporting and data analysis, and is considered a core component of business intelligence. DWs are central repositories of integrated data from one or more disparate sources. They store current and historical data in one single place that are used for creating analytical reports for workers throughout the enterprise.
A data mart is a simple form of a data warehouse that is focused on a single subject (or functional area), hence they draw data from a limited number of sources such as sales, finance or marketing.
OLAP and OLTP
Online analytical processing (OLAP) is characterized by a relatively low volume of transactions. Queries are often very complex and involve aggregations. OLAP databases store aggregated, historical data in multi-dimensional schemas (usually star schemas).
Online transaction processing (OLTP) is characterized by a large number of short on-line transactions (INSERT, UPDATE, DELETE). OLTP systems emphasize very fast query processing and maintaining data integrity in multi-access environments.
A relational database is a digital database based on the relational model of data, as proposed by E. F. Codd in 1970. A software system used to maintain relational databases is a relational database management system (RDBMS). Many relational database systems have an option of using the standard SQL (Structured Query Language) for querying and maintaining the database.
In order for a database management system (DBMS) to operate efficiently and accurately, it must use ACID transactions.
- The main idea of the question is about “huge volume of transactions.” The most important factor of a transaction is integrity. Transactions must meet the requirements of the ACID principle.
- If non-ACID transaction solutions are adopted, the customer’s orders are not guaranteed reliable. It may be missing and not fulfilled. Is it acceptable? I don’t think so.
- The data in the data warehouse or data mart is used for analysis and typically read-only.
- A relational database supports ACID transactions. It is typically classified as the online transaction processing (OLTP) system.
- All four options provided fault tolerance to avoid a single point of failure.