Skip to main content

Data Warehouse - An Introduction


A data warehouse is defined as a subject-oriented, integrated, nonvolatile, time-variant collection of data in support of management's decisions. More generally, data warehousing is a collection of decision support technologies, aimed at enabling the knowledge worker, such as executive, manager, and analyst, to arrive at better and faster decisions. Data warehouses provide access to data for complex analysis, knowledge discovery, and decision-making. style="mso-spacerun: They support high performance demands on an organization's data and information. It provides an enormous amount of historical and static data from three tiers:
1.      Relational databases
2.      Multidimensional OLAP applications
3.      Client analysis tools
Several types of applications such as online analytical processing (OLAP), decision-support systems (DSS) and data mining are being supported. OLAP is a term used to describe the analysis of complex data from the data warehouse.
OLAP is a software technology that allows users to easily and quickly analyze and view data from multiple points-of-view. OLAP provides dynamic and multi-dimensional support to executives and managers who need to understand different aspects of the data. Activities that are supported include:
§  Analyzing financial trends
§  Creating slices of data
§  Finding new relationships among the data
§  Drilling down into sales statistics
§  Doing calculations through different dimensions where each category of data (that is, product, location, sales numbers, time period, etc.) is considered a dimension.
There are OLAP tools that use distributed computing capabilities for analyses that require more storage and processing power than can be economically and efficiently located on an individual desktop.
DSS support an organization's leading decision makers with higher-level data for complex and critical decisions. A DSS queries a data warehouse or an OLAP database for relevant information that can be compared in order to make a business decision and predict the impact of that decision.
Finally, data mining is being used for knowledge discovery, the process of searching data for unanticipated new knowledge.
Knowledge workers and decision makers use tools ranging from parametric queries to ad hoc queries to data mining. Thus, the access component of the data warehouse must provide support of structured queries (both parametric and ad hoc). These together make up a managed query environment.

Comments

Popular posts from this blog

Standard and Formatted Input / Output in C++

The C++ standard libraries provide an extensive set of input/output capabilities which we will see in subsequent chapters. This chapter will discuss very basic and most common I/O operations required for C++ programming. C++ I/O occurs in streams, which are sequences of bytes. If bytes flow from a device like a keyboard, a disk drive, or a network connection etc. to main memory, this is called   input operation   and if bytes flow from main memory to a device like a display screen, a printer, a disk drive, or a network connection, etc., this is called   output operation . Standard Input and Output in C++ is done through the use of  streams . Streams are generic places to send or receive data. In C++, I/O is done through classes and objects defined in the header file  <iostream> .  iostream  stands for standard input-output stream. This header file contains definitions to objects like  cin ,  cout , etc. /O Library Header Files There are...

locking

DBMS Locking Part I (DBMS only) TECHNICAL ARTICLES -> PERFORMANCE ARTICLES [  Back  ] [  Next  ] DBMS is often criticized for excessive locking – resulting in poor database performance when sharing data among multiple concurrent processes. Is this criticism justified, or is DBMS being unfairly blamed for application design and implementation shortfalls? To evaluate this question, we need to understand more about DBMS locking protocols. In this article, we examine how, why, what and when DBMS locks and unlocks database resources. Future articles will address how to minimize the impact of database locking. THE NEED FOR LOCKING In an ideal concurrent environment, many processes can simultaneously access data in a DBMS database, each having the appearance that they have exclusive access to the database. In practice, this environment is closely approximated by careful use of locking protocols. Locking is necessary in a concurrent environment to as...

DATA WAREHOUSE VERSUS DATA MART: THE GREAT DEBATE

DATA WAREHOUSE VERSUS DATA MART: THE GREAT DEBATE Customers exploring the field of business intelligence for the first time often lead with: What is the difference between a data warehouse and a data mart? The next question follows as predictably as night follows day: which one does my company need? Let me start by saying that the two terms are often confused. Indeed, some people in the industry use them virtually interchangeably, which is unfortunate, because they do reflect a valuable hierarchical difference. The Data Warehouse A "data warehouse" will typically contain the full range of business intelligence available to a company from all sources. That data consists of transaction-processing records, corporate and marketing data, and other business operations information; for example, a bank might include loans, credit card statements, and demand deposits data, along with basic customer information. This internal data is frequently combined with statistica...