|
| |
| |
|
|
|
Da Notes -
70-019 - O
Personal notes on the MCDBA tests. These notes are from the various resources I used to pass the MCDBA exams including the MCDBA sample exams from Transcender. |
Passing Score: 620, Questions: 50, Time: 150 Min.
|
Da Notes Index
|
|
|
OLAP
|
Provides a multidimensional presentation of data warehouse data.
|
OLAP Vs Data Warehousing
|
OLAP is the technology to summarize data into multidimensional structures that allow you to retrieve data quickly. Data warehousing is the technology of moving enterprise data into one database, often with some summarization.
|
OLAP Vs OLTP
|
OLAP
Analytical - Designed for read-only analysis of historical data
Stores the data in dimensions
Built to be queried
Denormalized format
Heavily indexed
Summarized data
Attribute OLAP OLTP
Consolidated X
Consistent X X
Read-write X
Subject orient X
Historical X
Read-only X
Detailed X
Granular X
Aggregate schemas not always needed. Some tools such as Microsoft OLAP Services build aggregations using an intelligent algorithm.
OLTP
Operational - Designed for real-time inserting, updating, and deleting of current data
Stores data in relational tables
Built for transaction processing
Normalized format
Limited indexing
Detailed data
Disadvantages of OLTP system to do OLAP - It is difficult to design analytical reports; indexes reduce performance, and it is a complex data model.
|
OLAP Cube Architecture
|
· multidimensional representation of detail and summary data
· consists of a data source, dimensions, measures, and partitions
· data source identifies and connects to the database containing the data warehouse data that is the source of data for the cube
· dimensions map data warehouse dimension table information into a hierarchy of levels, and can be created for use in an individual cube or multiple cubes
· virtual dimension maps the properties of members of another dimension into a dimension that can then be used in cubes
· measures which are displayed in rows and columns are to be used to identify the numerical values from the fact table that are summarized for analysis, and form the core of cube information presented to users
· fact table columns can be additive or non-additive, and OLAP Services can use both types as measures
· partitions are the multidimensional storage containers that hold cube data
· user-defined partitions are available only in SQL Server™ OLAP Services Enterprise Edition
· design is based on the analytical requirements of users
· a virtual cube is a logical view of portions of one or more cubes that can be used to join relatively unlike cubes that share a common dimension
· cubes require substantial storage to contain the data and pre-calculated summary information in multidimensional structures - aggregations are pre-calculated summaries of data that provide the mechanism rapid query
· pre-calculation of all possible aggregations in a cube results in the fastest possible response time for all queries at the expense of storage and processing time. If no aggregations are pre-calculated (0%), little storage space is required beyond that necessary to store the base data
· another factor that affects storage requirements is sparsity, which is the amount of empty cells in a cube. SQL Server uses the following techniques to minimize the storage requirement: Storage is not allocated for empty cells, data compression is employed, a sophisticated algorithm designs efficient summary aggregations
Fact table and partition arrangement
· with different fact tables for a partition, all fact tables and dimensions for a cube's partitions must have the same structure as the cube's fact tables and dimensions
· you should ensure that no data is duplicated among the fact tables by using filter, although duplicated data is technically allowed
· with same fact table for multiple partitions, data items should not be used in more than one partition - use filters for all partitions in a cube to extract mutually exclusive data sets from the fact table
Incrementally updating a cube
· appropriate when new data is to be added to a cube
· does not require that the cube be processed.
· does not affect the existing data that has already been processed
· can be performed while users continue to query the cube
· after update, users have access to the additional data without having to disconnect and reconnect
· When would you process, refresh, or incrementally update your cubes?
· Process a cube when the cube is new or when the structure of a cube has changed.
· Update incrementally when new data has been added to the warehouse but existing data did not change.
· Refresh data if the underlying base data has changed.
Merging partitions
· partitions can be merged only if they have the same structure, they are stored in the same mode (MOLAP, HOLAP, or ROLAP), and contain identical aggregation designs
· you can copy the aggregation design from another of the cube's partitions when creating the partition in the Partition wizard to ensure that these partitions have the same aggregation design
· fact tables are not merged automatically when you merge partitions. You must do it manually
· filters of both partitions are ORed together to create a filter for the resulting partition which specifies the set of facts used in the resulting partition
· when merging partitions that use different fact tables, the resulting partition will refer only to the target partition's fact table. Facts from the source partition's fact table must be merged manually
· a merged MOLAP (multidimensional OLAP) partition with an incomplete fact table contains an internally merged copy of fact table data and will operate correctly until it is processed merged; a HOLAP (hybrid OLAP) or ROLAP (relational OLAP) partition with an incomplete fact table contains accurate aggregations, but incomplete facts, which leads to incorrect returned data
· absence of unavailable facts might not be noticed unless a user attempts to drill down to a fact in the unavailable table, or executes a query that requires a fact from the unavailable table
· when merging partitions that were created by specifying data slices in the Partition wizard, the merged partition can contain unexpected incorrect data unless you create a filter that specifies the data in the resultant partition.
· data Slice is specified when you create a partition using the Partition wizard. The wizard attempts to create a filter on the fact table to specify the data to be included in the partition if the level's MemberKeyColumn and MemberNameColumn properties point to the same column. This is the default if the dimension or cube has not been edited
Partitions:
· Partitions are a cube storage unit. Each partition can be stored in a different mode and different servers. Invisible to user. Clustered approach to cube storage
· OLAP Services provides a Partition wizard to assist in creating partitions
· cube may return incorrect results for some queries if a portion of the cube's data is included in more than one of its partitions
· partitions of a cube can be merged
· User defined partitions only in Enterprise edition
Processing Cube
· process - a complete load of the cube, i.e. all dimension and fact table data is read and all specified aggregations are calculated
· changes in the data warehouse schema that affect the structure of cubes require those cubes to have their structure changed and then be processed, while changes in (or additions to) data in the data warehouse do not require cubes to be completely processed
· OLAP Services has Decision Support Objects that allow programmatic access to set up cubes if you are a programmer.
· Designing storage selects aggregations and storage types. Processing a cube creates the selected aggregations on the selected storage type.
· Best way to decide on aggregations for your cube - Let the wizard decide. After the cube is used for a while, the aggregations can be further optimized.
Refreshing a cube's Data
· causes a cube's data to be cleared and reloaded and its aggregations recalculated
· appropriate when the underlying data in the data warehouse has changed but the cube's structure remains the same
· faster, as aggregation tables do not have to be redesigned
· can be performed while users continue to query the cube and users will have access to the updated data without having to disconnect and reconnect
|
OLAP Manager
|
· console application that provides a robust user interface for accessing OLAP servers and the metadata repositories that define multidimensional database structures
· Key functions of OLAP Manager
· Define databases and data sources
· Build and process cubes
· Specify storage options and optimize query performance
· Manage server security
· Browse data sources, shared dimensions, and security roles
|
OLAP Roles Vs NT Groups
|
OLAP Roles are maintained by OLAP Services.
OLAP Roles receive access rights to cubes. NT Groups are placed in OLAP Roles.
NT Groups are maintained by the operating system.
|
OLAP Services
|
· Purpose of OLAP Services
· Store data in multidimensional cubes with pre-calculated values. This speeds up the data queries.
· OLAP Services recognize SQL and MDX dialect
· OLAP Services internally uses the NT computer name for licensing information. Services may fail to start after you rename the Microsoft Windows NT computer name:
· Re-run the SQL Server OLAP Services setup so that it uses the latest Windows NT computer name. Remember to reinstall any service pack that was in use
· Change the Windows NT computer name back to the original computer name
· When using OLAP Services to pull data from SQL Server using trusted or Windows NT authentication security, the MSSQLServerOLAPService service must be configured to run under a domain or local user account; otherwise, OLAP may fail to process dimensions or cubes. By default MSSQLServerOLAPService service runs under a local system account
· Apart from being a member of the OLAP Administrators group, the MSSQLServerOLAPService account must have the appropriate permissions on SQL Server: permissions required will vary depending on the type of storage structure selected:
· When using MOLAP storage, the MSSQLServerOLAPService account must have at least SELECT permissions on source database.
· If you use ROLAP or HOLAP storage, the MSSQLServerOLAPService account must have at least SELECT and CREATE TABLE permissions on source database
· When setting up linked services to OLAP Server, keep in mind that:
· If login is made to SQL Server using SQL Server security login name and password, setup SQL Server services to run under either a local or domain user account rather than using a SYSTEM account. Otherwise, if login is made to SQL Server using Microsoft Windows NT authentication, then SQL Server passes the credentials of this Microsoft Windows NT account to OLAP services
· Either the SQL Server services startup account or the Microsoft Windows NT account should have access to OLAP services. Make this account part of the 'OLAP Administrator' local group on OLAP server computer, or create a database role within OLAP Manager and provide this role with read/write access to the cubes
· SQL Server 7.0 can perform queries against OLE DB providers by using the OPENQUERY or OPENROWSET Transact-SQL functions or by using a query with four-part names including a linked-server name. Query is limited to the abbreviated SELECT syntax supported by OLAP Services, but can include MDX syntax to return "flattened rowsets"
|
OLE DB
|
OLE DB providers make data available by exposing standard COM interfaces that any programming language environment with COM support can use.
OLE DB is a low-level, COM API that is used for accessing data. OLE DB is recommended for developing tools, utilities, or low-level components that need high performance. The OLE DB Provider for SQL Server (SQLOLEDB) is a native, high performance provider that accesses the SQL Server TDS protocol directly.
|
|
|
|
|