| |
|
|
|
Da Notes -
70-019 - D
Personal notes on the MCDBA tests. These notes are from the various resources I used to pass the MCDBA exams including the MCDBA sample exams from Transcender.
|
Passing Score: 620, Questions: 50, Time: 150 Min.
|
Da Notes Index
|
|
|
DAO
|
SQL Server supports applications written to the Remote Data Objects (RDO) and Data Access Objects (DAO) APIs. These are object APIs that encapsulate ODBC.
|
Data Characteristics
|
· Data is stored centrally. Converts to the same format in order to be consistent. Includes subject oriented key information. Historical. Read only, fast query and retrieval is the key - need appropriate level of summarization.
· fact table contains only number and key to reference textual data
· dimension table contains descriptions of data in facts table to provide a hierarchy structure
|
Data Structural Changes
|
· data additions can be managed by carefully defining partition filters, and by designing a strategy
to synchronize OLAP and data warehouse data
· Data warehouse errors can be minimized by intensive testing of the data transformation, validation, and scrubbing operations.
· Change the source database and then migrate new data to the data warehouse in a controlled manner
· Fact table changes can affect query accuracy until the cube is processed.
· The refresh data processing method can be used to reload the cube's data and recalculate the
aggregations.
Visibility to clients during changes:
· If a cube is processed online, it remains online until the processing has completed. The online cube is replaced by the new cube version.
· Full process method - online clients will be disconnected from the cube when the switch is made to the new version of the cube, and the clients must individually reconnect to access the new version
· Incremental update or the refresh data method - online clients will not be disconnected from the cube when the processing completes. The new version of the cube will be immediately visible with no break in service. This method uses partitions and affects the cube minimally.
· Rebuild the dimension structure method - If a shared dimension is processed, all cubes that use the dimension will immediately become unavailable to clients and must be processed before they can be used again.
· Incremental update method - If a shared dimension is processed, all cubes that use the dimension remain available to clients and any new members added to the dimension automatically become available to clients when the dimension processing is complete.
|
Data Mart
|
Smaller subset (department or business process) of a data warehouse.
|
Data Mining
|
Finding patterns among the data in a data
warehouse or data mart.
|
Data Storage & Aggregation Wizard
|
Set options for optimal cube performance and data storage.
|
Data Warehousing
|
Historical, consistent data store, including the tools used to query and analyze that data. Many years of information must be queried quickly. Data is organized to support analysis rather than to process real-time transactions.
Benefits provided by data warehousing
· Ability to access enterprise wide data
· Consistent data
· Perform analysis quickly
· Recognition of redundancy of effort
· Discover gaps in business knowledge and processes
· Decrease administration costs
· Empower everyone with information needed to perform effectively
|
Data Warehousing Framework
|
OLAP Services, DTS, English Query, and Microsoft Repository
|
Data Warehousing Vs OLAP
|
OLAP is used to summarize data into multidimensional structures that allow you to
retrieve data quickly.
|
DBCC
|
Database consistency checker SQL Server
|
DBCC - CHECKDB
|
· NOINDEX - Do not check nonclustered indexes for non-system tables (runs faster).
· REPAIR_ALLOW_DATA_LOSS - Performs all repairs. These repairs can result in some data loss.
· REPAIR_FAST - Performs minor repairs that can be done quickly and without risk of data loss.
· REPAIR_REBUILD - Performs all repairs, even long running repairs such as rebuilding indexes.
|
DBCC - REINDEX
|
· DBCC REINDEX (dbname, indexname, fillfactor) - rebuilds one or all indexes on a table.
· By allowing an index to be rebuilt dynamically, indexes enforcing either PRIMARY KEY or UNIQUE constraints can be rebuilt without having to drop and re-create those constraints.
· An index can be rebuilt without knowing the table's structure or constraints, which could occur after a bulk copy of data into the table.
|
Design Considerations
|
Designing a data warehouse
· top down approach - design the big warehouse first, then divide it into smaller marts
· bottom up approach - design the smaller units first, then integrate them into one big warehouse
|
Defining the Technical Architecture for a Solution
|
· Identify which technologies are appropriate for implementation of a given business solution. Technologies include design tools, data transformation tools, storage tools, presentation access tools, management tools, and scheduling tools.
· Choose a data storage architecture.
|
Developing the Logical Design
|
· Identify the sources of data from the operational databases.
· Identify the encoding structure and key structure for integrating all data.
· Identify the filtering requirements for operational data.
· Assess whether a data mart schema should be integrated within the enterprise data warehouse schema.
· Assess the level of detail required for data.
|
Deriving the Physical Design
|
· Assess how a given logical design impacts performance, maintainability, extensibility, scalability, availability, and security.
· Assess whether data should be queried from a relational database or a multidimensional database.
· Choose a schema design for a relational database. Design options include normalized, star, or snowflake.
· Group data into fact tables and dimension tables by applying de-normalization rules.
human factors: use default character plus Unicode data types for all languages, or use same code page for client and server unless they use only the first 128 char
· file group: use file groups to create database across drives to improve performance
· fault tolerant: disk striping with parity, disk mirror
|
Dimension
|
Dimension tables
· 2 table types: Dimension and fact
· contain the detail and describe the fact table
· has fewer rows than fact table
· columns are mainly character type
· conventional dimensions: e.g., time dimension such as month, date.... location dimension base don country, city...etc
· shared dimensions - data mart sharing each other's dimension table
· degenerate dimensions: represent a business event but not a numeric fact, has no association to any dimension table
· junk dimensions: attribute not related to business objective, but is still important
· fact table is much larger then dimension table, so judge the required size based on fact table
Defining Dimensions
· each column in a dimension contributes a level to the dimension, which are ordered by specificity and organized in a hierarchy that allows logical avenues for drill down
· each level contains members that are the values within the column that defines the level
· in tabular browsers, members provide the column headings, row headings, and subheadings by which measures are separated and displayed to cube users
· in graphical browsers, members provide other types of descriptive labels that serve the same function as in tabular browsers
· each dimension table's primary key must join to a foreign key in a cube's fact table or another dimension table, although key columns are not required in the dimension definition
· multiple-hierarchy dimensions provide similar yet alternate views of cube data, as two or more dimensions with names that share the same prefix, followed by a period, with different suffixes
Calculated Members
· dimension member whose value is calculated at run time using an expression that you specify when you define the calculated member
· enables you to add members and measures to a cube without increasing its size
· only the definitions for calculated members are stored, and values are calculated in memory
Member Properties
· attribute of a dimension member
· Used in queries. Additional options when analyzing cube data
· Is the basis of virtual dimensions.
· Values for member properties must be read from a column in the same dimension table as the associated members
· Created in the Dimension editor by associating the column that contains values for the member property with the level that contains the members
Rebuilding Structure
· Track changes to products over time (impact on sales of changing six-pack cola can size from 12 ounces to 11.5 ounces) - Write additional dimension records each time that a change occurs to an attribute for a product. Using this approach, you will be able to track sales over time for each variation of the product. Other approaches would not be appropriate. Using the option to overwrite the dimension record would not keep history of the product attributes, and storing additional attribute values would limit the number of times that product attributes could change. You will need to modify the design of any OLAP cubes.
· For slowly changing dimensions, write another dimension record.
· Required after the structure of the dimension is changed, or that relationships between members in the dimension hierarchy are changed
· when a shared dimension is processed with the rebuild the dimension structure option, all cubes that incorporate the shared dimension immediately become unavailable to users and must be processed before they can be used again
· when a shared dimension's structure is edited and saved but not processed, it will automatically be processed when any cube incorporating the dimension is processed. Any other cubes that incorporate the dimension immediately become unavailable to users and must be processed before they can be used again
· Incrementally update a dimension - updates when new members added but no structural changes are made. Shared dimension remains available to users while the dimension is incrementally updated. Added dimension members are in the cube after the update is complete.
Write-Enabling
· client applications can record changes to the cube's data
· allows end users to explore scenarios by changing cell values and analyzing the effects of the changes on cube data
· end user's change is stored in the write-back table as a difference from the currently displayed value, although original value in the cube is preserved and an audit trail of changes is recorded in the write-back table
· changes can only be made to cells at the lowest level - cells that do not contain aggregated information
· separate write back table facilitates conversion to a partition to permanently incorporate changes into the cube or to discard, which returns the cube to its original state
· end user is permitted to record changes in a cube's write-back table only if the user belongs to a role with read/write permissions assigned to the cube
Dimension Wizard
· Create shared dimensional data.
|
Disaster Recovery
|
The enterprise data warehouse database is the most critical. Recovering this database requires more work and data transformations than recovering the data marts. If you lose the enterprise database, the data marts cannot gather new data until the enterprise data warehouse is back online. Save transformation scripts and DTS packages to rebuild the database if the restore process fails.
The metadata repository for each OLAP server is also critical to recover.
|
DSO
|
Decision Support Objects (DSO) library supplies an object model for any Component Object Model (COM) interface (e.g. Visual C++, Visual Basic, and Visual Basic Scripting Edition.
DSO enables you to create applications that enhance, augment, and automate your OLAP installation . Use DSO within a COM-compliant language to allow users to build cubes on the OLAP Server. Used to automate admin tasks using OLAP manager.
|
DTS
|
TOOLS
· DTS wizard tool can move and transform data from one or more sources to a destination or multiple destinations
· DTS Designer tool - for DBA's to transfer data from multiple sources and for setting up complex workflow
· DTSRUN tool - Execute a DTS package in a batch or scheduled job
DTS lookups - data is transferred one row at a time
Source and destination types can be OLE DB data sources, ODBC data sources, and text files
DTS packages be stored in:
· Microsoft Repository (MSDB table - need sql permission select & read (update to edit). If stored here, , the metadata for the package is available to repository tools and data lineage can be tracked for the package )
· SQL Server (MSDB table - need sql permission select & read (update to edit) )
· COM-structured files (need NTFS read)
Package version - Package ID and version ID are generated when the package is created or updated.
Package passwords - Owner password is required to view, edit and execute. User password can execute a package. If passwords are used, the package is encrypted.
|
|
|
|