The Data Warehouse Toolkit: The Definitive Guide to Dimensional Modeling

Lýsing:
Updated new edition of Ralph Kimball's groundbreaking book on dimensional modeling for data warehousing! The first edition of Ralph Kimball's The Data Warehouse Toolkit introduced the foundation on which the data warehousing industry has been built - and now, these books are considered the most authoritative guides on dimensional modeling. This new third edition is a complete library of dimensional modeling techniques, the most comprehensive collection ever.
Revised and updated, it covers design techniques that support big data analytics, adds a new chapter on ETL techniques, provides new and enhanced modeling patterns, and much more. Authored by Ralph Kimball, known worldwide as an innovator, consultant, and influential thought leader in the field of data warehousing Begins with fundamental design recommendations and progresses step by step through increasingly complex scenarios Presents unique modeling techniques for e-commerce and common business applications such as inventory management, procurement, invoicing, accounting, and more Draws real-world case studies from a variety of industries, including retail sales, financial services, telecommunications, health care, and insurance Design dimensional databases that are easy to understand and provide fast query response with The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling, 3rd Edition.
Annað
- Höfundur: Ralph Kimball
- Útgáfa:3
- Útgáfudagur: 2013-07-01
- Hægt að prenta út 2 bls.
- Hægt að afrita 10 bls.
- Format:Page Fidelity
- ISBN 13: 9781118530771
- Print ISBN: 9781118530801
- ISBN 10: 1118530772
Efnisyfirlit
- Title Page
- Copyright
- Contents
- 1 Data Warehousing, Business Intelligence, and Dimensional Modeling Primer
- Different Worlds of Data Capture and Data Analysis
- Goals of Data Warehousing and Business Intelligence
- Publishing Metaphor for DW/BI Managers
- Dimensional Modeling Introduction
- Star Schemas Versus OLAP Cubes
- Fact Tables for Measurements
- Dimension Tables for Descriptive Context
- Facts and Dimensions Joined in a Star Schema
- Kimball’s DW/BI Architecture
- Operational Source Systems
- Extract, Transformation, and Load System
- Presentation Area to Support Business Intelligence
- Business Intelligence Applications
- Restaurant Metaphor for the Kimball Architecture
- Alternative DW/BI Architectures
- Independent Data Mart Architecture
- Hub-and-Spoke Corporate Information Factory Inmon Architecture
- Hybrid Hub-and-Spoke and Kimball Architecture
- Dimensional Modeling Myths
- Myth 1: Dimensional Models are Only for Summary Data
- Myth 2: Dimensional Models are Departmental, Not Enterprise
- Myth 3: Dimensional Models are Not Scalable
- Myth 4: Dimensional Models are Only for Predictable Usage
- Myth 5: Dimensional Models Can’t Be Integrated
- More Reasons to Think Dimensionally
- Agile Considerations
- Summary
- 2 Kimball Dimensional Modeling Techniques Overview
- Fundamental Concepts
- Gather Business Requirements and Data Realities
- Collaborative Dimensional Modeling Workshops
- Four-Step Dimensional Design Process
- Business Processes
- Grain
- Dimensions for Descriptive Context
- Facts for Measurements
- Star Schemas and OLAP Cubes
- Graceful Extensions to Dimensional Models
- Basic Fact Table Techniques
- Fact Table Structure
- Additive, Semi-Additive, Non-Additive Facts
- Nulls in Fact Tables
- Conformed Facts
- Transaction Fact Tables
- Periodic Snapshot Fact Tables
- Accumulating Snapshot Fact Tables
- Factless Fact Tables
- Aggregate Fact Tables or OLAP Cubes
- Consolidated Fact Tables
- Basic Dimension Table Techniques
- Dimension Table Structure
- Dimension Surrogate Keys
- Natural, Durable, and Supernatural Keys
- Drilling Down
- Degenerate Dimensions
- Denormalized Flattened Dimensions
- Multiple Hierarchies in Dimensions
- Flags and Indicators as Textual Attributes
- Null Attributes in Dimensions
- Calendar Date Dimensions
- Role-Playing Dimensions
- Junk Dimensions
- Snowflaked Dimensions
- Outrigger Dimensions
- Integration via Conformed Dimensions
- Conformed Dimensions
- Shrunken Dimensions
- Drilling Across
- Value Chain
- Enterprise Data Warehouse Bus Architecture
- Enterprise Data Warehouse Bus Matrix
- Detailed Implementation Bus Matrix
- Opportunity/Stakeholder Matrix
- Dealing with Slowly Changing Dimension Attributes
- Type 0: Retain Original
- Type 1: Overwrite
- Type 2: Add New Row
- Type 3: Add New Attribute
- Type 4: Add Mini-Dimension
- Type 5: Add Mini-Dimension and Type 1 Outrigger
- Type 6: Add Type 1 Attributes to Type 2 Dimension
- Type 7: Dual Type 1 and Type 2 Dimensions
- Dealing with Dimension Hierarchies
- Fixed Depth Positional Hierarchies
- Slightly Ragged/Variable Depth Hierarchies
- Ragged/Variable Depth Hierarchies with Hierarchy Bridge Tables
- Ragged/Variable Depth Hierarchies with Pathstring Attributes
- Advanced Fact Table Techniques
- Fact Table Surrogate Keys
- Centipede Fact Tables
- Numeric Values as Attributes or Facts
- Lag/Duration Facts
- Header/Line Fact Tables
- Allocated Facts
- Profit and Loss Fact Tables Using Allocations
- Multiple Currency Facts
- Multiple Units of Measure Facts
- Year-to-Date Facts
- Multipass SQL to Avoid Fact-to-Fact Table Joins
- Timespan Tracking in Fact Tables
- Late Arriving Facts
- Advanced Dimension Techniques
- Dimension-to-Dimension Table Joins
- Multivalued Dimensions and Bridge Tables
- Time Varying Multivalued Bridge Tables
- Behavior Tag Time Series
- Behavior Study Groups
- Aggregated Facts as Dimension Attributes
- Dynamic Value Bands
- Text Comments Dimension
- Multiple Time Zones
- Measure Type Dimensions
- Step Dimensions
- Hot Swappable Dimensions
- Abstract Generic Dimensions
- Audit Dimensions
- Late Arriving Dimensions
- Special Purpose Schemas
- Supertype and Subtype Schemas for Heterogeneous Products
- Real-Time Fact Tables
- Error Event Schemas
- Fundamental Concepts
- Four-Step Dimensional Design Process
- Step 1: Select the Business Process
- Step 2: Declare the Grain
- Step 3: Identify the Dimensions
- Step 4: Identify the Facts
- Retail Case Study
- Step 1: Select the Business Process
- Step 2: Declare the Grain
- Step 3: Identify the Dimensions
- Step 4: Identify the Facts
- Dimension Table Details
- Date Dimension
- Product Dimension
- Store Dimension
- Promotion Dimension
- Other Retail Sales Dimensions
- Degenerate Dimensions for Transaction Numbers
- Retail Schema in Action
- Retail Schema Extensibility
- Factless Fact Tables
- Dimension and Fact Table Keys
- Dimension Table Surrogate Keys
- Dimension Natural and Durable Supernatural Keys
- Degenerate Dimension Surrogate Keys
- Date Dimension Smart Keys
- Fact Table Surrogate Keys
- Resisting Normalization Urges
- Snowflake Schemas with Normalized Dimensions
- Outriggers
- Centipede Fact Tables with Too Many Dimensions
- Summary
- Value Chain Introduction
- Inventory Models
- Inventory Periodic Snapshot
- Inventory Transactions
- Inventory Accumulating Snapshot
- Fact Table Types
- Transaction Fact Tables
- Periodic Snapshot Fact Tables
- Accumulating Snapshot Fact Tables
- Complementary Fact Table Types
- Value Chain Integration
- Enterprise Data Warehouse Bus Architecture
- Understanding the Bus Architecture
- Enterprise Data Warehouse Bus Matrix
- Conformed Dimensions
- Drilling Across Fact Tables
- Identical Conformed Dimensions
- Shrunken Rollup Conformed Dimension with Attribute Subset
- Shrunken Conformed Dimension with Row Subset
- Shrunken Conformed Dimensions on the Bus Matrix
- Limited Conformity
- Importance of Data Governance and Stewardship
- Conformed Dimensions and the Agile Movement
- Conformed Facts
- Summary
- Procurement Case Study
- Procurement Transactions and Bus Matrix
- Single Versus Multiple Transaction Fact Tables
- Complementary Procurement Snapshot
- Slowly Changing Dimension Basics
- Type 0: Retain Original
- Type 1: Overwrite
- Type 2: Add New Row
- Type 3: Add New Attribute
- Type 4: Add Mini-Dimension
- Hybrid Slowly Changing Dimension Techniques
- Type 5: Mini-Dimension and Type 1 Outrigger
- Type 6: Add Type 1 Attributes to Type 2 Dimension
- Type 7: Dual Type 1 and Type 2 Dimensions
- Slowly Changing Dimension Recap
- Summary
- Order Management Bus Matrix
- Order Transactions
- Fact Normalization
- Dimension Role Playing
- Product Dimension Revisited
- Customer Dimension
- Deal Dimension
- Degenerate Dimension for Order Number
- Junk Dimensions
- Header/Line Pattern to Avoid
- Multiple Currencies
- Transaction Facts at Different Granularity
- Another Header/Line Pattern to Avoid
- Invoice Transactions
- Service Level Performance as Facts, Dimensions, or Both
- Profit and Loss Facts
- Audit Dimension
- Accumulating Snapshot for Order Fulfillment Pipeline
- Lag Calculations
- Multiple Units of Measure
- Beyond the Rearview Mirror
- Summary
- Accounting Case Study and Bus Matrix
- General Ledger Data
- General Ledger Periodic Snapshot
- Chart of Accounts
- Period Close
- Year-to-Date Facts
- Multiple Currencies Revisited
- General Ledger Journal Transactions
- Multiple Fiscal Accounting Calendars
- Drilling Down Through a Multilevel Hierarchy
- Financial Statements
- Budgeting Process
- Dimension Attribute Hierarchies
- Fixed Depth Positional Hierarchies
- Slightly Ragged Variable Depth Hierarchies
- Ragged Variable Depth Hierarchies
- Shared Ownership in a Ragged Hierarchy
- Time Varying Ragged Hierarchies
- Modifying Ragged Hierarchies
- Alternative Ragged Hierarchy Modeling Approaches
- Advantages of the Bridge Table Approach for Ragged Hierarchies
- Consolidated Fact Tables
- Role of OLAP and Packaged Analytic Solutions
- Summary
- CRM Overview
- Operational and Analytic CRM
- Customer Dimension Attributes
- Name and Address Parsing
- International Name and Address Considerations
- Customer-Centric Dates
- Aggregated Facts as Dimension Attributes
- Segmentation Attributes and Scores
- Counts with Type 2 Dimension Changes
- Outrigger for Low Cardinality Attribute Set
- Customer Hierarchy Considerations
- Bridge Tables for Multivalued Dimensions
- Bridge Table for Sparse Attributes
- Bridge Table for Multiple Customer Contacts
- Complex Customer Behavior
- Behavior Study Groups for Cohorts
- Step Dimension for Sequential Behavior
- Timespan Fact Tables
- Tagging Fact Tables with Satisfaction Indicators
- Tagging Fact Tables with Abnormal Scenario Indicators
- Customer Data Integration Approaches
- Master Data Management Creating a Single Customer Dimension
- Partial Conformity of Multiple Customer Dimensions
- Avoiding Fact-to-Fact Table Joins
- Low Latency Reality Check
- Summary
- Employee Profile Tracking
- Precise Effective and Expiration Timespans
- Dimension Change Reason Tracking
- Profile Changes as Type 2 Attributes or Fact Events
- Headcount Periodic Snapshot
- Bus Matrix for HR Processes
- Packaged Analytic Solutions and Data Models
- Recursive Employee Hierarchies
- Change Tracking on Embedded Manager Key
- Drilling Up and Down Management Hierarchies
- Multivalued Skill Keyword Attributes
- Skill Keyword Bridge
- Skill Keyword Text String
- Survey Questionnaire Data
- Text Comments
- Summary
- Banking Case Study and Bus Matrix
- Dimension Triage to Avoid Too Few Dimensions
- Household Dimension
- Multivalued Dimensions and Weighting Factors
- Mini-Dimensions Revisited
- Adding a Mini-Dimension to a Bridge Table
- Dynamic Value Banding of Facts
- Supertype and Subtype Schemas for Heterogeneous Products
- Supertype and Subtype Products with Common Facts
- Hot Swappable Dimensions
- Summary
- Telecommunications Case Study and Bus Matrix
- General Design Review Considerations
- Balance Business Requirements and Source Realities
- Focus on Business Processes
- Granularity
- Single Granularity for Facts
- Dimension Granularity and Hierarchies
- Date Dimension
- Degenerate Dimensions
- Surrogate Keys
- Dimension Decodes and Descriptions
- Conformity Commitment
- Design Review Guidelines
- Draft Design Exercise Discussion
- Remodeling Existing Data Structures
- Geographic Location Dimension
- Summary
- Airline Case Study and Bus Matrix
- Multiple Fact Table Granularities
- Linking Segments into Trips
- Related Fact Tables
- Extensions to Other Industries
- Cargo Shipper
- Travel Services
- Combining Correlated Dimensions
- Class of Service
- Origin and Destination
- More Date and Time Considerations
- Country-Specific Calendars as Outriggers
- Date and Time in Multiple Time Zones
- Localization Recap
- Summary
- University Case Study and Bus Matrix
- Accumulating Snapshot Fact Tables
- Applicant Pipeline
- Research Grant Proposal Pipeline
- Factless Fact Tables
- Admissions Events
- Course Registrations
- Facility Utilization
- Student Attendance
- More Educational Analytic Opportunities
- Summary
- Healthcare Case Study and Bus Matrix
- Claims Billing and Payments
- Date Dimension Role Playing
- Multivalued Diagnoses
- Supertypes and Subtypes for Charges
- Electronic Medical Records
- Measure Type Dimension for Sparse Facts
- Freeform Text Comments
- Images
- Facility/Equipment Inventory Utilization
- Dealing with Retroactive Changes
- Summary
- Clickstream Source Data
- Clickstream Data Challenges
- Clickstream Dimensional Models
- Page Dimension
- Event Dimension
- Session Dimension
- Referral Dimension
- Clickstream Session Fact Table
- Clickstream Page Event Fact Table
- Step Dimension
- Aggregate Clickstream Fact Tables
- Google Analytics
- Integrating Clickstream into Web Retailer’s Bus Matrix
- Profitability Across Channels Including Web
- Summary
- Insurance Case Study
- Insurance Value Chain
- Draft Bus Matrix
- Policy Transactions
- Dimension Role Playing
- Slowly Changing Dimensions
- Mini-Dimensions for Large or Rapidly Changing Dimensions
- Multivalued Dimension Attributes
- Numeric Attributes as Facts or Dimensions
- Degenerate Dimension
- Low Cardinality Dimension Tables
- Audit Dimension
- Policy Transaction Fact Table
- Heterogeneous Supertype and Subtype Products
- Complementary Policy Accumulating Snapshot
- Premium Periodic Snapshot
- Conformed Dimensions
- Conformed Facts
- Pay-in-Advance Facts
- Heterogeneous Supertypes and Subtypes Revisited
- Multivalued Dimensions Revisited
- More Insurance Case Study Background
- Updated Insurance Bus Matrix
- Detailed Implementation Bus Matrix
- Claim Transactions
- Transaction Versus Profile Junk Dimensions
- Claim Accumulating Snapshot
- Accumulating Snapshot for Complex Workflows
- Timespan Accumulating Snapshot
- Periodic Instead of Accumulating Snapshot
- Policy/Claim Consolidated Periodic Snapshot
- Factless Accident Events
- Common Dimensional Modeling Mistakes to Avoid
- Mistake 10: Place Text Attributes in a Fact Table
- Mistake 9: Limit Verbose Descriptors to Save Space
- Mistake 8: Split Hierarchies into Multiple Dimensions
- Mistake 7: Ignore the Need to Track Dimension Changes
- Mistake 6: Solve All Performance Problems with More Hardware
- Mistake 5: Use Operational Keys to Join Dimensions and Facts
- Mistake 4: Neglect to Declare and Comply with the Fact Grain
- Mistake 3: Use a Report to Design the Dimensional Model
- Mistake 2: Expect Users to Query Normalized Atomic Data
- Mistake 1: Fail to Conform Facts and Dimensions
- Summary
- Lifecycle Roadmap
- Roadmap Mile Markers
- Lifecycle Technology Track
- Technical Architecture Design
- Product Selection and Installation
- Lifecycle Data Track
- Dimensional Modeling
- Physical Design
- ETL Design and Development
- Lifecycle BI Applications Track
- BI Application Specification
- BI Application Development
- Lifecycle Wrap-up Activities
- Deployment
- Maintenance and Growth
- Common Pitfalls to Avoid
- Summary
- Modeling Process Overview
- Get Organized
- Identify Participants, Especially Business Representatives
- Review the Business Requirements
- Leverage a Modeling Tool
- Leverage a Data Profiling Tool
- Leverage or Establish Naming Conventions
- Coordinate Calendars and Facilities
- Design the Dimensional Model
- Reach Consensus on High-Level Bubble Chart
- Develop the Detailed Dimensional Model
- Review and Validate the Model
- Finalize the Design Documentation
- Summary
- Round Up the Requirements
- Business Needs
- Compliance
- Data Quality
- Security
- Data Integration
- Data Latency
- Archiving and Lineage
- BI Delivery Interfaces
- Available Skills
- Legacy Licenses
- The 34 Subsystems of ETL
- Extracting: Getting Data into the Data Warehouse
- Subsystem 1: Data Profiling
- Subsystem 2: Change Data Capture System
- Subsystem 3: Extract System
- Cleaning and Conforming Data
- Improving Data Quality Culture and Processes
- Subsystem 4: Data Cleansing System
- Subsystem 5: Error Event Schema
- Subsystem 6: Audit Dimension Assembler
- Subsystem 7: Deduplication System
- Subsystem 8: Conforming System
- Delivering: Prepare for Presentation
- Subsystem 9: Slowly Changing Dimension Manager
- Subsystem 10: Surrogate Key Generator
- Subsystem 11: Hierarchy Manager
- Subsystem 12: Special Dimensions Manager
- Subsystem 13: Fact Table Builders
- Subsystem 14: Surrogate Key Pipeline
- Subsystem 15: Multivalued Dimension Bridge Table Builder
- Subsystem 16: Late Arriving Data Handler
- Subsystem 17: Dimension Manager System
- Subsystem 18: Fact Provider System
- Subsystem 19: Aggregate Builder
- Subsystem 20: OLAP Cube Builder
- Subsystem 21: Data Propagation Manager
- Managing the ETL Environment
- Subsystem 22: Job Scheduler
- Subsystem 23: Backup System
- Subsystem 24: Recovery and Restart System
- Subsystem 25: Version Control System
- Subsystem 26: Version Migration System
- Subsystem 27: Workflow Monitor
- Subsystem 28: Sorting System
- Subsystem 29: Lineage and Dependency Analyzer
- Subsystem 30: Problem Escalation System
- Subsystem 31: Parallelizing/Pipelining System
- Subsystem 32: Security System
- Subsystem 33: Compliance Manager
- Subsystem 34: Metadata Repository Manager
- Summary
- ETL Process Overview
- Develop the ETL Plan
- Step 1: Draw the High-Level Plan
- Step 2: Choose an ETL Tool
- Step 3: Develop Default Strategies
- Step 4: Drill Down by Target Table
- Develop the ETL Specification Document
- Develop One-Time Historic Load Processing
- Step 5: Populate Dimension Tables with Historic Data
- Step 6: Perform the Fact Table Historic Load
- Develop Incremental ETL Processing
- Step 7: Dimension Table Incremental Processing
- Step 8: Fact Table Incremental Processing
- Step 9: Aggregate Table and OLAP Loads
- Step 10: ETL System Operation and Automation
- Real-Time Implications
- Real-Time Triage
- Real-Time Architecture Trade-Offs
- Real-Time Partitions in the Presentation Server
- Summary
- Big Data Overview
- Extended RDBMS Architecture
- MapReduce/Hadoop Architecture
- Comparison of Big Data Architectures
- Recommended Best Practices for Big Data
- Management Best Practices for Big Data
- Architecture Best Practices for Big Data
- Data Modeling Best Practices for Big Data
- Data Governance Best Practices for Big Data
- Summary
UM RAFBÆKUR Á HEIMKAUP.IS
Bókahillan þín er þitt svæði og þar eru bækurnar þínar geymdar. Þú kemst í bókahilluna þína hvar og hvenær sem er í tölvu eða snjalltæki. Einfalt og þægilegt!Rafbók til eignar
Rafbók til eignar þarf að hlaða niður á þau tæki sem þú vilt nota innan eins árs frá því bókin er keypt.
Þú kemst í bækurnar hvar sem er
Þú getur nálgast allar raf(skóla)bækurnar þínar á einu augabragði, hvar og hvenær sem er í bókahillunni þinni. Engin taska, enginn kyndill og ekkert vesen (hvað þá yfirvigt).
Auðvelt að fletta og leita
Þú getur flakkað milli síðna og kafla eins og þér hentar best og farið beint í ákveðna kafla úr efnisyfirlitinu. Í leitinni finnur þú orð, kafla eða síður í einum smelli.
Glósur og yfirstrikanir
Þú getur auðkennt textabrot með mismunandi litum og skrifað glósur að vild í rafbókina. Þú getur jafnvel séð glósur og yfirstrikanir hjá bekkjarsystkinum og kennara ef þeir leyfa það. Allt á einum stað.
Hvað viltu sjá? / Þú ræður hvernig síðan lítur út
Þú lagar síðuna að þínum þörfum. Stækkaðu eða minnkaðu myndir og texta með multi-level zoom til að sjá síðuna eins og þér hentar best í þínu námi.
Fleiri góðir kostir
- Þú getur prentað síður úr bókinni (innan þeirra marka sem útgefandinn setur)
- Möguleiki á tengingu við annað stafrænt og gagnvirkt efni, svo sem myndbönd eða spurningar úr efninu
- Auðvelt að afrita og líma efni/texta fyrir t.d. heimaverkefni eða ritgerðir
- Styður tækni sem hjálpar nemendum með sjón- eða heyrnarskerðingu
- Gerð : 208
- Höfundur : 11486
- Útgáfuár : 2013
- Leyfi : 379