ورود به حساب

نام کاربری گذرواژه

گذرواژه را فراموش کردید؟ کلیک کنید

حساب کاربری ندارید؟ ساخت حساب

ساخت حساب کاربری

نام نام کاربری ایمیل شماره موبایل گذرواژه

برای ارتباط با ما می توانید از طریق شماره موبایل زیر از طریق تماس و پیامک با ما در ارتباط باشید


09117307688
09117179751

در صورت عدم پاسخ گویی از طریق پیامک با پشتیبان در ارتباط باشید

دسترسی نامحدود

برای کاربرانی که ثبت نام کرده اند

ضمانت بازگشت وجه

درصورت عدم همخوانی توضیحات با کتاب

پشتیبانی

از ساعت 7 صبح تا 10 شب

دانلود کتاب Automating Data Quality Monitoring at Scale: Scaling Beyond Rules with Machine Learning (Final)

دانلود کتاب خودکارسازی نظارت بر کیفیت داده در مقیاس: مقیاس گذاری فراتر از قوانین با یادگیری ماشین (نهایی)

Automating Data Quality Monitoring at Scale: Scaling Beyond Rules with Machine Learning (Final)

مشخصات کتاب

Automating Data Quality Monitoring at Scale: Scaling Beyond Rules with Machine Learning (Final)

ویرایش:  
نویسندگان:   
سری:  
ISBN (شابک) : 9781098145934 
ناشر: O'Reilly Media 
سال نشر: 2024 
تعداد صفحات: 170 
زبان: English 
فرمت فایل : EPUB (درصورت درخواست کاربر به PDF، EPUB یا AZW3 تبدیل می شود) 
حجم فایل: 9 Mb 

قیمت کتاب (تومان) : 44,000



ثبت امتیاز به این کتاب

میانگین امتیاز به این کتاب :
       تعداد امتیاز دهندگان : 1


در صورت تبدیل فایل کتاب Automating Data Quality Monitoring at Scale: Scaling Beyond Rules with Machine Learning (Final) به فرمت های PDF، EPUB، AZW3، MOBI و یا DJVU می توانید به پشتیبان اطلاع دهید تا فایل مورد نظر را تبدیل نمایند.

توجه داشته باشید کتاب خودکارسازی نظارت بر کیفیت داده در مقیاس: مقیاس گذاری فراتر از قوانین با یادگیری ماشین (نهایی) نسخه زبان اصلی می باشد و کتاب ترجمه شده به فارسی نمی باشد. وبسایت اینترنشنال لایبرری ارائه دهنده کتاب های زبان اصلی می باشد و هیچ گونه کتاب ترجمه شده یا نوشته شده به فارسی را ارائه نمی دهد.


توضیحاتی در مورد کتاب خودکارسازی نظارت بر کیفیت داده در مقیاس: مقیاس گذاری فراتر از قوانین با یادگیری ماشین (نهایی)




توضیحاتی درمورد کتاب به خارجی

The world\'s businesses ingest a combined 2.5 quintillion bytes of data every day. But how much of this vast amount of data--used to build products, power AI systems, and drive business decisions--is poor quality or just plain bad? This practical book shows you how to ensure that the data your organization relies on contains only high-quality records. Most data engineers, data analysts, and data scientists genuinely care about data quality, but they often don\'t have the time, resources, or understanding to create a data quality monitoring solution that succeeds at scale. In this book, Jeremy Stanley and Paige Schwartz from Anomalo explain how you can use automated data quality monitoring to cover all your tables efficiently, proactively alert on every category of issue, and resolve problems immediately. This book will help you: Learn why data quality is a business imperative Understand and assess unsupervised learning models for detecting data issues Implement notifications that reduce alert fatigue and let you triage and resolve issues quickly Integrate automated data quality monitoring with data catalogs, orchestration layers, and BI and ML systems Understand the limits of automated data quality monitoring and how to overcome them Learn how to deploy and manage your monitoring solution at scale Maintain automated data quality monitoring for the long term



فهرست مطالب

Foreword
Preface
   Who Should Use This Book
   Conventions Used in This Book
   O’Reilly Online Learning
   How to Contact Us
   Acknowledgments
1. The Data Quality Imperative
   High-Quality Data Is the New Gold
      Data-Driven Companies Are Today’s Disrupters
      Data Analytics Is Democratized
      AI and Machine Learning Are Differentiators
         Generative AI and data quality
      Companies Are Investing in a Modern Data Stack
   More Data, More Problems
      Issues Inside the Data Factory
      Data Migrations
      Third-Party Data Sources
      Company Growth and Change
      Exogenous Factors
   Why We Need Data Quality Monitoring
      Data Scars
      Data Shocks
   Automating Data Quality Monitoring: The New Frontier
2. Data Quality Monitoring Strategies and the Role of Automation
   Monitoring Requirements
   Data Observability: Necessary, but Not Sufficient
   Traditional Approaches to Data Quality
      Manual Data Quality Detection
      Rule-Based Testing
      Metrics Monitoring
   Automating Data Quality Monitoring with Unsupervised Machine Learning
      What Is Unsupervised Machine Learning?
      An Analogy: Lane Departure Warnings
      The Limits of Automation
         Automating rule and metric creation
            Rules
            Metrics
   A Four-Pillar Approach to Data Quality Monitoring
3. Assessing the Business Impact of Automated Data Quality Monitoring
   Assessing Your Data
      Volume
      Variety
         Unstructured data
         Semistructured data
         Structured data
            Normalized relational data
            Fact tables
            Summary tables
      Velocity
      Veracity
      Special Cases
   Assessing Your Industry
      Regulatory Pressure
      AI/ML Risks
         Feature shocks
         NULL increases
         Change in correlation
         Duplicate data
      Data as a Product
   Assessing Your Data Maturity
   Assessing Benefits to Stakeholders
      Engineers
      Data Leadership
      Scientists
      Consumers
   Conducting an ROI Analysis
      Quantitative Measures
      Qualitative Measures
   Conclusion
4. Automating Data Quality Monitoring with Machine Learning
   Requirements
      Sensitivity
      Specificity
      Transparency
      Scalability
      Nonrequirements
      Data Quality Monitoring Is Not Outlier Detection
   ML Approach and Algorithm
      Data Sampling
         Sample size
         Bias and efficiency
      Feature Encoding
      Model Development
         Training and evaluation
         Computational efficiency
      Model Explainability
   Putting It Together with Pseudocode
   Other Applications
   Conclusion
5. Building a Model That Works on Real-World Data
   Data Challenges and Mitigations
      Seasonality
      Time-Based Features
      Chaotic Tables
      Updated-in-Place Tables
      Column Correlations
   Model Testing
      Injecting Synthetic Issues
         Example
      Benchmarking
         Analyzing performance
         Putting it together with pseudocode
      Improving the Model
   Conclusion
6. Implementing Notifications While Avoiding Alert Fatigue
   How Notifications Facilitate Data Issue Response
      Triage
      Routing
      Resolution
      Documentation
   Taking Action Without Notifications
   Anatomy of a Notification
      Visualization
      Actions
      Text Description
      Who Created/Last Edited the Check
   Delivering Notifications
      Notification Audience
      Notification Channels
         Email
         Real-time communication
         PagerDuty or Opsgenie-type platforms (alerting, on-call management)
         Ticketing platforms (Jira, ServiceNow)
         Webhooks
      Notification Timing
   Avoiding Alert Fatigue
      Scheduling Checks in the Right Order
      Clustering Alerts Using Machine Learning
      Suppressing Notifications
         Priority level
         Continuous retraining
         Narrowing the scope of the model
         Making the check less sensitive
         What not to suppress: Expected changes
   Automating the Root Cause Analysis
   Conclusion
7. Integrating Monitoring with Data Tools and Systems
   Monitoring Your Data Stack
   Data Warehouses
      Integrating with Data Warehouses
      Security
      Reconciling Data Across Multiple Warehouses
         Comparing datasets with rule-based testing
         Comparing datasets with unsupervised machine learning
         Comparing summary statistics
   Data Orchestrators
      Integrating with Orchestrators
   Data Catalogs
      Integrating with Catalogs
   Data Consumers
      Analytics and BI Tools
      MLOps
   Conclusion
8. Operating Your Solution at Scale
   Build Versus Buy
      Vendor Deployment Models
         SaaS
         Fully in-VPC or on-prem
         Hybrid
   Configuration
      Determining Which Tables Are Most Important
      Deciding What Data in a Table to Monitor
      Configuration at Scale
   Enablement
      User Roles and Permissions
      Onboarding, Training, and Support
   Improving Data Quality Over Time
      Initiatives
      Metrics
         Triage and resolution
         Executive dashboards
         Scorecards
   From Chaos to Clarity
A. Types of Data Quality Issues
   Table Issues
      Late Arrival
         Definition
         Example
         Causes
         Analytics impact
         ML impact
         How to monitor
      Schema Changes
         Definition
         Example
         Causes
         Analytics impact
         ML impact
         How to monitor
      Untraceable Changes
         Definition
         Example
         Causes
         Analytics impact
         ML impact
         How to monitor
   Row Issues
      Incomplete Rows
         Definition
         Example
         Causes
         Analytics impact
         ML impact
         How to monitor
      Duplicate Rows
         Definition
         Example
         Causes
         Analytics impact
         ML impact
         How to monitor
      Temporal Inconsistency
         Definition
         Example
         Causes
         Analytics impact
         ML impact
         How to monitor
   Value Issues
      Missing Values
         Definition
         Example
         Causes
         Analytics impact
         ML impact
         How to monitor
      Incorrect Values
         Definition
         Example
         Causes
         Analytics impact
         ML impact
         How to monitor
      Invalid Values
         Definition
         Example
         Causes
         Analytics impact
         ML impact
         How to monitor
   Multi Issues
      Relational Failures
         Definition
         Example
         Causes
         Analytics impact
         ML impact
         How to monitor
      Inconsistent Sources
         Definition
         Example
         Causes
         Analytics impact
         ML impact
         How to monitor
Index




نظرات کاربران