دسترسی نامحدود
برای کاربرانی که ثبت نام کرده اند
برای ارتباط با ما می توانید از طریق شماره موبایل زیر از طریق تماس و پیامک با ما در ارتباط باشید
در صورت عدم پاسخ گویی از طریق پیامک با پشتیبان در ارتباط باشید
برای کاربرانی که ثبت نام کرده اند
درصورت عدم همخوانی توضیحات با کتاب
از ساعت 7 صبح تا 10 شب
ویرایش:
نویسندگان: Sandeep Uttamchandani
سری:
ISBN (شابک) : 9781492075257
ناشر: O'Reilly Media, Inc.
سال نشر: 2020
تعداد صفحات: 0
زبان: English
فرمت فایل : EPUB (درصورت درخواست کاربر به PDF، EPUB یا AZW3 تبدیل می شود)
حجم فایل: 7 مگابایت
در صورت تبدیل فایل کتاب The Self-Service Data Roadmap به فرمت های PDF، EPUB، AZW3، MOBI و یا DJVU می توانید به پشتیبان اطلاع دهید تا فایل مورد نظر را تبدیل نمایند.
توجه داشته باشید کتاب نقشه راه داده های سلف سرویس نسخه زبان اصلی می باشد و کتاب ترجمه شده به فارسی نمی باشد. وبسایت اینترنشنال لایبرری ارائه دهنده کتاب های زبان اصلی می باشد و هیچ گونه کتاب ترجمه شده یا نوشته شده به فارسی را ارائه نمی دهد.
با ارزش ترین منبع دنیا داده ها هستند. شرکتها در تمام بخشهای صنعت از بینشهای مبتنی بر داده به عنوان یک مزیت رقابتی کلیدی استفاده میکنند. اما زمان مورد نیاز برای تبدیل دادههای خام به بینش میتواند روزها یا هفتهها طول بکشد، اگر آن را در چند دقیقه یا چند ساعت بخواهید. دانشمندان داده تقریباً 80 درصد از زمان خود را به جای توسعه بینش صرف مهندسی داده می کنند. و بیشتر سازمانها نمیتوانند تیمهای علم داده خود را به اندازه کافی سریع بسازند تا با نیازهای در حال رشد کسبوکار برای بینش بهتر و سریعتر هماهنگ شوند. این کتاب به مهندسان داده، دانشمندان داده و مدیران تیم داده کمک میکند تا با ساختن یک پلتفرم علم داده سلفسرویس که توانایی استخراج بینش از دادهها را برای همه افراد سازمان دموکراتیک میکند، به این مسائل رسیدگی کنند. دانشمندان داده، مهندسان نرم افزار، مدیران محصول و بازاریابان می توانند از آن برای کشف، تبدیل و تجزیه و تحلیل داده ها و انتشار بینش های خودکار در تولید استفاده کنند. این کتاب این نیست: غواصی عمیق در فناوریهای «جدید براق»، یا هر فناوری خاص، فناوری گلوله نقرهای برای ساخت یک پورتال سلف سرویس. سازمانها در بلوغ، افراد، فرآیند و فناوریشان متفاوت هستند و به راهحلهای مناسب نیاز دارند. تخصص در سراسر یک سازمان راهنمای عملی و ضروری برای هر تصمیم گیرنده، مجری یا استراتژیست که با پلت فرم علم داده یک سازمان کار می کند.
The world's most valuable resource is data. Companies across all industry verticals are using data-driven insights as a key competitive advantage. But the time required for transforming raw data to insights can take days or weeks when you want it in minutes or hours. Data scientists spend nearly 80% of their time in data engineering, rather than developing insights. And most organizations can't scale their data science teams fast enough to keep up with growing business needs for better, faster insights. This book will help data engineers, data scientists, and data team managers address these issues by building a self-service data science platform that democratizes the ability to extract insights from the data to everyone in the organization. Data scientists, software engineers, product managers, and marketers can use it to discover, transform, and analyze data and publish automated insights in production. This book is not: A deep dive into the "shiny new" technologies, or any one specific technology A silver bullet technology for building a self-service portal. Organizations differ in their maturity, people, process, and technology and require tailored solutions This book is: A collection of must-have operational capabilities for building a self-service data portal A blueprint for achieving better and faster insights A process for democratizing data engineering expertise across an organization A practical and indispensable guide for any decision-maker, implementer, or strategist working with an organization's data science platform.
Cover Copyright Table of Contents Preface Conventions Used in This Book Using Code Examples O’Reilly Online Learning How to Contact Us Chapter 1. Introduction Journey Map from Raw Data to Insights Discover Prep Build Operationalize Defining Your Time-to-Insight Scorecard Build Your Self-Service Data Roadmap Part I. Self-Service Data Discovery Chapter 2. Metadata Catalog Service Journey Map Understanding Datasets Analyzing Datasets Knowledge Scaling Minimizing Time to Interpret Extracting Technical Metadata Extracting Operational Metadata Gathering Team Knowledge Defining Requirements Technical Metadata Extractor Requirements Operational Metadata Requirements Team Knowledge Aggregator Requirements Implementation Patterns Source-Specific Connectors Pattern Lineage Correlation Pattern Team Knowledge Pattern Summary Chapter 3. Search Service Journey Map Determining Feasibility of the Business Problem Selecting Relevant Datasets for Data Prep Reusing Existing Artifacts for Prototyping Minimizing Time to Find Indexing Datasets and Artifacts Ranking Results Access Control Defining Requirements Indexer Requirements Ranking Requirements Access Control Requirements Nonfunctional Requirements Implementation Patterns Push-Pull Indexer Pattern Hybrid Search Ranking Pattern Catalog Access Control Pattern Summary Chapter 4. Feature Store Service Journey Map Finding Available Features Training Set Generation Feature Pipeline for Online Inference Minimize Time to Featurize Feature Computation Feature Serving Defining Requirements Feature Computation Feature Serving Nonfunctional Requirements Implementation Patterns Hybrid Feature Computation Pattern Feature Registry Pattern Summary Chapter 5. Data Movement Service Journey Map Aggregating Data Across Sources Moving Raw Data to Specialized Query Engines Moving Processed Data to Serving Stores Exploratory Analysis Across Sources Minimizing Time to Data Availability Data Ingestion Configuration and Change Management Compliance Data Quality Verification Defining Requirements Ingestion Requirements Transformation Requirements Compliance Requirements Verification Requirements Nonfunctional Requirements Implementation Patterns Batch Ingestion Pattern Change Data Capture Ingestion Pattern Event Aggregation Pattern Summary Chapter 6. Clickstream Tracking Service Journey Map Minimizing Time to Click Metrics Managing Instrumentation Event Enrichment Building Insights Defining Requirements Instrumentation Requirements Checklist Enrichment Requirements Checklist Implementation Patterns Instrumentation Pattern Rule-Based Enrichment Patterns Consumption Patterns Summary Part II. Self-Service Data Prep Chapter 7. Data Lake Management Service Journey Map Primitive Life Cycle Management Managing Data Updates Managing Batching and Streaming Data Flows Minimizing Time to Data Lake Management Requirements Implementation Patterns Data Life Cycle Primitives Pattern Transactional Pattern Advanced Data Management Pattern Summary Chapter 8. Data Wrangling Service Journey Map Minimizing Time to Wrangle Defining Requirements Curating Data Operational Monitoring Defining Requirements Implementation Patterns Exploratory Data Analysis Patterns Analytical Transformation Patterns Summary Chapter 9. Data Rights Governance Service Journey Map Executing Data Rights Requests Discovery of Datasets Model Retraining Minimizing Time to Comply Tracking the Customer Data Life Cycle Executing Customer Data Rights Requests Limiting Data Access Defining Requirements Current Pain Point Questionnaire Interop Checklist Functional Requirements Nonfunctional Requirements Implementation Patterns Sensitive Data Discovery and Classification Pattern Data Lake Deletion Pattern Use Case–Dependent Access Control Summary Part III. Self-Service Build Chapter 10. Data Virtualization Service Journey Map Exploring Data Sources Picking a Processing Cluster Minimizing Time to Query Picking the Execution Environment Formulating Polyglot Queries Joining Data Across Silos Defining Requirements Current Pain Point Analysis Operational Requirements Functional Requirements Nonfunctional Requirements Implementation Patterns Automatic Query Routing Pattern Unified Query Pattern Federated Query Pattern Summary Chapter 11. Data Transformation Service Journey Map Production Dashboard and ML Pipelines Data-Driven Storytelling Minimizing Time to Transform Transformation Implementation Transformation Execution Transformation Operations Defining Requirements Current State Questionnaire Functional Requirements Nonfunctional Requirements Implementation Patterns Implementation Pattern Execution Patterns Summary Chapter 12. Model Training Service Journey Map Model Prototyping Continuous Training Model Debugging Minimizing Time to Train Training Orchestration Tuning Continuous Training Defining Requirements Training Orchestration Tuning Continuous Training Nonfunctional Requirements Implementation Patterns Distributed Training Orchestrator Pattern Automated Tuning Pattern Data-Aware Continuous Training Summary Chapter 13. Continuous Integration Service Journey Map Collaborating on an ML Pipeline Integrating ETL Changes Validating Schema Changes Minimizing Time to Integrate Experiment Tracking Reproducible Deployment Testing Validation Defining Requirements Experiment Tracking Module Pipeline Packaging Module Testing Automation Module Implementation Patterns Programmable Tracking Pattern Reproducible Project Pattern Summary Chapter 14. A/B Testing Service Journey Map Minimizing Time to A/B Test Experiment Design Execution at Scale Experiment Optimization Implementation Patterns Experiment Specification Pattern Metrics Definition Pattern Automated Experiment Optimization Summary Part IV. Self-Service Operationalize Chapter 15. Query Optimization Service Journey Map Avoiding Cluster Clogs Resolving Runtime Query Issues Speeding Up Applications Minimizing Time to Optimize Aggregating Statistics Analyzing Statistics Optimizing Jobs Defining Requirements Current Pain Points Questionnaire Interop Requirements Functionality Requirements Nonfunctional Requirements Implementation Patterns Avoidance Pattern Operational Insights Pattern Automated Tuning Pattern Summary Chapter 16. Pipeline Orchestration Service Journey Map Invoke Exploratory Pipelines Run SLA-Bound Pipelines Minimizing Time to Orchestrate Defining Job Dependencies Distributed Execution Production Monitoring Defining Requirements Current Pain Points Questionnaire Operational Requirements Functional Requirements Nonfunctional Requirements Implementation Patterns Dependency Authoring Patterns Orchestration Observability Patterns Distributed Execution Pattern Summary Chapter 17. Model Deploy Service Journey Map Model Deployment in Production Model Maintenance and Upgrade Minimizing Time to Deploy Deployment Orchestration Performance Scaling Drift Monitoring Defining Requirements Orchestration Model Scaling and Performance Drift Verification Nonfunctional Requirements Implementation Patterns Universal Deployment Pattern Autoscaling Deployment Pattern Model Drift Tracking Pattern Summary Chapter 18. Quality Observability Service Journey Map Daily Data Quality Monitoring Reports Debugging Quality Issues Handling Low-Quality Data Records Minimizing Time to Insight Quality Verify the Accuracy of the Data Detect Quality Anomalies Prevent Data Quality Issues Defining Requirements Detection and Handling Data Quality Issues Functional Requirements Nonfunctional Requirements Implementation Patterns Accuracy Models Pattern Profiling-Based Anomaly Detection Pattern Avoidance Pattern Summary Chapter 19. Cost Management Service Journey Map Monitoring Cost Usage Continuous Cost Optimization Minimizing Time to Optimize Cost Expenditure Observability Matching Supply and Demand Continuous Cost Optimization Defining Requirements Pain Points Questionnaire Functional Requirements Nonfunctional Requirements Implementation Patterns Continuous Cost Monitoring Pattern Automated Scaling Pattern Cost Advisor Pattern Summary Index About the Author Colophon