ورود به حساب

نام کاربری گذرواژه

گذرواژه را فراموش کردید؟ کلیک کنید

حساب کاربری ندارید؟ ساخت حساب

ساخت حساب کاربری

نام نام کاربری ایمیل شماره موبایل گذرواژه

برای ارتباط با ما می توانید از طریق شماره موبایل زیر از طریق تماس و پیامک با ما در ارتباط باشید


09117307688
09117179751

در صورت عدم پاسخ گویی از طریق پیامک با پشتیبان در ارتباط باشید

دسترسی نامحدود

برای کاربرانی که ثبت نام کرده اند

ضمانت بازگشت وجه

درصورت عدم همخوانی توضیحات با کتاب

پشتیبانی

از ساعت 7 صبح تا 10 شب

دانلود کتاب Data Engineering for Machine Learning Pipelines: From Python Libraries to ML Pipelines and Cloud Platforms

دانلود کتاب مهندسی داده برای خطوط لوله یادگیری ماشین: از کتابخانه های پایتون تا خطوط لوله ML و بسترهای ابری

Data Engineering for Machine Learning Pipelines: From Python Libraries to ML Pipelines and Cloud Platforms

مشخصات کتاب

Data Engineering for Machine Learning Pipelines: From Python Libraries to ML Pipelines and Cloud Platforms

ویرایش:  
نویسندگان:   
سری:  
ISBN (شابک) : 9798868806018, 9798868806025 
ناشر: Apress 
سال نشر: 2024 
تعداد صفحات: 651 
زبان: English 
فرمت فایل : PDF (درصورت درخواست کاربر به PDF، EPUB یا AZW3 تبدیل می شود) 
حجم فایل: 33 مگابایت 

قیمت کتاب (تومان) : 61,000



ثبت امتیاز به این کتاب

میانگین امتیاز به این کتاب :
       تعداد امتیاز دهندگان : 3


در صورت تبدیل فایل کتاب Data Engineering for Machine Learning Pipelines: From Python Libraries to ML Pipelines and Cloud Platforms به فرمت های PDF، EPUB، AZW3، MOBI و یا DJVU می توانید به پشتیبان اطلاع دهید تا فایل مورد نظر را تبدیل نمایند.

توجه داشته باشید کتاب مهندسی داده برای خطوط لوله یادگیری ماشین: از کتابخانه های پایتون تا خطوط لوله ML و بسترهای ابری نسخه زبان اصلی می باشد و کتاب ترجمه شده به فارسی نمی باشد. وبسایت اینترنشنال لایبرری ارائه دهنده کتاب های زبان اصلی می باشد و هیچ گونه کتاب ترجمه شده یا نوشته شده به فارسی را ارائه نمی دهد.


توضیحاتی درمورد کتاب به خارجی



فهرست مطالب

Table of Contents
About the Author
About the Technical Reviewer
Introduction
Chapter 1: Core Technologies in Data Engineering
	Introduction
	Python Programming
		F Strings
		Python Functions
		Advanced Function Arguments
			*args
			**kwargs
		Lambda Functions
		Decorators in Python
		Type Hinting
		Typing Module
		Generators in Python
		Enumerate Functions
		List Comprehension
		Random Module
			random()
			randint()
			getrandbits()
			choice()
			shuffle()
			sample()
			seed()
	Git Source Code Management
		Foundations of Git
		GitHub
			Setup and Installation
		Core Concepts of Git
			Cloning
			Branching
			Forking
			Pull request
			Gitignore
	SQL Programming
		Essential SQL Queries
		Conditional Data Filtering
		Joining SQL Tables
			Self-Join in SQL
		Common Table Expressions
		Views in SQL
			Standard View
			Materialized View
		Temporary Tables in SQL
		Window Functions in SQL
			SQL Aggregate Functions
			SQL Rank Functions
			Query Tuning and Execution Plan Optimizations
	Conclusion
Chapter 2: Data Wrangling using Pandas
	Introduction
	Data Structures
		Series
		Data Frame
	Indexing
		Essential Indexing Methods
		Multi-indexing
		Time Delta Index
	Data Extraction and Loading
		CSV
		JSON
		HDF5
		Feather
		Parquet
		ORC
		Avro
		Pickle
		Chunk Loading
	Missing Values
		Background
		Missing Values in Data Pipelines
			None
			NaN
			NaT
			NA
		Handling Missing Values
			isna() Method
			notna() Method
			When to Use Which Method?
	Data Transformation
		Data Exploration
		Combining Multiple Pandas Objects
			Left Join
			Left Join Using merge()
			Right Join
			Outer Join
			Inner Join
			Cross Join
	Data Reshaping
		pivot()
		pivot_table()
		stack()
		unstack()
		melt()
		crosstab()
		factorize()
		compare()
		groupby()
	Conclusion
Chapter 3: Data Wrangling using Rust’s Polars
	Introduction
	Introduction to Polars
		Lazy vs. Eager Evaluation
	Data Structures in Polars
		Polars Series
		Polars Data Frame
		Polars Lazy Frame
	Data Extraction and Loading
		CSV
		JSON
		Parquet
	Data Transformation in Polars
		Polars Context
			Selection Context
			Filter Context
			Group-By Context
		Basic Operations
		String Operations
		Aggregation and Group-By
	Combining Multiple Polars Objects
		Left Join
		Outer Join
		Inner Join
		Semi Join
		Anti Join
		Cross Join
	Advanced Operations
		Identifying Missing Values
		Identifying Unique Values
		Pivot Melt Examples
	Polars/SQL Interaction
	Polars CLI
	Conclusion
Chapter 4: GPU Driven Data Wrangling Using CuDF
	Introduction
	CPU vs. GPU
		Introduction to CUDA
	Concepts of GPU Programming
		Kernels
		Memory Management
	Introduction to CuDF
		CuDF vs. Pandas
		Setup
			Testing the Installation
		File IO Operations
			CSV
			Parquet
			JSON
	Basic Operations
		Column Filtering
		Row Filtering
		Sorting the Dataset
	Combining Multiple CuDF Objects
		Left Join
		Outer Join
		Inner Join
		Left Semi Join
		Left Anti Join
	Advanced Operations
		Group-By Function
		Transform Function
		apply()
		Cross Tabulation
		Feature Engineering Using cut()
		Factorize Function
		Window Functions
		CuDF Pandas
	Conclusion
Chapter 5: Getting Started with Data Validation using Pydantic and Pandera
	Introduction
	Introduction to Data Validation
		Need for Good Data
		Definition
		Principles of Data Validation
			Data Accuracy
			Data Uniqueness
			Data Completeness
			Data Range
			Data Consistency
			Data Format
			Referential Integrity
	Introduction to Pydantic
		Type Annotations Refresher
		Setup and Installation
		Pydantic Models
			Nested Models
		Fields
		JSON Schemas
		Constrained Types
		Validators in Pydantic
	Introduction to Pandera
		Setup and Installation
		DataFrame Schema in Pandera
		Data Coercion in Pandera
		Checks in Pandera
		Statistical Validation in Pandera
		Lazy Validation
		Pandera Decorators
	Conclusion
Chapter 6: Data Validation using Great Expectations
	Introduction
	Introduction to Great Expectations
		Components of Great Expectations
			Data Context
			Data Sources
			Expectations
			Checkpoints
	Setup and Installation
	Getting Started with Writing Expectations
	Data Validation Workflow in Great Expectations
	Creating a Checkpoint
	Data Documentation
	Expectation Store
	Conclusion
Chapter 7: Introduction to Concurrency Programming and Dask
	Introduction
	Introduction to Parallel and Concurrent Processing
		History
		Python and the Global Interpreter Lock
		Concepts of Parallel Processing
		Identifying CPU Cores
		Concurrent Processing
	Introduction to Dask
		Setup and Installation
	Features of Dask
		Tasks and Graphs
		Lazy Evaluation
		Partitioning and Chunking
		Serialization and Pickling
		Dask-CuDF
	Dask Architecture
		Core Library
		Schedulers
		Client
		Workers
		Task Graphs
	Dask Data Structures and Concepts
		Dask Arrays
		Dask Bags
		Dask DataFrames
		Dask Delayed
		Dask Futures
	Optimizing Dask Computations
		Data Locality
		Prioritizing Work
		Work Stealing
	Conclusion
Chapter 8: Engineering Machine Learning Pipelines using DaskML
	Introduction
	Machine Learning Data Pipeline Workflow
		Data Sourcing
		Data Exploration
		Data Cleaning
		Data Wrangling
		Data Integration
		Feature Engineering
		Feature Selection
		Data Splitting
		Model Selection
		Model Training
		Model Evaluation
		Hyperparameter Tuning
		Final Testing
		Model Deployment
		Model Monitoring
		Model Retraining
	Dask-ML Integration with Other ML Libraries
		scikit-learn
		XGBoost
		PyTorch
		Other Libraries
	Dask-ML Setup and Installation
	Dask-ML Data Preprocessing
		RobustScaler()
		MinMaxScaler()
		One Hot Encoding
		Cross Validation
	Hyperparameter Tuning Using Dask-ML
		Grid Search
		Random Search
		Incremental Search
	Statistical Imputation with Dask-ML
	Conclusion
Chapter 9: Engineering Real-time Data Pipelines using Apache Kafka
	Introduction
	Introduction to Distributed Computing
	Introduction to Kafka
	Kafka Architecture
		Events
		Topics
		Partitions
		Broker
		Replication
		Producers
		Consumers
		Schema Registry
			With Avro
			With Protobuf
		Kafka Connect
		Kafka Streams and ksqlDB
		Kafka Admin Client
	Setup and Development
	Kafka Application with the Schema Registry
		Protobuf Serializer
	Stream Processing
		Stateful vs. Stateless Processing
	Kafka Connect
		Best Practices
	Conclusion
Chapter 10: Engineering Machine Learning and Data REST APIs using FastAPI
	Introduction
	Introduction to Web Services and APIs
		OpenWeather API
		Types of APIs
			SOAP APIs
			REST APIs
			GraphQL APIs
			Webhooks
		Typical Process of APIs
		Endpoints
		API Development Process
		REST API
		HTTP Status Codes
	FastAPI
		Setup and Installation
		Core Concepts
		Path Parameters and Query Parameters
		Pydantic Integration
			Response Model
		Dependency Injection in FastAPI
		Database Integration with FastAPI
		Object Relational Mapping
			SQLAlchemy
				Engine
				Session
				Query API
			Alembic
		Building a REST Data API
		Middleware in FastAPI
		ML API Endpoint Using FastAPI
	Conclusion
Chapter 11: Getting Started with Workflow Management and Orchestration
	Introduction
	Introduction to Workflow Orchestration
		Workflow
		ETL and ELT Data Pipeline Workflow
		Workflow Configuration
		Workflow Orchestration
	Introduction to Cron Job Scheduler
		Concepts
		Crontab File
		Cron Logging
		Cron Job Usage
		Cron Scheduler Applications
			Database Backup
			Data Processing
			Email Notification
		Cron Alternatives
	Conclusion
Chapter 12: Orchestrating Data Engineering Pipelines using Apache Airflow
	Introduction
	Introduction to Apache Airflow
		Setup and Installation
	Airflow Architecture
		Web Server
		Database
		Executor
		Scheduler
		Configuration Files
		A Simple Example
	Airflow DAGs
		Tasks
		Operators
		Sensors
		Task Flow
		Xcom
		Hooks
		Variables
		Params
		Templates
		Macros
	Controlling the DAG Workflow
		Triggers
	Conclusion
Chapter 13: Orchestrating Data Engineering Pipelines using Prefect
	Introduction
	Introduction to Prefect
		Setup and Installation
	Prefect Server
	Prefect Development
		Flows
		Flow Runs
		Interface
		Tasks
		Results
		Persisting Results
		Artifacts in Prefect
			Link Artifacts
			Markdown Artifacts
			Table Artifacts
		States in Prefect
		State Change Hooks
		Blocks
		Prefect Variables
		Variables in .yaml Files
		Task Runners
	Conclusion
Chapter 14: Getting Started with Big Data and Cloud Computing
	Introduction
	Background of Cloud Computing
		Networking Concepts for Cloud Computing
			IP address
			DNS
			Ports
			Firewalls
			Virtual Private Cloud
			Virtualization
	Introduction to Big Data
		Hadoop
		Spark
	Introduction to Cloud Computing
		Cloud Computing Deployment Models
			Public Cloud
			Private Cloud
			Hybrid Cloud
			Community Cloud
			Government Cloud
			Multi-cloud
		Cloud Architecture Concepts
			Scalability
			Elasticity
			High Availability
			Fault Tolerance
			Disaster Recovery
			Caching
		Cloud Computing Vendors
		Cloud Service Models
			Infrastructure as a Service
			Platform as a Service
			Software as a Service
		Cloud Computing Services
			Identity and Access Management
			Compute
			Storage
			Object Storage
			Databases
			NoSQL
				Schema on Write vs. Schema on Read
				Document Databases
				Column-Oriented Databases
				Key–Value Stores
				Graph Databases
				Time Series Databases
				Vector Databases
			Data Warehouses
			Data Lakes
			Data Warehouses vs. Data Lakes
			Real-Time/Streaming Processing Service
			Serverless Functions
			Data Integration Services
			Continuous Integration Services
			Containerization
			Data Governance
			Data Catalog
			Compliance and Data Protection
			Data Lifecycle Management
			Machine Learning
	Conclusion
Chapter 15: Engineering Data Pipelines Using Amazon Web Services
	Introduction
	AWS Console Overview
		Setting Up an AWS Account
		Installing the AWS CLI
	AWS S3
		Uploading Files
	AWS Data Systems
		Amazon RDS
		Amazon Redshift
		Amazon Athena
		Amazon Glue
		AWS Lake Formation
	AWS SageMaker
	Conclusion
Chapter 16: Engineering Data Pipelines Using Google Cloud Platform
	Introduction
	Google Cloud Platform
		Set Up a GCP Account
	Google Cloud Storage
	Google Cloud CLI
	Google Compute Engine
	Cloud SQL
	Google Bigtable
	Google BigQuery
	Google Dataproc
	Google Vertex AI Workbench
	Google Vertex AI
	Conclusion
Chapter 17: Engineering Data Pipelines Using Microsoft Azure
	Introduction
	Introduction to Azure
	Azure Blob Storage
	Azure SQL
	Azure Cosmos DB
	Azure Synapse Analytics
	Azure Data Factory
	Azure Functions
	Azure Machine Learning
	Azure ML Data Assets
	Azure ML Job
	Conclusion
Index




نظرات کاربران