دسترسی نامحدود
برای کاربرانی که ثبت نام کرده اند
برای ارتباط با ما می توانید از طریق شماره موبایل زیر از طریق تماس و پیامک با ما در ارتباط باشید
در صورت عدم پاسخ گویی از طریق پیامک با پشتیبان در ارتباط باشید
برای کاربرانی که ثبت نام کرده اند
درصورت عدم همخوانی توضیحات با کتاب
از ساعت 7 صبح تا 10 شب
ویرایش: [Second Edition]
نویسندگان: Michael Walker
سری:
ISBN (شابک) : 9781803239873
ناشر: Packt
سال نشر: 2024
تعداد صفحات: 487
زبان: English
فرمت فایل : EPUB (درصورت درخواست کاربر به PDF، EPUB یا AZW3 تبدیل می شود)
حجم فایل: 8 Mb
در صورت تبدیل فایل کتاب Python Data Cleaning Cookbook: Prepare your data for analysis with pandas, NumPy, Matplotlib, scikit-learn, and OpenAI, 2nd Ed به فرمت های PDF، EPUB، AZW3، MOBI و یا DJVU می توانید به پشتیبان اطلاع دهید تا فایل مورد نظر را تبدیل نمایند.
توجه داشته باشید کتاب کتاب آشپزی پاکسازی داده های پایتون: داده های خود را برای تجزیه و تحلیل با پانداها، NumPy، Matplotlib، scikit-learn، و OpenAI، ویرایش دوم آماده کنید. نسخه زبان اصلی می باشد و کتاب ترجمه شده به فارسی نمی باشد. وبسایت اینترنشنال لایبرری ارائه دهنده کتاب های زبان اصلی می باشد و هیچ گونه کتاب ترجمه شده یا نوشته شده به فارسی را ارائه نمی دهد.
این کتاب به شما نشان میدهد که چگونه دادهها را از منظرهای مختلف، از جمله ویژگیهای مجموعه داده و ستون، تمیز کنید، بحث کنید و ببینید.
The book shows you how to clean, wrangle, and view data from multiple perspectives, including dataset and column attributes.
Cover Copyright Contributors Table of Contents Preface Chapter 1: Anticipating Data Cleaning Issues When Importing Tabular Data with pandas Technical requirements Importing CSV files Importing Excel files Importing data from SQL databases Importing SPSS, Stata, and SAS data Importing R data Persisting tabular data Summary Chapter 2: Anticipating Data Cleaning Issues When Working with HTML, JSON, and Spark Data Technical requirements Importing simple JSON data Importing more complicated JSON data from an API Importing data from web pages Working with Spark data Persisting JSON data Versioning data Summary Chapter 3: Taking the Measure of Your Data Technical requirements Getting a first look at your data Selecting and organizing columns Selecting rows Generating frequencies for categorical variables Generating summary statistics for continuous variables Using generative AI to display descriptive statistics Summary Chapter 4: Identifying Outliers in Subsets of Data Technical requirements Identifying outliers with one variable Identifying outliers and unexpected values in bivariate relationships Using subsetting to examine logical inconsistencies in variable relationships Using linear regression to identify data points with significant influence Using k-nearest neighbors to find outliers Using Isolation Forest to find anomalies Using PandasAI to identify outliers Summary Chapter 5: Using Visualizations for the Identification of Unexpected Values Technical requirements Using histograms to examine the distribution of continuous variables Using boxplots to identify outliers for continuous variables Using grouped boxplots to uncover unexpected values in a particular group Examining both distribution shape and outliers with violin plots Using scatter plots to view bivariate relationships Using line plots to examine trends in continuous variables Generating a heat map based on a correlation matrix Summary Chapter 6: Cleaning and Exploring Data with Series Operations Technical requirements Getting values from a pandas Series Showing summary statistics for a pandas Series Changing Series values Changing Series values conditionally Evaluating and cleaning string Series data Working with dates Using OpenAI for Series operations Summary Chapter 7: Identifying and Fixing Missing Values Technical requirements Identifying missing values Cleaning missing values Imputing values with regression Using k-nearest neighbors for imputation Using random forest for imputation Using PandasAI for imputation Summary Chapter 8: Encoding, Transforming, and Scaling Features Technical requirements Creating training datasets and avoiding data leakage Removing redundant or unhelpful features Encoding categorical features: one-hot encoding Encoding categorical features: ordinal encoding Encoding categorical features with medium or high cardinality Using mathematical transformations Feature binning: equal width and equal frequency k-means binning Feature scaling Summary Chapter 9: Fixing Messy Data When Aggregating Technical requirements Looping through data with itertuples (an anti-pattern) Calculating summaries by group with NumPy arrays Using groupby to organize data by groups Using more complicated aggregation functions with groupby Using user-defined functions and apply with groupby Using groupby to change the unit of analysis of a DataFrame Using pivot_table to change the unit of analysis of a DataFrame Summary Chapter 10: Addressing Data Issues When Combining DataFrames Technical requirements Combining DataFrames vertically Doing one-to-one merges Doing one-to-one merges by multiple columns Doing one-to-many merges Doing many-to-many merges Developing a merge routine Summary Chapter 11: Tidying and Reshaping Data Technical requirements Removing duplicated rows Fixing many-to-many relationships Using stack and melt to reshape data from wide to long format Melting multiple groups of columns Using unstack and pivot to reshape data from long to wide format Summary Chapter 12: Automate Data Cleaning with User-Defined Functions, Classes, and Pipelines Technical requirements Functions for getting a first look at our data Functions for displaying summary statistics and frequencies Functions for identifying outliers and unexpected values Functions for aggregating or combining data Classes that contain the logic for updating Series values Classes that handle non-tabular data structures Functions for checking overall data quality Pre-processing data with pipelines: a simple example Pre-processing data with pipelines: a more complicated example Summary Packt Page Other Books You May Enjoy Index