دسترسی نامحدود
برای کاربرانی که ثبت نام کرده اند
برای ارتباط با ما می توانید از طریق شماره موبایل زیر از طریق تماس و پیامک با ما در ارتباط باشید
در صورت عدم پاسخ گویی از طریق پیامک با پشتیبان در ارتباط باشید
برای کاربرانی که ثبت نام کرده اند
درصورت عدم همخوانی توضیحات با کتاب
از ساعت 7 صبح تا 10 شب
ویرایش: نویسندگان: Gururajan Govindan, Shubhangi Hora, Konstantin Palagachev سری: ISBN (شابک) : 1839211385, 9781839211386 ناشر: Packt Publishing - ebooks Account سال نشر: 2020 تعداد صفحات: 625 زبان: English فرمت فایل : PDF (درصورت درخواست کاربر به PDF، EPUB یا AZW3 تبدیل می شود) حجم فایل: 41 مگابایت
در صورت تبدیل فایل کتاب The Data Analysis Workshop: Solve business problems with state-of-the-art data analysis models, developing expert data analysis skills along the way به فرمت های PDF، EPUB، AZW3، MOBI و یا DJVU می توانید به پشتیبان اطلاع دهید تا فایل مورد نظر را تبدیل نمایند.
توجه داشته باشید کتاب کارگاه تجزیه و تحلیل داده ها: حل مشکلات تجاری با پیشرفته ترین مدل های تجزیه و تحلیل داده ها ، توسعه مهارت های تجزیه و تحلیل داده های متخصص در طول راه نسخه زبان اصلی می باشد و کتاب ترجمه شده به فارسی نمی باشد. وبسایت اینترنشنال لایبرری ارائه دهنده کتاب های زبان اصلی می باشد و هیچ گونه کتاب ترجمه شده یا نوشته شده به فارسی را ارائه نمی دهد.
چگونگی تجزیه و تحلیل داده ها را با استفاده از مدل های پایتون با کمک موارد استفاده در دنیای واقعی و راهنمایی کارشناسان صنعت بیاموزید
امروزه کسب و کارها بصورت آنلاین فعالیت می کنند و تقریباً به طور مداوم داده تولید می کنند. اگرچه ممکن است همه دادهها به شکل خام مفید به نظر نرسند، اما اگر به درستی پردازش و تجزیه و تحلیل شوند، میتوانند بینشهای پنهان ارزشمندی را در اختیار شما قرار دهند. کارگاه تجزیه و تحلیل داده ها به شما کمک می کند یاد بگیرید که چگونه این الگوهای پنهان را در داده های خود کشف کنید، آنها را تجزیه و تحلیل کنید و از نتایج برای کمک به تغییر کسب و کار خود استفاده کنید.
این کتاب با استفاده از یک مورد استفاده شروع می شود. مغازه اجاره دوچرخه به شما نشان داده می شود که چگونه داده ها را به هم مرتبط کنید، هیستوگرام ها را رسم کنید، و ویژگی های زمانی را تجزیه و تحلیل کنید. همانطور که پیشرفت می کنید، یاد خواهید گرفت که چگونه داده ها را برای یک سیستم هیدرولیک با استفاده از کتابخانه های Seaborn و Matplotlib ترسیم کنید و انواع موارد استفاده را بررسی کنید که به شما نحوه پیوستن و ادغام پایگاه های داده، آماده سازی داده ها برای تجزیه و تحلیل و مدیریت داده های نامتعادل را نشان می دهد.
در پایان کتاب، تکنیکهای مختلف تجزیه و تحلیل دادهها، از جمله آزمون فرضیه، همبستگی، و انتساب ارزش صفر را یاد خواهید گرفت و به یک تحلیلگر داده مطمئن تبدیل خواهید شد.
< h4>آنچه یاد خواهید گرفتکارگاه تجزیه و تحلیل داده ها برای برنامه نویسانی است که از قبل نحوه کدنویسی در پایتون را می دانند و می خواهند از آن برای تجزیه و تحلیل داده ها استفاده کنند. اگر به دنبال کسب تجربه عملی در علم داده با پایتون هستید، این کتاب برای شما مناسب است.
Learn how to analyze data using Python models with the help of real-world use cases and guidance from industry experts
Businesses today operate online and generate data almost continuously. While not all data in its raw form may seem useful, if processed and analyzed correctly, it can provide you with valuable hidden insights. The Data Analysis Workshop will help you learn how to discover these hidden patterns in your data, to analyze them and leverage the results to help transform your business.
The book begins by taking you through the use case of a bike rental shop. You'll be shown how to correlate data, plot histograms, and analyze temporal features. As you progress, you'll learn how to plot data for a hydraulic system using the Seaborn and Matplotlib libraries, and explore a variety of use cases that show you how to join and merge databases, prepare data for analysis, and handle imbalanced data.
By the end of the book, you'll have learned different data analysis techniques, including hypothesis testing, correlation, and null-value imputation, and will have become a confident data analyst.
The Data Analysis Workshop is for programmers who already know how to code in Python and want to use it to perform data analysis. If you are looking to gain practical experience in data science with Python, this book is for you.
Cover FM Copyright Table of Contents Preface Chapter 1: Bike Sharing Analysis Introduction Understanding the Data Data Preprocessing Exercise 1.01: Preprocessing Temporal and Weather Features Registered versus Casual Use Analysis Exercise 1.02: Analyzing Seasonal Impact on Rides Hypothesis Tests Exercise 1.03: Estimating Average Registered Rides Exercise 1.04: Hypothesis Testing on Registered Rides Analysis of Weather-Related Features Exercise 1.05: Evaluating the Difference between the Pearson and Spearman Correlations Correlation Matrix Plot Time Series Analysis Exercise 1.06: Time Series Decomposition in Trend, Seasonality, and Residual Components ARIMA Models Exercise 1.07: ACF and PACF Plots for Registered Rides Activity 1.01: Investigating the Impact of Weather Conditions on Rides Summary Chapter 2: Absenteeism at Work Introduction Initial Data Analysis Exercise 2.01: Identifying Reasons for Absence Initial Analysis of the Reason for Absence Analysis of Social Drinkers and Smokers Exercise 2.02: Identifying Reasons of Absence with Higher Probability Among Drinkers and Smokers Exercise 2.03: Identifying the Probability of Being a Drinker/Smoker, Conditioned to Absence Reason Body Mass Index Age and Education Factors Exercise 2.04: Investigating the Impact of Age on Reason for Absence Exercise 2.05: Investigating the Impact of Education on Reason for Absence Transportation Costs and Distance to Work Factors Temporal Factors Exercise 2.06: Investigating Absence Hours, Based on the Day of the Week and the Month of the Year Activity 2.01: Analyzing the Service Time and Son Columns Summary Chapter 3: Analyzing Bank Marketing Campaign Data Introduction Initial Data Analysis Exercise 3.01: Analyzing Distributions of Numerical Features in the Banking Dataset Exercise 3.02: Analyzing Distributions of Categorical Features in the Banking Dataset Impact of Numerical Features on the Outcome Exercise 3.03: Hypothesis Test of the Difference of Distributions in Numerical Features Modeling the Relationship via Logistic Regression Linear Regression Logistic Regression Exercise 3.04: Logistic Regression on the Full Marketing Campaign Data Activity 3.01: Creating a Leaner Logistic Regression Model Summary Chapter 4: Tackling Company Bankruptcy Introduction Explanation of Some of the Important Features Importing the Data Exercise 4.01: Importing Data into DataFrames Pandas Profiling Running Pandas Profiling Pandas Profiling Report for DataFrame 1 Pandas Profiling Report for DataFrame 2 Missing Value Analysis Exercise 4.02: Performing Missing Value Analysis for the DataFrames Imputation of Missing Values Mean Imputation Exercise 4.03: Performing Mean Imputation on the DataFrames Iterative Imputation Exercise 4.04: Performing Iterative Imputation on the DataFrame Splitting the Features Feature Selection with Lasso Lasso Regularization for Mean-Imputed DataFrames Lasso Regularization for Iterative-Imputed DataFrames Activity 4.01: Feature Selection with Lasso Summary Chapter 5: Analyzing the Online Shopper's Purchasing Intention Introduction Data Dictionary Importing the Data Exploratory Data Analysis Univariate Analysis Baseline Conversion Rate from the Revenue Column Visitor-Wise Distribution Traffic-Wise Distribution Exercise 5.01: Analyzing the Distribution of Customers Session on the Website Region-Wise Distribution Exercise 5.02: Analyzing the Browser and OS Distribution of Customers Administrative Pageview Distribution Information Pageview Distribution Special Day Session Distribution Bivariate Analysis Revenue Versus Visitor Type Revenue Versus Traffic Type Exercise 5.03: Analyzing the Relationship between Revenue and Other Variables Linear Relationships Bounce Rate versus Exit Rate Page Value versus Bounce Rate Page Value versus Exit Rate Impact of Administration Page Views and Administrative Pageview Duration on Revenue Impact of Information Page Views and Information Pageview Duration on Revenue Clustering Method to Find the Optimum Number of Clusters Exercise 5.04: Performing K-means Clustering for Informational Duration versus Bounce Rate Performing K-means Clustering for Informational Duration versus Exit Rate Activity 5.01: Performing K-means Clustering for Administrative Duration versus Bounce Rate and Administrative Duration versus Exit Rate Summary Chapter 6: Analysis of Credit Card Defaulters Introduction Importing the Data Data Preprocessing Exploratory Data Analysis Univariate Analysis Bivariate Analysis Exercise 6.01: Evaluating the Relationship between the DEFAULT Column and the EDUCATION and MARRIAGE Columns PAY_1 versus DEFAULT Balance versus DEFAULT Exercise 6.02: Evaluating the Relationship between the AGE and DEFAULT Columns Correlation Activity 6.01: Evaluating the Correlation between Columns Using a Heatmap Building a Profile of a High-Risk Customer Summary Chapter 7: Analyzing the Heart Disease Dataset Introduction Exercise 7.01: Loading and Understanding the Data Outliers Exercise 7.02: Checking for Outliers Activity 7.01: Checking for Outliers Exercise 7.03: Plotting the Distributions and Relationships Between Specific Features Activity 7.02: Plotting Distributions and Relationships between Columns with Respect to the Target Column Exercise 7.04: Plotting the Relationship between the Presence of Heart Disease and Maximum Recorded Heart Rate Activity 7.03: Plotting the Relationship between the Presence of Heart Disease and the Cholesterol Column Exercise 7.05: Observing Correlations with a Heatmap Summary Chapter 8: Analysis of Credit Card Defaulters Introduction Data Cleaning Exercise 8.01: Loading and Cleaning Our Data Data Preparation and Feature Engineering Exercise 8.02: Preparing Our Data Data Analysis Exercise 8.03: Finding the Answers in Our Data Activity 8.01: Performing Data Analysis on the Online Retail II Dataset Summary Chapter 9: Analysis of the Energy Consumed by Appliances Introduction Exercise 9.01: Taking a Closer Look at the Dataset Exercise 9.02: Analyzing the Light Energy Consumption Column Activity 9.01: Analyzing the Appliances Energy Consumption Column Exercise 9.03: Performing Feature Engineering Exercise 9.04: Visualizing the Dataset Activity 9.02: Observing the Trend between a_energy and day Exercise 9.05: Plotting Distributions of the Temperature Columns Activity 9.03: Plotting Distributions of the Humidity Columns Exercise 9.06: Plotting out_b, out_hum, visibility, and wind Summary Chapter 10: Analyzing Air Quality Introduction About the Dataset Exercise 10.01: Concatenating Multiple DataFrames and Checking for Missing Values Outliers Exercise 10.02: Identifying Outliers Activity 10.01: Checking for Outliers Missing Values Exercise 10.03: Dealing with Missing Values Exercise 10.04: Observing the Concentration of PM25 and PM10 per Year Activity 10.02: Observing the Pollutant Concentration per Year Activity 10.03: Observing Pollutant Concentration per Month Heatmaps Exercise 10.05: Checking for Correlations between Features Summary Appendix Index