برای ارتباط با ما می توانید از طریق شماره موبایل زیر از طریق تماس و پیامک با ما در ارتباط باشید

09117307688
09117179751

در صورت عدم پاسخ گویی از طریق پیامک با پشتیبان در ارتباط باشید

دسترسی نامحدود

برای کاربرانی که ثبت نام کرده اند

ضمانت بازگشت وجه

درصورت عدم همخوانی توضیحات با کتاب

پشتیبانی

از ساعت 7 صبح تا 10 شب

دانلود کتاب Learning Data Science: Data Wrangling, Exploration, Visualization, and Modeling with Python (Final)

دانلود کتاب یادگیری علم داده: جدال داده ها، کاوش، تجسم و مدل سازی با پایتون (نهایی)

مشخصات کتاب

Learning Data Science: Data Wrangling, Exploration, Visualization, and Modeling with Python (Final)

ویرایش:  
نویسندگان: Sam  Lau  
سری:  
ISBN (شابک) : 9781098113001 
ناشر: O'Reilly Media 
سال نشر: 2023 
تعداد صفحات: 594 
زبان: English 
فرمت فایل : EPUB (درصورت درخواست کاربر به PDF، EPUB یا AZW3 تبدیل می شود) 
حجم فایل: 16 Mb

قیمت کتاب (تومان) : 55,000

میانگین امتیاز به این کتاب :
تعداد امتیاز دهندگان : 4

در صورت تبدیل فایل کتاب Learning Data Science: Data Wrangling, Exploration, Visualization, and Modeling with Python (Final) به فرمت های PDF، EPUB، AZW3، MOBI و یا DJVU می توانید به پشتیبان اطلاع دهید تا فایل مورد نظر را تبدیل نمایند.

توجه داشته باشید کتاب یادگیری علم داده: جدال داده ها، کاوش، تجسم و مدل سازی با پایتون (نهایی) نسخه زبان اصلی می باشد و کتاب ترجمه شده به فارسی نمی باشد. وبسایت اینترنشنال لایبرری ارائه دهنده کتاب های زبان اصلی می باشد و هیچ گونه کتاب ترجمه شده یا نوشته شده به فارسی را ارائه نمی دهد.

توضیحاتی در مورد کتاب یادگیری علم داده: جدال داده ها، کاوش، تجسم و مدل سازی با پایتون (نهایی)

توضیحاتی درمورد کتاب به خارجی

As an aspiring data scientist, you appreciate why organizations rely on data for important decisions--whether it\'s for companies designing websites, cities deciding how to improve services, or scientists discovering how to stop the spread of disease. And you want the skills required to distill a messy pile of data into actionable insights. We call this the data science lifecycle: the process of collecting, wrangling, analyzing, and drawing conclusions from data. Learning Data Science is the first book to cover foundational skills in both programming and statistics that encompass this entire lifecycle. It\'s aimed at those who wish to become data scientists or who already work with data scientists, and at data analysts who wish to cross the \"technical/nontechnical\" divide. If you have a basic knowledge of Python programming, you\'ll learn how to work with data using industry-standard tools like pandas. Refine a question of interest to one that can be studied with data Pursue data collection that may involve text processing, web scraping, etc. Glean valuable insights about data through data cleaning, exploration, and visualization Learn how to use modeling to describe the data Generalize findings beyond the data

فهرست مطالب

Preface
        Expected Background Knowledge
        Organization of the Book
        Conventions Used in This Book
            Using Code Examples
            O’Reilly Online Learning
            How to Contact Us
        Acknowledgements
    I. The Data Science Lifecycle
    1. The Data Science Lifecycle
        The Stages of the Lifecycle
        Examples of the Lifecycle
        Summary
    2. Questions and Data Scope
        Big Data and New Opportunities
            Example: Google Flu Trends
        Target Population, Access Frame, Sample
        Instruments and Protocols
        Measuring Natural Phenomenon
        Accuracy
            Types of Bias
            Types of Variation
        Summary
    3. Simulation and Data Design
        The Urn Model
            Sampling Designs
            Sampling Distribution of a Statistic
            Simulating the Sampling Distribution
            The Hypergeometric Distribution
        Example: Simulating Election Poll Bias and Variance
            The Pennsylvania Urn Model
            An Urn Model with Bias
            Conducting Larger Polls
        Example: Simulating a Randomized Trial for a Vaccine
            Scope
            The Urn Model for Random Assignment
        Example: Measuring Air Quality
        Summary
    4. Modeling with Summary Statistics
        The Constant Model
        Minimizing Loss
            Mean Absolute Error
            Mean Squared Error
            Choosing Loss Functions
        Summary
    5. Case Study: Why is my Bus Always Late?
        Question and Scope
        Data Wrangling
        Exploring Bus Times
        Modeling Wait Times
        Summary
    II. Rectangular Data
    6. Working With Dataframes Using pandas
        Subsetting
            Data Scope and Question
            DataFrames and Indices
            Slicing
            Filtering Rows
            Example: How recently has Luna become a popular name?
        Aggregating
            Basic Group-Aggregate
            Grouping on Multiple Columns
            Custom Aggregation Functions
            Example: Have People Become More Creative With Baby Names?
            Pivoting
        Joining
            Inner Joins
            Left, Right, and Outer Joins
            Example: Popularity of NYT Name Categories
        Transforming
            Apply
            Example: Popularity of “L” Names
            The Price of Apply
        How are Dataframes Different from Other Data Representations?
            Dataframes and Spreadsheets
            Dataframes and Matrices
            Dataframes and Relations
        Summary
    7. Working With Relations Using SQL
        Subsetting
            SQL Basics: SELECT and FROM
            What’s a Relation?
            Slicing
            Filtering Rows
            Example: How recently has Luna become a popular name?
        Aggregating
            Basic Group-Aggregate using GROUP BY
            Grouping on Multiple Columns
            Other Aggregation Functions
        Joining
            Inner Joins
            Left and Right Joins
            Example: Popularity of NYT Name Categories
        Transforming and Common Table Expressions
            SQL Functions
            Multistep Queries Using a WITH Clause
            Example: Popularity of “L” Names
        Summary
    III. Understanding The Data
    8. Wrangling Files
        Data Source Examples
            Drug Abuse Warning Network (DAWN) Survey
            San Francisco Restaurant Food Safety
        File Formats
            Delimited format
            Fixed-width Format
            Hierarchical Formats
            Loosely Formatted Text
        File Encoding
        File Size
            Working with Large Data Sets
        The Shell and Command Line Tools
        Table Shape and Granularity
            Granularity of Restaurant Inspections and Violations
            DAWN Survey Shape and Granularity
        Summary
    9. Wrangling Dataframes
        Example: Wrangling CO2 Measurements from Mauna Loa Observatory
            Quality Checks
            Addressing Missing Data
            Reshaping the Data Table
        Quality Checks
            Quality based on scope
            Quality of measurements and recorded values
            Quality across related features
            Quality for analysis
            Fixing the Data or Not
        Missing Values and Records
            Imputing Missing Values
        Transformations and Timestamps
            Transforming Timestamps
            Piping for Transformations
        Modifying Structure
        Example: Wrangling Restaurant Safety Violations
            Narrowing the Focus
            Aggregating Violations
            Extracting Information from Violation Descriptions
        Summary
    10. Exploratory Data Analysis
        Feature Types
            Example: Dog Breeds
            Transforming Qualitative Features
            The Importance of Feature Types
        What to Look For in a Distribution
        What to Look For in a Relationship
            Two Quantitative Features
            One Qualitative and One Quantitative Variable
            Two Qualitative Features
        Comparisons in Multivariate Settings
        Guidelines for Exploration
        Example: Sale Prices for Houses
            Understanding Price
            What Next?
            Examining other features
            Delving Deeper into Relationships
            Fixing Location
            EDA discoveries
        Summary
    11. Data Visualization
        Choosing Scale to Reveal Structure
            Filling the Data Region
            Including Zero
            Revealing Shape Through Transformations
            Banking to Decipher Relationships
            Revealing Relationships Through Straightening
        Smoothing and Aggregating Data
            Smoothing Techniques to Uncover Shape
            Smoothing Techniques to Uncover Relationships and Trends
            Smoothing Techniques Need Tuning
            Reducing Distributions to Quantiles
            When Not to Smooth
        Facilitating Meaningful Comparisons
            Emphasize the Important Difference
            Ordering Groups
            Avoid Stacking
            Selecting a Color Palette
            Guidelines for Comparisons in Plots
        Incorporating the Data Design
            Data Collected over Time
            Observational Studies
            Unequal Sampling
            Geographic Data
        Adding Context
            Example: 100m Sprint Times
        Creating Plots Using plotly
            Figure and Trace Objects
            Modifying Layout
            Plotting Functions
            Annotations
        Other Tools for Visualization
            matplotlib
            Grammar of Graphics
        Summary
    12. Case Study: How Accurate are Air Quality Measurements?
        Question, Design, and Scope
        Finding Collocated Sensors
            Wrangling the List of AQS Sites
            Wrangling the List of PurpleAir Sites
            Matching AQS and PurpleAir Sensors
        Wrangling and Cleaning AQS Sensor Data
            Checking Granularity
            Removing Unneeded Columns
            Checking the Validity of Dates
            Checking the Quality of PM2.5 Measurements
        Wrangling PurpleAir Sensor Data
            Checking the Granularity
            Handling Missing Values
        Exploring PurpleAir and AQS Measurements
        Creating a Model to Correct PurpleAir Measurements
        Summary
    IV. Other Data Sources
    13. Working with Text
        Examples of Text and Tasks
            Convert text into a standard format
            Extract a piece of text to create a feature
            Transform text into features
            Text analysis
        String Manipulation
            Converting Text to a Standard Format with Python String Methods
            String Methods in pandas
            Splitting Strings to Extract Pieces of Text
        Regular Expressions
            Concatenation of Literals
            Quantifiers
            Alternation and Grouping to Create Features
            Reference Tables
        Text Analysis
        Summary
    14. Data Exchange
        NetCDF Data
            Example: Rainfall Around the World
        JSON Data
            Example: Air Quality Data Exchange
        HTTP
        REST
            Example: Retrieving Info on Clash Songs from Spotify
        XML, HTML, and XPath
            Example: Scraping Race Times from Wikipedia
            XPath
            Example: Accessing Exchange Rates from the ECB
        Summary
    V. Linear Modeling
    15. Linear Models
        Simple Linear Model
        Example: A Simple Linear Model for Air Quality
            Interpreting Linear Models
            Assessing the Fit
        Fitting the Simple Linear Model
        Multiple Linear Model
            Example: A Multiple Linear Model for Air Quality
        Fitting the Multiple Linear Model
            A Geometric Problem
        Example: Where is the Land of Opportunity?
            Explaining Upward Mobility using Commute Time
            Relating Upward Mobility Using Multiple Variables
        Feature Engineering for Numeric Measurements
        Feature Engineering for Categorical Measurements
        Summary
    16. Model Selection
        Overfitting
            Example: Energy Consumption
        Train-Test Split
        Cross-Validation
            Example: Fitting a Bent Line Model with Cross-validation
        Regularization
            Example: A Market Analysis
        Model Bias and Variance
        Summary
    17. Theory for Inference and Prediction
        Distributions: Population, Empirical, Sampling
        Basics of Hypothesis Testing
            Example: A Rank-test to Compare Productivity of Wikipedia Contributors
            Example: A Test of Proportions for Vaccine Efficacy
        Bootstrapping for Inference
            Boostrapping a Test for a Regression Coefficient
        Basics of Confidence Intervals
            Confidence Intervals for a Coefficient
        Basics of Prediction Intervals
            Example: Predicting Bus Lateness
            Example: Predicting Crab Size
            Example: Predicting the Incremental Growth of a Crab
        Probability for Inference and Prediction
            Formalizing the Theory for Average rank statistics
            General Properties of Random Variables
            Probability Behind Testing and Intervals
            Probability Behind Model Selection
        Summary
    18. Case Study: How to Weigh a Donkey
        Donkey Study Question and Scope
        Wrangling and Transforming
            Train-Test Split of the Data
        Exploring
        Modeling a Donkey’s Weight
            A Loss Function for Prescribing Anesthetics
            Fitting a Simple Linear Model
            Fitting a Multiple Linear Model
            Bringing Qualitative Features into the Model
            Model Assessment
        Summary
    VI. Classification
    19. Classification
        Example: Wind Damaged Trees
        Modeling and Classification
            A Constant Model
            Examining the Relationship Between Size and Windthrow
        Modeling Proportions (and Probabilities)
            A Logistic Model
            Log Odds
            Using a Logistic Curve
        A Loss Function for the Logistic Model
            Fitting a Logistic Model
        From Probabilities to Classification
            The Confusion Matrix
            Precision vs Recall
        Summary
    20. Numerical Optimization
        Gradient Descent Basics
        Minimizing Huber Loss
        Convex and Differentiable Loss Functions
        Variants of Gradient Descent
            Stochastic Gradient Descent
            Mini-batch Gradient Descent
            Newton’s Method
        Summary
    21. Case Study: Detecting Fake News
        Question and Scope
        Obtaining and Wrangling the Data
        Exploring the Data
            Exploring the Publishers
            Exploring Publication Date
            Exploring Words in Articles
        Modeling
            A Single-Word Model
            Multiple Word Model
            Predicting with the tf-idf Transform
        Summary
    About the Authors

نظرات کاربران

کتاب های تصادفی

دانلود کتاب OECD Employment Outlook: June 2001 (OCED employment outlook)

دانلود کتاب Modelling financial markets Using Visual Basic.NET and Databases to Create Pricing, Trading, and Risk Management Models

دانلود کتاب Toolik Lake: Ecology of an Aquatic Ecosystem in Arctic Alaska

دانلود کتاب Thirty-Five Oriental Philosophers

Reliable Software Technology – Ada-Europe 2005: 10th Ada-Europe International Conference on Reliable Software Technologies, York, UK, June 20-24, 2005. Proceedings

دانلود کتاب Reliable Software Technology – Ada-Europe 2005: 10th Ada-Europe International Conference on Reliable Software Technologies, York, UK, June 20-24, 2005. Proceedings

ورود به حساب

ساخت حساب کاربری