دسترسی نامحدود
برای کاربرانی که ثبت نام کرده اند
برای ارتباط با ما می توانید از طریق شماره موبایل زیر از طریق تماس و پیامک با ما در ارتباط باشید
در صورت عدم پاسخ گویی از طریق پیامک با پشتیبان در ارتباط باشید
برای کاربرانی که ثبت نام کرده اند
درصورت عدم همخوانی توضیحات با کتاب
از ساعت 7 صبح تا 10 شب
ویرایش:
نویسندگان: Oswald Campesato
سری:
ISBN (شابک) : 1683929047, 9781683929048
ناشر: Mercury Learning and Information
سال نشر: 2022
تعداد صفحات: 300
[275]
زبان: English
فرمت فایل : PDF (درصورت درخواست کاربر به PDF، EPUB یا AZW3 تبدیل می شود)
حجم فایل: 2 Mb
در صورت تبدیل فایل کتاب Data Wrangling Using Pandas, SQL, and Java به فرمت های PDF، EPUB، AZW3، MOBI و یا DJVU می توانید به پشتیبان اطلاع دهید تا فایل مورد نظر را تبدیل نمایند.
توجه داشته باشید کتاب جدال داده ها با استفاده از پانداها، SQL و جاوا نسخه زبان اصلی می باشد و کتاب ترجمه شده به فارسی نمی باشد. وبسایت اینترنشنال لایبرری ارائه دهنده کتاب های زبان اصلی می باشد و هیچ گونه کتاب ترجمه شده یا نوشته شده به فارسی را ارائه نمی دهد.
این کتاب در درجه اول برای کسانی در نظر گرفته شده است که قصد دارند دانشمند داده شوند و همچنین برای افرادی که نیاز به انجام وظایف پاکسازی داده دارند. این شامل انواع ویژگی های NumPy و Pandas و نحوه ایجاد پایگاه داده و جداول در MySQL است. فصل 7 بسیاری از وظایف جدال داده ها را با استفاده از اسکریپت های پایتون و اسکریپت های پوسته مبتنی بر awk پوشش می دهد. فایل های همراه با کد برای دانلود از ناشر موجود است. ویژگی ها: مفاهیم اولیه برنامه نویسی پایتون 3، جاوا و پاندا را در اختیار خواننده قرار می دهد و مقدمه ای برای awk شامل فصلی در مورد فایل های RDBM و SQL Companion با کد است.
This book is intended primarily for those who plan to become data scientists as wellas anyone who needs to perform data cleaning tasks. It contains a variety of features of NumPy and Pandas and how to create databases and tables in MySQL. Chapter 7 covers many data wrangling tasks using Python scripts and awk-based shell scripts. Companion files with code are available for downloading from the publisher. Features: Provides the reader with basic Python 3, Java, and Pandas programming concepts, and an introduction to awk Includes a chapter on RDBMs and SQL Companion files with code
Cover Title Page Copyright Dedication Contents Preface Chapter 1: Introduction to Python Tools for Python easy_install and pip virtualenv IPython Python Installation Setting the PATH Environment Variable (Windows Only) Launching Python on Your Machine The Python Interactive Interpreter Python Identifiers Lines, Indentation, and Multi-Lines Quotation and Comments Saving Your Code in a Module Some Standard Modules The help() and dir() Functions Compile Time and Runtime Code Checking Simple Data Types Working with Numbers Working with Other Bases The chr() Function The round() Function in Python Formatting Numbers in Python Working with Fractions Unicode and UTF-8 Working with Unicode Working with Strings Comparing Strings Formatting Strings in Python Uninitialized Variables and the Value None Slicing and Splicing Strings Testing for Digits and Alphabetic Characters Search and Replace a String in Other Strings Remove Leading and Trailing Characters Printing Text Without NewLine Characters Text Alignment Working with Dates Converting Strings to Dates Exception Handling Handling User Input Command-Line Arguments Summary Chapter 2: Working with Data Dealing with Data: What Can Go Wrong? What is Data Drift? What are Datasets? Data Preprocessing Data Types Preparing Datasets Discrete Data vs. Continuous Data “Binning” Continuous Data Scaling Numeric Data via Normalization Scaling Numeric Data via Standardization Scaling Numeric Data via Robust Standardization What to Look for in Categorical Data Mapping Categorical Data to Numeric Values Working with Dates Working with Currency Working with Outliers and Anomalies Outlier Detection/Removal Finding Outliers with NumPy Finding Outliers with Pandas Calculating Z-Scores to Find Outliers Finding Outliers with SkLearn (Optional) Working with Missing Data Imputing Values: When is Zero a Valid Value? Dealing with Imbalanced Datasets What is SMOTE? SMOTE Extensions The Bias-Variance Tradeoff Types of Bias in Data Analyzing Classifiers (Optional) What is LIME? What is ANOVA? Summary Chapter 3: Introduction to Pandas What is Pandas? Pandas Data Frames Data Frames and Data Cleaning Tasks A Pandas Data Frame Example Describing a Pandas Data Frame Pandas Boolean Data Frames Transposing a Pandas Data Frame Pandas Data Frames and Random Numbers Converting Categorical Data to Numeric Data Merging and Splitting Columns in Pandas Combining Pandas Data Frames Data Manipulation with Pandas Data Frames Pandas Data Frames and CSV Files Useful Options for the Pandas read_csv() Function Reading Selected Rows from CSV Files Pandas Data Frames and Excel Spreadsheets Useful Options for Reading Excel Spreadsheets Select, Add, and Delete Columns in Data Frames Handling Outliers in Pandas Pandas Data Frames and Simple Statistics Finding Duplicate Rows in Pandas Finding Missing Values in Pandas Missing Values in an Iris-Based Dataset Sorting Data Frames in Pandas Working with groupby() in Pandas Aggregate Operations with the titanic.csv Dataset Working with apply() and mapapply() in Pandas Useful One-line Commands in Pandas Working with JSON-based Data Python Dictionary and JSON Python, Pandas, and JSON Summary Chapter 4: RDBMS and SQL What is an RDBMS? What Relationships Do Tables Have in an RDBMS? Features of an RDBMS What is ACID? When Do We Need an RDBMS? The Importance of Normalization A Four-Table RDBMS Detailed Table Descriptions The customers Table The purchase_orders Table The line_items Table The item_desc Table What is SQL? DCL, DDL, DQL, DML, and TCL SQL Privileges Properties of SQL Statements The CREATE Keyword What is MySQL? What about MariaDB? Installing MySQL Data Types in MySQL The CHAR and VARCHAR Data Types String-based Data Types FLOAT and DOUBLE Data Types BLOB and TEXT Data Types MySQL Database Operations Creating a Database Display a List of Databases Display a List of Database Users Dropping a Database Exporting a Database Renaming a Database The INFORMATION_SCHEMA Table The PROCESSLIST Table SQL Formatting Tools Summary Chapter 5: Java, JSON, and XML Working with Java and MySQL Performing the Set-up Steps Creating a MySQL Database in Java Creating a MySQL Table in Java Inserting Data into a MySQL Table in Java Deleting Data and Dropping MySQL Tables in Java Selecting Data from a MySQL Table in Java Updating Data in a MySQL Table in Java Working with JSON, MySQL, and Java Select JSON-based Data from a MySQL Table in Java Working with XML, MySQL, and Java What is XML? What is an XML Schema? When are XML Schemas Useful? Create a MySQL Table for XML Data in Java Read an XML Document in Java Read an XML Document as a String in Java Insert XML-based Data into a MySQL Table in Java Select XML-based Data from a MySQL Table in Java Parse XML-based String Data from a MySQL Table in Java Working with XML Schemas Summary Chapter 6: Data Cleaning Tasks What is Data Cleaning? Data Cleaning for Personal Titles Data Cleaning in SQL Replace NULL with 0 Replace NULL Values with Average Value Replace Multiple Values with a Single Value Handle Mismatched Attribute Values Convert Strings to Date Values Data Cleaning from the Command Line (Optional) Working with the sed Utility Working with Variable Column Counts Truncating Rows in CSV Files Generating Rows with Fixed Columns with the awk Utility Converting Phone Numbers Converting Numeric Date Formats Converting Alphabetic Date Formats Working with Date and Time Date Formats Working with Codes, Countries, and Cities Data Cleaning on a Kaggle Dataset Summary Chapter 7: Data Wrangling What is Data Wrangling? Data Transformation: What Does This Mean? CSV Files with Multi-Row Records Pandas Solution (1) Pandas Solution (2) CSV Solution CSV Files, Multi-row Records, and the awk Command Quoted Fields Split on Two Lines (Optional) Overview of the Events Project Why This Project? Project Tasks Generate Country Codes Prepare a List of Cities in Countries Generating City Codes from Country Codes: awk Generating City Codes from Country Codes: Python Generating SQL Statements for the city_codes Table Generating a CSV File for Band Members (Java) Generating a CSV File for Band Members (Python) Generating a Calendar of Events (COE) Project Automation Script Project Follow-up Comments Summary Appendix A: Working with awk The awk Command Built-in Variables That Control awk How Does the awk Command Work? Aligning Text with the printf() Statement Conditional Logic and Control Statements The while Statement A for Loop in awk A for Loop with a break Statement The next and continue Statements Deleting Alternate Lines in Datasets Merging Lines in Datasets Printing File Contents as a Single Line Joining Groups of Lines in a Text File Joining Alternate Lines in a Text File Matching with Meta Characters and Character Sets Printing Lines Using Conditional Logic Splitting Filenames with awk Working with Postfix Arithmetic Operators Numeric Functions in awk One-line awk Commands Useful Short awk Scripts Printing the Words in a Text String in awk Count Occurrences of a String in Specific Rows Printing a String in a Fixed Number of Columns Printing a Dataset in a Fixed Number of Columns Aligning Columns in Datasets Aligning Columns and Multiple Rows in Datasets Removing a Column from a Text File Subsets of Column-aligned Rows in Datasets Counting Word Frequency in Datasets Displaying Only “Pure” Words in a Dataset Working with Multi-line Records in awk A Simple Use Case Another Use Case Summary Index