دسترسی نامحدود
برای کاربرانی که ثبت نام کرده اند
برای ارتباط با ما می توانید از طریق شماره موبایل زیر از طریق تماس و پیامک با ما در ارتباط باشید
در صورت عدم پاسخ گویی از طریق پیامک با پشتیبان در ارتباط باشید
برای کاربرانی که ثبت نام کرده اند
درصورت عدم همخوانی توضیحات با کتاب
از ساعت 7 صبح تا 10 شب
ویرایش: [1 ed.]
نویسندگان: Oswald Campesato
سری:
ISBN (شابک) : 9781683929734, 2022948076
ناشر: Mercury Learning and Information
سال نشر: 2022
تعداد صفحات: 293
زبان: English
فرمت فایل : PDF (درصورت درخواست کاربر به PDF، EPUB یا AZW3 تبدیل می شود)
حجم فایل: 6 Mb
در صورت تبدیل فایل کتاب Bash for Data Scientists به فرمت های PDF، EPUB، AZW3، MOBI و یا DJVU می توانید به پشتیبان اطلاع دهید تا فایل مورد نظر را تبدیل نمایند.
توجه داشته باشید کتاب Bash for Data Scientists نسخه زبان اصلی می باشد و کتاب ترجمه شده به فارسی نمی باشد. وبسایت اینترنشنال لایبرری ارائه دهنده کتاب های زبان اصلی می باشد و هیچ گونه کتاب ترجمه شده یا نوشته شده به فارسی را ارائه نمی دهد.
این کتاب مجموعهای از ابزارهای خط فرمان قدرتمند را معرفی
میکند
که میتوانند برای ایجاد اسکریپتهای پوسته ساده و در عین حال
قدرتمند برای پردازش مجموعه دادهها ترکیب شوند.
نمونههای کد و اسکریپتها از پوسته bash استفاده میکنند و
معمولاً شامل مجموعه دادههای کوچکی میشوند بنابراین
می توانید بر درک ویژگی های grep، sed و awk تمرکز کنید. فایل های
همراه
با کد برای دانلود از ناشر در دسترس هستند.
ویژگی ها
+ ابزارهای خط فرمان قدرتمندی را در اختیار خواننده قرار می دهد
که می توانند برای
ایجاد اسکریپت های پوسته ساده و در عین حال قدرتمند برای پردازش
ترکیب شوند. مجموعه داده ها
+ حاوی انواع قطعات کد و اسکریپت های پوسته برای دانشمندان داده،
تحلیلگران داده،
و کسانی که می خواهند راه حل های مبتنی بر پوسته را برای "پاک
کردن" انواع مختلف مجموعه داده ها داشته باشند
+ فایل های همراه با کد موجود برای دانلود با آمازون گواهی
خرید
با نوشتن به ناشر.
فهرست محتوا
1: مقدمه ای بر یونیکس. 2: فایل ها و دایرکتوری ها 3: دستورات
مفید.
4: منطق شرطی و حلقه ها. 5: پردازش مجموعه داده ها با grep و
sed.
6: پردازش مجموعه داده ها با awk. 7: پردازش مجموعه داده ها
(پانداها).
8: NoSQL، SQLite و Python. فهرست.
درباره نویسنده
Oswald Campesato (سانفرانسیسکو، کالیفرنیا) یک مربی کمکی
در UC-Santa Clara است و در یادگیری عمیق، جاوا، اندروید،
و NLP تخصص دارد. او نویسنده بیش از بیست و پنج کتاب از
جمله
SQL Pocket Primer، Python 3 for Machine Learning، و
NLP Using R Pocket Primer (همه آموزش مرکوری) است.
This book introduces an assortment of powerful command
line utilities
that can be combined to create simple, yet powerful shell
scripts for processing datasets.
The code samples and scripts use the bash shell, and typically
involve small datasets so
you can focus on understanding the features of grep, sed, and
awk. Companion files
with code are available for downloading from the
publisher.
Features
+Provides the reader with powerful command line utilities that
can be combined to
create simple yet powerful shell scripts for processing
datasets
+Contains a variety of code fragments and shell scripts for
data scientists, data analysts,
and those who want shell-based solutions to “clean” various
types of datasets
+Companion files with code available for downloading with
Amazon proof of
purchase by writing to the publisher.
Table of Contents
1: Introduction to UNIX. 2: Files and Directories. 3: Useful
Commands.
4: Conditional Logic and Loops. 5: Processing Datasets with
grep and sed.
6: Processing Datasets with awk. 7: Processing Datasets
(Pandas).
8: NoSQL, SQLite, and Python. Index.
About the Author
Oswald Campesato (San Francisco, CA) is an adjunct
instructor
at UC-Santa Clara and specializes in Deep Learning, Java,
Android,
and NLP. He is the author of over twenty-five books including
the
SQL Pocket Primer, Python 3 for Machine Learning, and the
NLP Using R Pocket Primer (all Mercury Learning).
Bash for Data Scientists CONTENTS PREFACE WHAT IS THE GOAL? IS THIS BOOK IS FOR ME AND WHAT WILL I LEARN? HOW WERE THE CODE SAMPLES CREATED? WHAT YOU NEED TO KNOW FOR THIS BOOK WHICH BASH COMMANDS ARE EXCLUDED? HOW DO I SET UP A COMMAND SHELL? WHAT ARE THE “NEXT STEPS” AFTER FINISHING THIS BOOK? CHAPTER 1 INTRODUCTION WHAT IS UNIX? Available Shell Types WHAT IS BASH? Getting Help for Bash Commands Navigating Around Directories The history Command LISTING FILENAMES WITH THE LS COMMAND DISPLAYING CONTENTS OF FILES The cat Command The head and tail Commands The Pipe Symbol The fold Command FILE OWNERSHIP: OWNER, GROUP, AND WORLD HIDDEN FILES HANDLING PROBLEMATIC FILENAMES WORKING WITH ENVIRONMENT VARIABLES The env Command Useful Environment Variables Setting the PATH Environment Variable Specifying Aliases and Environment Variables FINDING EXECUTABLE FILES THE printf COMMAND AND THE echo COMMAND THE cut COMMAND THE echo COMMAND AND WHITESPACES COMMAND SUBSTITUTION (“BACK TICK”) THE PIPE SYMBOL AND MULTIPLE COMMA USING A SEMICOLON TO SEPARATE COMMANDS THE paste COMMAND Inserting Blank Lines with the paste Command A SIMPLE USE CASE WITH THE paste COMMAND A SIMPLE USE CASE WITH cut AND paste COMMANDS WORKING WITH META CHARACTERS WORKING WITH CHARACTER CLASSES WHAT ABOUT ZSH? Switching between bash and zsh Configuring zsh SUMMARY CHAPTER 2 FILES AND DIRECTORIES CREATE, COPY, REMOVE, AND MOVE FILES Creating Files Copying Files Copy Files with Command Substitution Deleting Files Moving Files THE BASENAME, DIRNAME, AND FILE COMMANDS THE wc COMMAND THE more COMMAND AND THE less COMMAND THE head COMMAND THE tail COMMAND FILE COMPARISON COMMANDS THE PARTS OF A FILENA WORKING WITH FILE PERMISSIONS The chmod Command The chown Command The chgrp Command The umask and ulimit Commands WORKING WITH DIRECTORIES Absolute and Relative Directories Absolute and Relative Path Names Creating Directories Removing Directories Changing Directories Renaming Directories USING QUOTE CHARACTERS STREAMS AND REDIRECTION COMMANDS METACHARACTERS AND CHARACTER CLASSES Digits and Characters Working with “^” and “\” and “!” FILENAMES AND METACHARACTERS SUMMARY CHAPTER 3 USEFUL COMMANDS THE join COMMAND THE fold COMMAND THE split COMMAND THE sort COMMAND THE uniq COMMAND HOW TO COMPARE FILES THE od COMMAND THE tr COMMAND A SIMPLE USE CASE THE find COMMAND THE tee COMMAND FILE COMPRESSION COMMANDS The tar command The cpio Command The gzip and gunzip Commands The bunzip2 Command The zip Command COMMANDS FOR zip FILES AND bz FILES INTERNAL FIELD SEPARATOR (IFS) DATA FROM A RANGE OF COLUMNS IN A DATASET WORKING WITH UNEVEN ROWS IN DATASETS THE alias COMMAND SUMMARY CHAPTER 4 CONDITIONAL LOGIC AND LOOPS ARITHMETIC OPERATIONS AND OPERATORS WORKING WITH ARRAYS ARRAYS AND TEXT FILES WORKING WITH VARIABLES Assigning Values to Variables WORKING WITH OPERATORS FOR STRINGS AND NUMBERS THE read COMMAND FOR USER INPUT THE test COMMAND FOR VARIABLES, FILES, AND DIRECTORIES Relational Operators Boolean Operators String Operators File Test Operators CONDITIONAL LOGIC WITH if/else STATEMENTS THE case/esac STATEMENT ARITHMETIC OPERATORS AND COMPARISONS WORKING WITH STRINGS IN SHELL SCRIPTS Working with Strings WORKING WITH LOOPS Using a for loop WORKING WITH NESTED LOOPS USING A while LOOP THE while, case, AND if/elif/fi STATEMENTS USING AN UNTIL LOOP USER-DEFINED FUNCTIONS CREATING A SIMPLE MENU FROM SHELL COMMANDS SUMMARY CHAPTER 5 PROCESSING DATASETS WITH GREPAND SED WHAT IS THE grep COMMAND? METACHARACTERS AND THE grep COMMAND ESCAPING METACHARACTERS WITH THE grep COMMAND USEFUL OPTIONS FOR THE grep COMMAND Character Classes and the grep Command WORKING WITH THE –C OPTION IN grep MATCHING A RANGE OF LINES USING BACK REFERENCES IN THE grep COMMAND FINDING EMPTY LINES IN DATASETS USING KEYS TO SEARCH DATASETS THE BACKSLASH CHARACTER AND THE grep COMMAND MULTIPLE MATCHES IN THE GREP COMMAND THE grep COMMAND AND THE xargs COMMAND Searching zip Files for a String CHECKING FOR A UNIQUE KEY VALUE Redirecting Error Messages THE egrep COMMAND AND fgrep COMMAND Displaying “Pure” Words in a Dataset with egrep Redirecting Error Messages THE egrep COMMAND AND fgrep COMMAND Displaying “Pure” Words in a Dataset with egrep The fgrep Command DELETE ROWS WITH MISSING VALUES A SIMPLE USE CASE WHAT IS THE sed COMMAND? The sed Execution Cycle MATCHING STRING PATTERNS USING sed SUBSTITUTING STRING PATTERNS USING sed Replacing Vowels from a String or a File Deleting Multiple Digits and Letters from a String SEARCH AND REPLACE WITH sed DATASETS WITH MULTIPLE DELIMITERS USEFUL SWITCHES IN sed WORKING WITH DATASETS Printing Lines Character Classes and sed Removing Control Characters COUNTING WORDS IN A DATASET BACK REFERENCES IN sed ONE-LINE sed COMMANDS POPULATE MISSING VALUES WITH THE sed COMMAND A DATASET WITH 1,000,000 ROWS Numeric Comparisons Counting Adjacent Digits Average Support Rate SUMMARY CHAPTER 6 PROCESSING DATASETS WITH AWK THE awk COMMAND Built-in Variables that Control awk How Does the awk Command Work? ALIGNING TEXT WITH THE printf COMMAND CONDITIONAL LOGIC AND CONTROL STATEMENTS The while Statement A for loop in awk A for loop with a break Statement The next and continue Statements DELETING ALTERNATE LINES IN DATASETS MERGING LINES IN DATASETS Printing File Contents as a Single Line Joining Groups of Lines in a Text File Joining Alternate Lines in a Text File MATCHING WITH METACHARACTERS AND CHARACTER SETS PRINTING LINES USING CONDITIONAL LOGIC SPLITTING FILENAMES WITH awk WORKING WITH POSTFIX ARITHMETIC OPERATORS NUMERIC FUNCTIONS IN awk ONE-LINE awk COMMANDS USEFUL SHORT awk SCRIPTS PRINTING THE WORDS IN A TEXT STRING IN awk COUNT OCCURRENCES OF A STRING IN SPECIFIC ROWS PRINTING A STRING IN A FIXED NUMBER OF COLUMNS PRINTING A DATASET IN A FIXED NUMBER OF COLUMNS ALIGNING COLUMNS IN DATASETS ALIGNING COLUMNS AND MULTIPLE ROWS IN DATASETS DISPLAYING A SUBSET OF COLUMNS IN A TEXT FILE SUBSETS OF COLUMN-ALIGNED ROWS IN DATASETS COUNTING WORD FREQUENCY IN DATASETS DISPLAYING ONLY “PURE” WORDS IN A DATASET DELETE ROWS WITH MISSING VALUES WORKING WITH MULTI-LINE RECORDS IN AWK A SIMPLE USE CASE ANOTHER USE CASE A DATASET WITH 1,000,000 ROWS Counting Adjacent Digits Average Support Rate SUMMARY CHAPTER 7 PROCESSING DATASETS (PANDAS) PREREQUISITES FOR THIS CHAPTER ANALYZING MISSING DATA Causes of Missing Data PANDAS, CSV FILES, AND MISSING DATA Single Column CSV Files Two Column CSV Files MISSING DATA AND IMPUTATION Counting Missing Data Values Drop Redundant Columns Remove Duplicate Rows Display Duplicate Rows Uniformity of Data Values Too Many Missing Data Values Categorical Data Data Inconsistency Mean Value Imputation Random Value Imputation Multiple Imputation Matching and Hot Deck Imputation Is a Zero Value Valid or Invalid? SKEWED DATASETS CSV FILES WITH MULTI-ROW RECORDS COLUMN SUBSET AND ROW SUBRANGE OF THE TITANIC CSV FILE DATA NORMALIZATION Assigning Classes to Data Other Data Cleaning Tasks DeepChecks and Data Validation HANDLING CATEGORICAL DATA Processing Inconsistent Categorical Data Mapping Categorical Data to Numeric Values Mapping Categorical Data to One Hot Encoded Values WORKING WITH CURRENCY WORKING WITH DATES Find Missing Dates Find Unique Dates Switch Date Formats WORKING WITH IMBALANCED DATASETS Data Sampling Techniques Removing Noisy Data Cost-sensitive Learning Detecting Imbalanced Data Rebalancing Datasets Specify stratify in Data Splits WHAT IS SMOTE? DATA WRANGLING Data Transformation: What Does This Mean? A DATASET WITH 1,000,000 ROWS Dataset Details Numeric Comparisons Counting Adjacent Digits SAVING CSV DATA TO XML, JSON, AND HTML FILES SUMMARY CHAPTER 8 NOSQL, SQLITE, AND PYTHON NON-RELATIONAL DATABASE SYSTEMS Advantages of Non-relational Databases WHAT IS NOSQL? What is NewSQL? RDBMS VERSUS NOSQL: WHICH ONE TO USE? Good Data Types for NoSQL Some Guidelines for Selecting a Database NoSQL Databases WHAT IS MONGODB? Features of MongoDB Installing MongoDB Launching MongoDB USEFUL MONGO APIS Metacharacters in Mongo Queries MONGODB COLLECTIONS AND DOCUMENTS Document Format in MongoDB CREATE A MONGODB COLLECTION WORKING WITH MONGODB COLLECTIONS Find All Android Phones Find All Android Phones in 2018 Insert a New Item (Document) Update an Existing Item (Document) Calculate the Average Price for Each Brand Calculate the Average Price for Each Brand in 2019 Import Data with mongoimport WHAT IS FUGUE? WHAT IS COMPASS? WHAT IS PYMONGO? MYSQL, SQLALCHEMY, AND PANDAS What is SQLAlchemy? Read MySQL Data via SQLAlchemy EXPORT SQL DATA FROM PANDAS TO EXCEL MYSQL AND CONNECTOR/PYTHON Establishing a Database Connection Creating a Database Table Reading Data from a Database Table WHAT IS SQLITE? SQLite Features SQLite Installation SQLiteStudio Installation DB Browser for SQLite Installation SQLiteDict (Optional) WHAT IS TIMESCALEDB? Install Timescaledb (Macbook) Setting Up the TimescaleDB Extension The rides Table The Parallel Copy Command Data Analysis LARGE SCALE DATA IMPUTATION SUMMARY INDEX