دسترسی نامحدود
برای کاربرانی که ثبت نام کرده اند
برای ارتباط با ما می توانید از طریق شماره موبایل زیر از طریق تماس و پیامک با ما در ارتباط باشید
در صورت عدم پاسخ گویی از طریق پیامک با پشتیبان در ارتباط باشید
برای کاربرانی که ثبت نام کرده اند
درصورت عدم همخوانی توضیحات با کتاب
از ساعت 7 صبح تا 10 شب
ویرایش:
نویسندگان: Nataraj Dasgupta
سری:
ISBN (شابک) : 9781783554393
ناشر: Packt Publishing
سال نشر: 2018
تعداد صفحات: 402
زبان: English
فرمت فایل : PDF (درصورت درخواست کاربر به PDF، EPUB یا AZW3 تبدیل می شود)
حجم فایل: 57 مگابایت
در صورت ایرانی بودن نویسنده امکان دانلود وجود ندارد و مبلغ عودت داده خواهد شد
کلمات کلیدی مربوط به کتاب تجزیه و تحلیل عملی داده های بزرگ: COM062000 - کامپیوترها / مدلسازی و طراحی دادهها، COM091000 - کامپیوترها / محاسبات ابری، COM018000 - کامپیوترها / پردازش دادهها
در صورت تبدیل فایل کتاب Practical Big Data Analytics به فرمت های PDF، EPUB، AZW3، MOBI و یا DJVU می توانید به پشتیبان اطلاع دهید تا فایل مورد نظر را تبدیل نمایند.
توجه داشته باشید کتاب تجزیه و تحلیل عملی داده های بزرگ نسخه زبان اصلی می باشد و کتاب ترجمه شده به فارسی نمی باشد. وبسایت اینترنشنال لایبرری ارائه دهنده کتاب های زبان اصلی می باشد و هیچ گونه کتاب ترجمه شده یا نوشته شده به فارسی را ارائه نمی دهد.
Cover Copyright and Credits Packt Upsell Contributors Table of Contents Preface Chapter 1: Too Big or Not Too Big What is big data? A brief history of data Dawn of the information age Dr. Alan Turing and modern computing The advent of the stored-program computer From magnetic devices to SSDs Why we are talking about big data now if data has always existed Definition of big data Building blocks of big data analytics Types of Big Data Structured Unstructured Semi-structured Sources of big data The 4Vs of big data When do you know you have a big data problem and where do you start your search for the big data solution? Summary Chapter 2: Big Data Mining for the Masses What is big data mining? Big data mining in the enterprise Building the case for a Big Data strategy Implementation life cycle Stakeholders of the solution Implementing the solution Technical elements of the big data platform Selection of the hardware stack Selection of the software stack Summary Chapter 3: The Analytics Toolkit Components of the Analytics Toolkit System recommendations Installing on a laptop or workstation Installing on the cloud Installing Hadoop Installing Oracle VirtualBox Installing CDH in other environments Installing Packt Data Science Box Installing Spark Installing R Steps for downloading and installing Microsoft R Open Installing RStudio Installing Python Summary Chapter 4: Big Data With Hadoop The fundamentals of Hadoop The fundamental premise of Hadoop The core modules of Hadoop Hadoop Distributed File System - HDFS Data storage process in HDFS Hadoop MapReduce An intuitive introduction to MapReduce A technical understanding of MapReduce Block size and number of mappers and reducers Hadoop YARN Job scheduling in YARN Other topics in Hadoop Encryption User authentication Hadoop data storage formats New features expected in Hadoop 3 The Hadoop ecosystem Hands-on with CDH WordCount using Hadoop MapReduce Analyzing oil import prices with Hive Joining tables in Hive Summary Chapter 5: Big Data Mining with NoSQL Why NoSQL? The ACID, BASE, and CAP properties ACID and SQL The BASE property of NoSQL The CAP theorem The need for NoSQL technologies Google Bigtable Amazon Dynamo NoSQL databases In-memory databases Columnar databases Document-oriented databases Key-value databases Graph databases Other NoSQL types and summary of other types of databases Analyzing Nobel Laureates data with MongoDB JSON format Installing and using MongoDB Tracking physician payments with real-world data Installing kdb+, R, and RStudio Installing kdb+ Installing R Installing RStudio The CMS Open Payments Portal Downloading the CMS Open Payments data Creating the Q application Loading the data The backend code Creating the frontend web portal R Shiny platform for developers Putting it all together - The CMS Open Payments application Applications Summary Chapter 6: Spark for Big Data Analytics The advent of Spark Limitations of Hadoop Overcoming the limitations of Hadoop Theoretical concepts in Spark Resilient distributed datasets Directed acyclic graphs SparkContext Spark DataFrames Actions and transformations Spark deployment options Spark APIs Core components in Spark Spark Core Spark SQL Spark Streaming GraphX MLlib The architecture of Spark Spark solutions Spark practicals Signing up for Databricks Community Edition Spark exercise - hands-on with Spark (Databricks) Summary Chapter 7: An Introduction to Machine Learning Concepts What is machine learning? The evolution of machine learning Factors that led to the success of machine learning Machine learning, statistics, and AI Categories of machine learning Supervised and unsupervised machine learning Supervised machine learning Vehicle Mileage, Number Recognition and other examples Unsupervised machine learning Subdividing supervised machine learning Common terminologies in machine learning The core concepts in machine learning Data management steps in machine learning Pre-processing and feature selection techniques Centering and scaling The near-zero variance function Removing correlated variables Other common data transformations Data sampling Data imputation The importance of variables The train, test splits, and cross-validation concepts Splitting the data into train and test sets The cross-validation parameter Creating the model Leveraging multicore processing in the model Summary Chapter 8: Machine Learning Deep Dive The bias, variance, and regularization properties The gradient descent and VC Dimension theories Popular machine learning algorithms Regression models Association rules Confidence Support Lift Decision trees The Random forest extension Boosting algorithms Support vector machines The K-Means machine learning technique The neural networks related algorithms Tutorial - associative rules mining with CMS data Downloading the data Writing the R code for Apriori Shiny (R Code) Using custom CSS and fonts for the application Running the application Summary Chapter 9: Enterprise Data Science Enterprise data science overview A roadmap to enterprise analytics success Data science solutions in the enterprise Enterprise data warehouse and data mining Traditional data warehouse systems Oracle Exadata, Exalytics, and TimesTen HP Vertica Teradata IBM data warehouse systems (formerly Netezza appliances) PostgreSQL Greenplum SAP Hana Enterprise and open source NoSQL Databases Kdb+ MongoDB Cassandra Neo4j Cloud databases Amazon Redshift, Redshift Spectrum, and Athena databases Google BigQuery and other cloud services Azure CosmosDB GPU databases Brytlyt MapD Other common databases Enterprise data science – machine learning and AI The R programming language Python OpenCV, Caffe, and others Spark Deep learning H2O and Driverless AI Datarobot Command-line tools Apache MADlib Machine learning as a service Enterprise infrastructure solutions Cloud computing Virtualization Containers – Docker, Kubernetes, and Mesos On-premises hardware Enterprise Big Data Tutorial – using RStudio in the cloud Summary Chapter 10: Closing Thoughts on Big Data Corporate big data and data science strategy Ethical considerations Silicon Valley and data science The human factor Characteristics of successful projects Summary Appendix: External Data Science Resources Big data resources NoSQL products Languages and tools Creating dashboards Notebooks Visualization libraries Courses on R Courses on machine learning Machine learning and deep learning links Web-based machine learning services Movies Machine learning books from Packt Books for leisure reading Other Books You May Enjoy Leave a review - let other readers know what you think Index