ورود به حساب

نام کاربری گذرواژه

گذرواژه را فراموش کردید؟ کلیک کنید

حساب کاربری ندارید؟ ساخت حساب

ساخت حساب کاربری

نام نام کاربری ایمیل شماره موبایل گذرواژه

برای ارتباط با ما می توانید از طریق شماره موبایل زیر از طریق تماس و پیامک با ما در ارتباط باشید


09117307688
09117179751

در صورت عدم پاسخ گویی از طریق پیامک با پشتیبان در ارتباط باشید

دسترسی نامحدود

برای کاربرانی که ثبت نام کرده اند

ضمانت بازگشت وجه

درصورت عدم همخوانی توضیحات با کتاب

پشتیبانی

از ساعت 7 صبح تا 10 شب

دانلود کتاب Principles Of Data Mining

دانلود کتاب اصول داده کاوی

Principles Of Data Mining

مشخصات کتاب

Principles Of Data Mining

ویرایش: [4th Edition] 
نویسندگان:   
سری: Undergraduate Topics In Computer Science 
ISBN (شابک) : 1447174925, 9781447174936 
ناشر: Springer 
سال نشر: 2020 
تعداد صفحات: 576 
زبان: English 
فرمت فایل : PDF (درصورت درخواست کاربر به PDF، EPUB یا AZW3 تبدیل می شود) 
حجم فایل: 6 Mb 

قیمت کتاب (تومان) : 82,000



ثبت امتیاز به این کتاب

میانگین امتیاز به این کتاب :
       تعداد امتیاز دهندگان : 10


در صورت تبدیل فایل کتاب Principles Of Data Mining به فرمت های PDF، EPUB، AZW3، MOBI و یا DJVU می توانید به پشتیبان اطلاع دهید تا فایل مورد نظر را تبدیل نمایند.

توجه داشته باشید کتاب اصول داده کاوی نسخه زبان اصلی می باشد و کتاب ترجمه شده به فارسی نمی باشد. وبسایت اینترنشنال لایبرری ارائه دهنده کتاب های زبان اصلی می باشد و هیچ گونه کتاب ترجمه شده یا نوشته شده به فارسی را ارائه نمی دهد.


توضیحاتی در مورد کتاب اصول داده کاوی

این کتاب تکنیک‌های اصلی داده‌کاوی، استخراج خودکار اطلاعات ضمنی و بالقوه مفید از داده‌ها را که به طور فزاینده‌ای در حوزه‌های تجاری، علمی و سایر حوزه‌های کاربردی مورد استفاده قرار می‌گیرد، توضیح می‌دهد و بررسی می‌کند. این بر طبقه بندی، استخراج قوانین انجمن و خوشه بندی تمرکز دارد. هر موضوع به وضوح توضیح داده شده است، با تمرکز بر الگوریتم ها نه فرمالیسم ریاضی، و با مثال های دقیق کار شده نشان داده شده است. این کتاب برای خوانندگان بدون پیش زمینه قوی در ریاضیات یا آمار نوشته شده است و هر فرمول استفاده شده با جزئیات توضیح داده شده است. این می تواند به عنوان یک کتاب درسی برای پشتیبانی از دوره های کارشناسی یا کارشناسی ارشد در طیف گسترده ای از موضوعات از جمله علوم کامپیوتر، مطالعات بازرگانی، بازاریابی، هوش مصنوعی، بیوانفورماتیک و علوم قانونی استفاده شود. به عنوان کمکی برای خودآموزی، هدف آن کمک به خوانندگان عمومی است تا درک لازم را از آنچه در داخل "جعبه سیاه" وجود دارد ایجاد کنند تا بتوانند از بسته های داده کاوی تجاری به طور متمایز استفاده کنند و همچنین خوانندگان پیشرفته یا محققان دانشگاهی را قادر می سازد تا درک کنند یا مشارکت کنند. به پیشرفت های فنی آینده در این زمینه. هر فصل دارای تمرین‌های عملی است تا خوانندگان بتوانند پیشرفت خود را بررسی کنند. یک واژه نامه کامل از اصطلاحات فنی استفاده شده گنجانده شده است. اصول داده کاوی شامل توصیفی از الگوریتم‌هایی برای طبقه‌بندی داده‌های جریانی است، هم داده‌های ثابت، جایی که مدل اصلی ثابت است، و هم داده‌هایی که وابسته به زمان هستند، جایی که مدل اساسی هر از گاهی تغییر می‌کند - پدیده‌ای که به عنوان رانش مفهومی شناخته می‌شود. نسخه چهارم توسعه‌یافته، شرح مفصلی از یک شبکه عصبی پیش‌خور با پس انتشار ارائه می‌دهد و نشان می‌دهد که چگونه می‌توان از آن برای طبقه‌بندی استفاده کرد.


توضیحاتی درمورد کتاب به خارجی

This book explains and explores the principal techniques of Data Mining, the automatic extraction of implicit and potentially useful information from data, which is increasingly used in commercial, scientific and other application areas. It focuses on classification, association rule mining and clustering. Each topic is clearly explained, with a focus on algorithms not mathematical formalism, and is illustrated by detailed worked examples. The book is written for readers without a strong background in mathematics or statistics and any formulae used are explained in detail. It can be used as a textbook to support courses at undergraduate or postgraduate levels in a wide range of subjects including Computer Science, Business Studies, Marketing, Artificial Intelligence, Bioinformatics and Forensic Science. As an aid to self-study, it aims to help general readers develop the necessary understanding of what is inside the 'black box' so they can use commercial data mining packages discriminatingly, as well as enabling advanced readers or academic researchers to understand or contribute to future technical advances in the field. Each chapter has practical exercises to enable readers to check their progress. A full glossary of technical terms used is included. Principles of Data Mining includes descriptions of algorithms for classifying streaming data, both stationary data, where the underlying model is fixed, and data that is time-dependent, where the underlying model changes from time to time - a phenomenon known as concept drift. The expanded fourth edition gives a detailed description of a feed-forward neural network with backpropagation and shows how it can be used for classification.



فهرست مطالب

Principles of Data Mining
	About This Book
	Contents
1. Introduction to Data Mining
	1.1 The Data Explosion
	1.2 Knowledge Discovery
	1.3 Applications of Data Mining
	1.4 Labelled and Unlabelled Data
	1.5 Supervised Learning: Classification
	1.6 Supervised Learning: Numerical Prediction
	1.7 Unsupervised Learning: Association Rules
	1.8 Unsupervised Learning: Clustering
2. Data for Data Mining
	2.1 Standard Formulation
	2.2 Types of Variable
		2.2.1 Categorical and Continuous Attributes
	2.3 Data Preparation
		2.3.1 Data Cleaning
	2.4 Missing Values
		2.4.1 Discard Instances
		2.4.2 Replace by Most Frequent/Average Value
	2.5 Reducing the Number of Attributes
	2.6 The UCI Repository of Datasets
	2.7 Chapter Summary
	2.8 Self-assessment Exercises for Chapter 2
	Reference
3. Introduction to Classification: Naïve Bayes and Nearest Neighbour
	3.1 What Is Classification?
	3.2 Naïve Bayes Classifiers
	3.3 Nearest Neighbour Classification
		3.3.1 Distance Measures
		3.3.2 Normalisation
		3.3.3 Dealing with Categorical Attributes
	3.4 Eager and Lazy Learning
	3.5 Chapter Summary
	3.6 Self-assessment Exercises for Chapter 3
4. Using Decision Trees for Classification
	4.1 Decision Rules and Decision Trees
		4.1.1 Decision Trees: The Golf Example
		4.1.2 Terminology
		4.1.3 The degrees Dataset
	4.2 The TDIDT Algorithm
	4.3 Types of Reasoning
	4.4 Chapter Summary
	4.5 Self-assessment Exercises for Chapter 4
	References
5. Decision Tree Induction: Using Entropy for Attribute Selection
	5.1 Attribute Selection: An Experiment
	5.2 Alternative Decision Trees
		5.2.1 The Football/Netball Example
		5.2.2 The anonymous Dataset
	5.3 Choosing Attributes to Split On: Using Entropy
		5.3.1 The lens24 Dataset
		5.3.2 Entropy
		5.3.3 Using Entropy for Attribute Selection
		5.3.4 Maximising Information Gain
	5.4 Chapter Summary
	5.5 Self-assessment Exercises for Chapter 5
6. Decision Tree Induction: Using Frequency Tables for Attribute Selection
	6.1 Calculating Entropy in Practice
		6.1.1 Proof of Equivalence
		6.1.2 A Note on Zeros
	6.2 Other Attribute Selection Criteria: Gini Index of Diversity
	6.3 The chi2 Attribute Selection Criterion
	6.4 Inductive Bias
	6.5 Using Gain Ratio for Attribute Selection
		6.5.1 Properties of Split Information
		6.5.2 Summary
	6.6 Number of Rules Generated by Different Attribute Selection Criteria
	6.7 Missing Branches
	6.8 Chapter Summary
	6.9 Self-assessment Exercises for Chapter 6
	References
7. Estimating the Predictive Accuracy of a Classifier
	7.1 Introduction
	7.2 Method 1: Separate Training and Test Sets
		7.2.1 Standard Error
		7.2.2 Repeated Train and Test
	7.3 Method 2: k-fold Cross-validation
	7.4 Method 3: N-fold Cross-validation
	7.5 Experimental Results I
	7.6 Experimental Results II: Datasets with Missing Values
		7.6.1 Strategy 1: Discard Instances
		7.6.2 Strategy 2: Replace by Most Frequent/Average Value
		7.6.3 Missing Classifications
	7.7 Confusion Matrix
		7.7.1 True and False Positives
	7.8 Chapter Summary
	7.9 Self-assessment Exercises for Chapter 7
	Reference
8. Continuous Attributes
	8.1 Introduction
	8.2 Local versus Global Discretisation
	8.3 Adding Local Discretisation to TDIDT
		8.3.1 Calculating the Information Gain of a Set of Pseudo-attributes
		8.3.2 Computational Efficiency
	8.4 Using the ChiMerge Algorithm for Global Discretisation
		8.4.1 Calculating the Expected Values and chi2
		8.4.2 Finding the Threshold Value
		8.4.3 Setting minIntervals and maxIntervals
		8.4.4 The ChiMerge Algorithm: Summary
		8.4.5 The ChiMerge Algorithm: Comments
	8.5 Comparing Global and Local Discretisation for Tree Induction
	8.6 Chapter Summary
	8.7 Self-assessment Exercises for Chapter 8
	Reference
9. Avoiding Overfitting of Decision Trees
	9.1 Dealing with Clashes in a Training Set
		9.1.1 Adapting TDIDT to Deal with Clashes
	9.2 More About Overfitting Rules to Data
	9.3 Pre-pruning Decision Trees
	9.4 Post-pruning Decision Trees
	9.5 Chapter Summary
	9.6 Self-assessment Exercise for Chapter 9
	References
10. More About Entropy
	10.1 Introduction
	10.2 Coding Information Using Bits
	10.3 Discriminating Amongst M Values (M Not a Power of 2)
	10.4 Encoding Values That Are Not Equally Likely
	10.5 Entropy of a Training Set
	10.6 Information Gain Must Be Positive or Zero
	10.7 Using Information Gain for Feature Reduction for Classification Tasks
		10.7.1 Example 1: The genetics Dataset
		10.7.2 Example 2: The bcst96 Dataset
	10.8 Chapter Summary
	10.9 Self-assessment Exercises for Chapter 10
	References
11. Inducing Modular Rules for Classification
	11.1 Rule Post-pruning
	11.2 Conflict Resolution
	11.3 Problems with Decision Trees
	11.4 The Prism Algorithm
		11.4.1 Changes to the Basic Prism Algorithm
		11.4.2 Comparing Prism with TDIDT
	11.5 Chapter Summary
	11.6 Self-assessment Exercise for Chapter 11
	References
12. Measuring the Performance of a Classifier
	12.1 True and False Positives and Negatives
	12.2 Performance Measures
	12.3 True and False Positive Rates versus Predictive Accuracy
	12.4 ROC Graphs
	12.5 ROC Curves
	12.6 Finding the Best Classifier
	12.7 Chapter Summary
	12.8 Self-assessment Exercise for Chapter 12
13. Dealing with Large Volumes of Data
	13.1 Introduction
	13.2 Distributing Data onto Multiple Processors
	13.3 Case Study: PMCRI
	13.4 Evaluating the Effectiveness of a Distributed System: PMCRI
	13.5 Revising a Classifier Incrementally
	13.6 Chapter Summary
	13.7 Self-assessment Exercises for Chapter 13
	References
14. Ensemble Classification
	14.1 Introduction
	14.2 Estimating the Performance of a Classifier
	14.3 Selecting a Different Training Set for Each Classifier
	14.4 Selecting a Different Set of Attributes for Each Classifier
	14.5 Combining Classifications: Alternative Voting Systems
	14.6 Parallel Ensemble Classifiers
	14.7 Chapter Summary
	14.8 Self-assessment Exercises for Chapter 14
	References
15. Comparing Classifiers
	15.1 Introduction
	15.2 The Paired t-Test
	15.3 Choosing Datasets for Comparative Evaluation
		15.3.1 Confidence Intervals
	15.4 Sampling
	15.5 How Bad Is a `No Significant Difference\' Result?
	15.6 Chapter Summary
	15.7 Self-assessment Exercises for Chapter 15
	References
16. Association Rule Mining I
	16.1 Introduction
	16.2 Measures of Rule Interestingness
		16.2.1 The Piatetsky-Shapiro Criteria and the RI Measure
		16.2.2 Rule Interestingness Measures Applied to the chess Dataset
		16.2.3 Using Rule Interestingness Measures for Conflict Resolution
	16.3 Association Rule Mining Tasks
	16.4 Finding the Best N Rules
		16.4.1 The J-Measure: Measuring the Information Content of a Rule
		16.4.2 Search Strategy
	16.5 Chapter Summary
	16.6 Self-assessment Exercises for Chapter 16
	References
17. Association Rule Mining II
	17.1 Introduction
	17.2 Transactions and Itemsets
	17.3 Support for an Itemset
	17.4 Association Rules
	17.5 Generating Association Rules
	17.6 Apriori
	17.7 Generating Supported Itemsets: An Example
	17.8 Generating Rules for a Supported Itemset
	17.9 Rule Interestingness Measures: Lift and Leverage
	17.10 Chapter Summary
	17.11 Self-assessment Exercises for Chapter 17
	Reference
18. Association Rule Mining III: Frequent Pattern Trees
	18.1 Introduction: FP-Growth
	18.2 Constructing the FP-tree
		18.2.1 Pre-processing the Transaction Database
		18.2.2 Initialisation
		18.2.3 Processing Transaction 1: f, c, a, m, p
		18.2.4 Processing Transaction 2: f, c, a, b, m
		18.2.5 Processing Transaction 3: f, b
		18.2.6 Processing Transaction 4: c, b, p
		18.2.7 Processing Transaction 5: f, c, a, m, p
	18.3 Finding the Frequent Itemsets from the FP-tree
		18.3.1 Itemsets Ending with Item p
		18.3.2 Itemsets Ending with Item m
	18.4 Chapter Summary
	18.5 Self-assessment Exercises for Chapter 18
	Reference
19. Clustering
	19.1 Introduction
	19.2 k-Means Clustering
		19.2.1 Example
		19.2.2 Finding the Best Set of Clusters
	19.3 Agglomerative Hierarchical Clustering
		19.3.1 Recording the Distance Between Clusters
		19.3.2 Terminating the Clustering Process
	19.4 Chapter Summary
	19.5 Self-assessment Exercises for Chapter 19
20. Text Mining
	20.1 Multiple Classifications
	20.2 Representing Text Documents for Data Mining
	20.3 Stop Words and Stemming
	20.4 Using Information Gain for Feature Reduction
	20.5 Representing Text Documents: Constructing a Vector Space Model
	20.6 Normalising the Weights
	20.7 Measuring the Distance Between Two Vectors
	20.8 Measuring the Performance of a Text Classifier
	20.9 Hypertext Categorisation
		20.9.1 Classifying Web Pages
		20.9.2 Hypertext Classification versus Text Classification
	20.10 Chapter Summary
	20.11 Self-assessment Exercises for Chapter 20
21. Classifying Streaming Data
	21.1 Introduction
		21.1.1 Stationary v Time-dependent Data
	21.2 Building an H-Tree: Updating Arrays
		21.2.1 Array currentAtts
		21.2.2 Array splitAtt
		21.2.3 Sorting a record to the appropriate leaf node
		21.2.4 Array hitcount
		21.2.5 Array classtotals
		21.2.6 Array acvCounts
		21.2.7 Array branch
	21.3 Building an H-Tree: a Detailed Example
		21.3.1 Step (a): Initialise Root Node 0
		21.3.2 Step (b): Begin Reading Records
		21.3.3 Step (c): Consider Splitting at Node 0
		21.3.4 Step (d): Split on Root Node and Initialise New Leaf Nodes
		21.3.5 Step (e): Process the Next Set of Records
		21.3.6 Step (f): Consider Splitting at Node 2
		21.3.7 Step (g): Process the Next Set of Records
		21.3.8 Outline of the H-Tree Algorithm
	21.4 Splitting on an Attribute: Using Information Gain
	21.5 Splitting on An Attribute: Using a Hoeffding Bound
	21.6 H-Tree Algorithm: Final Version
	21.7 Using an Evolving H-Tree to Make Predictions
		21.7.1 Evaluating the Performance of an H-Tree
	21.8 Experiments: H-Tree versus TDIDT
		21.8.1 The lens24 Dataset
		21.8.2 The vote Dataset
	21.9 Chapter Summary
	21.10 Self-assessment Exercises for Chapter 21
	References
22. Classifying Streaming Data II: Time-Dependent Data
	22.1 Stationary versus Time-dependent Data
	22.2 Summary of the H-Tree Algorithm
		22.2.1 Array currentAtts
		22.2.2 Array splitAtt
		22.2.3 Array hitcount
		22.2.4 Array classtotals
		22.2.5 Array acvCounts
		22.2.6 Array branch
		22.2.7 Pseudocode for the H-Tree Algorithm
	22.3 From H-Tree to CDH-Tree: Overview
	22.4 From H-Tree to CDH-Tree: Incrementing Counts
	22.5 The Sliding Window Method
	22.6 Resplitting at Nodes
	22.7 Identifying Suspect Nodes
	22.8 Creating Alternate Nodes
	22.9 Growing/Forgetting an Alternate Node and its Descendants
	22.10 Replacing an Internal Node by One of its Alternate Nodes
	22.11 Experiment: Tracking Concept Drift
		22.11.1 lens24 Data: Alternative Mode
		22.11.2 Introducing Concept Drift
		22.11.3 An Experiment with Alternating lens24 Data
		22.11.4 Comments on Experiment
	22.12 Chapter Summary
	22.13 Self-assessment Exercises for Chapter 22
	Reference
23. An Introduction to Neural Networks
	23.1 Introduction
	23.2 Neural Nets Example 1
	23.3 Neural Nets Example 2
		23.3.1 Forward Propagating the Values of the Input Nodes
		23.3.2 Forward Propagation: Summary of Formulae
	23.4 Backpropagation
		23.4.1 Stochastic Gradient Descent
		23.4.2 Finding the Gradients
		23.4.3 Working backwards from the output layer to the hidden layer
		23.4.4 Working backwards from the hidden layer to the input layer
		23.4.5 Updating the Weights
	23.5 Processing a Multi-instance Training Set
	23.6 Using a Neural Net for Classification: the iris Dataset
	23.7 Using a Neural Net for Classification: the seeds Dataset
	23.8 Neural Nets: A Note of Caution
	23.9 Chapter Summary
	23.10 Self-assessment Exercises for Chapter 23
A. Essential Mathematics
	A.1 Subscript Notation
		A.1.1 Sigma Notation for Summation
		A.1.2 Double Subscript Notation
		A.1.3 Other Uses of Subscripts
	A.2 Trees
		A.2.1 Terminology
		A.2.2 Interpretation
		A.2.3 Subtrees
	A.3 The Logarithm Function log2 X
		A.3.1 The Function -X log2 X
	A.4 Introduction to Set Theory
		A.4.1 Subsets
		A.4.2 Summary of Set Notation
B. Datasets
	References
C. Sources of Further Information
	Websites
	Books
	Conferences
	Information About Association Rule Mining
D. Glossary and Notation
E. Solutions to Self-assessment Exercises
Index




نظرات کاربران