دسترسی نامحدود
برای کاربرانی که ثبت نام کرده اند
برای ارتباط با ما می توانید از طریق شماره موبایل زیر از طریق تماس و پیامک با ما در ارتباط باشید
در صورت عدم پاسخ گویی از طریق پیامک با پشتیبان در ارتباط باشید
برای کاربرانی که ثبت نام کرده اند
درصورت عدم همخوانی توضیحات با کتاب
از ساعت 7 صبح تا 10 شب
ویرایش:
نویسندگان: Lior Rokach
سری:
ISBN (شابک) : 9789811297472, 9789811297496
ناشر: World Scientific
سال نشر: 2024
تعداد صفحات: 303
زبان: English
فرمت فایل : PDF (درصورت درخواست کاربر به PDF، EPUB یا AZW3 تبدیل می شود)
حجم فایل: 10 مگابایت
در صورت تبدیل فایل کتاب Cluster Analysis: A Primer Using R به فرمت های PDF، EPUB، AZW3، MOBI و یا DJVU می توانید به پشتیبان اطلاع دهید تا فایل مورد نظر را تبدیل نمایند.
توجه داشته باشید کتاب تجزیه و تحلیل خوشه ای: آغازگر با استفاده از r نسخه زبان اصلی می باشد و کتاب ترجمه شده به فارسی نمی باشد. وبسایت اینترنشنال لایبرری ارائه دهنده کتاب های زبان اصلی می باشد و هیچ گونه کتاب ترجمه شده یا نوشته شده به فارسی را ارائه نمی دهد.
Contents Preface About the Author 1. Introduction to Data Clustering 1.1 Overview 1.2 Data Science and Data Mining 1.3 The Four-Layers Model 1.4 Taxonomy of Machine Learning Tasks 1.4.1 Data representation 1.5 What is Clustering? 1.6 Taxonomy of Clustering Methods 1.7 Data Clustering Using R 2. Similarity Measures 2.1 Overview 2.2 Preliminaries 2.2.1 Data types 2.2.2 Distance measures 2.3 Euclidean Distance 2.4 Minkowski: Distance Measures for Numeric Attributes 2.5 Distance Measures for Binary Attributes 2.6 Distance Measures for Categorical Attributes 2.6.1 Distance metrics for ordinal attributes 2.7 Distance Metrics for Mixed-Type Attributes 2.8 Similarity Functions 2.8.1 Cosine measure 2.8.2 Pearson correlation measure 2.8.3 Extended Jaccard measure 2.8.4 Dice coefficient measure 2.9 Calculating the Dissimilarity Matrix in R 3. Partitioning Methods for Minimizing Distance Measures 3.1 Introduction 3.2 K-Means 3.2.1 Algorithm overview 3.2.2 Illustration of k-means algorithm 3.2.3 Running k-means in R 3.2.4 The properties of k-means algorithm 3.3 Determining the Number of Clusters 3.3.1 The clues package 3.4 X-Means 3.4.1 Algorithm overview 3.4.2 Running X-means in R 3.5 K-Means++ 3.5.1 Algorithm overview 3.5.2 Running k-means++ in R 3.6 K-Medoids: Partitioning Around Medoids 3.6.1 Algorithm overview 3.6.2 Running k-medoids in R 3.6.3 PROCLUS and ORCLUS algorithms 3.6.4 CLARA and CLARANS algorithms 3.7 Variation of k-Means 3.7.1 K-Medians 3.7.2 K-Modes 3.7.3 K-Prototypes 3.8 The BFR Algorithm 3.8.1 Sufficient statistics 3.8.2 Point sets in BFR 3.8.3 Algorithm phases 3.8.3.1 Initialization 3.8.3.2 Summarization 3.8.3.3 Cluster update 3.8.4 Applications and use cases 3.8.5 Advantages and limitations of BFR algorithm 3.9 Canopy Clustering 3.9.1 Detailed description of the algorithm 3.9.2 Conclusions, advantages, limitations, and future research 3.10 The k-SVD Algorithm 3.10.1 Detailed Description of the Algorithm 3.10.2 Conclusions, advantages, limitations, and future research 3.11 Kernel K-Means 3.11.1 Detailed description of the algorithm 3.11.2 Conclusions, advantages, limitations, and future research 3.12 Mini-Batch K-Means 3.12.1 Detailed description of the algorithm 3.12.2 Running the algorithm in R 3.12.3 Conclusions, advantages, limitations, and future research 3.13 Affinity Propagation 3.13.1 Detailed description of the algorithm 3.13.2 Running the algorithm in R 3.13.3 Conclusions, advantages, limitations, and future research 3.14 Fuzzy Clustering 3.14.1 Fuzzy C-means (FCM) algorithm 3.14.1.1 Algorithm explanation 3.14.1.2 Parameter selection 3.14.2 Implementing FCM in R 3.14.3 Applications of fuzzy clustering 3.14.3.1 Image segmentation 3.14.3.2 Pattern recognition 3.14.3.3 Market segmentation 3.14.3.4 Medical diagnosis 3.14.4 Conclusion 3.15 FLAME Clustering Algorithm 3.15.1 Detailed description of the algorithm 3.15.2 Conclusions, advantages, limitations, and future research 3.16 The Gath-Geva Clustering Algorithm 3.16.1 Detailed description of the algorithm 3.16.1.1 Initialization 3.16.1.2 Updating cluster centroids 3.16.1.3 Updating covariance matrices 3.16.1.4 Updating membership degrees 3.16.1.5 Objective function 3.16.1.6 Convergence 3.16.2 Running the algorithm in R 3.16.3 Conclusions, advantages, limitations, and future research 3.17 Gustafson-Kessel Clustering 3.17.1 Detailed description of the algorithm 3.17.2 Running the algorithm in R 3.17.3 Conclusions, advantages, limitations, and future research 4. Hierarchical Methods 4.1 Introduction 4.2 Agglomerative Methods 4.2.1 Graph measures 4.2.1.1 Single-link 4.2.1.2 Complete link 4.2.1.3 Group average 4.2.1.4 McQuitty 4.2.2 Geometric measures 4.2.2.1 Centroid 4.2.2.2 Median 4.2.2.3 Ward method 4.2.3 Running the basic hierarchical clustering algorithm in R 4.2.4 ROCK algorithm 4.2.5 AGNES 4.3 Divisive Methods 4.3.1 DIANA 4.3.2 COBWEB 4.4 Hybrid Hierarchical Clustering 4.5 Supporting Packages 4.5.1 Detection of clusters in hierarchical clustering dendrograms 4.5.2 Assessing the uncertainty in hierarchical cluster analysis 4.6 The BIRCH Algorithm 4.6.1 Detailed description of the algorithm 4.6.2 CF tree structure 4.6.3 Insertion into CF tree 4.6.4 Node splitting 4.6.5 Conclusions, advantages, limitations, and future research 4.7 SLINK Algorithm: A Dive into Hierarchical Clustering 4.7.1 Detailed description of the algorithm 4.7.2 Conclusions 4.8 The CLINK (Complete-Linkage Clustering) 4.8.1 Detailed description of the algorithm 4.8.2 Running the algorithm in R 4.9 Unweighted Pair Group Method with Arithmetic Mean (UPGMA) 4.9.1 Running the algorithm in R 4.9.2 Conclusions, advantages, limitations, and future research 4.10 WPGMA Algorithm 4.11 Comparing the Clustering of SLINK, CLINK, UPGMA and WPGMA 4.12 Sequential Agglomerative Hierarchical Non-overlapping (SAHN) Algorithm 4.13 The CURE Clustering Algorithm 4.13.1 Random sampling 4.13.2 Partitioning and partial clustering 4.13.3 Representative points selection 4.13.4 Shrinking towards the mean 4.13.5 Hierarchical clustering 4.13.6 Conclusions, advantages, limitations, and future research 4.14 Nearest-neighbor Chain Algorithm 4.14.1 Detailed description of the algorithm 4.14.2 Conclusions, advantages, limitations, and future research 5. Clustering Visualization 5.1 Introduction 5.2 Using Built-in Plot Function 5.3 The Clusplot Function 5.4 FlexClust Package 5.5 Dendrogram 5.5.1 Comparing a pair of dendrograms 5.6 Clustergram 5.7 t-Distributed Stochastic Neighbor Embedding (t-SNE) 5.7.1 Advantages and limitations 6. Cluster Validity: Evaluation of Clustering Algorithms 6.1 Introduction 6.2 Internal Criteria 6.2.1 Sum of squared error (SSE) 6.2.2 The Ball-Hall index 6.2.3 Other minimum variance criteria 6.2.4 Scatter criteria 6.2.5 C index 6.2.6 The McClain-Rao index 6.2.7 The Banfeld-Raftery index 6.2.8 Condorcet’s criterion 6.2.9 The C-criterion 6.2.10 The Calinski-Harabasz index 6.2.11 The Silhouette index 6.2.12 Log SS ratio index 6.2.13 The Dunn index 6.2.14 The generalized Dunn index (GDI) 6.2.15 The Davies-Bouldin index 6.2.16 The Baker-Hubert Gamma index 6.2.17 The G-plus index 6.2.18 The Det-ratio index 6.2.19 The log Det ratio index 6.2.20 The k2|W|) index 6.2.21 Category utility metric 6.2.22 Edge cut metrics 6.3 External Quality Criteria 6.3.1 Mutual information based measure 6.3.2 Precision-recall measure 6.3.3 Rand index 6.3.4 Folkes and Mallows index 6.4 Calculating Validity Indices in R 6.5 Determining the Number of Clusters 6.5.1 Methods based on intra cluster scatter 6.5.2 Methods based on both the inter and intra cluster scatter 6.5.3 Criteria based on probabilistic methods 6.6 Hypothesis Testing in Cluster Validity 7. Mixture Densities-Based Clustering 7.1 Introduction 7.2 DBSCAN Algorithm 7.2.1 Running DBSCAN algorithm in R 7.2.2 Variations of DBSCAN and the OPTICS algorithm 7.3 Mean-shift 7.4 EM Clustering 7.4.1 E-step 7.4.2 M-step 7.4.3 Running EM algorithm in R 7.5 Density Peak Clustering 7.6 Latent Class Analysis 7.7 Further Reading 8. Graph Clustering 8.1 Introduction 8.2 Graph Terminology 8.3 Affinity Propagation 8.3.1 Running affinity propagation algorithm in R 8.3.2 Conclusions, advantages, limitations, and future research 8.4 K-Cores 8.4.1 Running K-cores clustering in R 8.5 The Igraph Package 8.5.1 Creating graphs 8.5.2 Centrality measures 8.5.2.1 Degree 8.5.2.2 Betweenness 8.5.2.3 Closeness 8.5.3 Community structure detection based on edge betweenness 8.6 CHAMELEON 8.7 The CACTUS Algorithm for Clustering Categorical Data 8.8 Markov Clustering (MCL) 8.8.0.1 Expansion 8.8.0.2 Inflation 8.8.0.3 Pruning 8.8.1 Running the MCL algorithm in R 8.8.2 Conclusions, advantages, limitations, and future research 9. Grid-Based Clustering Methods 9.1 CLIQUE: Clustering in QUEst 9.1.1 How CLIQUE works 9.1.2 Using CLIQUE in R 9.1.3 Properties of the CLIQUE algorithm 9.2 STING: Statistical Information Grid Clustering 9.2.1 Overview of STING algorithm 9.2.2 Properties of the STING algorithm 9.3 WaveCluster 9.3.0.1 WaveCluster algorithm steps 9.3.0.2 Advantages and limitations 9.4 GRIDCLUS 9.4.1 Algorithm steps 9.4.2 Advantages and limitations 9.5 Applications of Grid-Based Clustering 9.6 Conclusion and Future Research 10. Deep Learning for Clustering 10.1 Introduction 10.2 Foundations of Artificial Neural Networks 10.2.1 Network architecture 10.3 Deep Clustering: An Overview 10.3.1 Why deep clustering? 10.4 Types of Deep Clustering Methods 10.4.1 Self-organizing maps (SOMs) 10.4.2 Gaussian mixture models (GMMs) with neural networks 10.4.3 Autoencoder-based clustering 10.4.3.1 Deep embedded clustering (DEC) 10.4.4 Clustering deep neural networks (CDNN) 10.4.4.1 Algorithm implementation 10.4.5 Generative adversarial networks (GANs) 10.4.5.1 ClusterGAN framework 10.4.5.2 Algorithm implementation 10.4.6 Variational autoencoders (VAEs) 10.4.6.1 Variational deep embedding (VaDE) 10.4.6.2 Algorithm implementation 10.5 Applications of Deep Clu 10.5.1 Image clustering 10.5.2 Text clustering 10.5.3 Speech and audio processing 10.6 Conclusions, Challenges and Future Directions 10.6.1 Challenges 10.6.1.1 Scalability 10.6.1.2 Interpretability 10.6.1.3 Theoretical understanding 10.6.2 Future research directions 10.6.2.1 Theoretical exploration 10.6.2.2 New architectures 10.6.2.3 Domain adaptation 10.6.2.4 Semi-supervised and unsupervised learning 11. Spectral Clustering 11.1 Background and Motivation 11.2 Graph Theory Basics 11.2.1 Graphs and adjacency matrices 11.2.2 Degree matrix 11.2.3 Laplacian matrix 11.2.4 Properties of the Laplacian matrix 11.2.5 Spectral embedding 11.3 Spectral Clustering Algorithm 11.3.1 Eigenvector decomposition 11.3.2 Clustering in reduced space 11.3.3 Assigning original points 11.4 Analysis and Interpretation 11.4.1 Ideal case 11.4.2 Practical considerations 11.5 Multiscale Spectral Clustering 11.5.1 Motivation for multiscale analysis 11.5.2 Constructing multiscale similarity matrices 11.5.3 Combining multiscale information 11.5.4 Multiscale Laplacian matrix 11.5.5 Eigenvector decomposition and clustering 11.5.6 Advantages of multiscale spectral clustering 11.6 Sparse Spectral Clustering 11.6.1 Principles of sparse spectral clustering 11.6.2 Techniques for sparsifying the similarity matrix 11.6.2.1 k-Nearest neighbors 11.6.2.2 ϵ-Neighborhood 11.6.2.3 Combination methods 11.6.3 Computing the sparse Laplacian 11.6.4 Eigenvector decomposition and clustering 11.6.5 Advantages and challenges of sparse spectral clustering 11.6.5.1 Advantages 11.6.5.2 Challenges 11.7 The Normalized Cuts Algorithm 11.8 The Ratio Cuts Algorithm Bibliography Index