دسترسی نامحدود
برای کاربرانی که ثبت نام کرده اند
برای ارتباط با ما می توانید از طریق شماره موبایل زیر از طریق تماس و پیامک با ما در ارتباط باشید
در صورت عدم پاسخ گویی از طریق پیامک با پشتیبان در ارتباط باشید
برای کاربرانی که ثبت نام کرده اند
درصورت عدم همخوانی توضیحات با کتاب
از ساعت 7 صبح تا 10 شب
ویرایش: 2
نویسندگان: Taeho Jo
سری: Studies in Big Data, 45
ISBN (شابک) : 3031759753, 9783031759789
ناشر: Springer
سال نشر: 2025
تعداد صفحات: 0
زبان: English
فرمت فایل : EPUB (درصورت درخواست کاربر به PDF، EPUB یا AZW3 تبدیل می شود)
حجم فایل: 52 مگابایت
در صورت تبدیل فایل کتاب Text Mining: Concepts, Implementation, and Big Data Challenge به فرمت های PDF، EPUB، AZW3، MOBI و یا DJVU می توانید به پشتیبان اطلاع دهید تا فایل مورد نظر را تبدیل نمایند.
توجه داشته باشید کتاب استخراج متن: مفاهیم ، اجرای و چالش داده های بزرگ نسخه زبان اصلی می باشد و کتاب ترجمه شده به فارسی نمی باشد. وبسایت اینترنشنال لایبرری ارائه دهنده کتاب های زبان اصلی می باشد و هیچ گونه کتاب ترجمه شده یا نوشته شده به فارسی را ارائه نمی دهد.
Preface Contents Part I Foundation 1 Introduction 1.1 Definition of Text Mining 1.2 Texts 1.2.1 Text Components 1.2.2 Text Formats 1.3 Data Mining Tasks 1.3.1 Classification 1.3.2 Clustering 1.3.3 Association 1.4 Data Mining Types 1.4.1 Relational Data Mining 1.4.2 Web Mining 1.4.3 Big Data Mining 1.5 Summary References 2 Text Indexing 2.1 Overview of Text Indexing 2.2 Steps of Text Indexing 2.2.1 Tokenization 2.2.2 Stemming 2.2.3 Stop-Word Removal 2.2.4 Term Weighting 2.3 Text Indexing: Implementation 2.3.1 Class Definition 2.3.2 Stemming Rule 2.3.3 Method Implementations 2.4 Additional Steps 2.4.1 Index Filtering 2.4.2 Index Expansion 2.4.3 Index Optimization 2.5 Summary References 3 Text Encoding 3.1 Overview of Text Encoding 3.2 Feature Selection 3.2.1 Wrapper Approach 3.2.2 Principal Component Analysis 3.2.3 Independent Component Analysis 3.2.4 Singular Value Decomposition 3.3 Feature Value Assignment 3.3.1 Assignment Schemes 3.3.2 Similarity Computation 3.4 Issues of Text Encoding 3.4.1 Huge Dimensionality 3.4.2 Sparse Distribution 3.4.3 Poor Transparency 3.5 Summary References 4 Text Association 4.1 Overview of Text Association 4.2 Data Association 4.2.1 Functional View 4.2.2 Support and Confidence 4.2.3 Apriori Algorithm 4.3 Word Association 4.3.1 Word Text Matrix 4.3.2 Functional View 4.3.3 Simple Example 4.4 Text Association 4.4.1 Functional View 4.4.2 Simple Example 4.5 Overall Summary References Part II Text Categorization 5 Text Categorization: Conceptual View 5.1 Definition of Text Categorization 5.2 Data Classification 5.2.1 Binary Classification 5.2.2 Multiple Classification 5.2.3 Classification Decomposition 5.2.4 Regression 5.3 Classification Types 5.3.1 Hard vs. Soft Classification 5.3.2 Flat vs. Hierarchical Classification 5.3.3 Single vs. Multiple Viewed Classification 5.3.4 Independent vs. Dependent Classification 5.4 Variants of Text Categorization 5.4.1 Spam Mail Filtering 5.4.2 Sentimental Analysis 5.4.3 Information Filtering 5.4.4 Topic Routing 5.5 Summary and Further Discussions References 6 Text Categorization: Approaches 6.1 Machine Learning 6.2 Lazy Learning 6.2.1 K-Nearest Neighbor 6.2.2 Radius Nearest Neighbor 6.2.3 Distance-Based Nearest Neighbor 6.2.4 Attribute Discriminated Nearest Neighbor 6.3 Probabilistic Learning 6.3.1 Bayes Rule 6.3.2 Bayes Classifier 6.3.3 Naive Bayes 6.3.4 Bayesian Learning 6.4 Kernel-Based Classifier 6.4.1 Perceptron 6.4.2 Kernel Functions 6.4.3 Support Vector Machine 6.4.4 Optimization Constraints 6.5 Summary and Further Discussions References 7 Text Categorization: Implementation 7.1 System Architecture 7.2 Class Definitions 7.2.1 Classes: Word, Text, and PlainText 7.2.2 Interface and Class: Classifier and KNearestNeighbor 7.2.3 Class: TextClassificationAPI 7.3 SubsectionTitle 7.3.1 Class: Word 7.3.2 Class: PlainText 7.3.3 Class: KNearestNeighbor 7.3.4 Class: TextClassificationAPI 7.4 Graphic User Interface and Demonstration 7.4.1 Class: TextClassificationGUI 7.4.2 Preliminary Tasks and Encoding 7.4.3 Classification Process 7.4.4 System Upgrading 7.5 Summary and Further Discussions 8 Text Categorization: Evaluation 8.1 Evaluation Overview 8.2 Text Collections 8.2.1 NewsPage.com 8.2.2 20NewsGroups 8.2.3 Reuter21578 8.2.4 OSHUMED 8.3 F1 Measure 8.3.1 Contingency Table 8.3.2 Micro-Averaged F1 8.3.3 Macro-Averaged F1 8.3.4 Example 8.4 Statistical t-Test 8.4.1 Student t-Distribution 8.4.2 Unpaired Difference Inference 8.4.3 Paired Difference Inference 8.4.4 Example 8.5 Summary and Further Discussions References Part III Text Clustering 9 Text Clustering: Conceptual View 9.1 Definition of Text Clustering 9.2 Data Clustering 9.2.1 SubSubsectionTitle 9.2.2 Association vs. Clustering 9.2.3 Classification vs. Clustering 9.2.4 Constraint Clustering 9.3 Clustering Types 9.3.1 Static vs. Dynamic Clustering 9.3.2 Crisp vs. Fuzzy Clustering 9.3.3 SubsectionTitle 9.3.4 Single vs. Multiple Viewed Clustering 9.4 Derived Tasks from Text Clustering 9.4.1 Cluster Naming 9.4.2 Subtext Clustering 9.4.3 Automatic Sampling for Text Categorization 9.4.4 Redundant Project Detection 9.5 Summary and Further Discussions References 10 Text Clustering: Approaches 10.1 Unsupervised Learning 10.2 Simple Clustering Algorithms 10.2.1 AHC Algorithm 10.2.2 Divisive Clustering Algorithm 10.2.3 Single-Pass Algorithm 10.2.4 Growing Algorithm 10.3 K-Means Algorithm 10.3.1 Crisp K-Means Algorithm 10.3.2 Fuzzy K-Means Algorithm 10.3.3 Gaussian Mixture 10.3.4 K Medoid Algorithm 10.4 Competitive Learning 10.4.1 Kohonen Networks 10.4.2 Learning Vector Quantization 10.4.3 Two-Dimensional Self-Organizing Map 10.4.4 Neural Gas 10.5 Summary and Further Discussions References 11 Text Clustering: Implementation 11.1 System Architecture 11.2 Class Definitions 11.2.1 Classes in Text Categorization System 11.2.2 Class: Cluster 11.2.3 Interface: ClusterAnalyzer 11.2.4 Class: AHCAlgorithm 11.3 Method Implementations 11.3.1 Methods in Previous Classes 11.3.2 Class: Cluster 11.3.3 Class: AHC Algorithm 11.4 Class: ClusterAnalysisAPI 11.4.1 Class: ClusterAnalysisAPI 11.4.2 Class: ClusterAnalyzerGUI 11.4.3 Demonstration 11.4.4 System Upgrading 11.5 Summary and Further Discussions Reference 12 Text Clustering: Evaluation 12.1 Introduction 12.2 Cluster Validations 12.2.1 Intra-cluster and Inter-cluster Similarities 12.2.2 Internal Validation 12.2.3 Relative Validation 12.2.4 External Validation 12.3 Clustering Index 12.3.1 Computation Process 12.3.2 Evaluation of Crisp Clustering 12.3.3 Evaluation of Fuzzy Clustering 12.3.4 Evaluation of Hierarchical Clustering 12.4 Parameter Tuning 12.4.1 Clustering Index for Unlabeled Documents 12.4.2 Simple Clustering Algorithm with Parameter Tuning 12.4.3 K Means Algorithm with Parameter Tuning 12.4.4 Evolutionary Clustering Algorithm 12.5 Summary and Further Discussions References Part IV Advanced Topics 13 Text Summarization 13.1 Definition of Text Summarization 13.2 Text Summarization Types 13.2.1 Manual Versus Automatic Text Summarization 13.2.2 Single Versus Multiple Text Summarization 13.2.3 Flat Versus Hierarchical Text Summarization 13.2.4 Abstraction Versus Query-Based Summarization 13.3 Approaches to Text Summarization 13.3.1 Heuristic Approaches 13.3.2 Mapping into Classification Task 13.3.3 Sampling Schemes 13.3.4 Application of Machine Learning Algorithms 13.4 Combination with Other Text Mining Tasks 13.4.1 Summary-Based Classification 13.4.2 Summary-Based Clustering 13.4.3 Topic-Based Summarization 13.4.4 Text Expansion 13.5 Summary and Further Discussions 14 Text Segmentation 14.1 Definition of Text Segmentation 14.2 Text Segmentation Type 14.2.1 Spoken Versus Written Text Segmentation 14.2.2 Ordered Versus Unordered Text Segmentation 14.2.3 Exclusive Versus Overlapping Segmentation 14.2.4 Flat Versus Hierarchical Text Segmentation 14.3 Machine Learning-Based Approaches 14.3.1 Heuristic Approaches 14.3.2 Mapping into Classification 14.3.3 Encoding Adjacent Paragraph Pairs 14.3.4 Application of Machine Learning 14.4 Derived Tasks 14.4.1 Temporal Topic Analysis 14.4.2 Subtext Retrieval 14.4.3 Subtext Synthesization 14.4.4 Virtual Text 14.5 Summary and Further Discussions 15 Taxonomy Generation 15.1 Definition of Taxonomy Generation 15.2 Relevant Tasks to Taxonomy Generation 15.2.1 Keyword Extraction 15.2.2 Word Categorization 15.2.3 Word Clustering 15.2.4 Topic Routing 15.3 Taxonomy Generation Schemes 15.3.1 Index-Based Scheme 15.3.2 Clustering-Based Scheme 15.3.3 Association-Based Scheme 15.3.4 Link Analysis-Based Scheme 15.4 Taxonomy Governance 15.4.1 Taxonomy Maintenance 15.4.2 Taxonomy Growth 15.4.3 Taxonomy Integration 15.4.4 Ontology 15.5 Summary and Further Discussions References 16 Dynamic Document Organization 16.1 Definition of Dynamic Document Organization 16.2 Online Clustering 16.2.1 Online Clustering in Functional View 16.2.2 Online K Means Algorithm 16.2.3 Online Unsupervised KNN Algorithm 16.2.4 Online Fuzzy Clustering 16.3 Dynamic Organization 16.3.1 Execution Process 16.3.2 Maintenance Mode 16.3.3 Creation Mode 16.3.4 Additional Tasks 16.4 Issues of Dynamic Document Organization 16.4.1 Text Representation 16.4.2 Binary Decomposition 16.4.3 Transition into Creation Mode 16.4.4 Variants of DDO System 16.4.5 Summary and Further Discussions References Part V Word Mining 17 Word Encoding 17.1 Introduction 17.2 Word Encoding 17.2.1 Text Indexing 17.2.2 Text Index Structure 17.2.3 Word Indexing 17.2.4 Inverted Index 17.3 Word Representation 17.3.1 Text Representation 17.3.2 Word-Text Matrix 17.3.3 Texts as Features 17.3.4 Texts as Features 17.4 Word Representation 17.4.1 XML Documents 17.4.2 Compound Data 17.4.3 Compound Words 17.4.4 Compound Encoding 17.5 Summary and Further Discussions References 18 Word Classification 18.1 Introduction 18.2 Traditional Instances 18.2.1 Lexical Word Classification 18.2.2 POS Tagging 18.2.3 Named Entity Extraction 18.2.4 Frequent Word Set Extraction 18.3 Word Classification 18.3.1 Sampling 18.3.2 Semantic Similarity 18.3.3 KNN Based Classification 18.3.4 KNN Variants 18.4 Word and Text Classification 18.4.1 Compound Classification 18.4.2 Text Cluster Features 18.4.3 Word Classification for Text Classification 18.4.4 Text Classification for Word Classification 18.5 Summary and Further Discussions References 19 Word Clustering 19.1 Introduction 19.2 Semantic Word Operations 19.2.1 Word Collocation 19.2.2 Word Similarity Matrix 19.2.3 Word Group Characterization 19.2.4 Word Association 19.3 Semantic Word Clustering 19.3.1 Bottom-Up Word Clustering 19.3.2 Top-Down Clustering 19.3.3 Top-Down Clustering 19.3.4 Partitional Word Clustering 19.4 Word and Text Clustering 19.4.1 Three Types of Word and Text Clustering 19.4.2 Word Clustering for Text Classification 19.4.3 Word Clustering for Text Clustering 19.4.4 Constraint Word Clustering 19.5 Summary and Further Discussions References 20 Keyword Extraction 20.1 Introduction 20.2 Advanced Text Indexing 20.2.1 Index Processing 20.2.2 Index Optimization 20.2.3 Index Adaptation 20.2.4 Hierarchical Text Representation 20.3 Keyword Extraction System 20.3.1 Keyword Extraction as Word Classification 20.3.2 Domain-Dependent Classification 20.3.3 Keyword Extraction Process 20.3.4 System Design 20.4 Current Affair Topics 20.4.1 Text Classification + Keyword Extraction 20.4.2 Generative AI 20.4.3 Large Language Modeling 20.4.4 ChatGPT 20.5 Summary and Further Discussions References Index