ورود به حساب

نام کاربری گذرواژه

گذرواژه را فراموش کردید؟ کلیک کنید

حساب کاربری ندارید؟ ساخت حساب

ساخت حساب کاربری

نام نام کاربری ایمیل شماره موبایل گذرواژه

برای ارتباط با ما می توانید از طریق شماره موبایل زیر از طریق تماس و پیامک با ما در ارتباط باشید


09117307688
09117179751

در صورت عدم پاسخ گویی از طریق پیامک با پشتیبان در ارتباط باشید

دسترسی نامحدود

برای کاربرانی که ثبت نام کرده اند

ضمانت بازگشت وجه

درصورت عدم همخوانی توضیحات با کتاب

پشتیبانی

از ساعت 7 صبح تا 10 شب

دانلود کتاب Blueprints for Text Analytics Using Python: Machine Learning-Based Solutions for Common Real World (NLP) Applications

دانلود کتاب نقشه هایی برای تجزیه و تحلیل متن با استفاده از پایتون: راه حلهای مبتنی بر یادگیری ماشین برای برنامه های کاربردی دنیای واقعی (NLP)

Blueprints for Text Analytics Using Python: Machine Learning-Based Solutions for Common Real World (NLP) Applications

مشخصات کتاب

Blueprints for Text Analytics Using Python: Machine Learning-Based Solutions for Common Real World (NLP) Applications

دسته بندی: سایبرنتیک: هوش مصنوعی
ویرایش: 1 
نویسندگان: , ,   
سری:  
ISBN (شابک) : 149207408X, 9781492074083 
ناشر: O'Reilly Media 
سال نشر: 2021 
تعداد صفحات: 422 
زبان: English 
فرمت فایل : PDF (درصورت درخواست کاربر به PDF، EPUB یا AZW3 تبدیل می شود) 
حجم فایل: 20 مگابایت 

قیمت کتاب (تومان) : 51,000



ثبت امتیاز به این کتاب

میانگین امتیاز به این کتاب :
       تعداد امتیاز دهندگان : 13


در صورت تبدیل فایل کتاب Blueprints for Text Analytics Using Python: Machine Learning-Based Solutions for Common Real World (NLP) Applications به فرمت های PDF، EPUB، AZW3، MOBI و یا DJVU می توانید به پشتیبان اطلاع دهید تا فایل مورد نظر را تبدیل نمایند.

توجه داشته باشید کتاب نقشه هایی برای تجزیه و تحلیل متن با استفاده از پایتون: راه حلهای مبتنی بر یادگیری ماشین برای برنامه های کاربردی دنیای واقعی (NLP) نسخه زبان اصلی می باشد و کتاب ترجمه شده به فارسی نمی باشد. وبسایت اینترنشنال لایبرری ارائه دهنده کتاب های زبان اصلی می باشد و هیچ گونه کتاب ترجمه شده یا نوشته شده به فارسی را ارائه نمی دهد.


توضیحاتی در مورد کتاب نقشه هایی برای تجزیه و تحلیل متن با استفاده از پایتون: راه حلهای مبتنی بر یادگیری ماشین برای برنامه های کاربردی دنیای واقعی (NLP)

تبدیل متن به اطلاعات ارزشمند برای مشاغلی که به دنبال کسب مزیت رقابتی هستند ضروری است. با پیشرفت های اخیر در پردازش زبان طبیعی (NLP)، کاربران اکنون گزینه های زیادی برای حل چالش های پیچیده دارند. اما همیشه مشخص نیست که کدام ابزارها یا کتابخانه های NLP برای نیازهای یک کسب و کار کار می کنند، یا از کدام تکنیک ها و به چه ترتیبی باید استفاده کنید. این کتاب کاربردی به دانشمندان داده و توسعه دهندگان نقشه هایی برای بهترین راه حل های عملی برای وظایف رایج در تجزیه و تحلیل متن و پردازش زبان طبیعی ارائه می دهد. نویسندگان Jens Albrecht، Sidharth Ramachandran و Christian Winkler مطالعات موردی در دنیای واقعی و نمونه‌های کد دقیق را در Python ارائه می‌کنند تا به شما در شروع سریع کمک کنند. • استخراج داده ها از API ها و صفحات وب • داده های متنی را برای تجزیه و تحلیل آماری و یادگیری ماشین آماده کنید • از یادگیری ماشین برای طبقه بندی، مدل سازی موضوعات و خلاصه سازی استفاده کنید • مدل های هوش مصنوعی و نتایج طبقه بندی را توضیح دهید • تشابهات معنایی با جاسازی کلمات را کاوش و تجسم کنید • احساسات مشتری را در بررسی محصول شناسایی کنید • یک نمودار دانش بر اساس موجودیت های نامگذاری شده و روابط آنها ایجاد کنید


توضیحاتی درمورد کتاب به خارجی

Turning text into valuable information is essential for businesses looking to gain a competitive advantage. With recent improvements in natural language processing (NLP), users now have many options for solving complex challenges. But it's not always clear which NLP tools or libraries would work for a business's needs, or which techniques you should use and in what order. This practical book provides data scientists and developers with blueprints for best practice solutions to common tasks in text analytics and natural language processing. Authors Jens Albrecht, Sidharth Ramachandran, and Christian Winkler provide real-world case studies and detailed code examples in Python to help you get started quickly. • Extract data from APIs and web pages • Prepare textual data for statistical analysis and machine learning • Use machine learning for classification, topic modeling, and summarization • Explain AI models and classification results • Explore and visualize semantic similarities with word embeddings • Identify customer sentiment in product reviews • Create a knowledge graph based on named entities and their relations



فهرست مطالب

Cover
Copyright
Table of Contents
Preface
	Approach of the Book
	Prerequisites
	Some Important Libraries to Know
	Books to Read
	Conventions Used in This Book
	Using Code Examples
	O’Reilly Online Learning
	How to Contact Us
	Acknowledgments
Chapter 1. Gaining Early Insights from Textual Data
	What You’ll Learn and What We’ll Build
	Exploratory Data Analysis
	Introducing the Dataset
	Blueprint: Getting an Overview of the Data with Pandas
		Calculating Summary Statistics for Columns
		Checking for Missing Data
		Plotting Value Distributions
		Comparing Value Distributions Across Categories
		Visualizing Developments Over Time
	Blueprint: Building a Simple Text Preprocessing Pipeline
		Performing Tokenization with Regular Expressions
		Treating Stop Words
		Processing a Pipeline with One Line of Code
	Blueprints for Word Frequency Analysis
		Blueprint: Counting Words with a Counter
		Blueprint: Creating a Frequency Diagram
		Blueprint: Creating Word Clouds
		Blueprint: Ranking with TF-IDF
	Blueprint: Finding a Keyword-in-Context
	Blueprint: Analyzing N-Grams
	Blueprint: Comparing Frequencies Across Time Intervals and Categories
		Creating Frequency Timelines
		Creating Frequency Heatmaps
	Closing Remarks
Chapter 2. Extracting Textual Insights with APIs
	What You’ll Learn and What We’ll Build
	Application Programming Interfaces
	Blueprint: Extracting Data from an API Using the Requests Module
		Pagination
		Rate Limiting
	Blueprint: Extracting Twitter Data with Tweepy
		Obtaining Credentials
		Installing and Configuring Tweepy
		Extracting Data from the Search API
		Extracting Data from a User’s Timeline
		Extracting Data from the Streaming API
	Closing Remarks
Chapter 3. Scraping Websites and Extracting Data
	What You’ll Learn and What We’ll Build
	Scraping and Data Extraction
	Introducing the Reuters News Archive
	URL Generation
	Blueprint: Downloading and Interpreting robots.txt
	Blueprint: Finding URLs from sitemap.xml
	Blueprint: Finding URLs from RSS
	Downloading Data
	Blueprint: Downloading HTML Pages with Python
	Blueprint: Downloading HTML Pages with wget
	Extracting Semistructured Data
	Blueprint: Extracting Data with Regular Expressions
	Blueprint: Using an HTML Parser for Extraction
	Blueprint: Spidering
		Introducing the Use Case
		Error Handling and Production-Quality Software
	Density-Based Text Extraction
		Extracting Reuters Content with Readability
		Summary Density-Based Text Extraction
	All-in-One Approach
	Blueprint: Scraping the Reuters Archive with Scrapy
	Possible Problems with Scraping
	Closing Remarks and Recommendation
Chapter 4. Preparing Textual Data for Statistics and Machine Learning
	What You’ll Learn and What We’ll Build
	A Data Preprocessing Pipeline
	Introducing the Dataset: Reddit Self-Posts
		Loading Data Into Pandas
		Blueprint: Standardizing Attribute Names
		Saving and Loading a DataFrame
	Cleaning Text Data
		Blueprint: Identify Noise with Regular Expressions
		Blueprint: Removing Noise with Regular Expressions
		Blueprint: Character Normalization with textacy
		Blueprint: Pattern-Based Data Masking with textacy
	Tokenization
		Blueprint: Tokenization with Regular Expressions
		Tokenization with NLTK
		Recommendations for Tokenization
	Linguistic Processing with spaCy
		Instantiating a Pipeline
		Processing Text
		Blueprint: Customizing Tokenization
		Blueprint: Working with Stop Words
		Blueprint: Extracting Lemmas Based on Part of Speech
		Blueprint: Extracting Noun Phrases
		Blueprint: Extracting Named Entities
	Feature Extraction on a Large Dataset
		Blueprint: Creating One Function to Get It All
		Blueprint: Using spaCy on a Large Dataset
		Persisting the Result
		A Note on Execution Time
	There Is More
		Language Detection
		Spell-Checking
		Token Normalization
	Closing Remarks and Recommendations
Chapter 5. Feature Engineering and Syntactic Similarity
	What You’ll Learn and What We’ll Build
	A Toy Dataset for Experimentation
	Blueprint: Building Your Own Vectorizer
		Enumerating the Vocabulary
		Vectorizing Documents
		The Document-Term Matrix
		The Similarity Matrix
	Bag-of-Words Models
		Blueprint: Using scikit-learn’s CountVectorizer
		Blueprint: Calculating Similarities
	TF-IDF Models
		Optimized Document Vectors with TfidfTransformer
		Introducing the ABC Dataset
		Blueprint: Reducing Feature Dimensions
		Blueprint: Improving Features by Making Them More Specific
		Blueprint: Using Lemmas Instead of Words for Vectorizing Documents
		Blueprint: Limit Word Types
		Blueprint: Remove Most Common Words
		Blueprint: Adding Context via N-Grams
	Syntactic Similarity in the ABC Dataset
		Blueprint: Finding Most Similar Headlines to a Made-up Headline
		Blueprint: Finding the Two Most Similar Documents in a Large Corpus (Much More Difficult)
		Blueprint: Finding Related Words
		Tips for Long-Running Programs like Syntactic Similarity
	Summary and Conclusion
Chapter 6. Text Classification Algorithms
	What You’ll Learn and What We’ll Build
	Introducing the Java Development Tools Bug Dataset
	Blueprint: Building a Text Classification System
		Step 1: Data Preparation
		Step 2: Train-Test Split
		Step 3: Training the Machine Learning Model
		Step 4: Model Evaluation
	Final Blueprint for Text Classification
	Blueprint: Using Cross-Validation to Estimate Realistic Accuracy Metrics
	Blueprint: Performing Hyperparameter Tuning with Grid Search
	Blueprint Recap and Conclusion
	Closing Remarks
	Further Reading
Chapter 7. How to Explain a Text Classifier
	What You’ll Learn and What We’ll Build
	Blueprint: Determining Classification Confidence Using Prediction Probability
	Blueprint: Measuring Feature Importance of Predictive Models
	Blueprint: Using LIME to Explain the Classification Results
	Blueprint: Using ELI5 to Explain the Classification Results
	Blueprint: Using Anchor to Explain the Classification Results
		Using the Distribution with Masked Words
		Working with Real Words
	Closing Remarks
Chapter 8. Unsupervised Methods: Topic Modeling and Clustering
	What You’ll Learn and What We’ll Build
	Our Dataset: UN General Debates
		Checking Statistics of the Corpus
		Preparations
	Nonnegative Matrix Factorization (NMF)
		Blueprint: Creating a Topic Model Using NMF for Documents
		Blueprint: Creating a Topic Model for Paragraphs Using NMF
	Latent Semantic Analysis/Indexing
		Blueprint: Creating a Topic Model for Paragraphs with SVD
	Latent Dirichlet Allocation
		Blueprint: Creating a Topic Model for Paragraphs with LDA
		Blueprint: Visualizing LDA Results
	Blueprint: Using Word Clouds to Display and Compare Topic Models
	Blueprint: Calculating Topic Distribution of Documents and Time Evolution
	Using Gensim for Topic Modeling
		Blueprint: Preparing Data for Gensim
		Blueprint: Performing Nonnegative Matrix Factorization with Gensim
		Blueprint: Using LDA with Gensim
		Blueprint: Calculating Coherence Scores
		Blueprint: Finding the Optimal Number of Topics
		Blueprint: Creating a Hierarchical Dirichlet Process with Gensim
	Blueprint: Using Clustering to Uncover the Structure of Text Data
	Further Ideas
	Summary and Recommendation
	Conclusion
Chapter 9. Text Summarization
	What You’ll Learn and What We’ll Build
	Text Summarization
		Extractive Methods
		Data Preprocessing
	Blueprint: Summarizing Text Using Topic Representation
		Identifying Important Words with TF-IDF Values
		LSA Algorithm
	Blueprint: Summarizing Text Using an Indicator Representation
	Measuring the Performance of Text Summarization Methods
	Blueprint: Summarizing Text Using Machine Learning
		Step 1: Creating Target Labels
		Step 2: Adding Features to Assist Model Prediction
		Step 3: Build a Machine Learning Model
	Closing Remarks
	Further Reading
Chapter 10. Exploring Semantic Relationships with Word Embeddings
	What You’ll Learn and What We’ll Build
	The Case for Semantic Embeddings
		Word Embeddings
		Analogy Reasoning with Word Embeddings
		Types of Embeddings
	Blueprint: Using Similarity Queries on Pretrained Models
		Loading a Pretrained Model
		Similarity Queries
	Blueprints for Training and Evaluating Your Own Embeddings
		Data Preparation
		Blueprint: Training Models with Gensim
		Blueprint: Evaluating Different Models
	Blueprints for Visualizing Embeddings
		Blueprint: Applying Dimensionality Reduction
		Blueprint: Using the TensorFlow Embedding Projector
		Blueprint: Constructing a Similarity Tree
	Closing Remarks
	Further Reading
Chapter 11. Performing Sentiment Analysis on Text Data
	What You’ll Learn and What We’ll Build
	Sentiment Analysis
	Introducing the Amazon Customer Reviews Dataset
	Blueprint: Performing Sentiment Analysis Using Lexicon-Based Approaches
		Bing Liu Lexicon
		Disadvantages of a Lexicon-Based Approach
	Supervised Learning Approaches
		Preparing Data for a Supervised Learning Approach
	Blueprint: Vectorizing Text Data and Applying a Supervised Machine Learning Algorithm
		Step 1: Data Preparation
		Step 2: Train-Test Split
		Step 3: Text Vectorization
		Step 4: Training the Machine Learning Model
	Pretrained Language Models Using Deep Learning
		Deep Learning and Transfer Learning
	Blueprint: Using the Transfer Learning Technique and a Pretrained Language Model
		Step 1: Loading Models and Tokenization
		Step 2: Model Training
		Step 3: Model Evaluation
	Closing Remarks
	Further Reading
Chapter 12. Building a Knowledge Graph
	What You’ll Learn and What We’ll Build
	Knowledge Graphs
		Information Extraction
	Introducing the Dataset
	Named-Entity Recognition
		Blueprint: Using Rule-Based Named-Entity Recognition
		Blueprint: Normalizing Named Entities
		Merging Entity Tokens
	Coreference Resolution
		Blueprint: Using spaCy’s Token Extensions
		Blueprint: Performing Alias Resolution
		Blueprint: Resolving Name Variations
		Blueprint: Performing Anaphora Resolution with NeuralCoref
		Name Normalization
		Entity Linking
	Blueprint: Creating a Co-Occurrence Graph
		Extracting Co-Occurrences from a Document
		Visualizing the Graph with Gephi
	Relation Extraction
		Blueprint: Extracting Relations Using Phrase Matching
		Blueprint: Extracting Relations Using Dependency Trees
	Creating the Knowledge Graph
		Don’t Blindly Trust the Results
	Closing Remarks
	Further Reading
Chapter 13. Using Text Analytics in Production
	What You’ll Learn and What We’ll Build
	Blueprint: Using Conda to Create Reproducible Python Environments
	Blueprint: Using Containers to Create Reproducible Environments
	Blueprint: Creating a REST API for Your Text Analytics Model
	Blueprint: Deploying and Scaling Your API Using a Cloud Provider
	Blueprint: Automatically Versioning and Deploying Builds
	Closing Remarks
	Further Reading
Index
About the Authors
Colophon




نظرات کاربران