برای ارتباط با ما می توانید از طریق شماره موبایل زیر از طریق تماس و پیامک با ما در ارتباط باشید

09117307688
09117179751

در صورت عدم پاسخ گویی از طریق پیامک با پشتیبان در ارتباط باشید

دسترسی نامحدود

برای کاربرانی که ثبت نام کرده اند

ضمانت بازگشت وجه

درصورت عدم همخوانی توضیحات با کتاب

پشتیبانی

از ساعت 7 صبح تا 10 شب

دانلود کتاب Data Science Bookcamp: Five real-world Python projects

دانلود کتاب کمپ کتاب علوم داده: پنج پروژه پایتون در دنیای واقعی

مشخصات کتاب

Data Science Bookcamp: Five real-world Python projects

دسته بندی: سایبرنتیک: هوش مصنوعی
ویرایش: 1 
نویسندگان: Leonard Apeltsin  
سری:  
ISBN (شابک) : 1617296252, 9781617296253 
ناشر: Manning Publications 
سال نشر: 2021 
تعداد صفحات: 706 
زبان: English 
فرمت فایل : PDF (درصورت درخواست کاربر به PDF، EPUB یا AZW3 تبدیل می شود) 
حجم فایل: 42 مگابایت

قیمت کتاب (تومان) : 37,000

کلمات کلیدی مربوط به کتاب کمپ کتاب علوم داده: پنج پروژه پایتون در دنیای واقعی: یادگیری ماشینی، مدل‌های احتمالی، پردازش زبان طبیعی، درخت‌های تصمیم، علم داده، یادگیری نظارت شده، پایتون، خوشه‌بندی، تجسم داده‌ها، آمار، رگرسیون لجستیک، scikit-learn، Web Scraping، NumPy، matplotlib، پانداها، نظریه گراف، NetworkX، نمودار الگوریتم ها، داده های مکانی، نظریه احتمال، آزمون فرضیه، تحلیل شبکه، استنتاج آماری، پردازش متن، مدل های مارکوف، کارتوپی، ابتدایی، شبیه سازی مونت کارلو

میانگین امتیاز به این کتاب :
تعداد امتیاز دهندگان : 10

در صورت تبدیل فایل کتاب Data Science Bookcamp: Five real-world Python projects به فرمت های PDF، EPUB، AZW3، MOBI و یا DJVU می توانید به پشتیبان اطلاع دهید تا فایل مورد نظر را تبدیل نمایند.

توجه داشته باشید کتاب کمپ کتاب علوم داده: پنج پروژه پایتون در دنیای واقعی نسخه زبان اصلی می باشد و کتاب ترجمه شده به فارسی نمی باشد. وبسایت اینترنشنال لایبرری ارائه دهنده کتاب های زبان اصلی می باشد و هیچ گونه کتاب ترجمه شده یا نوشته شده به فارسی را ارائه نمی دهد.

توضیحاتی در مورد کتاب کمپ کتاب علوم داده: پنج پروژه پایتون در دنیای واقعی

با ساخت پنج پروژه در دنیای واقعی، علم داده را با پایتون بیاموزید! با ایجاد درک منعطف و شهودی از علم داده، پیش‌بینی بازی‌های کارتی، ردیابی شیوع بیماری و موارد دیگر را آزمایش کنید. در کمپ کتاب Data Science یاد خواهید گرفت: • تکنیک های محاسبه و ترسیم احتمالات • تجزیه و تحلیل آماری با استفاده از Scipy • نحوه سازماندهی مجموعه داده ها با الگوریتم های خوشه بندی • نحوه تجسم مجموعه داده های پیچیده چند متغیره • نحوه آموزش الگوریتم یادگیری ماشین درخت تصمیم در Data Science Bookcamp دانش خود را در مورد Python با انواع مشکلات باز که دانشمندان حرفه‌ای داده هر روز روی آنها کار می‌کنند، آزمایش و ایجاد خواهید کرد. مجموعه داده‌های قابل بارگیری و راه‌حل‌های کاملاً توضیح داده شده به شما کمک می‌کنند آنچه را که آموخته‌اید قفل کنید، اعتماد به نفس خود را تقویت کنید و شما را برای یک حرفه جدید و هیجان‌انگیز در علم داده آماده کنید. در مورد تکنولوژی یک پروژه علم داده دارای بخش‌های متحرک زیادی است و به تمرین و مهارت نیاز دارد تا همه کدها، الگوریتم‌ها، مجموعه‌های داده، قالب‌ها و تجسم‌ها به طور هماهنگ با هم کار کنند. این کتاب منحصر به فرد شما را از طریق پنج پروژه واقع بینانه، از جمله ردیابی شیوع بیماری از سرفصل های اخبار، تجزیه و تحلیل شبکه های اجتماعی، و یافتن الگوهای مرتبط در داده های کلیک بر روی تبلیغات راهنمایی می کند. در مورد کتاب کمپ کتاب Data Science به نظریه سطح سطح و مثال‌های اسباب‌بازی ختم نمی‌شود. همانطور که روی هر پروژه کار می کنید، یاد می گیرید که چگونه مشکلات رایجی مانند داده های از دست رفته، داده های نامرتب و الگوریتم هایی را که کاملاً با مدلی که می سازید مطابقت ندارند عیب یابی کنید. از دستورالعمل‌های راه‌اندازی دقیق و راه‌حل‌های کاملاً توضیح‌داده‌شده که نقاط خرابی رایج را برجسته می‌کنند، قدردانی خواهید کرد. در پایان، شما به مهارت های خود مطمئن خواهید بود زیرا می توانید نتایج را ببینید. داخلش چیه • خراش دادن وب • سازماندهی مجموعه داده ها با الگوریتم های خوشه بندی • مجموعه داده های پیچیده چند متغیره را تجسم کنید • الگوریتم یادگیری ماشین درخت تصمیم را آموزش دهید درباره خواننده برای خوانندگانی که اصول پایتون را می دانند. بدون نیاز به علم داده یا مهارت های یادگیری ماشین قبلی. درباره نویسنده لئونارد آپلتسین، رئیس علوم داده در Anomaly است، جایی که تیم او از تجزیه و تحلیل پیشرفته برای کشف تقلب، ضایعات و سوء استفاده در مراقبت های بهداشتی استفاده می کند.

توضیحاتی درمورد کتاب به خارجی

Learn data science with Python by building five real-world projects! Experiment with card game predictions, tracking disease outbreaks, and more, as you build a flexible and intuitive understanding of data science. In Data Science Bookcamp you will learn: • Techniques for computing and plotting probabilities • Statistical analysis using Scipy • How to organize datasets with clustering algorithms • How to visualize complex multi-variable datasets • How to train a decision tree machine learning algorithm In Data Science Bookcamp you’ll test and build your knowledge of Python with the kind of open-ended problems that professional data scientists work on every day. Downloadable data sets and thoroughly-explained solutions help you lock in what you’ve learned, building your confidence and making you ready for an exciting new data science career. About the technology A data science project has a lot of moving parts, and it takes practice and skill to get all the code, algorithms, datasets, formats, and visualizations working together harmoniously. This unique book guides you through five realistic projects, including tracking disease outbreaks from news headlines, analyzing social networks, and finding relevant patterns in ad click data. About the book Data Science Bookcamp doesn’t stop with surface-level theory and toy examples. As you work through each project, you’ll learn how to troubleshoot common problems like missing data, messy data, and algorithms that don’t quite fit the model you’re building. You’ll appreciate the detailed setup instructions and the fully explained solutions that highlight common failure points. In the end, you’ll be confident in your skills because you can see the results. What's inside • Web scraping • Organize datasets with clustering algorithms • Visualize complex multi-variable datasets • Train a decision tree machine learning algorithm About the reader For readers who know the basics of Python. No prior data science or machine learning skills required. About the author Leonard Apeltsin is the Head of Data Science at Anomaly, where his team applies advanced analytics to uncover healthcare fraud, waste, and abuse.

فهرست مطالب

Data Science Bookcamp
brief contents
contents
preface
acknowledgments
about this book
	Who should read this book
	How this book is organized
	About the code
about the author
about the cover illustration
Case study 1—Finding the winning strategy in a card game
	Section 1—Computing probabilities using Python
		1.1 Sample space analysis: An equation-free approach for measuring uncertainty in outcomes
			1.1.1 Analyzing a biased coin
		1.2 Computing nontrivial probabilities
			1.2.1 Problem 1: Analyzing a family with four children
			1.2.2 Problem 2: Analyzing multiple die rolls
			1.2.3 Problem 3: Computing die-roll probabilities using weighted sample spaces
		1.3 Computing probabilities over interval ranges
			1.3.1 Evaluating extremes using interval analysis
		Summary
	Section 2—Plotting probabilities using Matplotlib
		2.1 Basic Matplotlib plots
		2.2 Plotting coin-flip probabilities
			2.2.1 Comparing multiple coin-flip probability distributions
		Summary
	Section 3—Running random simulations in NumPy
		3.1 Simulating random coin flips and die rolls using NumPy
			3.1.1 Analyzing biased coin flips
		3.2 Computing confidence intervals using histograms and NumPy arrays
			3.2.1 Binning similar points in histogram plots
			3.2.2 Deriving probabilities from histograms
			3.2.3 Shrinking the range of a high confidence interval
			3.2.4 Computing histograms in NumPy
		3.3 Using confidence intervals to analyze a biased deck of cards
		3.4 Using permutations to shuffle cards
		Summary
	Section 4—Case study 1 solution
		4.1 Predicting red cards in a shuffled deck
			4.1.1 Estimating the probability of strategy success
		4.2 Optimizing strategies using the sample space for a 10-card deck
		Summary
Case study 2—Assessing online ad clicks for significance
	Section 5—Basic probability and statistical analysis using SciPy
		5.1 Exploring the relationships between data and probability using SciPy
		5.2 Mean as a measure of centrality
			5.2.1 Finding the mean of a probability distribution
		5.3 Variance as a measure of dispersion
			5.3.1 Finding the variance of a probability distribution
		Summary
	Section 6—Making predictions using the central limit theorem and SciPy
		6.1 Manipulating the normal distribution using SciPy
			6.1.1 Comparing two sampled normal curves
		6.2 Determining the mean and variance of a population through random sampling
		6.3 Making predictions using the mean and variance
			6.3.1 Computing the area beneath a normal curve
			6.3.2 Interpreting the computed probability
		Summary
	Section 7—Statistical hypothesis testing
		7.1 Assessing the divergence between sample mean and population mean
		7.2 Data dredging: Coming to false conclusions through oversampling
		7.3 Bootstrapping with replacement: Testing a hypothesis when the population variance is unknown
		7.4 Permutation testing: Comparing means of samples when the population parameters are unknown
		Summary
	Section 8—Analyzing tables using Pandas
		8.1 Storing tables using basic Python
		8.2 Exploring tables using Pandas
		8.3 Retrieving table columns
		8.4 Retrieving table rows
		8.5 Modifying table rows and columns
		8.6 Saving and loading table data
		8.7 Visualizing tables using Seaborn
		Summary
	Section 9—Case study 2 solution
		9.1 Processing the ad-click table in Pandas
		9.2 Computing p-values from differences in means
		9.3 Determining statistical significance
		9.4 41 shades of blue: A real-life cautionary tale
		Summary
Case study 3—Tracking disease outbreaks using news headlines
	Section 10—Clustering data into groups
		10.1 Using centrality to discover clusters
		10.2 K-means: A clustering algorithm for grouping data into K central groups
			10.2.1 K-means clustering using scikit-learn
			10.2.2 Selecting the optimal K using the elbow method
		10.3 Using density to discover clusters
		10.4 DBSCAN: A clustering algorithm for grouping data based on spatial density
			10.4.1 Comparing DBSCAN and K-means
			10.4.2 Clustering based on non-Euclidean distance
		10.5 Analyzing clusters using Pandas
		Summary
	Section 11—Geographic location visualization and analysis
		11.1 The great-circle distance: A metric for computing the distance between two global points
		11.2 Plotting maps using Cartopy
			11.2.1 Manually installing GEOS and Cartopy
			11.2.2 Utilizing the Conda package manager
			11.2.3 Visualizing maps
		11.3 Location tracking using GeoNamesCache
			11.3.1 Accessing country information
			11.3.2 Accessing city information
			11.3.3 Limitations of the GeoNamesCache library
		11.4 Matching location names in text
		Summary
	Section 12—Case study 3 solution
		12.1 Extracting locations from headline data
		12.2 Visualizing and clustering the extracted location data
		12.3 Extracting insights from location clusters
		Summary
Case study 4—Using online job postings to improve your data science resume
	Section 13—Measuring text similarities
		13.1 Simple text comparison
			13.1.1 Exploring the Jaccard similarity
			13.1.2 Replacing words with numeric values
		13.2 Vectorizing texts using word counts
			13.2.1 Using normalization to improve TF vector similarity
			13.2.2 Using unit vector dot products to convert between relevance metrics
		13.3 Matrix multiplication for efficient similarity calculation
			13.3.1 Basic matrix operations
			13.3.2 Computing all-by-all matrix similarities
		13.4 Computational limits of matrix multiplication
		Summary
	Section 14—Dimension reduction of matrix data
		14.1 Clustering 2D data in one dimension
			14.1.1 Reducing dimensions using rotation
		14.2 Dimension reduction using PCA and scikit-learn
		14.3 Clustering 4D data in two dimensions
			14.3.1 Limitations of PCA
		14.4 Computing principal components without rotation
			14.4.1 Extracting eigenvectors using power iteration
		14.5 Efficient dimension reduction using SVD and scikit-learn
		Summary
	Section 15—NLP analysis of large text datasets
		15.1 Loading online forum discussions using scikit-learn
		15.2 Vectorizing documents using scikit-learn
		15.3 Ranking words by both post frequency and count
			15.3.1 Computing TFIDF vectors with scikit-learn
		15.4 Computing similarities across large document datasets
		15.5 Clustering texts by topic
			15.5.1 Exploring a single text cluster
		15.6 Visualizing text clusters
			15.6.1 Using subplots to display multiple word clouds
		Summary
	Section 16—Extracting text from web pages
		16.1 The structure of HTML documents
		16.2 Parsing HTML using Beautiful Soup
		16.3 Downloading and parsing online data
		Summary
	Section 17—Case study 4 solution
		17.1 Extracting skill requirements from job posting data
			17.1.1 Exploring the HTML for skill descriptions
		17.2 Filtering jobs by relevance
		17.3 Clustering skills in relevant job postings
			17.3.1 Grouping the job skills into 15 clusters
			17.3.2 Investigating the technical skill clusters
			17.3.3 Investigating the soft-skill clusters
			17.3.4 Exploring clusters at alternative values of K
			17.3.5 Analyzing the 700 most relevant postings
		17.4 Conclusion
		Summary
Case study 5—Predicting future friendships from social network data
	Section 18—An introduction to graph theory and network analysis
		18.1 Using basic graph theory to rank websites by popularity
			18.1.1 Analyzing web networks using NetworkX
		18.2 Utilizing undirected graphs to optimize the travel time between towns
			18.2.1 Modeling a complex network of towns and counties
			18.2.2 Computing the fastest travel time between nodes
		Summary
	Section 19—Dynamic graph theory techniques for node ranking and social network analysis
		19.1 Uncovering central nodes based on expected traffic in a network
			19.1.1 Measuring centrality using traffic simulations
		19.2 Computing travel probabilities using matrix multiplication
			19.2.1 Deriving PageRank centrality from probability theory
			19.2.2 Computing PageRank centrality using NetworkX
		19.3 Community detection using Markov clustering
		19.4 Uncovering friend groups in social networks
		Summary
	Section 20—Network-driven supervised machine learning
		20.1 The basics of supervised machine learning
		20.2 Measuring predicted label accuracy
			20.2.1 Scikit-learn’s prediction measurement functions
		20.3 Optimizing KNN performance
		20.4 Running a grid search using scikit-learn
		20.5 Limitations of the KNN algorithm
		Summary
	Section 21—Training linear classifiers with logistic regression
		21.1 Linearly separating customers by size
		21.2 Training a linear classifier
			21.2.1 Improving perceptron performance through standardization
		21.3 Improving linear classification with logistic regression
			21.3.1 Running logistic regression on more than two features
		21.4 Training linear classifiers using scikit-learn
			21.4.1 Training multiclass linear models
		21.5 Measuring feature importance with coefficients
		21.6 Linear classifier limitations
		Summary
	Section 22—Training nonlinear classifiers with decision tree techniques
		22.1 Automated learning of logical rules
			22.1.1 Training a nested if/else model using two features
			22.1.2 Deciding which feature to split on
			22.1.3 Training if/else models with more than two features
		22.2 Training decision tree classifiers using scikit-learn
			22.2.1 Studying cancerous cells using feature importance
		22.3 Decision tree classifier limitations
		22.4 Improving performance using random forest classification
		22.5 Training random forest classifiers using scikit-learn
		Summary
	Section 23—Case study 5 solution
		23.1 Exploring the data
			23.1.1 Examining the profiles
			23.1.2 Exploring the experimental observations
			23.1.3 Exploring the Friendships linkage table
		23.2 Training a predictive model using network features
		23.3 Adding profile features to the model
		23.4 Optimizing performance across a steady set of features
		23.5 Interpreting the trained model
			23.5.1 Why are generalizable models so important?
		Summary
index
	Symbols
	A
	B
	C
	D
	E
	F
	G
	H
	I
	J
	K
	L
	M
	N
	O
	P
	R
	S
	T
	U
	V
	W
	X
	Y