ورود به حساب

نام کاربری گذرواژه

گذرواژه را فراموش کردید؟ کلیک کنید

حساب کاربری ندارید؟ ساخت حساب

ساخت حساب کاربری

نام نام کاربری ایمیل شماره موبایل گذرواژه

برای ارتباط با ما می توانید از طریق شماره موبایل زیر از طریق تماس و پیامک با ما در ارتباط باشید


09117307688
09117179751

در صورت عدم پاسخ گویی از طریق پیامک با پشتیبان در ارتباط باشید

دسترسی نامحدود

برای کاربرانی که ثبت نام کرده اند

ضمانت بازگشت وجه

درصورت عدم همخوانی توضیحات با کتاب

پشتیبانی

از ساعت 7 صبح تا 10 شب

دانلود کتاب Hadoop in Practice

دانلود کتاب هادوپ در عمل

Hadoop in Practice

مشخصات کتاب

Hadoop in Practice

ویرایش: 2nd 
نویسندگان:   
سری:  
ISBN (شابک) : 1617292222 
ناشر: Manning Publications 
سال نشر: 2014 
تعداد صفحات: 513 
زبان: English 
فرمت فایل : PDF (درصورت درخواست کاربر به PDF، EPUB یا AZW3 تبدیل می شود) 
حجم فایل: 9 مگابایت 

قیمت کتاب (تومان) : 40,000



ثبت امتیاز به این کتاب

میانگین امتیاز به این کتاب :
       تعداد امتیاز دهندگان : 8


در صورت تبدیل فایل کتاب Hadoop in Practice به فرمت های PDF، EPUB، AZW3، MOBI و یا DJVU می توانید به پشتیبان اطلاع دهید تا فایل مورد نظر را تبدیل نمایند.

توجه داشته باشید کتاب هادوپ در عمل نسخه زبان اصلی می باشد و کتاب ترجمه شده به فارسی نمی باشد. وبسایت اینترنشنال لایبرری ارائه دهنده کتاب های زبان اصلی می باشد و هیچ گونه کتاب ترجمه شده یا نوشته شده به فارسی را ارائه نمی دهد.


توضیحاتی درمورد کتاب به خارجی



فهرست مطالب

Hadoop in Practice, Second Edition
brief contents
contents
preface
acknowledgments
about this book
	Roadmap
	What’s new in the second edition?
	Getting help
	Code conventions and downloads
	Third-party libraries
	Datasets
	NASDAQ financial stocks
	Apache log data
	Names
	Author Online
about the cover illustration
Part 1: Background and fundamentals
	Chapter 1: Hadoop in a heartbeat
		1.1 What is Hadoop?
			1.1.1 Core Hadoop components
			1.1.2 The Hadoop ecosystem
			1.1.3 Hardware requirements
			1.1.4 Hadoop distributions
			1.1.5 Who’s using Hadoop?
			1.1.6 Hadoop limitations
		1.2 Getting your hands dirty with MapReduce
		1.3 Chapter summary
	Chapter 2: Introduction to YARN
		2.1 YARN overview
			2.1.1 Why YARN?
			2.1.2 YARN concepts and components
			2.1.3 YARN configuration
			Technique 1: Determining the configuration of your cluster
			2.1.4 Interacting with YARN
			Technique 2: Running a command on your YARN cluster
			Technique 3: Accessing container logs
			Technique 4: Aggregating container log files
			2.1.5 YARN challenges
		2.2 YARN and MapReduce
			2.2.1 Dissecting a YARN MapReduce application
			2.2.2 Configuration
			2.2.3 Backward compatibility
			Technique 5: Writing code that works on Hadoop versions 1 and 2
			2.2.4 Running a job
			Technique 6: Using the command line to run a job
			2.2.5 Monitoring running jobs and viewing archived jobs
			2.2.6 Uber jobs
			Technique 7: Running small MapReduce jobs
		2.3 YARN applications
			2.3.1 NoSQL
			2.3.2 Interactive SQL
			2.3.3 Graph processing
			2.3.4 Real-time data processing
			2.3.5 Bulk synchronous parallel
			2.3.6 MPI
			2.3.7 In-memory
			2.3.8 DAG execution
		2.4 Chapter summary
Part 2: Data logistics
	Chapter 3: Data serialization— working with text and beyond
		3.1 Understanding inputs and outputs in MapReduce
			3.1.1 Data input
			3.1.2 Data output
		3.2 Processing common serialization formats
			3.2.1 XML
			Technique 8: MapReduce and XML
			3.2.2 JSON
			Technique 9: MapReduce and JSON
		3.3 Big data serialization formats
			3.3.1 Comparing SequenceFile, Protocol Buffers, Thrift, and Avro
			3.3.2 SequenceFile
			Technique 10: Working with SequenceFiles
			Technique 11: Using SequenceFiles to encode Protocol Buffers
			3.3.3 Protocol Buffers
			3.3.4 Thrift
			3.3.5 Avro
			Technique 12: Avro’s schema and code generation
			Technique 13: Selecting the appropriate way to use Avro in MapReduce
			Technique 14: Mixing Avro and non-Avro data in MapReduce
			Technique 15: Using Avro records in MapReduce
			Technique 16: Using Avro key/value pairs in MapReduce
			Technique 17: Controlling how sorting worksin MapReduce
			Technique 18: Avro and Hive
			Technique 19: Avro and Pig
		3.4 Columnar storage
			3.4.1 Understanding object models and storage formats
			3.4.2 Parquet and the Hadoop ecosystem
			3.4.3 Parquet block and page sizes
			Technique 20: Reading Parquet files via the command line
			Technique 21: Reading and writing Avro data in Parquet with Java
			Technique 22: Parquet and MapReduce
			Technique 23: Parquet and Hive/Impala
			Technique 24: Pushdown predicates and projection with Parquet
			3.4.4 Parquet limitations
		3.5 Custom file formats
			3.5.1 Input and output formats
			Technique 25: Writing input and output formats for CSV
			3.5.2 The importance of output committing
		3.6 Chapter summary
	Chapter 4: Organizing and optimizing data in HDFS
		4.1 Data organization
			4.1.1 Directory and file layout
			4.1.2 Data tiers
			4.1.3 Partitioning
			Technique 26: Using MultipleOutputs to partition your data
			Technique 27: Using a custom MapReduce partitioner
			4.1.4 Compacting
			Technique 28: Using filecrush to compact data
			Technique 29: Using Avro to store multiple small binary files
			4.1.5 Atomic data movement
		4.2 Efficient storage with compression
			Technique 30: Picking the right compression codec for your data
			Technique 31: Compression with HDFS, MapReduce, Pig, and Hive
			Technique 32: Splittable LZOP with MapReduce, Hive, and Pig
		4.3 Chapter summary
	Chapter 5: Moving data into and out of Hadoop
		5.1 Key elements of data movement
		5.2 Moving data into Hadoop
			5.2.1 Roll your own ingest
			Technique 33: Using the CLI to load files
			Technique 34: Using REST to load files
			Technique 35: Accessing HDFS from behind a firewall
			Technique 36: Mounting Hadoop with NFS
			Technique 37: Using DistCp to copy data within and between clusters
			Technique 38: Using Java to load files
			5.2.2 Continuous movement of log and binary files into HDFS
			Technique 39: Pushing system log messages into HDFS with Flume
			Technique 40: An automated mechanism to copy files into HDFS
			Technique 41: Scheduling regular ingress activities with Oozie
			5.2.3 Databases
			Technique 42: Using Sqoop to import data from MySQL
			5.2.4 HBase
			Technique 43: HBase ingress into HDFS
			Technique 44: MapReduce with HBase as a data source
			5.2.5 Importing data from Kafka
			Technique 45: Using Camus to copy Avro data from Kafka into HDFS
		5.3 Moving data out of Hadoop
			5.3.1 Roll your own egress
			Technique 46: Using the CLI to extract files
			Technique 47: Using REST to extract files
			Technique 48: Reading from HDFS when behind a firewall
			Technique 49: Mounting Hadoop with NFS
			Technique 50: Using DistCp to copy data out of Hadoop
			Technique 51: Using Java to extract files
			5.3.2 Automated file egress
			Technique 52: An automated mechanism to export files from HDFS
			5.3.3 Databases
			Technique 53: Using Sqoop to export data to MySQL
			5.3.4 NoSQL
		5.4 Chapter summary
Part 3: Big data patterns
	Chapter 6: Applying MapReduce patterns to big data
		6.1 Joining
			Technique 54: Picking the best join strategy for your data
			Technique 55: Filters, projections, and pushdowns
			6.1.1 Map-side joins
			Technique 56: Joining data where one dataset can fit into memory
			Technique 57: Performing a semi-join on large datasets
			Technique 58: Joining on presorted and prepartitioned data
			6.1.2 Reduce-side joins
			Technique 59: A basic repartition join
			Technique 60: Optimizing the repartition join
			Technique 61: Using Bloom filters to cut down on shuffled data
			6.1.3 Data skew in reduce-side joins
			Technique 62: Joining large datasets with high join-key cardinality
			Technique 63: Handling skews generated by the hash partitioner
		6.2 Sorting
			6.2.1 Secondary sort
			Technique 64: Implementing a secondary sort
			6.2.2 Total order sorting
			Technique 65: Sorting keys across multiple reducers
		6.3 Sampling
			Technique 66: Writing a reservoir-sampling InputFormat
		6.4 Chapter summary
	Chapter 7: Utilizing data structures and algorithms at scale
		7.1 Modeling data and solving problems with graphs
			7.1.1 Modeling graphs
			7.1.2 Shortest-path algorithm
			Technique 67: Find the shortest distance between two users
			7.1.3 Friends-of-friends algorithm
			Technique 68: Calculating FoFs
			7.1.4 Using Giraph to calculate PageRank over a web graph
			Technique 69: Calculate PageRank over a web graph
		7.2 Bloom filters
			Technique 70: Parallelized Bloom filter creation in MapReduce
		7.3 HyperLogLog
			7.3.1 A brief introduction to HyperLogLog
			Technique 71: Using HyperLogLog to calculate unique counts
		7.4 Chapter summary
	Chapter 8: Tuning, debugging, and testing
		8.1 Measure, measure, measure
		8.2 Tuning MapReduce
			8.2.1 Common inefficiencies in MapReduce jobs
			Technique 72: Viewing job statistics
			8.2.2 Map optimizations
			Technique 73: Data locality
			Technique 74: Dealing with a large number of input splits
			Technique 75: Generating input splits in the cluster with YARN
			8.2.3 Shuffle optimizations
			Technique 76: Using the combiner
			Technique 77: Blazingly fast sorting with binary comparators
			Technique 78: Tuning the shuffle internals
			8.2.4 Reducer optimizations
			Technique 79: Too few or too many reducers
			8.2.5 General tuning tips
			Technique 80: Using stack dumps to discover unoptimized user code
			Technique 81: Profiling your map and reduce tasks
		8.3 Debugging
			8.3.1 Accessing container log output
			Technique 82: Examining task logs
			8.3.2 Accessing container start scripts
			Technique 83: Figuring out the container startup command
			8.3.3 Debugging OutOfMemory errors
			Technique 84: Force container JVMs to generate a heap dump
			8.3.4 MapReduce coding guidelines for effective debugging
			Technique 85: Augmenting MapReduce code for better debugging
		8.4 Testing MapReduce jobs
			8.4.1 Essential ingredients for effective unit testing
			8.4.2 MRUnit
			Technique 86: Using MRUnit to unit-test MapReduce
			8.4.3 LocalJobRunner
			Technique 87: Heavyweight job testing with the LocalJobRunner
			8.4.4 MiniMRYarnCluster
			Technique 88: Using MiniMRYarnCluster to test your jobs
			8.4.5 Integration and QA testing
		8.5 Chapter summary
Part 4: Beyond MapReduce
	Chapter 9: SQL on Hadoop
		9.1 Hive
			9.1.1 Hive basics
			9.1.2 Reading and writing data
			Technique 89: Working with text files
			Technique 90: Exporting data to local disk
			9.1.3 User-defined functions in Hive
			Technique 91: Writing UDFs
			9.1.4 Hive performance
			Technique 92: Partitioning
			Technique 93: Tuning Hive joins
		9.2 Impala
			9.2.1 Impala vs. Hive
			9.2.2 Impala basics
			Technique 94: Working with text
			Technique 95: Working with Parquet
			Technique 96: Refreshing metadata
			9.2.3 User-defined functions in Impala
			Technique 97: Executing Hive UDFs in Impala
		9.3 Spark SQL
			Technique 98: Calculating stock averages with Spark SQL
			Technique 99: Language-integrated queries
			Technique 100: Hive and Spark SQL
			9.3.1 Spark 101
			9.3.2 Spark on Hadoop
			9.3.3 SQL with Spark
		9.4 Chapter summary
	Chapter 10: Writing a YARN application
		10.1 Fundamentals of building a YARN application
			10.1.1 Actors
			10.1.2 The mechanics of a YARN application
		10.2 Building a YARN application to collect cluster statistics
			Technique 101: A bare-bones YARN client
			Technique 102: A bare-bones ApplicationMaster
			Technique 103: Running the application and accessing logs
			Technique 104: Debugging using an unmanaged application master
		10.3 Additional YARN application capabilities
			10.3.1 RPC between components
			10.3.2 Service discovery
			10.3.3 Checkpointing application progress
			10.3.4 Avoiding split-brain
			10.3.5 Long-running applications
			10.3.6 Security
		10.4 YARN programming abstractions
			10.4.1 Twill
			10.4.2 Spring
			10.4.3 REEF
			10.4.4 Picking a YARN API abstraction
		10.5 Chapter summary
appendix: Installing Hadoop and friends
	A.1 Code for the book
	A.2 Recommended Java versions
	A.3 Hadoop
		Apache tarball installation
		Hadoop 1.x UI ports
		Hadoop 2.x UI ports
	A.4 Flume
		Getting more information
		Installation on Apache Hadoop 1.x systems
		Installation on Apache Hadoop 2.x systems
	A.5 Oozie
		Getting more information
		Installation on Hadoop 1.x systems
		Installation on Hadoop 2.x systems
	A.6 Sqoop
		Getting more information
		Installation
	A.7 HBase
		Getting more information
		Installation
	A.8 Kafka
		Getting more information
		Installation
	A.9 Camus
		Getting more information
		Installation on Hadoop 1
		Installation on Hadoop 2
	A.10 Avro
		Getting more information
		Installation
	A.11 Apache Thrift
		Getting more information
		Building Thrift 0.7
	A.12 Protocol Buffers
		Getting more information
		Building Protocol Buffers
	A.13 Snappy
		Getting more information
	A.14 LZOP
		Getting more information
		Building LZOP
	A.15 Elephant Bird
		Getting more information
	A.16 Hive
		Getting more information
		Installation
	A.17 R
		Getting more information
		Installation on Red Hat–based systems
		Installation on non–Red Hat systems
	A.18 RHadoop
		Getting more information
		rmr/rhdfs installation
	A.19 Mahout
		Getting more information
		Installation
index
	A
	B
	C
	D
	E
	F
	G
	H
	I
	J
	K
	L
	M
	N
	O
	P
	Q
	R
	S
	T
	U
	V
	W
	X
	Y
	Z




نظرات کاربران