YARNEssentials
TableofContents
YARNEssentials
Credits
AbouttheAuthors
AbouttheReviewers
www.PacktPub.com
Supportfiles,eBooks,discountoffers,andmore
Whysubscribe?
FreeaccessforPacktaccountholders
Preface
Whatthisbookcovers
Whatyouneedforthisbook
Whothisbookisfor
Conventions
Readerfeedback
Customersupport
Downloadingtheexamplecode
Errata
Piracy
Questions
1.NeedforYARN
Theredesignidea
LimitationsoftheclassicalMapReduceorHadoop1.x
YARNasthemodernoperatingsystemofHadoop
WhatarethedesigngoalsforYARN
Summary
2.YARNArchitecture
CorecomponentsofYARNarchitecture
ResourceManager
ApplicationMaster(AM)
NodeManager(NM)
YARNschedulerpolicies
TheFIFO(FirstInFirstOut)scheduler
Thefairscheduler
Thecapacityscheduler
RecentdevelopmentsinYARNarchitecture
Summary
3.YARNInstallation
Single-nodeinstallation
Prerequisites
Platform
Software
Startingwiththeinstallation
Thestandalonemode(localmode)
Thepseudo-distributedmode
Thefully-distributedmode
HistoryServer
Slavefiles
OperatingHadoopandYARNclusters
StartingHadoopandYARNclusters
StoppingHadoopandYARNclusters
WebinterfacesoftheEcosystem
Summary
4.YARNandHadoopEcosystems
TheHadoop2release
AshortintroductiontoHadoop1.xandMRv1
MRv1versusMRv2
UnderstandingwhereYARNfitsintoHadoop
OldandnewMapReduceAPIs
BackwardcompatibilityofMRv2APIs
Binarycompatibilityoforg.apache.hadoop.mapredAPIs
Sourcecompatibilityoforg.apache.hadoop.mapredAPIs
PracticalexamplesofMRv1andMRv2
Preparingtheinputfile(s)
Runningthejob
Result
Summary
5.YARNAdministration
Containerallocation
Containerallocationtotheapplication
Containerconfigurations
YARNschedulingpolicies
TheFIFO(FirstInFirstOut)scheduler
TheFIFO(FirstInFirstOut)scheduler
Thecapacityscheduler
Capacityschedulerconfigurations
Thefairscheduler
Fairschedulerconfigurations
YARNmultitenancyapplicationsupport
AdministrationofYARN
Administrativetools
AddingandremovingnodesfromaYARNcluster
AdministratingYARNjobs
MapReducejobconfigurations
YARNlogmanagement
YARNwebuserinterface
Summary
6.DevelopingandRunningaSimpleYARNApplication
RunningsampleexamplesonYARN
RunningasamplePiexample
MonitoringYARNapplicationswithwebGUI
YARN’sMapReducesupport
TheMapReduceApplicationMaster
ExampleYARNMapReducesettings
YARN’scompatibilitywithMapReduceapplications
DevelopingYARNapplications
TheYARNapplicationworkflow
WritingtheYARNclient
WritingtheYARNApplicationMaster
ResponsibilitiesoftheApplicationMaster
Summary
7.YARNFrameworks
ApacheSamza
WritingaKafkaproducer
Writingthehello-samzaproject
Startingagrid
Storm-YARN
Prerequisites
HadoopYARNshouldbeinstalled
ApacheZooKeepershouldbeinstalled
SettingupStorm-YARN
Gettingthestorm.yamlconfigurationofthelaunchedStormcluster
BuildingandrunningStorm-Starterexamples
ApacheSpark
WhyrunonYARN?
ApacheTez
ApacheGiraph
HOYA(HBaseonYARN)
KOYA(KafkaonYARN)
Summary
8.FailuresinYARN
ResourceManagerfailures
ApplicationMasterfailures
NodeManagerfailures
Containerfailures
HardwareFailures
Summary
9.YARN–AlternativeSolutions
Mesos
Omega
Corona
Summary
10.YARN–FutureandSupport
WhatYARNmeanstothebigdataindustry
Journey–presentandfuture
Presenton-goingfeatures
Futurefeatures
YARN-supportedframeworks
Summary
Index
YARNEssentials
YARNEssentials
Copyright©2015PacktPublishing
Allrightsreserved.Nopartofthisbookmaybereproduced,storedinaretrievalsystem,
ortransmittedinanyformorbyanymeans,withoutthepriorwrittenpermissionofthe
publisher,exceptinthecaseofbriefquotationsembeddedincriticalarticlesorreviews.
Everyefforthasbeenmadeinthepreparationofthisbooktoensuretheaccuracyofthe
informationpresented.However,theinformationcontainedinthisbookissoldwithout
warranty,eitherexpressorimplied.Neithertheauthors,norPacktPublishing,andits
dealersanddistributorswillbeheldliableforanydamagescausedorallegedtobecaused
directlyorindirectlybythisbook.
PacktPublishinghasendeavoredtoprovidetrademarkinformationaboutallofthe
companiesandproductsmentionedinthisbookbytheappropriateuseofcapitals.
However,PacktPublishingcannotguaranteetheaccuracyofthisinformation.
Firstpublished:February2015
Productionreference:1190215
PublishedbyPacktPublishingLtd.
LiveryPlace
35LiveryStreet
BirminghamB32PB,UK.
ISBN978-1-78439-173-7
www.packtpub.com
Credits
Authors
AmolFasale
NirmalKumar
Reviewers
LakshmiNarasimhan
SwapnilSalunkhe
Jenny(Xiao)Zhang
CommissioningEditor
TaronPereira
AcquisitionEditor
JamesJones
ContentDevelopmentEditor
ArwaManasawala
TechnicalEditor
IndrajitA.Das
CopyEditors
KarunaNarayanan
LaxmiSubramanian
ProjectCoordinator
PuravMotiwalla
Proofreaders
SafisEditing
MariaGould
Indexer
PriyaSane
Graphics
SheetalAute
ValentinaD’silva
AbhinashSahu
ProductionCoordinator
ShantanuN.Zagade
CoverWork
ShantanuN.Zagade
AbouttheAuthors
AmolFasalehasmorethan4yearsofindustryexperienceactivelyworkinginthefields
ofbigdataanddistributedcomputing;heisalsoanactivebloggerinandcontributortothe
opensourcecommunity.Amolworksasaseniordatasystemengineerat
MakeMyTrip.com,averywell-knowntravelandhospitalityportalinIndia,responsible
forreal-timepersonalizationofonlineuserexperiencewithApacheKafka,ApacheStorm,
ApacheHadoop,andmanymore.Also,Amolhasactivehands-onexperiencein
Java/J2EE,SpringFrameworks,Python,machinelearning,Hadoopframework
components,SQL,NoSQL,andgraphdatabases.
YoucanfollowAmolonTwitterat@amolfasaleoronLinkedIn.Amolisveryactiveon
socialmedia.Youcancatchhimonlineforanytechnicalassistance;hewouldbehappyto
help.
Amolhascompletedhisbachelor’sinengineering(electronicsandtelecommunication)
fromPuneUniversityandpostgraduatediplomaincomputersfromCDAC.
Thegiftofloveisoneofthegreatestblessingsfromparents,andIamheartilythankfulto
mymom,dad,friends,andcolleagueswhohaveshownandcontinuetoshowtheirsupport
indifferentways.Finally,IowemuchtoJamesandArwawithoutwhosedirectionand
understanding,Iwouldnothavecompletedthiswork.
NirmalKumarisaleadsoftwareengineeratiLabs,theR&DteamatImpetusInfotech
Pvt.Ltd.Hehasmorethan8yearsofexperienceinopensourcetechnologiessuchasJava,
JEE,Spring,Hibernate,webservices,Hadoop,Hive,Flume,Sqoop,Kafka,Storm,
NoSQLdatabasessuchasHBaseandCassandra,andMPPdatabasessuchasTeradata.
YoucanfollowhimonTwitterat@nirmal___kumar.Hespendsmostofhistimereading
aboutandplayingwithdifferenttechnologies.Hehasalsoundertakenmanytechtalksand
trainingsessionsonbigdatatechnologies.
Hehasattainedhismaster’sdegreeincomputerapplicationsfromHarcourtButler
TechnologicalInstitute(HBTI),Kanpur,IndiaandiscurrentlypartofthebigdataR&D
teaminiLabsatImpetusInfotechPvt.Ltd.
Iwouldliketothankmyorganization,especiallyiLabs,forsupportingmeinwritingthis
book.Also,aspecialthankstothePacktPublishingteam;withoutyouguys,thiswork
wouldnothavebeenpossible.
AbouttheReviewers
LakshmiNarasimhanisafullstackdeveloperwhohasbeenworkingonbigdataand
searchsincetheearlydaysofLuceneandwasapartofthesearchteamatAsk.com.Heis
abigadvocateofopensourceandregularlycontributesandconsultsonvarious
technologies,mostnotablyDrupalandtechnologiesrelatedtobigdata.Lakshmiis
currentlyworkingasthecurriculumdesignerforhisowntrainingcompany,
.Heblogsoccasionallyabouthistechnicalendeavorsat
andcanbecontactedviahisTwitterhandle,@lakshminp.
It’shardfindareadyreferenceordocumentationforasubjectlikeYARN.I’dliketo
thanktheauthorforwritingabookonYARNandhopethetargetaudiencefindsituseful.
SwapnilSalunkheisapassionatesoftwaredeveloperwhoiskeenlyinterestedinlearning
andimplementingnewtechnologies.Hehasapassionforfunctionalprogramming,
machinelearning,andworkingwithdata.Hehasexperienceworkinginthefinanceand
telecomdomains.
I’dliketothankPacktPublishinganditsstaffforanopportunitytocontributetothis
book.
Jenny(Xiao)Zhangisatechnologyprofessionalinbusinessanalytics,KPIs,andbig
data.Shehelpsbusinessesbettermanage,measure,report,andanalyzedatatoanswer
criticalbusinessquestionsanddrivebusinessgrowth.SheisanexpertinSaaSbusiness
andhadexperienceinavarietyofindustrydomainssuchastelecom,oilandgas,and
finance.Shehaswrittenanumberofblogpostsatonbigdata,
Hadoop,andYARN.ShealsoactivelyusesTwitterat@smallnarutotoshareinsightson
bigdataandanalytics.
Iwanttothankallmyblogreaders.Itistheencouragementfromthemthatmotivatesme
todeepdiveintotheoceanofbigdata.Ialsowanttothankmydad,Michael(Tiegang)
Zhang,forprovidingtechnicalinsightsintheprocessofreviewingthebook.Aspecial
thankstothePacktPublishingteamforthisgreatopportunity.
www.PacktPub.com
Supportfiles,eBooks,discountoffers,and
more
Forsupportfilesanddownloadsrelatedtoyourbook,pleasevisitwww.PacktPub.com.
DidyouknowthatPacktofferseBookversionsofeverybookpublished,withPDFand
ePubfilesavailable?YoucanupgradetotheeBookversionatwww.PacktPub.comandas
aprintbookcustomer,youareentitledtoadiscountontheeBookcopy.Getintouchwith
usat<>formoredetails.
Atwww.PacktPub.com,youcanalsoreadacollectionoffreetechnicalarticles,signup
forarangeoffreenewslettersandreceiveexclusivediscountsandoffersonPacktbooks
andeBooks.
/>DoyouneedinstantsolutionstoyourITquestions?PacktLibisPackt’sonlinedigital
booklibrary.Here,youcansearch,access,andreadPackt’sentirelibraryofbooks.
Whysubscribe?
FullysearchableacrosseverybookpublishedbyPackt
Copyandpaste,print,andbookmarkcontent
Ondemandandaccessibleviaawebbrowser
FreeaccessforPacktaccountholders
IfyouhaveanaccountwithPacktatwww.PacktPub.com,youcanusethistoaccess
PacktLibtodayandview9entirelyfreebooks.Simplyuseyourlogincredentialsfor
immediateaccess.