TableofContents
DeepLearningwithHadoop
Credits
AbouttheAuthor
AbouttheReviewers
www.PacktPub.com
Whysubscribe?
CustomerFeedback
Dedication
Preface
Whatthisbookcovers
Whatyouneedforthisbook
Whothisbookisfor
Conventions
Readerfeedback
Customersupport
Downloadingtheexamplecode
Downloadingthecolorimagesofthisbook
Errata
Piracy
Questions
1.IntroductiontoDeepLearning
Gettingstartedwithdeeplearning
Deepfeed-forwardnetworks
Variouslearningalgorithms
Unsupervisedlearning
Supervisedlearning
Semi-supervisedlearning
Deeplearningterminologies
Deeplearning:ArevolutioninArtificialIntelligence
Motivationsfordeeplearning
Thecurseofdimensionality
Thevanishinggradientproblem
Distributedrepresentation
Classificationofdeeplearningnetworks
Deepgenerativeorunsupervisedmodels
Deepdiscriminatemodels
Summary
2.DistributedDeepLearningforLarge-ScaleData
Deeplearningformassiveamountsofdata
Challengesofdeeplearningforbigdata
Challengesofdeeplearningduetomassivevolumesofdata(firstV)
Challengesofdeeplearningfromahighvarietyofdata(secondV)
Challengesofdeeplearningfromahighvelocityofdata(thirdV)
Challengesofdeeplearningtomaintaintheveracityofdata(fourthV)
DistributeddeeplearningandHadoop
Map-Reduce
IterativeMap-Reduce
YetAnotherResourceNegotiator(YARN)
Importantcharacteristicsfordistributeddeeplearningdesign
Deeplearning4j-anopensourcedistributedframeworkfordeeplearning
MajorfeaturesofDeeplearning4j
SummaryoffunctionalitiesofDeeplearning4j
SettingupDeeplearning4jonHadoopYARN
GettingfamiliarwithDeeplearning4j
IntegrationofHadoopYARNandSparkfordistributeddeeplearning
RulestoconfigurememoryallocationforSparkonHadoopYARN
Summary
3.ConvolutionalNeuralNetwork
Understandingconvolution
BackgroundofaCNN
Architectureoverview
BasiclayersofCNN
ImportanceofdepthinaCNN
Convolutionallayer
Sparseconnectivity
Improvedtimecomplexity
Parametersharing
Improvedspacecomplexity
Equivariantrepresentations
ChoosingthehyperparametersforConvolutionallayers
Depth
Stride
Zero-padding
Mathematicalformulationofhyperparameters
Effectofzero-padding
ReLU(RectifiedLinearUnits)layers
AdvantagesofReLUoverthesigmoidfunction
Poolinglayer
Whereisituseful,andwhereisitnot?
Fullyconnectedlayer
DistributeddeepCNN
Mostpopularaggressivedeepneuralnetworksandtheirconfigurations
Trainingtime-majorchallengesassociatedwithdeepneuralnetworks
HadoopfordeepCNNs
ConvolutionallayerusingDeeplearning4j
Loadingdata
Modelconfiguration
Trainingandevaluation
Summary
4.RecurrentNeuralNetwork
Whatmakesrecurrentnetworksdistinctivefromothers?
Recurrentneuralnetworks(RNNs)
Unfoldingrecurrentcomputations
Advantagesofamodelunfoldedintime
MemoryofRNNs
Architecture
Backpropagationthroughtime(BPTT)
Errorcomputation
Longshort-termmemory
Problemwithdeepbackpropagationwithtime
Longshort-termmemory
Bi-directionalRNNs
ShortfallsofRNNs
Solutionstoovercome
DistributeddeepRNNs
RNNswithDeeplearning4j
Summary
5.RestrictedBoltzmannMachines
Energy-basedmodels
Boltzmannmachines
HowBoltzmannmachineslearn
Shortfall
RestrictedBoltzmannmachine
Thebasicarchitecture
HowRBMswork
ConvolutionalRestrictedBoltzmannmachines
StackedConvolutionalRestrictedBoltzmannmachines
DeepBeliefnetworks
Greedylayer-wisetraining
DistributedDeepBeliefnetwork
DistributedtrainingofRestrictedBoltzmannmachines
DistributedtrainingofDeepBeliefnetworks
Distributedbackpropagationalgorithm
PerformanceevaluationofRBMsandDBNs
Drasticimprovementintrainingtime
ImplementationusingDeeplearning4j
RestrictedBoltzmannmachines
DeepBeliefnetworks
Summary
6.Autoencoders
Autoencoder
Regularizedautoencoders
Sparseautoencoders
Sparsecoding
Sparseautoencoders
Thek-Sparseautoencoder
Howtoselectthesparsitylevelk
Effectofsparsitylevel
Deepautoencoders
Trainingofdeepautoencoders
ImplementationofdeepautoencodersusingDeeplearning4j
Denoisingautoencoder
ArchitectureofaDenoisingautoencoder
Stackeddenoisingautoencoders
ImplementationofastackeddenoisingautoencoderusingDeeplearning4j
Applicationsofautoencoders
Summary
7.MiscellaneousDeepLearningOperationsusingHadoop
DistributedvideodecodinginHadoop
Large-scaleimageprocessingusingHadoop
ApplicationofMap-Reducejobs
NaturallanguageprocessingusingHadoop
Webcrawler
Extractionofkeywordandmodulefornaturallanguageprocessing
Estimationofrelevantkeywordsfromapage
Summary
1.References
DeepLearningwithHadoop
DeepLearningwithHadoop
Copyright©2017PacktPublishing
Allrightsreserved.Nopartofthisbookmaybereproduced,storedinaretrievalsystem,or
transmittedinanyformorbyanymeans,withoutthepriorwrittenpermissionofthepublisher,
exceptinthecaseofbriefquotationsembeddedincriticalarticlesorreviews.
Everyefforthasbeenmadeinthepreparationofthisbooktoensuretheaccuracyofthe
informationpresented.However,theinformationcontainedinthisbookissoldwithoutwarranty,
eitherexpressorimplied.Neithertheauthor,norPacktPublishing,anditsdealersand
distributorswillbeheldliableforanydamagescausedorallegedtobecauseddirectlyor
indirectlybythisbook.
PacktPublishinghasendeavoredtoprovidetrademarkinformationaboutallofthecompanies
andproductsmentionedinthisbookbytheappropriateuseofcapitals.However,Packt
Publishingcannotguaranteetheaccuracyofthisinformation.
Firstpublished:February2017
Productionreference:1130217
PublishedbyPacktPublishingLtd.
LiveryPlace
35LiveryStreet
Birmingham
B32PB,UK.
ISBN978-1-78712-476-9
www.packtpub.com
Credits
Authors
DipayanDev
Reviewers
ShashwatShriparv
WissemELKhlifi
CommissioningEditor
AmeyVarangaonkar
AcquisitionEditor
DivyaPoojari
ContentDevelopmentEditor
SumeetSawant
TechnicalEditor
NileshSawakhande
CopyEditor
SafisEditing
ProjectCoordinator
ShwetaHBirwatkar
Proofreader
SafisEditing
Indexer
MariammalChettiyar
Graphics
TaniaDutta
ProductionCoordinator
MelwynDsa
AbouttheAuthor
DipayanDevhascompletedhisM.TechfromNationalInstituteofTechnology,Silcharwitha
firstclassfirstandiscurrentlyworkingasasoftwareprofessionalinBengaluru,India.Hehas
extensiveknowledgeandexperienceinnon-relationaldatabasetechnologies,havingprimarily
workedwithlarge-scaledataoverthelastfewyears.HiscoreexpertiseliesinHadoop
Framework.Duringhispostgraduation,Dipayanhadbuiltaninfinitescalableframeworkfor
Hadoop,calledDr.Hadoop,whichgotpublishedintop-tierSCI-EindexedjournalofSpringer
(Dr.Hadoophasrecentlybeencited
byGooWikipediaintheirApacheHadooparticle.Apartfromthat,heregistersinterestina
widerangeofdistributedsystemtechnologies,suchasRedis,ApacheSpark,Elasticsearch,
Hive,Pig,Riak,andotherNoSQLdatabases.Dipayanhasalsoauthoredvariousresearch
papersandbookchapters,whicharepublishedbyIEEEandtop-tierSpringerJournals.To
knowmoreabouthim,youcanalsovisithisLinkedInprofile
/>
AbouttheReviewers
ShashwatShriparvhasmorethan7yearsofITexperience.Hehasworkedwithvarious
technologiesonhiscareerpath,suchasHadoopandsubprojects,Java,.NET,andsoon.He
hasexperienceintechnologiessuchasHadoop,HBase,Hive,Pig,Flume,Sqoop,Mongo,
Cassandra,Java,C#,Linux,Scripting,PHP,C++,C,Webtechnologies,andvariousreal-life
usecasesinBigDatatechnologiesasadeveloperandadministrator.Helikestoridebikes,has
interestinphotography,andwritesblogswhennotworking.
HehasworkedwithcompaniessuchasCDAC,Genilok,HCL,UIDAI(Aadhaar),Pointcross;he
iscurrentlyworkingwithCenturyLinkCognilytics.
HeistheauthorofLearningHBase,PacktPublishing,thereviewerofPigDesignPatternbook,
PacktPublishing,andthereviewerofHadoopReal-WorldSolutioncookbook,2ndedition.
Iwouldliketotakethisopportunitytothankeveryonewhohavesomehowmademylifebetter
andappreciatedmeatmybestandbaredwithmeandsupportedmeduringmybadtimes.
WissemElKhlifiisthefirstOracleACEinSpainandanOracleCertifiedProfessionalDBA
withover12yearsofITexperience.HeearnedtheComputerScienceEngineerdegreefrom
FSTTunisia,MastersinComputerSciencefromtheUPCBarcelona,andMastersinBigData
SciencefromtheUPCBarcelona.HisareaofinterestincludeCloudArchitecture,BigData
Architecture,andBigDataManagement&Analysis.
Hiscareerhasincludedtherolesof:Javaanalyst/programmer,OracleSeniorDBA,andbig
datascientist.HecurrentlyworksasSeniorBigDataandCloudArchitectforSchneiderElectric
/APC.Hewritesnumerousarticlesonhiswebsiteandhistwitter
handleis@orawiss.
www.PacktPub.com
Forsupportfilesanddownloadsrelatedtoyourbook,pleasevisitwww.PacktPub.com.
DidyouknowthatPacktofferseBookversionsofeverybookpublished,withPDFandePub
filesavailable?YoucanupgradetotheeBookversionatwww.PacktPub.comandasaprint
bookcustomer,youareentitledtoadiscountontheeBookcopy.Getintouchwithus
atformoredetails.
Atwww.PacktPub.com,youcanalsoreadacollectionoffreetechnicalarticles,signupfora
rangeoffreenewslettersandreceiveexclusivediscountsandoffersonPacktbooksand
eBooks.
/>Getthemostin-demandsoftwareskillswithMapt.MaptgivesyoufullaccesstoallPackt
booksandvideocourses,aswellasindustry-leadingtoolstohelpyouplanyourpersonal
developmentandadvanceyourcareer.
Whysubscribe?
FullysearchableacrosseverybookpublishedbyPackt
Copyandpaste,print,andbookmarkcontent
Ondemandandaccessibleviaawebbrowser
CustomerFeedback
ThanksforpurchasingthisPacktbook.AtPackt,qualityisattheheartofoureditorialprocess.
Tohelpusimprove,pleaseleaveusanhonestreviewonthisbook'sAmazonpageat
/>Ifyou'dliketojoinourteamofregularreviewers,youcane-mailusat
WeawardourregularreviewerswithfreeeBooksandvideos
inexchangefortheirvaluablefeedback.Helpusberelentlessinimprovingourproducts!
Dedication
Tomymother,DiptiDebandfather,TarunKumarDeb.
Andalsomyelderbrother,TapojitDeb.
Preface
Thisbookwillteachyouhowtodeploylarge-scaledatasetsindeepneuralnetworkswith
Hadoopforoptimalperformance.
Startingwithunderstandingwhatdeeplearningis,andwhatthevariousmodelsassociatedwith
deepneuralnetworksare,thisbookwillthenshowyouhowtosetuptheHadoopenvironment
fordeeplearning.
Whatthisbookcovers
Chapter1,IntroductiontoDeepLearning,covershowdeeplearninghasgaineditspopularity
overthelastdecadeandisnowgrowingevenfasterthanmachinelearningduetoitsenhanced
functionalities.Thischapterstartswithanintroductionofthereal-lifeapplicationsofArtificial
Intelligence,theassociatedchallenges,andhoweffectivelyDeeplearningisabletoaddressall
ofthese.Thechapterprovidesanin-depthexplanationofdeeplearningbyaddressingsomeof
themajormachinelearningproblemssuchas,Thecurseofdimensionality,Vanishinggradient
problem,andthelikes.Togetstartedwithdeeplearningforthesubsequentchapters,the
classificationofvariousdeeplearningnetworksisdiscussedinthelatterpartofthischapter.
Thischapterisprimarilysuitableforreaders,whoareinterestedtoknowthebasicsofdeep
learningwithoutgettingmuchintothedetailsofindividualdeepneuralnetworks.
Chapter2,DistributedDeepLearningforLarge-ScaleData,explainsthatbigdataanddeep
learningareundoubtedlythetwohottesttechnicaltrendsinrecentdays.Bothofthemare
criticallyinterconnectedandhaveshowntremendousgrowthinthepastfewyears.This
chapterstartswithhowdeeplearningtechnologiescanbefurnishedwithmassiveamountof
unstructureddatatofacilitateextractionofvaluablehiddeninformationoutofthem.Famous
technologicalcompaniessuchasGoogle,Facebook,Apple,andthelikeareusingthislargescaledataintheirdeeplearningprojectstotrainsomeaggressivelydeepneuralnetworksina
smarterway.Deepneuralnetworks,however,showcertainchallengeswhiledealingwithBig
data.Thischapterprovidesadetailedexplanationofallthesechallenges.Thelatterpartofthe
chapterintroducesHadoop,todiscusshowdeeplearningmodelscanbeimplementedusing
Hadoop'sYARNanditsiterativeMap-reduceparadigm.Thechapterfurtherintroduces
Deeplearning4j,apopularopensourcedistributedframeworkfordeeplearningandexplainsits
variouscomponents.
Chapter3,ConvolutionalNeuralNetwork,introducesConvolutionalneuralnetwork(CNN),a
deepneuralnetworkwidelyusedbytoptechnologicalindustriesintheirvariousdeeplearning
projects.CNNcomeswithavastrangeofapplicationsinvariousfieldssuchasimage
recognition,videorecognition,naturallanguageprocessing,andsoon.Convolution,aspecial
typeofmathematicaloperation,isanintegralcomponentofCNN.Togetstarted,thechapter
initiallydiscussestheconceptofconvolutionwithareal-lifeexample.Further,anin-depth
explanationofConvolutionalneuralnetworkisprovidedbydescribingeachcomponentofthe
network.Toimprovetheperformanceofthenetwork,CNNcomeswiththreemostimportant
parameters,namely,sparseconnectivity,parametersharing,andequivariantrepresentation.
ThechapterexplainsallofthesetogetabettergriponCNN.Further,CNNalsopossessesfew
crucialhyperparameters,whichhelpindecidingthedimensionofoutputvolumeofthenetwork.
Adetaileddiscussionalongwiththemathematicalrelationshipamongthesehyperparameters
canbefoundinthischapter.Thelatterpartofthechapterfocusesondistributedconvolutional
neuralnetworksandshowsitsimplementationusingHadoopandDeeplearning4j.
Chapter4,RecurrentNeuralNetwork,explainsthatitisaspecialtypeofneuralnetworkthat
canworkoverlongsequencesofvectorstoproducedifferentsequencesofvectors.Recently,
theyhavebecomeanextremelypopularchoiceformodelingsequencesofvariablelength.RNN
hasbeensuccessfullyimplementedforvariousapplicationssuchasspeechrecognition,online
handwrittenrecognition,languagemodeling,andthelike.Thechapterprovidesadetailed
explanationofthevariousconceptsofRNNbyprovidingessentialmathematicalrelationsand
visualrepresentations.RNNpossessesitsownmemorytostoretheoutputoftheintermediate
hiddenlayer.Memoryisthecorecomponentoftherecurrentneuralnetwork,whichhasbeen
discussedinthischapterwithanappropriateblockdiagram.Moreover,thelimitationsofunidirectionalrecurrentneuralnetworksareprovided,andtoovercomethesame,theconceptof
bidirectionalrecurrentneuralnetwork(BRNN)isintroduced.Later,toaddresstheproblemof
vanishinggradient,introducedinchapter1,aspecialunitofRNN,calledLongshort-term
Memory(LSTM)isdiscussed.Intheend,theimplementationofdistributeddeeprecurrent
neuralnetworkwithHadoopisshownwithDeeplearning4j.
Chapter5,RestrictedBoltzmannMachines,coversboththemodelsdiscussedinchapters3
and4andexplainsthattheyarediscriminativemodels.AgenerativemodelcalledRestricted
Boltzmannmachine(RBM)isdiscussedinchapter5.RBMiscapableofrandomlyproducing
visibledatavalueswhenhiddenparametersaresuppliedtoit.Thechapterstartswith
introducingtheconceptofanEnergy-basedmodel,andexplainshowRestrictedBoltzmann
machinesarerelatedtoit.Furthermore,thediscussionprogressestowardsaspecialtypeof
RBMknownasConvolutionalRestrictedBoltzmannmachine,whichisacombinationofboth
ConvolutionandRestrictedBoltzmannmachines,andfacilitatesintheextractionofthefeatures
ofhighdimensionalimages.
DeepBeliefnetworks(DBN),awidelyusedmultilayernetworkcomposedofseveralRestricted
Boltzmannmachinesgetsintroducedinthelatterpartofthechapter.Thispartalsodiscusses
howDBNcanbeimplementedinadistributedenvironmentusingHadoop.Theimplementation
ofRBMaswellasdistributedDBNusingDeeplearning4jisdiscussedintheendofthechapter.
Chapter6,Autoencoders,introducesonemoregenerativemodelcalledautoencoder,whichis
generallyusedfordimensionalityreduction,featurelearning,orextraction.Thechapterstarts
withexplainingthebasicconceptofautoencoderanditsgenericblockdiagram.Thecore
structureofanautoencoderisbasicallydividedintotwoparts,encoderanddecoder.The
encodermapstheinputtothehiddenlayer,whereasthedecodermapsthehiddenlayertothe
outputlayer.Theprimaryconcernofabasicautoencoderistocopycertainaspectsoftheinput
layertotheoutputlayer.Thenextpartofthechapterdiscussesatypeofautoencodercalled
sparseautoencoder,whichisbasedonthedistributedsparserepresentationofthehidden
layer.Goingfurther,theconceptofdeepautoencoder,comprisingmultipleencodersand
decodersisexplainedin-depthwithanappropriateexampleandblockdiagram.Aswe
proceed,denoisingautoencoderandstackeddenoisingautoencoderareexplainedinthelatter
partofthechapter.Inconclusion,chapter6alsoshowstheimplementationofstacked
denoisingautoencoderanddeepautoencoderinHadoopusingDeeplearning4j.
Chapter7,MiscellaneousDeepLearningOperationsusingHadoop,focuses,mainly,onthe
designofthreemostcommonlyusedmachinelearningapplicationsindistributedenvironment.
Thechapterdiscussestheimplementationoflarge-scalevideoprocessing,large-scaleimage
processing,andnaturallanguageprocessing(NLP)withHadoop.ItexplainshowthelargescalevideoandimagedatasetscanbedeployedinHadoopDistributedFileSystem(HDFS)
andprocessedwithMap-reducealgorithm.ForNLP,anin-depthexplanationofthedesignand
implementationisprovidedattheendofthechapter.
Whatyouneedforthisbook
Weexpectallthereadersofthisbooktohavesomebackgroundoncomputerscience.This
bookmainlytalksondifferentdeepneuralnetworks,theirdesignsandapplicationswith
Deeplearning4j.Toextractthemostoutofthebook,thereadersareexpectedtoknowthe
basicsofmachinelearning,linearalgebra,probabilitytheory,theconceptsofdistributed
systemsandHadoop.FortheimplementationofdeepneuralnetworkswithHadoop,
Deeplearning4jhasbeenextensivelyusedthroughoutthisbook.Followingisthelinkfor
everythingyouneedtorunDeeplearning4j:
/>
Whothisbookisfor
IfyouareadatascientistwhowantstolearnhowtoperformdeeplearningonHadoop,thisis
thebookforyou.Knowledgeofthebasicmachinelearningconceptsandsomeunderstanding
ofHadoopisrequiredtomakethebestuseofthisbook.
Conventions
Inthisbook,youwillfindanumberoftextstylesthatdistinguishbetweendifferentkindsof
information.Herearesomeexamplesofthesestylesandanexplanationoftheirmeaning.
Codewordsintext,databasetablenames,foldernames,filenames,fileextensions,
pathnames,dummyURLs,userinput,andTwitterhandlesareshownasfollows:"The.build()
functionisusedtobuildthelayer."
Ablockofcodeissetasfollows:
publicstaticfinalStringDATA_URL=
" />
Whenwewishtodrawyourattentiontoaparticularpartofacodeblock,therelevantlinesor
itemsaresetinbold:
MultiLayerNetworkmodel=newMultiLayerNetwork(getConfiguration());
Model.init();
Newtermsandimportantwordsareshowninbold.Wordsthatyouseeonthescreen,for
example,inmenusordialogboxes,appearinthetextlikethis:"Insimplewords,anyneural
networkwithtwoormorelayers(hidden)isdefinedasadeepfeed-forwardnetworkorfeedforwardneuralnetwork."
Note
Warningsorimportantnotesappearinaboxlikethis.
Tip
Tipsandtricksappearlikethis.
Readerfeedback
Feedbackfromourreadersisalwayswelcome.Letusknowwhatyouthinkaboutthisbookwhatyoulikedordisliked.Readerfeedbackisimportantforusasithelpsusdeveloptitlesthat
youwillreallygetthemostoutof.Tosendusgeneralfeedback,simplyemail,andmentionthebook'stitleinthesubjectofyourmessage.If
thereisatopicthatyouhaveexpertiseinandyouareinterestedineitherwritingorcontributing
toabook,seeourauthorguideatwww.packtpub.com/authors.
Customersupport
NowthatyouaretheproudownerofaPacktbook,wehaveanumberofthingstohelpyouto
getthemostfromyourpurchase.
Downloadingtheexamplecode
Youcandownloadtheexamplecodefilesforthisbookfromyouraccountat
.Ifyoupurchasedthisbookelsewhere,youcanvisit
andregistertohavethefilese-maileddirectlytoyou.
Youcandownloadthecodefilesbyfollowingthesesteps:
1. Loginorregistertoourwebsiteusingyoure-mailaddressandpassword.
2. HoverthemousepointerontheSUPPORTtabatthetop.
3. ClickonCodeDownloads&Errata.
4. EnterthenameofthebookintheSearchbox.
5. Selectthebookforwhichyou'relookingtodownloadthecodefiles.
6. Choosefromthedrop-downmenuwhereyoupurchasedthisbookfrom.
7. ClickonCodeDownload.
Oncethefileisdownloaded,pleasemakesurethatyouunziporextractthefolderusingthe
latestversionof:
WinRAR/7-ZipforWindows
Zipeg/iZip/UnRarXforMac
7-Zip/PeaZipforLinux
ThecodebundleforthebookisalsohostedonGitHubat
Wealsohaveothercode
bundlesfromourrichcatalogofbooksandvideosavailableat
Checkthemout!
Downloadingthecolorimagesofthisbook
WealsoprovideyouwithaPDFfilethathascolorimagesofthescreenshots/diagramsusedin
thisbook.Thecolorimageswillhelpyoubetterunderstandthechangesintheoutput.Youcan
downloadthisfilefrom
/>
Errata
Althoughwehavetakeneverycaretoensuretheaccuracyofourcontent,mistakesdohappen.
Ifyoufindamistakeinoneofourbooks-maybeamistakeinthetextorthecode-wewouldbe
gratefulifyoucouldreportthistous.Bydoingso,youcansaveotherreadersfromfrustration
andhelpusimprovesubsequentversionsofthisbook.Ifyoufindanyerrata,pleasereport
thembyvisitingselectingyourbook,clickingonthe
ErrataSubmissionFormlink,andenteringthedetailsofyourerrata.Onceyourerrataare
verified,yoursubmissionwillbeacceptedandtheerratawillbeuploadedtoourwebsiteor
addedtoanylistofexistingerrataundertheErratasectionofthattitle.
Toviewthepreviouslysubmittederrata,goto
andenterthenameofthebookinthesearch
field.TherequiredinformationwillappearundertheErratasection.
Piracy
PiracyofcopyrightedmaterialontheInternetisanongoingproblemacrossallmedia.AtPackt,
wetaketheprotectionofourcopyrightandlicensesveryseriously.Ifyoucomeacrossany
illegalcopiesofourworksinanyformontheInternet,pleaseprovideuswiththelocation
addressorwebsitenameimmediatelysothatwecanpursuearemedy.
Pleasecontactusatwithalinktothesuspectedpiratedmaterial.
Weappreciateyourhelpinprotectingourauthorsandourabilitytobringyouvaluablecontent.
Questions
Ifyouhaveaproblemwithanyaspectofthisbook,youcancontactus
at,andwewilldoourbesttoaddresstheproblem.
Chapter1.IntroductiontoDeepLearning
"ByfarthegreatestdangerofArtificialIntelligenceisthatpeopleconcludetooearlythat
theyunderstandit."
--EliezerYudkowsky
Everthought,whyitisoftendifficulttobeatthecomputerinchess,evenforthebestplayersof
thegame?HowFacebookisabletorecognizeyourfaceamidhundredsofmillionsofphotos?
Howcanyourmobilephonerecognizeyourvoice,andredirectthecalltothecorrectperson,
fromhundredsofcontactslisted?
Theprimarygoalofthisbookistodealwithmanyofthosequeries,andtoprovidedetailed
solutionstothereaders.Thisbookcanbeusedforawiderangeofreasonsbyavarietyof
readers,however,wewrotethebookwithtwomaintargetaudiencesinmind.Oneofthe
primarytargetaudiencesisundergraduateorgraduateuniversitystudentslearningaboutdeep
learningandArtificialIntelligence;thesecondgroupofreadersarethesoftwareengineerswho
alreadyhaveaknowledgeofbigdata,deeplearning,andstatisticalmodeling,butwantto
rapidlygainknowledgeofhowdeeplearningcanbeusedforbigdataandviceversa.
Thischapterwillmainlytrytosetafoundationforthereadersbyprovidingthebasicconcepts,
terminologies,characteristics,andthemajorchallengesofdeeplearning.Thechapterwillalso
putforwardtheclassificationofdifferentdeepnetworkalgorithms,whichhavebeenwidely
usedbyresearchersoverthelastdecade.Thefollowingarethemaintopicsthatthischapter
willcover:
Gettingstartedwithdeeplearning
Deeplearningterminologies
Deeplearning:ArevolutioninArtificialIntelligence
Classificationofdeeplearningnetworks
Eversincethedawnofcivilization,peoplehavealwaysdreamtofbuildingartificialmachinesor
robotswhichcanbehaveandworkexactlylikehumanbeings.FromtheGreekmythological
characterstotheancientHinduepics,therearenumeroussuchexamples,whichclearly
suggestpeople'sinterestandinclinationtowardscreatingandhavinganartificiallife.
Duringtheinitialcomputergenerations,peoplehadalwayswonderedifthecomputercouldever
becomeasintelligentasahumanbeing!Goingforward,eveninmedicalscience,theneedof
automatedmachineshasbecomeindispensableandalmostunavoidable.Withthisneedand
constantresearchinthesamefield,ArtificialIntelligence(AI)hasturnedouttobea
flourishingtechnologywithvariousapplicationsinseveraldomains,suchasimageprocessing,
videoprocessing,andmanyotherdiagnosistoolsinmedicalsciencetoo.
AlthoughtherearemanyproblemsthatareresolvedbyAIsystemsonadailybasis,nobody
knowsthespecificrulesforhowanAIsystemisprogrammed!Afewoftheintuitiveproblems
areasfollows:
Googlesearch,whichdoesareallygoodjobofunderstandingwhatyoutypeorspeak
Asmentionedearlier,Facebookisalsosomewhatgoodatrecognizingyourface,and
hence,understandingyourinterests
Moreover,withtheintegrationofvariousotherfields,forexample,probability,linearalgebra,
statistics,machinelearning,deeplearning,andsoon,AIhasalreadygainedahugeamountof
popularityintheresearchfieldoverthecourseoftime.
OneofthekeyreasonsfortheearlysuccessofAIcouldbethatitbasicallydealtwith
fundamentalproblemsforwhichthecomputerdidnotrequireavastamountofknowledge.For
example,in1997,IBM'sDeepBluechess-playingsystemwasabletodefeattheworld
championGarryKasparov[1].Althoughthiskindofachievementatthattimecanbeconsidered
significant,itwasdefinitelynotaburdensometasktotrainthecomputerwithonlythelimited
numberofrulesinvolvedinchess!Trainingasystemwithafixedandlimitednumberofrulesis
termedashard-codedknowledgeofthecomputer.ManyArtificialIntelligenceprojectshave
undergonethishard-codedknowledgeaboutthevariousaspectsoftheworldinmany
traditionallanguages.Astimeprogresses,thishard-codedknowledgedoesnotseemtowork
withsystemsdealingwithhugeamountsofdata.Moreover,thenumberofrulesthatthedata
wasfollowingalsokeptchanginginafrequentmanner.Therefore,mostoftheprojects
followingthatsystemfailedtostanduptotheheightofexpectation.
Thesetbacksfacedbythishard-codedknowledgeimpliedthatthoseartificialintelligence
systemsneededsomewayofgeneralizingpatternsandrulesfromthesuppliedrawdata,
withouttheneedforexternalspoon-feeding.Theproficiencyofasystemtodosoistermedas
machinelearning.Therearevarioussuccessfulmachinelearningimplementationswhichwe
useinourdailylife.Afewofthemostcommonandimportantimplementationsareasfollows:
Spamdetection:Givenane-mailinyourinbox,themodelcandetectwhethertoputthat
e-mailinspamorintheinboxfolder.AcommonnaiveBayesmodelcandistinguish
betweensuche-mails.
Creditcardfrauddetection:Amodelthatcandetectwhetheranumberoftransactions
performedataspecifictimeintervalarecarriedoutbytheoriginalcustomerornot.
Oneofthemostpopularmachinelearningmodels,givenbyMor-Yosefetalin1990,used
logisticregression,whichcouldrecommendwhethercaesareandeliverywasneededfor
thepatientornot!
Therearemanysuchmodelswhichhavebeenimplementedwiththehelpofmachinelearning
techniques.
Figure1.1:Thefigureshowstheexampleofdifferenttypesofrepresentation.Let'ssaywe
wanttotrainthemachinetodetectsomeemptyspacesinbetweenthejellybeans.Inthe
imageontherightside,wehavesparsejellybeans,anditwouldbeeasierfortheAIsystemto
determinetheemptyparts.However,intheimageontheleftside,wehaveextremelycompact
jellybeans,andhence,itwillbeanextremelydifficulttaskforthemachinetofindtheempty
spaces.ImagessourcedfromUSC-SIPIimagedatabase
Alargeportionofperformanceofthemachinelearningsystemsdependsonthedatafedtothe
system.Thisiscalledrepresentationofthedata.Alltheinformationrelatedtothe
representationiscalledthefeatureofthedata.Forexample,iflogisticregressionisusedto
detectabraintumorinapatient,theAIsystemwillnottrytodiagnosethepatientdirectly!
Rather,theconcerneddoctorwillprovidethenecessaryinputtothesystemsaccordingtothe
commonsymptomsofthatpatient.TheAIsystemwillthenmatchthoseinputswiththealready
receivedpastinputswhichwereusedtotrainthesystem.
Basedonthepredictiveanalysisofthesystem,itwillprovideitsdecisionregardingthe
disease.Althoughlogisticregressioncanlearnanddecidebasedonthefeaturesgiven,it
cannotinfluenceormodifythewayfeaturesaredefined.Logisticregressionisatypeof
regressionmodelwherethedependentvariablehasalimitednumberofpossiblevaluesbased
ontheindependentvariable,unlikelinearregression.So,forexample,ifthatmodelwas
providedwithacaesareanpatient'sreportinsteadofthebraintumorpatient'sreport,itwould
surelyfailtopredictthecorrectoutcome,asthegivenfeatureswouldnevermatchwiththe
traineddata.
Thesedependenciesofthemachinelearningsystemsontherepresentationofthedataarenot
reallyunknowntous!Infact,mostofourcomputertheoryperformsbetterbasedonhowthe
dataarerepresented.Forexample,thequalityofadatabaseisconsideredbasedonhowthe
schemaisdesigned.Theexecutionofanydatabasequery,evenonathousandoramillionlines
ofdata,becomesextremelyfastifthetableisindexedproperly.Therefore,thedependencyof
thedatarepresentationoftheAIsystemsshouldnotsurpriseus.
Therearemanysuchexamplesindailylifetoo,wheretherepresentationofthedatadecides
ourefficiency.Tolocateapersonamidst20peopleisobviouslyeasierthantolocatethesame
personinacrowdof500people.Avisualrepresentationoftwodifferenttypesofdata
representationisshownintheprecedingFigure1.1.
Therefore,iftheAIsystemsarefedwiththeappropriatefeatureddata,eventhehardest
problemscouldberesolved.However,collectingandfeedingthedesireddatainthecorrect
waytothesystemhasbeenaseriousimpedimentforthecomputerprogrammer.
Therecanbenumerousreal-timescenarioswhereextractingthefeaturescouldbea
cumbersometask.Therefore,thewaythedataarerepresenteddecidestheprimefactorsin
theintelligenceofthesystem.
Note
Findingcatsamidstagroupofhumansandcatscanbeextremelycomplicatedifthe
featuresarenotappropriate.Weknowthatcatshavetails;therefore,wemightliketo
detectthepresenceoftailsasaprominentfeature.However,giventhedifferenttailshapes
andsizes,itisoftendifficulttodescribeexactlyhowatailwilllooklikeintermsofpixel
values!Moreover,tailscouldsometimesbeconfusedwiththehandsofhumans.Also,
overlappingofsomeobjectscouldomitthepresenceofacat'stail,makingtheimageeven
morecomplicated.
Fromalltheabovediscussions,itcanbeconcludedthatthesuccessofAIsystemsdepends
mainlyonhowthedataarerepresented.Also,variousrepresentationscanensnareandcache
thedifferentexplanatoryfactorsofallthedisparitiesbehindthedata.
Representationlearningisoneofthemostpopularandwidelypracticedlearningapproaches
usedtocopewiththesespecificproblems.Learningtherepresentationsofthenextlayerfrom
theexistingrepresentationofdatacanbedefinedasrepresentationlearning.Ideally,all
representationlearningalgorithmshavethisadvantageoflearningrepresentations,which
capturetheunderlyingfactors,asubsetthatmightbeapplicableforeachparticularsub-task.A
simpleillustrationisgiveninthefollowingFigure1.2: