chinelearning
Readinginthedata
Preprocessingandcleaningthedata
Choosingtherightmodelandlearningalgorithm
Beforebuildingourfirstmodel...
Startingwithasimplestraightline
Towardssomeadvancedstuff
Steppingbacktogoforward-anotherlookatourdata
Trainingandtesting
Answeringourinitialquestion
Summary
Chapter2:ClassifyingwithReal-worldExamples
TheIrisdataset
Visualizationisagoodfirststep
Buildingourfirstclassificationmodel
Evaluation-holdingoutdataandcross-validation
Buildingmorecomplexclassifiers
Amorecomplexdatasetandamorecomplexclassifim
LearningabouttheSeedsdataset
Featuresandfeatureengineering
Nearestneighborclassification
Classifyingwithscikit-learn
Lookingatthedecisionboundaries
Binaryandmulticlassclassification
Summary
Chapter3:Clustering-FindingRelatedPosts
Measuringtherelatednessofposts
Hownottodoit
Howtodoit
Preprocessing-similaritymeasuredasasimilarnumberofcommonwords
Convertingrawtextintoabagofwords
Countingwords
Normalizingwordcountvectors
Removinglessimportantwords
Stemming
Stopwordsonsteroids
Ourachievementsandgoals
Clustering
K-means
Gettingtestdatatoevaluateourideason
Clusteringposts
Solvingourinitialchallenge
Anotherlookatnoise
Tweakingtheparameters
Summary
Chapter4:TopicModeling
LatentDirichletallocation
Buildingatopicmodel
Comparingdocumentsbytopics
ModelingthewholeofWikipedia
Choosingthenumberoftopics
Summary
Chapter5:Classification-DetectingPoorAnswers
Sketchingourroadmap
Learningtoclassifyclassyanswers
Tuningtheinstance
Tuningtheclassifier
Fetchingthedata
Slimmingthedatadowntochewablechunks
Preselectionandprocessingofattributes
Definingwhatisagoodanswer
Creatingourfirstclassifier
StartingwithkNN
Engineeringthefeatures
Trainingtheclassifier
Measuringtheclassifier'sperformance
Designingmorefeatures
Decidinghowtoimprove
Bias-varianceandtheirtradeoff
Fixinghighbias
Fixinghighvariance
Highbiasorlowbias
Usinglogisticregression
Abitofmathwithasmallexample
Applyinglogisticregressiontoourpostclassificationproblem
Lookingbehindaccuracy-precisionandrecall
Slimmingtheclassifier
Shipit!
Summary
Chapter6:ClassificationII-SentimentAnalysis
Sketchingourroadmap
FetchingtheTwitterdata
IntroducingtheNaiveBayesclassifier
GettingtoknowtheBayes'theorem
Beingnaive
UsingNaiveBayestoclassify
Accountingforunseenwordsandotheroddities
Accountingforarithmeticunderflows
Creatingourfirstclassifierandtuningit
Solvinganeasyproblemfirst
Usingallclasses
Tuningtheclassifier'sparameters
Cleaningtweets
Takingthewordtypesintoaccount
Determiningthewordtypes
SuccessfullycheatingusingSentiWordNet
Ourfirstestimator
Puttingeverythingtogether
Summary
Chapter7:Regression
Predictinghousepriceswithregression
Multidimensionalregression
Cross-validationforregression
Penalizedorregularizedregression
L1andL2penalties
UsingLassoorElasticNetinscikit-learn
VisualizingtheLassopath
P-greater-than-Nscenarios
Anexamplebasedontextdocuments
Settinghyperparametersinaprincipledway
Summary
Chapter8:Recommendations
Ratingpredictionsandrecommendations
Splittingintotrainingandtesting
Normalizingthetrainingdata
Aneighborhoodapproachtorecommendations
Aregressionapproachtorecommendations
Combiningmultiplemethods
Basketanalysis
Obtainingusefulpredictions
Analyzingsupermarketshoppingbaskets
Associationrulemining
Moreadvancedbasketanalysis
Summary
Chapter9:Classification-MusicGenreClassification
Sketchingourroadmap
Fetchingthemusicdata
ConvertingintoaWAVformat
Lookingatmusic
Decomposingmusicintosinewavecomponents
UsingFFTtobuildourfirstclassifier
Increasingexperimentationagility
Trainingtheclassifier
Usingaconfusionmatrixtomeasureaccuracyin
multiclassproblems
Analternativewaytomeasureclassifierperformance
usingreceiver-operatorcharacteristics
ImprovingclassificationperformancewithMel
FrequencyCepstralCoefficients
Summary
Chapter10:ComputerVision
Introducingimageprocessing
Loadinganddisplayingimages
Thresholding
Gaussianblurring
Puttingthecenterinfocus
Basicimageclassification
Computingfeaturesfromimages
Writingyourownfeatures
Usingfeaturestofindsimilarimages
Classifyingaharderdataset
Localfeaturerepresentations
Summary
Chapter11:DmensionalityReduction
Sketchingourroadmap
Selectingfeatures
Detectingredundantfeaturesusingfilters
Correlation
Mutualinformation
Askingthemodelaboutthefeaturesusingwrappers
Otherfeatureselectionmethods
Featureextraction
Aboutprincipalcomponentanalysis
SketchingPCA
ApplyingPCA
LimitationsofPCAandhowLDAcanhelp
Multidimensionalscaling
Summary
Chapter12:BiggerData
Learningaboutbigdata
Usingjugtobreakupyourpipelineintotasks
Anintroductiontotasksinjug
Lookingunderthehood
Usingjugfordataanalysis
Reusingpartialresults
UsingAmazonWebServices
Creatingyourfirstvirtualmachines
InstallingPythonpackagesonAmazonLinux
Runningjugonourcloudmachine
AutomatingthegenerationofclusterswithStarCluster
Summary
Appendix:WheretoLearnMoreMachineLearning
Onlinecourses
Books
Questionandanswersites
Blogs
Datasources
Gettingcompetitive
Allthatwasleftout
Summary
Index