Manning.Algorithms.of.the.Intelligent.Web.2nd.Edi.pdf
contents
foreword ix
preface xi
acknowledgments xiii
about this book xv
1 Building applications for the intelligent web 1
1.1 An intelligent algorithm in action: Google Now 3
1.2 The intelligent algorithm lifecycle 5
1.3 Further examples of intelligent algorithms 6
1.4 Things that intelligent applications are not 7
Intelligent algorithms are not all-purpose thinking machines 7
Intelligent algorithms are not a drop-in replacement for humans 7
Intelligent algorithms are not discovered by accident 8
1.5 Classes of intelligent algorithm 8
Artificial intelligence 9 ■ Machine learning 9 ■ Predictive
analytics 10
1.6 Evaluating the performance of intelligent algorithms 12
Evaluating intelligence 12 ■ Evaluating predictions 12
1.7 Important notes about intelligent algorithms 15
Your data is not reliable 15 ■ Inference does not happen
instantaneously 16 ■ Size matters! 16 ■ Different algorithms
have different scaling characteristics 16 ■ Everything is not a
nail! 17 ■ Data isn’t everything 17 ■ Training time can be
vi CONTENTS
variable 17 ■ Generalization is the goal 17 ■ Human intuition
is problematic 18 ■ Think about engineering new features 18
Learn many different models 18 ■ Correlation is not the same
as causation 18
1.8 Summary 19
2 Extracting structure from data: clustering and
transforming your data 20
2.1 Data, structure, bias, and noise 22
2.2 The curse of dimensionality 25
2.3 K-means 26
K-means in action 31
2.4 The Gaussian mixture model 33
What is the Gaussian distribution? 34 ■ Expectation
maximization and the Gaussian distribution 36 ■ The Gaussian
mixture model 36 ■ An example of learning using a Gaussian
mixture model 38
2.5 The relationship between k-means and GMM 41
2.6 Transforming the data axis 42
Eigenvectors and eigenvalues 42 ■ Principal component
analysis 43 ■ An example of principal component analysis 44
2.7 Summary 46
3 Recommending relevant content 47
3.1 Setting the scene: an online movie store 48
3.2 Distance and similarity 49
A closer look at distance and similarity 53 ■ Which is the best
similarity formula? 55
3.3 How do recommender engines work? 56
3.4 User-based collaborative filtering 57
3.5 Model-based recommendation using
singular value decomposition 62
Singular value decomposition 63 ■ Recommendation using SVD:
choosing movies for a given user 64 ■ Recommendation using
SVD: choosing users for a given movie 69
3.6 The Netflix Prize 72
3.7 Evaluating your recommendation 74
3.8 Summary 75
CONTENTS vii
4 Classification: placing things where they belong 77
4.1 The need for classification 78
4.2 An overview of classifiers 81
Structural classification algorithms 82 ■ Statistical classification
algorithms 84 ■ The lifecycle of a classifier 85
4.3 Fraud detection with logistic regression 86
A linear regression primer 86 ■ From linear to logistic
regression 88 ■ Implementing fraud detection 91
4.4 Are your results credible? 99
4.5 Classification with very large datasets 103
4.6 Summary 105
5 Case study: click prediction for online advertising 106
5.1 History and background 107
5.2 The exchange 109
Cookie matching 110 ■ Bid 110 ■ Bid win (or loss)
notification 111 ■ Ad placement 111 ■ Ad monitoring 111
5.3 What is a bidder? 112
Requirements of a bidder 112
5.4 What is a decisioning engine? 113
Information about the user 113 ■ Information about the
placement 114 ■ Contextual information 114 ■ Data
preparation 114 ■ Decisioning engine model 114
Mapping predicted click-through rate to bid price 115
Feature engineering 115 ■ Model training 116
5.5 Click prediction with Vowpal Wabbit 116
Vowpal Wabbit data format 117 ■ Preparing the dataset 119
Testing the model 124 ■ Model calibration 126
5.6 Complexities of building a decisioning engine 128
5.7 The future of real-time prediction 129
5.8 Summary 130
6 Deep learning and neural networks 131
6.1 An intuitive approach to deep learning 132
6.2 Neural networks 133
6.3 The perceptron 135
Training 136 ■ Training a perceptron in scikit-learn 138
A geometric interpretation of the perceptron for two inputs 140
资源链接
标签
发布日期
2017-05-17
擦亮日期
2017-05-17