Normal view MARC view ISBD view

Pattern recognition and machine learning / (Record no. 4063)

MARC details
000 -LEADER
fixed length control field	10378cam a22002297a 4500
020 ## - INTERNATIONAL STANDARD BOOK NUMBER
International Standard Book Number	0387310738 (hd.bd.)
020 ## - INTERNATIONAL STANDARD BOOK NUMBER
International Standard Book Number	9780387310732
040 ## - CATALOGING SOURCE
Transcribing agency	CUS
082 00 - DEWEY DECIMAL CLASSIFICATION NUMBER
Classification number	006.4
Item number	BIS/P
100 1# - MAIN ENTRY--PERSONAL NAME
Personal name	Bishop, Christopher M.
245 10 - TITLE STATEMENT
Title	Pattern recognition and machine learning /
Statement of responsibility, etc.	Christopher M. Bishop.
260 ## - PUBLICATION, DISTRIBUTION, ETC. (IMPRINT)
Place of publication, distribution, etc.	New York :
Name of publisher, distributor, etc.	Springer,
Date of publication, distribution, etc.	2006.
300 ## - PHYSICAL DESCRIPTION
Extent	xx, 738 p. :
Other physical details	ill. (some col.) ;
Dimensions	25 cm.
440 #0 - SERIES
Title	Information science and statistics
504 ## - BIBLIOGRAPHY, ETC. NOTE
Bibliography, etc	Includes bibliographical references and index.
505 ## - FORMATTED CONTENTS NOTE
Formatted contents note	Introduction ^<br/>1.1 Example: Polynomial Curve Fitting 4<br/>1.2 Probability Theory 12<br/>1.2.1 Probability densities 17<br/>1.2.2 Expectations and covariances 19<br/>1.2.3 Bayesian probabilities 21<br/>1.2.4 The Gaussian distribution 24<br/>1.2.5 Curve fitting re-visited 28<br/>1.2.6 Bayesian curve fitting 30<br/>1.3 Model Selection 32<br/>1.4 The Curse of Dimensionality 33<br/>1.5 Decision Theory 38<br/>1.5.1 Minimizing the misclassification rate 39<br/>1.5.2 Minimizing the expected loss 41<br/>1.5.3 The reject option 42<br/>1.5.4 Inference and decision 42<br/>1.5.5 Loss functions for regression 46<br/>1.6 Information Theory 48<br/>1.6.1 Relative entropy and mutualinformation<br/>Exercises<br/>* ' 58<br/>2 Probability Distributions 67<br/>2.1 Binary Variables 68<br/>2.1.) The beta distribution 71<br/>2.2 Multinomial Variables 74<br/>2.2.1 The Dirichlet distribution 76<br/>2.3 The Gaussian Distribution 78<br/>2.3.1 Conditional Gaussian distributions 85<br/>2.3.2 Marginal Gaussian distributions 88<br/>2.3.3 Bayes' theorem for Gaussian variables 90<br/>2.3.4 Maximum likelihood for the Gaussian 93<br/>2.3.5 Sequential e.stimation 94<br/>2.3.6 Bayesian inference for the Gaus.sian 97<br/>2.3.7 Student's t-distribution 102<br/>2.3.8 Periodic variables 105<br/>2.3.9 Mixtures of Gaussians 110<br/>2.4 The Exponential Family 11^<br/>2.4.1 Maximum likelihood and sufficient statistics 116<br/>2.4.2 Conjugate priors 117<br/>2.4.3 Noninformative priors 117<br/>2.5 Nonparametric Methods 120<br/>2.5.1 Kernel density estimators 122<br/>2.5.2 Nearest-neighbour methods . 124<br/>Exercises 127<br/>3 Linear Models for Regression 137<br/>3.1 Linear Basis Function Models 138<br/>3.1.1 Maximum likelihood and least squares 140<br/>3.1.2 Geometry of least squares 143<br/>3.1.3 Sequential learning ^43<br/>3.1.4 Regularized least squares 144<br/>3.1.5 Multiple outputs 146<br/>3.2 The Bias-Variance Decomposition 147<br/>3.3 Bayesian Linear Regression 152<br/>3.3.1 Parameter distribution 152<br/>3.3.2 Predictive distribution 156<br/>3.3.3 Equivalent kernel 159<br/>3.4 Bayesian Model Comparison 161<br/>3.5 The Evidence Approximation 165<br/>3.5.1 Evaluation of the evidence function 166<br/>3.5.2 Maximizing the evidence function 168<br/>3.5.3 Effective number of parameters 170<br/>3.6 Limitations of Fixed Basis Functions 172<br/>Exercises<br/>4 Linear Models for Classification 179<br/>4.1 Discriminant Functions 181<br/>4.1.1 Two classes 181<br/>4.1.2 Multiple classes 182<br/>4.1.3 Least squares for classification 184<br/>4.1.4 Fisher's linear discriminant 186<br/>4.1.5 Relation to least squares 189<br/>4.1.6 Fisher's discriminant for multiple classes 191<br/>4.1.7 The perceptron algorithm 192<br/>4.2 Probabilistic Generative Models 196<br/>4.2.1 Continuous inputs 198<br/>4.2.2 Maximum likelihood solution 200<br/>4.2.3 Discrete features 202<br/>4.2.4 Exponential family 202<br/>4.3 Probabilistic Discriminative Models 203<br/>4.3.1 Fixed basis functions 204<br/>4.3.2 Logistic regression 205<br/>4.3.3 Iterative reweighted least squares 207<br/>4.3.4 Multiclass logistic regression 209<br/>4.3.5 Probit regression 210<br/>4.3.6 Canonical link functions 212<br/>4.4 The Laplace Approximation 213<br/>4.4.1 Model comparison and BIG 216<br/>4.5 Bayesian Logistic Regression 217<br/>4.5.1 Laplace approximation 217<br/>4 5.2 Predictive distribution 218<br/>Exercises<br/>5 Neural Networks<br/>5.1 Feed-forward Network Functions<br/>5.1.1 Weight-space symmetries<br/>5.2 Network Training<br/>5.2.1 Parameter optimization<br/>5.2.2 Local quadratic approximation<br/>5.2.3 Use of gradient information<br/>5.2.4 Gradient descent optimization ^41<br/>5.3 Error Backpropagation ; • ' 242<br/>5.3.1 Evaluation of error-function derivatives ^45<br/>5.3.2 A simple example ■ ■ • . . . . 246<br/>5.3.3 Efficiency of backpropagation ■ ■ ■<br/>5.3.4 The Jacobian matrix * * 249<br/>5.4 The Hessian Matrix ■ ' ' ^ 250<br/>5.4.1 Diagonal approximation • • 251<br/>5.4.2 Outer product approximation ' 252<br/>5.4.3 Inverse Hessian . . ■<br/>220<br/>5.4.4 Finite differences 252<br/>5.4.5 Exact evaluation of the Hessian 253<br/>5.4.6 Fast multiplication by the Hessian 254<br/>5.5 Regularization in Neural Networks 256<br/>5.5.1 Consistent Gaussian priors 257<br/>5.5.2 Early stopping 259<br/>5.5.3 Invariances 261<br/>5.5.4 Tangent propagation 263<br/>5.5.5 Training with transformed data 265<br/>5.5.6 Convolutional networks 267<br/>5.5.7 Soft weight sharing 269<br/>5.6 Mixture Density Networks 272<br/>5.7 Bayesian Neural Networks 277<br/>5.7.1 Posterior parameter distribution 278<br/>5.7.2 Hyperparameter optimization 280<br/>5.7.3 Bayesian neural networks for classification 281<br/>Exercises 284<br/>Kernel Methods 291<br/>6.1 Dual Representations 293<br/>6.2 Constructing Kernels 294<br/>6.3 Radial Basis Function Networks 299<br/>6.3.1 Nadaraya-Watson model 301<br/>6.4 Gaussian Processes 303<br/>6.4.1 Linear regression revisited 304<br/>6.4.2 Gaussian processes for regression 306<br/>6.4.3 Learning the hyperparameters 311<br/>6.4.4 Automatic relevance determination 312<br/>6.4.5 Gaussian processes for classification 313<br/>6.4.6 Laplace approximation 315<br/>6.4.7 Connection to neural networks 319<br/>Exercises 320<br/>Sparse Kernel Machines 325<br/>7.1 Maximum Margin Classifiers 326<br/>7.1.1 Overlapping class distributions 331<br/>7.1.2 Relation to logistic regression 336<br/>7.1.3 Multiclass SVMs 333<br/>7.1.4 SVMs for regression 339<br/>7.1.5 Computational learning theory 344<br/>7.2 Relevance Vector Machines • • • • •<br/>7.2.1 RVM for regression 345<br/>7.2.2 Analysis of sparsity 349<br/>7.2.3 RVM for classification 353<br/>Exercises . . 357<br/>8 Graphical Models 359<br/>8.1 Bayesian Networks 360<br/>8.1.1 Example: Polynomial regression 362<br/>8.1.2 Generative models 365<br/>8.1.3 Di.screte variables 366<br/>8.1.4 Linear-Gaussian models 370<br/>8.2 Conditional Independence 372<br/>8.2.1 Three example graphs 373<br/>8.2.2 D-separation 378<br/>8.3 Markov Random Fields 383<br/>8.3.1 Conditional independence properties 383<br/>8.3.2 Factorization properties 384<br/>8.3.3 Illustration: Image de-noising 387<br/>8.3.4 Relation to directed graphs 390<br/>8.4 Inference in Graphical Models 393<br/>8.4.1 Inference on a chain 394<br/>8.4.2 Trees 398<br/>8.4.3 Factor graphs 399<br/>8.4.4 The sum-product algorithm 402<br/>8.4.5 The max-sum algorithm 411<br/>8.4.6 Exact inference in general graphs 416<br/>8.4.7 Loopy belief propagation 417<br/>8.4.8 Learning the graph structure 418<br/>Exercises 418<br/>9 Mixture Models and EM 423<br/>9.1 /^-means Clustering 424<br/>9.1.1 Image segmentation and compression 428<br/>9.2 Mixtures of Gaussians 430<br/>9.2.1 Maximum likelihood 432<br/>9.2.2 EM for Gaussian mixtures 435<br/>9.3 An Alternative View of EM 439<br/>9.3.1 Gaussian mixtures revisited 441<br/>9.3.2 Relation to A'-means 443<br/>9.3.3 Mixtures of Bernoulli distributions 444<br/>9.3.4 EM for Bayesian linear regression 448<br/>9.4 The EM Algorithm in General 450<br/>Exercises 455<br/>10 Approximate Inference 461<br/>10.1 Variational Inference 462<br/>10.1.1 Factorized distributions 464<br/>10.1.2 Properties of factorized approximations 466<br/>10.1.3 Example: The univariate Gaussian 470<br/>10.1.4 Model compari.son 473<br/>10.2 Illustration: Variational Mixture of Gaussians 474<br/>10.2.1 Variational distribution 475<br/>10.2.2 Variational lower bound 481<br/>10.2.3 Predictive density 482<br/>10.2.4 Determining the number of components 483<br/>10.2.5 Induced factorizations 485<br/>10.3 Variational Linear Regression 486<br/>10.3.1 Variational distribution 486<br/>10.3.2 Predictive distribution 488<br/>10.3.3 Lower bound 489<br/>10.4 Exponential Family Distributions 490<br/>10.4.1 Variational message passing 491<br/>10.5 Local Variational Methods 493<br/>10.6 Variational Logistic Regression 498<br/>10.6.1 Variational posterior distribution 498<br/>10.6.2 Optimizing the variational parameters 500<br/>10.6.3 Inference of hyperparameters 502<br/>10.7 Expectation Propagation 505<br/>10.7.1 Example; The clutter problem 511<br/>10.7.2 Expectation propagation on graphs 513<br/>Exerci.ses<br/>11 Sampling Methods<br/>11.1 Basic Sampling Algorithms<br/>11.1.1 Standard distributions 526<br/>11.1.2 Rejection sampling 528<br/>11.1.3 Adaptive rejection sampling 530<br/>11.1.4 Importance sampling 532<br/>11.1.5 Sampling-importance-resampling 534<br/>11.1.6 Sampling and the EM algorithm 536<br/>11.2 Markov Chain Monte Carlo 537<br/>11.2.1 Markov chains 539<br/>11.2.2 The Metropolis-Hastings algorithm 541<br/>11.3 Gibbs Sampling 542<br/>11.4 Slice Sampling 546<br/>11.5 The Hybrid Monte Carlo Algorithm 548<br/>11.5.1 Dynamical systems 548<br/>11.5.2 Hybrid Monte Carlo 552<br/>11.6 Estimating the Partition Function 554<br/>Exercises 555<br/>12 Continuous Latent Variables 559<br/>12.1 Principal Component Analysis 561<br/>12.1.1 Maximum variance formulation 561<br/>12.1.2 Minimum-error formulation 563<br/>12.1.3 Applications of PCA 565<br/>12.1.4 PCA for high-dimensional data 569<br/>12.2 Probabilistic PCA 57O<br/>12.2.1 Maximum likelihood PCA 574<br/>12 2.2 EM algorithm for PCA 577<br/>12.2.3 Bayesian PCA 58O<br/>12.2.4 Factor analysis 533<br/>12.3 Kernel PCA 586<br/>12.4 Nonlinear Latent Variable Models 591<br/>12.4.1 Independent component analysis 591<br/>12.4.2 Autoassociative neural networks 592<br/>12.4.3 Modelling nonlinear manifolds 595<br/>Exercises 599<br/>13 Sequential Data 605<br/>13.1 Markov Models 607<br/>13.2 Hidden Markov Models 610<br/>13.2.1 Maximum likelihood for the HMM 615<br/>13.2.2 The forward-backward algorithm 618<br/>13.2.3 The sum-product algorithm for the HMM 625<br/>13.2.4 Scaling factors 627<br/>13.2.5 The Viterbi algorithm 629<br/>13.2.6 Extensions of the hidden Markov model 631<br/>13.3 Linear Dynamical Systems 635<br/>13.3.1 Inference in LDS 638<br/>13.3.2 Learning in LDS 642<br/>13.3.3 Extensions of LDS 644<br/>13.3.4 Particle fillers 645<br/>Exercises 646<br/>14 Combining Models 653<br/>14.1 Bayesian Mode! Averaging 654<br/>14.2 Committees 655<br/>14.3 Boosting 657<br/>14.3.1 Minimizing exponential error 659<br/>14.3.2 Error functions for boosting 661<br/>14.4 Tree-based Models 663<br/>14.5 Conditional Mixture Models 666<br/>14.5.1 Mixtures of linear regression models 667<br/>14.5.2 Mixtures of logistic models 670<br/>14.5.3 Mixtures of experts 672<br/>Exercises 674
650 #0 - SUBJECT
Keyword	Pattern perception.
650 #0 - SUBJECT
Keyword	Machine learning.
942 ## - ADDED ENTRY ELEMENTS (KOHA)
Koha item type	GN Books

Holdings
Withdrawn status	Lost status	Damaged status	Not for loan	Home library	Current library	Shelving location	Date acquired	Full call number	Accession number	Date last seen	Koha item type
				Central Library, Sikkim University	Central Library, Sikkim University	General Book Section	30/06/2016	006.4 BIS/P	P35915	30/06/2016	General Books