Artificial Intelligence Engines

A tutorial introduction to the mathematics of deep learning

Book cover with blue abstract colour

About this book

The brain has always had a fundamental advantage over conventional computers: it can learn. However, a new generation of artificial intelligence algorithms, in the form of deep neural networks, is rapidly eliminating that advantage.

Deep neural networks rely on adaptive algorithms to master a wide variety of tasks, including cancer diagnosis, object recognition, speech recognition, robotic control, chess, poker, backgammon and Go, at super-human levels of performance.

In this richly illustrated book, key neural network learning algorithms are explained informally first, followed by detailed mathematical analyses. Topics include both historically important neural networks (perceptrons, Hopfield nets, Boltzmann machines and backpropagation networks), and modern deep neural networks (variational autoencoders, convolutional networks, generative adversarial networks, and reinforcement learning using SARSA and Q-learning).

Online computer programs, collated from open source repositories, give hands-on experience of neural networks, and PowerPoint slides provide support for teaching. Written in an informal style, with a comprehensive glossary, tutorial appendices (eg Bayes' theorem, maximum likelihood estimation), and a list of further readings, this is an ideal introduction to the algorithmic engines of modern artificial intelligence.

Published 1 April 2019

Paperback ISBN: 9780956372819

Hardback ISBN: 9780956372826

Download Chapter 1 (PDF, 9.7MB)

Computer code

Corrections

Available from:

Amazon.com logo
Amazon.co.uk logo

Contents

Preface

List of pseudocode examples

Online code examples

1. Artificial neural networks

1.1 Introduction
1.2 What is an artificial neural network?
1.3 The origins of neural networks
1.4 From backprop to deep learning
1.5 An overview of Chapters

2. Linear associative networks

2.1 Introduction
2.2 Setting one connection weight
2.3 Learning one association
2.4 Gradient descent
2.5 Learning two associations
2.6 Learning many associations
2.7 Learning photographs
2.8 Summary

3. Perceptrons

3.1 Introduction
3.2 The perceptron learning algorithm
3.3 The exclusive OR problem
3.4 Why exclusive OR matters
3.5. Summary

4. The backpropagation algorithm

4.1 Introduction
4.2 The backpropagation algorithm
4.3 Why use sigmoidal hidden units?
4.4 Generalisation and overfitting
4.5 Vanishing gradients
4.6 Speeding up backprop
4.7 Local and global mimima
4.8 Temporal backprop
4.9 Early backprop achievements
4.10 Summary

5. Hopfield nets

5.1 Introduction
5.2 The Hopfield net
5.3 Learning one network state
5.4 Content addressable memory
5.5 Tolerance to damage
5.6 The energy function
5.7 Summary

6. Boltzmann machines

6.1 Introduction
6.2 Learning in generative models
6.3 The Boltzmann machine energy function
6.4 Simulated annealing
6.5 Learning by sculpting distributions
6.6 Learning in Boltzmann machines
6.7 Learning by maximising likelihood
6.8 Autoencoder networks
6.9 Summary

7. Deep RBMs

7.1 Introduction
7.2 Restricted Boltzmann machines
7.3 Training restricted Boltzmann machines
7.4 Deep autoencoder networks
7.5 Summary

8. Variational autoencoders

8.1 Introduction
8.2 Why favour independent features?
8.3 Overview of variational autoencoders
8.4 Latent variables and manifolds
8.5 Key quantities
8.6 How variational autoencoders work
8.7 The evidence lower bound
8.8 An alternative derivation
8.9 Maximising the lower bound
8.10 Conditional variational autoencoders
8.11 Summary

9. Deep backprop networks

9.1 Introduction
9.2 Convolutional neural networks
9.3 LeNet1
9.4 LeNet5
9.5 AlexNet
9.6 GoogLeNet
9.7 ResNet
9.8 Ladder autoencoder networks
9.9 Denoising autoencoder networks
9.10 Fooling neural networks
9.11 Generative adversarial networks
9.12 Temporal deep neural networks
9.13 Capsule neural networks
9.14 Summary

10. Reinforcement learning

10.1 Introduction
10.2 What's the problem?
10.3 Key quantities
10.4 Markov decision processes
10.5 Formalising the problem
10.6 The Bellman equation
10.7 Learning state-value functions
10.8 Eligibility traces
10.9 Learning action-value functions
10.10 Balancing a pole
10.11 Applications
10.12 Summary

11. The emperor’s new AI?

11.1 Artificial intelligence
11.2 Yet another revolution?

Further reading

Appendices

A. Glossary
B. Mathematical symbols
C. A vector matrix tutorial
D. Maximum likelihood estimation
E. Bayes' theorem

References

Index

Book figures

The figures from this book are licensed for use as specified below (eg for teaching and for non-commercial attributed use).

All book figures in a PowerPoint file (PPTX, 22.2MB).

Creative Commons License icon

Creative Commons License

Only figures from Artificial Intelligence Engines by James V Stone are licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

Reviews

"Authoritative, funny, and concise."

Steven Strogatz, Professor of Applied Mathematics, Cornell University

"Artificial Intelligence Engines will introduce you to the rapidly growing field of deep learning networks: how to build them, how to use them and how to think about them. James Stone will guide you from the basics to the outer reaches of a technology that is changing the world."

Professor Terrence Sejnowski, Director of the Computational Neurobiology Laboratory, Salk Institute, USA and author of The Deep Learning Revolution, MIT Press, 2018

"This book manages the impossible: it is a fun read, intuitive and engaging, light-hearted and delightful, and cuts right through the hype and turgid terminology. Unlike many texts, this is not a shallow cookbook for some particular deep learning program-du-jure. Instead, it crisply and painlessly imparts the principles, intuitions and background needed to understand existing machine-learning systems, learn new tools, and invent novel architectures, with ease."

Professor Barak Pearlmutter, Brain and Computation Laboratory, National University of Ireland Maynooth, Ireland

"This text provides an engaging introduction to the mathematics underlying neural networks. It is meant to be read from start to finish, as it carefully builds up, chapter by chapter, the essentials of neural network theory. After first describing classic linear networks and nonlinear multilayer perceptrons, Stone gradually introduces a comprehensive range of cutting edge technologies in use today. Written in an accessible and insightful manner, this book is a pleasure to read, and I will certainly be recommending it to my students."

Dr Stephen Eglen, Department of Applied Mathematics and Theoretical Physics (DAMTP), Cambridge Computational Biology Institute (CCBI), Cambridge University, UK

Further reading

The emperor's new AI? (blog)

A very short history of artificial neural networks (blog)