Information Theory

A Tutorial Introduction (Second Edition)

About this book

"A superb introduction."​

Originally developed by Claude Shannon in the 1940s, information theory laid the foundations for the digital revolution, and is now an essential tool in telecommunications, genetics, linguistics, brain sciences, and deep space communication. In this richly illustrated book, accessible examples are used to introduce information theory in terms of everyday games like "20 questions" before more advanced topics are explored.

Online MATLAB and Python computer programs provide hands-on experience of information theory in action, and PowerPoint slides give support for teaching. Written in an informal style, with a comprehensive glossary and tutorial appendices, this text is an ideal primer for novices who wish to learn the essential principles and applications of information theory.

Note that this Second Edition is identical to the First Edition, except for the addition of two new chapters: Rate Distortion Theory and Transfer Entropy. Claude Shannon introduced the idea of Rate Distortion Theory in his original book published in 1949 ("The Mathematical Theory of Communication"), and he followed this with a complete theory in a paper published in 1959 ("Coding Theorems for a Discrete Source with a Fidelity Criterion").  More recently, information theory has been applied to the problem of inferring causality, in the form of Transfer Entropy. The underlying theory of both Rate Distortion Theory and Transfer Entropy are explained in the context of recent applications in signal processing and biology.

Published 2022

ISBN: 9781739672706

Download Chapter 1 (PDF, 5MB)

Python code (2.7) – summary (TXT, 2KB)

Python code – compressed (ZIP, 136KB)

MATLAB code – summary (TXT, 4KB)

MATLAB code – compressed (ZIP, 124KB)


Available from: logo logo


1. What is Information?

1.1 Introduction
1.2 Information, eyes and evolution
1.3 Finding a route, bit by bit
1.4 A million answers to twenty questions
1.5 Information, bits and binary digits
1.6 Example 1: Telegraphy
1.7 Example 2: Binary images
1.8 Example 3: Grey-level pictures
1.9 Summary

2. Entropy of discrete variables

2.1 Introduction
2.2 Ground rules and terminology
2.3 Shannon's desiderata
2.4 Information, surprise and entropy
2.5 Evaluating entropy
2.6 Properties of entropy
2.7 Independent and identically distributed values
2.8 Bits, Shannons, and bans
2.9 Summary

3. The source coding theorem

3.1 Introduction
3.2 Capacity of a discrete noiseless channel
3.3 Shannon’s source coding theorem
3.4 Calculating information rates
3.5 Data compression
3.6 The entropy of English text
3.7 Why the theorem is true
3.8 Kolmogorov complexity
3.7 Summary

4. The noisy channel coding theorem

4.1 Introduction
4.2 Joint distributions
4.3 Mutual information
4.4 Conditional entropy
4.5 Noise and cross-talk
4.6 Noisy pictures and coding efficiency
4.7 Error correcting codes
4.8 Capacity of a noisy channel
4.9 Shannon’s noisy channel coding theorem
4.10 Why the theorem is true
4.11 Summary

5. Entropy of continuous variables

5.1 Introduction
5.2 The trouble with entropy
5.3 Differential entropy
5.4 Under-estimating entropy
5.5 Properties of differential entropy
5.6 Maximum entropy distributions
5.7 Making sense of differential entropy
5.8 What is half a bit of information?
5.9 Summary

6. Mutual information: Continuous

6.1 Introduction
6.2 Joint distributions
6.3 Conditional distributions and entropy
6.4 Mutual information and conditional entropy
6.5 Mutual information is invariant
6.6 Kullback-Leibler divergence
6.7 Summary

7. Channel capacity: Continuous

7.1 Introduction
7.2 Channel capacity
7.3 The Gaussian channel
7.4 Error rates of noisy channels
7.5 Using a Gaussian channel
7.6 Mutual information and correlation
7.7 The fixed range channel
7.8 Summary

8. Rate Distortion Theory

8.1 Introduction

8.2 Informal Summary

8.3 Rate Distortion Theory

8.4 The Binary Rate Distortion Function

8.5 The Gaussian Rate Distortion Function

8.6 Deterministic vs Stochastic Encoding

8.7 ImageCompressionExample

8.8 Applications

8.9 Summary

9 Transfer Entropy

9.1 Introduction

9.2 Transfer Entropy

9.3 ThePendulum

9.4 NumericalExample

9.5 Summary

10. Thermodynamic entropy and information

10.1 Introduction
10.2 Physics, entropy and disorder
10.3 Information and thermodynamic entropy
10.4. Ensembles, macrostates and microstates
10.5 Pricing information: the Landauer limit
10.6 The second law of thermodynamics
10.7 Maxwell’s demon
10.8 Quantum computation
10.9 Summary

11. Information as nature’s currency

11.1 Introduction
11.2 Satellite TVs, MP3 and all that
11.3 Does sex accelerate evolution?
11.4 The human genome: how much information?
11.5 Are brains good at processing information?
11.6 A short history of information theory
11.7 Summary

Further reading


A. Glossary
B. Mathematical symbols
C. Logarithms
D. Probability density functions
E. Averages from distributions
F. The rules of probability
G. The Gaussian distribution
H. Key equations



Book figures

The figures from the first edition of this book are licensed for use as specified below (eg for teaching and for non-commercial attributed use. Please email me if you would like any figures from the second edition.

Download figures (PDF, 10.7MB)

Download figures in a PowerPoint file (PPT, 11.6MB)

Creative Commons License icon

Creative Commons License

Only figures from Information Theory: A tutorial introduction by James V Stone are licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.


Praise for the Second Edition.

"Claude Shannon introduced Information Theory, and used it to establish the fundamental limits on communication. However, textbooks on information theory can seem impenetrable to those outside the discipline. Stone's tutorial treatment provides a much needed introduction, which explains relevant details while maintaining the integrity of the topic. This book should be useful to students and researchers in related scientific fields, including machine learning and biological signal analysis."

Jerry Gibson, Distinguished Professor, University of California, USA.

Praise for the First Edition.

"Information lies at the heart of biology, societies depend on it, and our ability to process information ever more efficiently is transforming our lives. By introducing the theory that enabled our information revolution, this book describes what information is, how it can be communicated efficiently, and why it underpins our understanding of biology, brains, and physical reality. Its tutorial approach develops a deep intuitive understanding using the minimum number of elementary equations.

"Thus, this superb introduction not only enables scientists of all persuasions to appreciate the relevance of information theory, it also equips them to start using it. The same goes for students. I have used a handout to teach elementary information theory to biologists and neuroscientists for many years. I will throw away my handout and use this book."

Simon Laughlin, Professor of Neurobiology, Fellow of the Royal Society, Department of Zoology, University of Cambridge, UK

"This is a really great book – it describes a simple and beautiful idea in a way that is accessible for novices and experts alike. This 'simple idea' is that information is a formal quantity that underlies nearly everything we do. In this book, Stone leads us through Shannon’s fundamental insights; starting with the basics of probability and ending with a range of applications including thermodynamics, telecommunications, computational neuroscience and evolution.

"There are some lovely anecdotes: I particularly liked the account of how Samuel Morse (inventor of the Morse code) pre-empted modern notions of efficient coding by counting how many copies of each letter were held in stock in a printer's workshop. The treatment of natural selection as 'a means by which information about the environment is incorporated into DNA' is both compelling and entertaining. The substance of this book is a clear exposition of information theory, written in an intuitive fashion (true to Stone's observation that 'rigour follows insight').

"Indeed, I wish that this text had been available when I was learning about information theory. Stone has managed to distil all of the key ideas in information theory into a coherent story. Every idea and equation that underpins recent advances in technology and the life sciences can be found in this informative little book."

Professor Karl Friston, Fellow of the Royal Society and Scientific Director of the Wellcome Trust Centre for Neuroimaging, Institute of Neurology, University College London

Further reading

Below is part two of the book: The mathematical theory of information by Shannon and Weaver, which is also freely available from the Bell Labs website. The print book contains two parts: a discursive part one by Weaver and part two by Shannon.