The authors of the present book, Gustavo Deco, trained in theoretical physics, and Dragan Obradovic, trained in control theory, belong to the key members of such an interdisciplinary team, which is part of the central division for research and development of the Siemens AG,located in Munich. For the past several years, this team has concentrated on developing and applying methods based on neural networks, information theory, and the theory of nonlinear dynamics. Much of the content of the book is based on original work by the authors, which testifies that high-quality forefront research and company goals can match very well.As is evidenced by the title of their book, the authors make use of elements of information theory to gain a deeper understanding of how information is processed in neural networks. Historically, artificial neural networks and information theory had entirely different origins and aims. Grossly speaking, neural networks aimed to understand how the brain works, and information theory aimed to understand how communication is transmitted. Chronological milestones in the history of artificial neural networks are Hebb's book on the organization of behavior, Rosenblatt's book on principles of neurodynamics in which he defines the perceptrons, Hopfield's discovery of the analogy of certain types of neural networks to spin glasses and the exploitation of the associated ener8Y function, the generalhation of simple perceptrons to feedforward multi-layer perceptrons accompanied by the backpropagation learning algorithm of Rumelhart and others and its extension to multi-layer perceptrons with feedback accompanied by the recurrent backpropagation learning algorithm of Almeida, Pineda and others.
Modem information theory started with the seminal papers of Shannon on the mathematical theory of communication, continued with McMillan's work on basic theorems of information theory, and proceeded further with the relationship between information theory and statistics investigated by Kolmogorov, Chaitin and Solomonoff.
Despite separate developments of both disciplines, there is an obvious link between them:Artificial neural networks process information from their inputs to their outputs. This link has been investigated in the past, notably by Barlow, who proposed the principle of redundancy reduction as the goal of unsupervised learning, and by Linsker, who formulated the principle of maximizing information as a mechanism for information processing in the brain.
The authors of this book for the first time present a systematic and exhaustive information theory based approach to artificial neural networks, which amply demonstrates that this field is a high-level and rigorous research discipline on its own, with a particularly high potential if linked with the methods of nonlinear dynamics. (This link is only alluded to in the book and remains a topic for future research). Equally important, many of the methods and results presented in the book have found or are about to find their way into real-world applications.To be able to complete their book in the time allocated, the authors had to sacrifice much of their free time. As a reward, a fine book has emerged, which will attract a readership that is open to interdisciplinary new ideas and at the same time appreciates solid research in its best sense.