The importance of intrinsically two-dimensional image features in biological vision and picture coding


Christoph Zetzsche, Erhardt Barth, and Bernhard Wegmann

ABSTRACT The relation between information processing in the human visual system and the efficient encoding of images is considered. It is shown that both, vision scientists and communication engineers, may profit in an unconventional fashion from a joint interdisciplinary approach: In contrast to common assumptions, no further benefits for communication engineering can be gained from the reduction of visual irrelevancy. Rather, the investigation of biological image processing can lead to the emergence of new principles for the reduction of redundancy in image signals. They extend beyond the traditional concepts of second-order statistics/linear system theory which are "blind" for higher-order statistical dependencies due to locally oriented structures. A biologically motivated approach for their exploitation is suggested that models "end-stopped" cells as highly nonlinear detectors for intrinsically two-dimensional image features. Such detectors are obtained from a synthesis of differential geometry and filter theory. By means of reconstruction of the input image it is shown, that the essential information in natural images is captured by the sparse activity of such detectors. This implies that the application of concepts from statistical information theory can offer a fruitful paradigm for the understanding of basic features of biological vision.

INTRODUCTION: Image signals represent an extraordinarily large amount of information. In technical applications, a typical value is 2Mbit for a single gray-level picture of broadcast quality. In biology, more than one million fibres carry the image information from the retina to the visual cortex (Potts et al., 1972). Reduction of the immense data load is, therefore, an essential requirement for any kind of image processing system, be it of biological or of technical nature. Data compression can rely on two essential sources for such a reduction: statistical redundancy and subjective irrelevance. Traditionally, research on statistical dependencies and their exploitation by redundancy reduction is related to communication engineering whereas a basic method of visual psychophysics and physiology is the measurement of the limits of visual performance, i.e. the determination of the irrelevance aspects. We will argue that this view deserves some revision. In particular we will demonstrate that for the case of still images 1) no substantial further improvements in irrelevance reduction of images can be gained by a more detailed knowledge of static spatial visual sensitivities, and 2) the standard mathematical approaches to the redundancy reduction problem are severely limited in their ability of recognizing essential structural aspects in natural images. In particular, the efficient exploitation of "orientations" in images is shown to be beyond the scope of methods of optimum linear transform coding which are based on second-order statistics. The main conclusion to be drawn from these considerations will be the following: Image coding scientists should not primarily see the visual system as determining irrelevance, i.e. the limits of visibility of certain signals. Rather, they should take into account that it has adapted its information processing strategies during millions of years to the statistics and structures of our environment. Hence, it seems suited as a heuristic guide to improved encoding procedures which may overcome the limits of the existing theoretical concepts. Vision scientists, on the other hand, can expect to gain an additional theoretical concept for the interpretation and explanation of structures found in psychophysical and physiological experiments. This is not to say, however, that irrelevance aspects, i.e., the investigation of certain limits of biological structures, will play no role in the future development. Rather, redundancy and irrelevance should be seen as essential and equally important aspects of image information processing in both technical and biological systems.