Sunday 25 February 2024

Mathematical

This being a second spin-off from the Francis Crick’s book at reference 1, the first being found at reference 2. This one arises from a case of achromatopsia, a variety of acquired colour-blindness, described at reference 3, which eventually led me to the fifty year old paper at reference 4, that is to say, from about the time that I was an undergraduate student of something else in London. Edwin Land being, inter alia, the inventor of the once well-known Polaroid camera. The son of a scap metal merchant, you can read all about him at reference 5. In what follows, Land and McCann are referred to as LM.

With reference 4 being old enough to be a bit of a curiosity from a bibliographic perspective.

For example, the references section combines what we would call references with what we would call notes. And then, most of the references do not include the title of the paper in question, referring instead to a page and volume in a journal, this last usually heavily abbreviated. Perhaps there were fewer of them in those days – perhaps searchers then mostly had access to a suitable bricks & mortar library – but not particularly helpful for the online searcher of today.

The oldest reference is from 1839, referenced in item 1 of the reference list snapped above. Presumably, the Library of the Optical Society had hard copy for LM to consult – in the original French. Reference 10 of this post. From where I associate to the decidedly moribund library of the Royal Astronomical Society in Piccadilly. I remember, possibly quite wrongly, possibly conflating it with the library at the Foreign Office, a once grand but now shabby, oval room, with brown-wood bookshelves lining the walls top to bottom, with a wrought iron balcony running around those walls at mezzanine level. Plus special ladders, health and safety not having been invented. And to that of the Royal Institution in Albemarle Street – open access shelves when I first knew it, but now wired up. It still has the look of a library, but using it as a casual visitor is another matter. No doubt the Library of the Optical Society is now something of the same sort.

Then the photographs are of a poor quality, with that of Figure 3 being so poor that I was reduced to digging out an alternative copy of the paper, from ResearchGate. I forget where the first came from. And then, not all the photographs are included with the paper at all, with some being described as ‘bound transparencies’: presumably where quality mattered, separate arrangements had to be made. But I have not found them.

And then the paper itself is the written-up version of a presentation given at a Fall Meeting of the Optical Society of America in 1967; the occasion of the presentation of the Frederick Ives Medal to one of the authors. For which see reference 9. Presumably the sort of meeting where the presentation was given from an old-fashioned reading desk at the front to the assembled ranks of fellows and their guests. Visual aids a bit basic. From where I associate to tests in the use of what might have been called AVA for audio visual aids, rather dreaded by lady student teachers in England at about the same time.

Preliminaries

The starting point of reference 4 is the surprising fact that our perception of the colour of objects seems to be based to a greater extent on their intrinsic reflectances that one might otherwise have expected, given that what the eye gets is the combination of those reflectances with the ambient light, light which varies a good deal from place to place and from time to time. Think of the sunlight falling on an interior wall through a barred or dirty window. With intrinsic reflectances having the important property of constancy, which makes them useful to animals like us in the important matter of object identification. Is this a tiger or a toothbrush that I see before me?

So how does the human vision system – that combination of eyes, brain and individual history, otherwise memory – given that it does not seem to have enough data to go on – pull this trick off? 

The Retinex theory of the present paper was an important milestone in the quest to find out.

In this paper, LM call the ambient light ‘illumination’ and the light coming off an object – illumination as dealt with by reflection from that object – ‘luminance’ or ‘flux’. Most of the relevant objects in the LM world have a matte finish, reflecting light equally in all directions, which means that the point of view is not important.

Paraphrasing from Wikipedia, luminance is also the photometric measure of the luminous intensity per unit area of light travelling in a given direction. It describes the amount of light that is reflected from a particular place and falls within a given solid angle. It is measured in candelas – an amount of energy – per unit solid angle per unit source area. It is colour blind. A concept closely related to that of brightness, as opposed to hue, which is quite different.

To measure luminance at a place on an image, presumably as a ratio as compared with some baseline, they make use of a telescopic photometer. At a range of, say a couple of metres, point the photometer at the relevant bit of the image and read off the luminance. Such a reading can then be compared with what an experimental subject says about the colour of that same bit of image.

Probably more like the instrument turned up by Bing (from reference 7) snapped above, than the sort of thing you can buy now for a few pounds from Amazon.

Mondrians

Here and elsewhere a lot of the work is done using what are called Mondrians, after the painter whose paintings they resemble. A coloured Mondrian, lifted from a subsequent article by Land in Scientific American is included above. A collection of more or less rectangular, coloured shapes, assembled into an outer rectangle, mostly meeting each other at simple line segments.

Those of the present paper are mostly done in black, white and shades of grey.

The opening snap, taken from Wikipedia, is described as ‘Mondrian dresses by Yves Saint Laurent shown with a Mondrian painting in 1966’.

The argument

A first set of experiments, using a coloured Mondrian and three smoothly variable projectors for illumination – red, green and blue – demonstrated that fairly drastic changes of illumination do not greatly affect the perception of colour, which does indeed seem to be strongly correlated with the reflectances of the objects concerned.

Note that, regarding the papers used for these experiments, LM write that: ‘… To reduce the role of specular reflectance, the papers are not only matte, but are also selected to have a minimum reflectance as high as or higher than 10% for any part of the visual spectrum…’.

And according to something in Nature turned up by Bing: ‘Specular reflection appears as a bright spot or highlight on any smooth glossy convex surface and is caused by a near mirror-like reflectance off the surface. Convex shapes always provide the ideal geometry for highlights, areas of very strong reflectance, regardless of the orientation of the surface or position of the receiver’. Clearly a bad thing in this context.

We then observe that eyes have three sorts of colour receptor, responding best to long, medium and short wavelengths, roughly speaking red, green and blue respectively, bearing in mind that the term ‘colour’ has no meaning outside of a fairly sophisticated brain. LM propose that the human vision system (VS) has three independent sub-systems, with each sub-system being called a retinex, a conflation of ‘retina’ and ‘cortex’, each dealing with one band of wavelength. Then it is enough for a retinex to know about the brightness of a patch of colour, of a patch on a Mondrian, as might be measured by the photometer. Without loss of generality, LM mostly work in terms of a retinex which does black, white and greys.

Next, we observe that eyes are very sensitive to edges. And that while the illumination might vary a good deal across a Mondrian as a whole, the illumination of two points close to but on opposite sides of an edge will be nearly exactly the same. Then the ratio of the flux at those two points, as measured by the photometer, will be the ratio of the reflectances for the wavelength band in question. The two equal illuminations which go into the two fluxes will cancel out.

And if the note above applies here, these ratios will be in the range 0.1 to 10, that is to say bounded. I dare say one could add some twiddles to the argument which would make this restriction unnecessary.

Given that reflectance within a patch on a Mondrian will be constant, this allows us to step across the Mondrian and express the reflectance of any one patch as a ratio with that of any other. That is to say, in the snap above, what we are recording is the ratios between pairs of reflectances at edges, not the reflectances themselves. These are determined by assigning some conventional value, say 100% reflectance to the starting patch, and then working through the path, one patch at a time.

By example, LM illustrate the assertion that this stepping across the uniformly coloured patches of the Mondrian neatly sidesteps the problems associated with varying illumination, perhaps with the lower half of the Mondrian being more strongly illuminated than the upper half. Furthermore, it manages this despite the difficulty than we humans have comparing the colours of patches which are not right next to each other.

The catch being that the answer we get for any particular patch likely depends on the starting point, on the starting patch, to which we arbitrarily assign the reflectance of 100%.

A biologically plausible proceeding. Maybe, given the retinotopic organisation of the early visual areas of the brain, neurons could be organised to do something reasonably local like this.

Now we want 100% to be the maximum, to correspond to 100% reflectance in the wavelength band in question. To, for example, a good, strong red. So, given the vagaries of the eye, how does it know which patch to choose to represent 100% reflectance – how does it know quickly, without consuming too many processing resources?

The problem could be moved to a graph where the nodes are the patches on our Mondrian and the edges are the boundaries between adjacent pairs. We could just give our edges a direction to say this node is brighter than that node, or we could give them a weight expressing that relationship more explicitly. Or both. Having given some thought to the matter of equality.

Not any old graph would do. For example, it would have to be a planar graph. Any cycle could only involve nodes of equal brightness. Any two paths connecting the same start and end points would have to give the same answer with respect to the relationship between those start and end points.

We then reduce the search space by looking for maximal elements, elements where all the edges point away. Then reduce it to maximum elements by a relatively small number of pair-wise comparisons. Other procedures, more or less efficient, but to the same end, could be devised – but these are just the sort of global, long-winded, biologically implausible procedures which LM seek to avoid.

Instead, they suggest two wheezes. First, if we have a path terminating in our target patch, we discard the potion of the path before the patch with the highest reflectance on that path, resetting the sequential product from that patch. Second, we take the average of a number of such paths, to give us our estimate of the reflectance of the target patch. So VS, in order to decide what colour to assign to a target patch, looks to its three (or possibly some other number) of retinexes. Each retinex looks at a number of paths terminating at that patch, otherwise takes a look around the patch in question, takes context into account, to produce a reflectance for its waveband. These three reflectances, in effect an RGB triple, then determine the perceived colour.

Note the implicit assumption that each cone on a retina is linked to its own neuron in the brain, that a neuron in the relevant part of the brain codes for type of cone. And those neurons are well mixed up, each type giving good dense coverage of the central part of the visual field. Dense enough to build satisfactory paths. Generalising a bit, they generalise the path as a piece of string, wending its way across the image, with suitable sensors dotted along it, with our reading being the product of all the readings taken along the way. Perhaps the integral if we take logarithms.

Note the implicit assumption that the colours of things in the visual field are well mixed up. That this path averaging is going to work, at least most of the time.

LM do not assert that this is what VS actually does. But they do exhibit a procedure, which, in one way or another, VS will have to emulate or replace.

They also suggest how this procedure might be executed electronically, with the computing components available at the time.

Moving from the grey Mondrians to the coloured ones that would be needed for real, LM acknowledge that we move from the simple scalar product of illumination and reflectance to an integral over the waveband for the retinex concerned. However, they argue that this only slightly disturbs the argument that illumination cancels out when taking the ratio at a Mondrian boundary. The procedure of jumping across successive patches of constant reflectance, of one colour, still takes care of illumination varying across the whole.

The procedure still seems to work, more or less.

LM acknowledge that the overlapping of the cone response curves means that the three retinexes are not completely independent. However, they argue that this does not disturb the main argument, rather goes to explain some oddities previously passed over.

They go on to say that further work will be reported in a forthcoming paper. Work which had largely been done at the time of this writing, but which had not been done when the draft, as it were, was presented to the Optical Society.

More speculations

One might, for one reason or another, want to reduce a rectangular, coloured image to a set of polygons which add up to that rectangle, in such a way that it is reasonable to assign a single colour to each polygon. A common way to do this is to divide the image into a large, rectangular array of rectangular pixels, perhaps a million or more of them, with each pixel carrying a colour code. Such an array will do very well on the average laptop screen. 

One could also regard this array of pixels as a simple, regular form of Mondrian, in which each area has four neighbours and four ratios – up, down, right and left. And then compute the perceived colour from the ratios using the procedure outlined above. All of which might be of interest if one was trying to model what the brain does, rather than present an image for a brain to actually consume.

Another way is triangulation, rather less expensive in digital space and much more convenient for three-dimensional modelling of two-dimensional surfaces. Colouring the triangles is a bonus. All of which is also a shift from the raster graphics of the pixels to vector graphics. 

In two dimensions, the analysis offered by a Mondrian is usually very coarse compared with a million pixels – and I don’t know if they have much application outside of art galleries, neurology and psychology labs. 

Another important issue with all this is how far do you go in the interest of verisimilitude. When do you stop dividing up the image into smaller and smaller patches, smaller and smaller parcels?

One answer, provided by Microsoft’s Powerpoint package, is to go for quite big parcels, but then to allow for a modest amount of variation within the parcel. To have a stab at texture - to which end you are offered a range of choices, in addition to straightforward colour fill: gradient fill, texture fill, pattern fill and image fill, illustrated in the snap above. In effect, adding texture to colour as a property of the parcel, with texture needing a few more bytes than colour, but not that many more.

No doubt specialised packages offer a lot more in this department. All of which would break the rule on which the present procedure is built: colour constancy within each area.

Note that the need for such variation vanishes as the parcels get small. With the small pixels of the average computer screen, variation is subsumed in the colour of those pixels – with a lot of small pixels potentially carrying a lot more information than a much smaller number of big parcels, even when these carry more than just colour.

Note that VS makes use of colour itself to give what it projects into consciousness shape in three dimensions. Which probably amounts to one of the exceptions to the rule about perceiving intrinsic colour rather than something else.

Maybe the brain does neither raster nor vector. Maybe whatever it does to make us conscious of the Mondrian – or any other visual scene – is derived directly from projection from a dense, more or less random but retinotopic array of neurons on a patch of cortex. Position in the conscious field of view is a consequence of position on this patch, rather than of any coding of position. Of ‘x’ and ‘y’ coordinates. Or ‘r’ and ‘θ’ coordinates.

Lastly, given that we are talking about an iterative process, there is the option of taking an early result for consciousness to be going on with, then using a later result, which the unconscious has been beavering away at in the meantime, for the next frame of consciousness. A beavering away which might take account of what it was in the visual scene which is being attended to.

Other matters

Land goes over similar ground in a discourse subsequently given at the Royal Institution in London, a discourse which was printed up at reference 8. Back in the day when discourses were serious business, rather than the book promotion events they became by the time that I found them, something over five years ago. When eminent speakers did live experiments to serious audiences. But the heritage desk is still recognisable from the sketch provided above.

LM were by no means the first on this block, that is to say onto the interdependence of the colours on the page. The artist Josef Albers, to name but one, published in 1963. See reference 6. And then, while I was writing this, I happened to put my hand on the rather different take on colour, at least the story is told from a different point of view, by Rudolf Arnheim, at chapter VII, reference 11.

While turning to reference 10, Bing turns me up in seconds a good quality facsimile provided by the helpful people at reference 12 – and the librarians at the University of Ottawa. It is no wonder that real libraries are going out of fashion – provided only that there are enough of them left to feed the Internet.

Even the French is not too bad, but I don’t suppose that I will get through much of it. Life is too short!

Not convinced about the AVA mentioned above, I thought to ask Microsoft’s Bing. Whereupon Copilot offers, unsolicited: ‘Certainly! The three-letter acronym for audio visual equipment is AV. AV stands for “audio-visual”, encompassing the technology used to transmit and display visual data’. Not too impressed with his counting skills, I try Google’s Gemini with the prompt ‘I am trying to recall a three letter acronym for audio visual aids and equipment current in England in the late 1960s. Can you help’. After a brief exchange, he offers ‘AVT’ for audio visual technology. He is properly tentative about it and lists the right sort of stuff, but I am not convinced. AVT does not ring any bells – but I can’t yet do any better.

Conclusions

We have speculated about how it is that we are so good at computing reflectances of objects in the world about us, on the basis of what seems like inadequate information. We have come up with a procedure which seems at least vaguely plausible in biological terms.

While the Wikipedia entry at reference 14 ends: ‘… Although retinex models are still widely used in computer vision, actual human color perception has been shown to be more complex’ – which, if true, is not bad for a fifty year old theory in a crowded and busy part of the scientific world.

Along the way we have been reminded that the perception of colour of something does not just depend on the light coming from that something, but also on the light coming from the scene around. With the brain making, in effect, some assumptions about the world – assumptions which do not always work. The brain can be tricked.

PS 1: Microsoft Word is trying to smooth me out, to make me conform with its idea of the way to do things, which can often be irritating. So not content with red and blue underlinings everywhere, he has now started to tell me what I am about to type. So where, near the top, I have ‘bibliographic perspective’, I had been going to have ‘bibliographic point of view’. Furthermore ‘underlinings’ attracted the red underline, even though allowed by Merriam-Webster online.

PS 2: a last trick for the computer buff. We define a map M from one Mondrian to another, M: A → B, where B has the same patches as A but with colours in B given by making the colour of a patch in A the colour perceived by our experimental subject, or at least the proxy for that colour delivered by the procedure outlined above. What happens to this map if we apply it iteratively? Does it grind to a halt? Does it go round in circles? Does it just wander around, rather aimlessly, for ever? Can we say what it is about a Mondrian or its illumination that makes it do one thing rather than the other?

References

Reference 1: The astonishing hypothesis: The scientific search for the soul – Francis Crick – 1990.

Reference 2: https://psmv5.blogspot.com/2024/02/an-invisible-fingerprint.html

Reference 3: The curious case of Jonathan I. ‘The Case of the Colorblind Painter - Oliver Sacks, Robert Wasserman, New York Review of Books –1987.

Reference 4: Lightness and Retinex theory – Land, E. H. and McCann, J. J. – 1971. 

Reference 5: https://en.wikipedia.org/wiki/Edwin_H._Land

Reference 6: https://psmv3.blogspot.com/2017/06/late-convert.html

Reference 7: https://physicsmuseum.uq.edu.au/. They also provide short explanations of how things work.

Reference 8: The retinex theory of colour vision – Land, E. – 1974. A transcript of a discourse given at the Royal Institution.

Reference 9: https://www.optica.org/get_involved/awards_and_honors/awards

Reference 10: De la Loi du Contraste Simultane des Couleurs – M. E. Chevreul – 1839. Pitois-Levrault, Paris.

Reference 11: https://archive.org/

Reference 12: Rudolf Arnheim - Art and visual perception: A psychology of the creative eye - 1954/1974. 

Reference 14: https://en.wikipedia.org/wiki/Color_constancy

No comments:

Post a Comment