Thursday, 3 November 2022

WEBP pictures

From time to time I come across image files with the extension WEBP and for some reason, now lost, I was in the habit of passing these images through Microsoft’s Paint tool to convert them to PDF. I can only think because either Blogger or Word would not load images in WEBP format – but that does not seem to be the case now.

Whatever the case, I was puzzled about the message you got when you did the save as. What exactly were you losing? If the image in question is a simple rectangular array of coloured pixels, what difference does it make what particular file format you use? Does it matter to me that one file format is more cunning than another when it comes to dealing with repeating pixels or repeating patterns of pixels?

Which prompted inquiry into what exactly WEBP was. I was surprised at how long it took to find someone who would tell me what I wanted to know, pitched at the right level for me, computer literate but not graphics software literate.

The high level story that I have come to is summarised above. Left we have moving pictures, the sort of thing you watch on television or at the cinema. A lot of this is organised by frame, so many frames, each containing a stationary image, a second. Maybe thirty of them a second – and to achieve that sort of speed digitally may require a great deal of computer power. With the result that the human thinks that it is the real thing, it has all got blurred into a continuously moving image. 

All supported by a once expensive bit of machinery called a frame buffer, a large two dimensional array of bytes, encoding a large two dimensional array of pixels, millions of them, corresponding to an output frame, embedded in a support package, the whole embedded – at least conceptually – in a host computer.

But some of it is not organised by frame in that at least some systems update a two-dimensional image, a two-dimensional version of the three dimensional world in a continuous, real-time way – for example the human retina. Maybe there are computers out there which do something of the same sort.

In the middle we have essentially stationary images, although provision is made for a modest amount of movement, for short videos or animations. These seem to come in one of two varieties. The raster images which are two dimensional arrays of coloured pixels. If the pixel are small enough and you have enough of them, you get a pretty good image. A system which works pretty well, for example, for photographs, although things can get tricky if you want to change the dimensions of the pixel array in flight. A system which is used by WEBP. Whereas in a vector image the contents are described by mathematical formulae, often describing polygons and their fill, very much in the way of a Powerpoint rectangle, ellipse or line. 

The open arrow from vector to raster is there to remind us that in order to render a vector image on a screen it usually has to be converted into an array of pixels, this being how most computer screens are organised. You might play around with it in vector form, but if you want to look at it, you need pixels.

While right we have documents and pages, either for printing on paper or for display on a computer screen. Documents which might contain images, but which were, at least in the beginning, mainly about text. The languages which describe such documents today amount to specialised computer programming languages, used to build programs which are usually content to leave pushing pixels around to others. 

The desktop publishing software around in the 1980’s were very much this sort of thing and I recall using the one called Ventura Publisher, capable of reproducing quite complicated text. A product now gobbled up by the Ottawa based Corel Corporation, last noticed at reference 9 and now itself gobbled up by Kohlberg Kravis Roberts & Co, a US investment house. The fate, it seems, of many once prominent and successful software products.

Another way of looking at this is more source orientated. On the left we have a three dimensional world which we wish to render down, to reduce to one or more two dimensional images. A reduction which involves the idea of a point of view, the point from where we are looking at this world. In green in the snap above to suggest its process rather than data declaration flavour.

We might start with something like a camera capturing a real world scene. Or with an artist creating a cartoon. Or with three dimensional modelling on a computer. Or we might have something in between. 

But what we end up with is one, or often a lot more, two dimensional images. 

On the right we start in what can be thought of as a two dimensional world, a two dimensional image which we want to appear on a page or on a computer screen. This might be a book, a diagram or some sort of composite, quite possibly involving image inserts from the three dimensional world. With the organisation into wrappers, pages and inserts intended to be suggestive rather than proscriptive. Red stars for optional repeats.

Again, what we end up with can be thought of as one or more two dimensional images. This is the world of, for example, the ubiquitous PDF file, the portable document format file.

WEBP

Having provided some context, we can now get back to the WEBP file, first advertised by Google about ten years ago and brought into the real world much more recently, with support for WEBP now present in all the big browsers – not least Google’s Chrome – and files with the WEBP file extension appearing more and more often.

The driver for WEBP appears to have been efficiency, to get the size of Internet image files down, so reducing the demand for broadband capacity, so increasing the speed with which pages from the Internet can be displayed. Along the way it added modest video capability and something called alpha channel, with this last being what it of present interest, described at reference 2.

At the end of which we have: ‘The use of the term alpha is explained by Smith as follows: “We called it that because of the classic linear interpolation formula α×A+(1- α)×B that uses the Greek letter α to control the amount of interpolation between, in this case, two images A and B”. That is, when compositing image A atop image B, the value of  α in the formula is taken directly from A's alpha channel’.

Alpha channel is all about transparency: when α is 1, A is opaque and when it is 0, A is fully transparent, invisible. In between we might have A as the glass of a window or water in a pond and B something seen through the window or in the pond.

Along the way I read that some tricky images – perhaps involving surfers, long hair and sea spray – work best when composed from several alpha channelled layers.

Loss of transparency

Which alpha channel is lost when a WEBP file is converted into, for example, a JPG file. What I think this means is that where α is zero, the image is null, perhaps conventionally displayed on a screen as white space, black space or grey chequerboard. Where α is one, you get the full-strength image. Where it is between zero and one, you get the original colour watered down, or put another way, the starting RGB values have been multiplied by alpha, making them smaller, actually bringing the colour closer to black, to absence of colour. Without the benefit of the (1- α)×B part of the formula above, because we do not have the contribution of a B image.

Black being the colour of a dark pond, with no light being reflected from objects in the pond – like silver fish or green weed – or from the bottom of the point. The pond only looks white to the extent that white light is being reflected off its surface, in which case it will no longer be transparent. 

When the snap above right was first lifted from the Internet as a PNG file back in October, the α=0 part of the image was displayed as black, which apart from being rather unsightly, made the labels very hard to read. But in Paint, the black was replaced by grey chequerboard which is much better, was what was saved in JPG and which last was the version used at reference 5. Since then both the black and the grey chequerboard for the transparent regions in PNG images seem to have been replaced by white, which I can only put down to one of the regular Microsoft Windows upgrades.

In sum, there is no loss of quality of the image at hand, but you can no longer combine this image with another image, making use of the alpha channel. So you have lost transparency to come rather than transparency that was already there. And I think that this is the meaning of the warning pop-up with which I started.

It took a long time to get there!

In part, because references 1 and 2 from Wikipedia were rather dense and reference 3 from Google was rather bland. While reference 4 from Adobe gave a rather good introductory story, only omitting their own PDF product, only mentioned in passing. I suppose the trouble was that the sort of introductory material I was looking for is more likely to be found in a text book than on a page on the Internet – but the interiors of text books are not usually visible to search.

A plus point for Wikipedia was the wealth of further reading to be found for the industrious, some of which is to be found at reference 6, 7 and 8.

Commercial considerations

Graphics processing is big business and so big money. So there are lots of patents and a fair number of legal skirmishes around them. On the other hand, there is lots going for standards, with standards working best when the supporting technology is open and freely available.

No doubt the stuff of study for wannabee masters of business administration. Incidentally, a degree which, I seem to recall reading recently, is no longer the magic bullet to propel you to the upper reaches that it once was.

A sample

For the sake of completeness, a complete WEBP image. The image which was started at the beginning of this post.

Postscript

The next day. A black version of the WEBP brain has now turned up, rather degraded, in JPG, a version which had escaped the OneDrive carnage. The black letters of the labels have vanished, but the brain end of the lines connecting brain to labels survive.

References

Reference 1: https://en.wikipedia.org/wiki/WebP

Reference 2: https://en.wikipedia.org/wiki/Alpha_compositing

Reference 3: https://developers.google.com/speed/webp/

Reference 4: https://www.adobe.com/creativecloud/file-types/image/raster/webp-file.html

Reference 5: https://psmv5.blogspot.com/2022/10/swings-and-roundabouts.html

Reference 6: Do we know what the early visual system does? - Carandini, M. and others – 2005. Open access at https://www.jneurosci.org/content/jneuro/25/46/10577.full.pdf

Reference 7: Technical Memo 7: Alpha and the History of Digital Compositing – Alvy Ray Smith, Microsoft – 1995. A reference from reference 2 above. Open access.

Reference 8: Compositing Digital Images – Porter, Thomas; Duff, Tom – 1984. A reference from reference 7 above. Open access.

Reference 9: http://psmv4.blogspot.com/2021/08/sir-roland-storrs.html.


No comments:

Post a Comment