Chasing Delta-E, Grey Scale and Primary Colours for Perfect Calibration

For very obvious reasons, when considering calibration there is a big focus on Delta-E values, grey scale, and primary colours as a being a good sign of accuracy. Many calibration system suppliers go out of their way to provide all sorts of 'reporting' capabilities to prove their calibration is accurate using Delta-E values for grey scale and RGB primary colours.

But, is this correct?

The reality of relying on Delta-E


The simple answer is no, it is not at all correct, and often has little real relevance to the overall accuracy of calibration.

A very easy explanation can be as simple as the fact that real-world images do not contain grey scales, or even much in the way of grey or pure red, green or blue as actual colours. It is only technically generated images that have such 'perfect' colours, which shows there is an underlying issue with the way many systems approach calibration, as it is these 'unnatural' colours that are the focus of most calibration systems.

The image here shows the standard Delta-E values reported as an example of calibration accuracy. All the black space is not verified for accurate calibration, and can easily be wildly inaccurate.

Because of the limited number of points that Delta-E verifies it is very possible for the actual calibration to be widely inaccurate when real-world images are viewed, even though the Delta-E values report accurate calibration.

A far better and very obvious way to verify calibration is to perform a second profile with the calibration LUT active and assess the 3D Cube generated - the closer to a perfect 'cube' the better the accuracy of the calibration.

Delta-E (dE) is a single number that represents a difference between two colours, with the basis that a dE of 2.3 is the Just Noticeable Difference (JND), or smallest colour difference the human eye can see.

So, theoretically any dE less than 2.3 is imperceptible, while any dE greater than 2.3 is noticeable. However, some colour differences greater than 2.3 can be imperceptible, while some colour differences below 2.3 can be very visible, depending on the colour being measured.

Additionally, and more importantly, when Delta-E is used to represent calibration accuracy it is normal to only report a limited number of colour points. Usually the grey Scale and RGB primary colours only are represented, as shown above, or a small selection of colours based on something like the Macbeth Colour Checker. Neither is good enough in reality, as far too few points are being used to try to verify the total volumetric colour space.

Note: Although a dE of 2.3 is regarded as the technical JND value, many refer to a value of 1.0 as being a more realistic threshold for imperceptible difference.

Problems with Delta-E
Grey Scale and RGB

To gain a mental image of the problems being outlined here think of skin tones. The average Caucasian skin tone resides well away from any grey scale, or primary colours, and as such is ignored by most calibration systems. More importantly colours such as skin tones, grass, sky, etc, are memory colours, which means the human eye has a good idea as to what they should look like as they are seen almost daily. And equally importantly there are many different variations of hues, saturation and brightness associated with each 'colour' or tone. Without accurate display verification that includes these variations the calibrated results can never be considered as accurate.

The cube image here shows a standard grey scale and primary RGB calibration, with skin tone added to show its approximate location for reference. All the black space (including the skin tones) are effectively un-calibrated.

If displays were 'linear' in colour reproduction - any change in input signal produced an exactly equal change in the displayed colour - it would be possible to perform a grey scale and primary colour calibration only, and extrapolate/interpolate the calibration of the remaining colours. Unfortunately very few displays are in anyway 'linear'. More annoyingly, those displays that are close to linear are the highly expensive professional monitors which are routinely calibrated with professional 3D LUT profiling systems, whether the display needs it or not. It is lower-cost displays, such as home TVs that are almost always of poor linearity, and therefore can only be accurately calibrated via professional level full 3D cube based profiling and calibration.

Grey Scale Only Calibration

If you have a perfectly linear display (with perfect RGB Separation, and therefore zero cross-coupling errors), you could actually profile just the grey scale, and single patch RGB readings, and then generate a perfect 3D LUT that would control not just the grey scale, but total gamut (all the colours) as well. Very few calibration system understand this fact.

But, as few displays are truly linear in response - with the output changing directly in line with any input change - it's a concept that cannot really be used.

Notice the use of 'profiling' and 'calibration' as separate functions - this is of critical importance for accurate final calibration. See later.

This requirement for accurate calibration of displays with poor linearity (which is most displays as stated above) requires the use of 3D LUTs generated from full 3D cube based profiles.

But, not all 3D LUT based calibration is made equal - because of the overriding desire of many calibration systems to focus, incorrectly, on Delta-E, grey scale and primary colours as the definition for accurate calibration.

This is not to say that Delta-E reports are useless, and should be ignored, or that the values they report are untrustworthy (ignoring the fact that the actual values reported can be deceptive), but that good Delta-E values alone are no guarantee of accurate calibration. All the colours that Delta-E vales do not report on are equally important, and must be equally as accurate for good final calibration.

Every Colour Point MUST Be Considered Equal

From the above description of calibration issues it can be seen that every colour point has to be given equal importance during profiling and calibration, not just grey scale and primary colours.

That above statement is so important for accurate calibration it is worth stating again!

Every colour point has to be given equal importance during profiling and calibration, not just grey scale and primary colours

And the only way to do that is to profile all points within a 3D cube equally, and we recommend the use of full 21^3 profiling for any critical calibration, as this covers the entire colour space very accurately, with a good level of granularity.

21 Point Cube

The cube here shows a full 21^3 point profile, with all colours measured equally, and so all colours calibrated to a higher level of accuracy.

Within LightSpace CMS there are options for Quick profiles, which can produce excellent results, but they will never be as 'accurate' as a 17^3, or better still a 21^3 profile.
(But, depending on the display the variation may be next to invisible - but never as accurate).

5 Point Cube

The image here shows a 5^3 cube profile combined with 17 point grey scale and RGB colour. It is easy to see that none of the points align with the skin tone colour locations, and there is still a lot of 'black' un-calibrated space within the cube.

The use of skin tones as an example of 'real accuracy' is a perfect one, as they are critical memory colours and you can only get them accurate by accurate profiling - no interpolation will ever be accurate as is it - well - interpolated information, not accurate profiling.

The problem with interpolated data is lack of accuracy, as can be seen in the following images, where you can see the difference between calibrations generated by the above two approaches on the same display.

Interpolated Fully Measured

The edges of the LUT (primary colours) and the grey scale are identical, while the interior of the cube is very different, showing the additional accuracy afforded by treating all colours equally.

What all the above shows is that relying Delta-E as a guide to overall calibration accuracy can be fraught with issues, and should be used as a 'guide' to accuracy only.