Context Navigation

1 Tweet

Timestamp:: 08/04/2009 12:07:57 AM (16 years ago)
Author:: Sam Hocevar
Comment:: Chapter 1. Colour quantisation

Legend:

: Unmodified
: Added
: Removed
: Modified

libcaca/study/1

-                      v1
+                      v2
 = Libcaca study: the science behind colour ASCII art =
+= 1. Colour quantisation =
 This document is an attempt at extending the leverage of skilled resources by uncovering and addressing the challenges the industry faces today in the area of colour ASCII art generation.
+The process of reducing the number of colours used in an image is called colour quantisation. It is a very old and common computer graphics problem. Many methods exist to do the task, and their efficiency depends on several parameters:
+Seriously, guys. If you think that what libcaca does is easy, you either don’t know what you are talking about, or we want you in the team.
+    * the input image: is it a photograph? a vector drawing? a composition of both?
+    * the target media: is it a computer screen? if so, what are the size and the position of the pixels? is it a printed document? if so, what kind of paper? what kind of ink? or maybe the conversion should be optimised for both targets?
+    * the quality requirements: for instance, can contrast be raised for a more appealing result at the expense of accuracy?
+    * the allowed computation time: do we need 50fps or can we afford to wait 10 seconds for a better result?
 == Foreword ==
+== 1.1. Black and white thresholding ==
+Meet Lena. She will guide us through this document, because the seriousness of a scientific document in the area of computer graphics can be measured by the number of times Lena appears in it. She truly is the Mona Lisa of image processing. ![2]
+Since a greyscale pixel has a value between 0 and 1, a fast method to convert the image to black and white is to set all pixels below 0.5 to black and all pixels above 0.5 to white. This method is called thresholding and, in our case, results in the following image:
+[[Image(source:/web/trunk/static/study/lena256.png,alt="Lena (256×256)")]]
+[[Image(source:/web/trunk/static/study/gradient256.png,alt="colour gradient (64×256)")]]
+[[Image(source:/web/trunk/static/study/lena256bw.png,alt="Lena (256×256BW)")]]
+[[Image(source:/web/trunk/static/study/gradient256bw.png,alt="greyscale gradient (32×256)")]]
+[[Image(source:/web/trunk/static/study/out/lena1-1-1.png,alt="50% threshold")]]
+[[Image(source:/web/trunk/static/study/out/grad1-1-1.png,alt="50% threshold gradient")]]
+This document makes a lot of assumptions, such as the fact that input images are made of pixels that have either one (grey level) or three (red, green and blue) values uniformly spread between 0 and 1 (with regards to human contrast perception). Real life is more complicated than that, but that is beyond the scope of this document for now.
+Table of contents
+Not that bad, but we were pretty lucky: the original image’s brightness was rather well balanced. A lot of detail is lost, though. Different results can be obtained by choosing “threshold values” other than 0.5, for instance 0.4 or 0.6, resulting in a much brighter or darker image:
+ * [wiki:libcaca/study/1 1. Colour quantisation]
+   * 1.1. Black and white thresholding
+   * 1.2. Greyscale thresholding
+   * 1.3. Dynamic thresholding
+   * 1.4. Random dithering
+ * [wiki:libcaca/study/2 2. Halftoning]
+   * 2.1. Halftoning patterns
+   * 2.2. Screen artifacts
+   * 2.3. Ordered dithering
+   * 2.4. Random ordered dithering
+   * 2.5. Non-rectangular dither tiles
+   * 2.6. Supercell dithering
+   * 2.7. Void and cluster method
+ * [wiki:libcaca/study/3 3. Error diffusion]
+   * 3.1. Floyd-Steinberg and !JaJuNi error diffusion
+   * 3.2. Floyd-Steinberg derivatives
+   * 3.3. Changing image parsing direction
+   * 3.4. Variable coefficients error diffusion
+   * 3.5. Block error diffusion
+   * 3.6. Sub-block error diffusion
+   * 3.7. Direct binary search
+ * [wiki:libcaca/study/4 4. Model-based dithering]
+   * 4.1. Gaussian human visual system model
+   * 4.2. Direct binary search
+   * 4.3 Comparing dithering algorithms
+ * [wiki:libcaca/study/5 5. Greyscale dithering]
+   * 5.1. Introducing gamma
+   * 5.2. Gamma correction
+   * 5.3. Greyscale sub-block error diffusion
+ * [wiki:libcaca/study/6 6. Colour dithering]
+   * 6.1. Separate-space dithering
+   * 6.2. Accounting for other dimensions
+   * 6.3. Reducing visual artifacts
+   * 6.4. Colour sub-block error diffusion
+ * [wiki:libcaca/study/7 7. Photographic mosaics]
+   * 7.1. Image classification
+   * 7.2. Error diffusion
+   * 7.3. Colour ASCII art
+ * [wiki:libcaca/study/bibliography Bibliography]
+[[Image(source:/web/trunk/static/study/out/lena1-1-2.png,alt="40% threshold")]]
+[[Image(source:/web/trunk/static/study/out/grad1-1-2.png,alt="40% threshold gradient")]]
+[[Image(source:/web/trunk/static/study/out/lena1-1-3.png,alt="60% threshold")]]
+[[Image(source:/web/trunk/static/study/out/grad1-1-3.png,alt="60% threshold gradient")]]
+Choosing the best thresholding value for a given image is called average dithering. But even with the best value, the results will not improve tremendously.
+== 1.2. Greyscale thresholding ==
+Better results can be achieved with a slightly bigger palette. Here is thresholding applied to a 3-colour and to a 5-colour palette:
+[[Image(source:/web/trunk/static/study/out/lena1-2-1.png,alt="3-colour threshold")]]
+[[Image(source:/web/trunk/static/study/out/grad1-2-1.png,alt="3-colour threshold gradient")]]
+[[Image(source:/web/trunk/static/study/out/lena1-2-2.png,alt="5-colour threshold")]]
+[[Image(source:/web/trunk/static/study/out/grad1-2-2.png,alt="5-colour threshold gradient")]]
+Using this method, shades of grey are evenly used. However, the global error is far from optimal, as the following graphs show:
+[[Image(source:/web/trunk/static/study/fig1-2-1.png,alt="mean error 0.138")]]
+[[Image(source:/web/trunk/static/study/fig1-2-2.png,alt="mean error 0.075")]]
+The following thresholding method minimises the error, at the cost of underusage of pure black and white colours:
+[[Image(source:/web/trunk/static/study/out/lena1-2-3.png,alt="3-colour threshold, minimal error")]]
+[[Image(source:/web/trunk/static/study/out/grad1-2-3.png,alt="3-colour threshold gradient, minimal error")]]
+[[Image(source:/web/trunk/static/study/out/lena1-2-4.png,alt="5-colour threshold, minimal error")]]
+[[Image(source:/web/trunk/static/study/out/grad1-2-4.png,alt="5-colour threshold gradient, minimal error")]]
+[[Image(source:/web/trunk/static/study/fig1-2-3.png,alt="mean error 0.125")]]
+[[Image(source:/web/trunk/static/study/fig1-2-4.png,alt="mean error 0.0625")]]
+This is a perfect example of a situation where colour accuracy does not help achieve a better result.
+== 1.3. Dynamic thresholding ==
+Dynamic thresholding consists in studying the image before selecting the threshold values. One strategy, for instance, is to choose the median pixel value. This is done simply by computing a histogram of the image.
+[[Image(source:/web/trunk/static/study/out/lena1-3-1.png,alt="2-colour dynamic threshold")]]
+[[Image(source:/web/trunk/static/study/out/grad1-3-1.png,alt="2-colour dynamic threshold gradient")]]
+[[Image(source:/web/trunk/static/study/out/lena1-3-2.png,alt="5-colour dynamic threshold")]]
+[[Image(source:/web/trunk/static/study/out/grad1-3-2.png,alt="5-colour dynamic threshold gradient")]]
+== 1.4. Random dithering ==
+Instead of constantly using the same threshold value, one can use a different random value for each pixel in the image. This technique is simply called random dithering.
+Here are two simple examples. On the left, threshold values are uniformly chosen between 0 and 1. On the right, random dithering with threshold values chosen with a gaussian distribution (mean 0.5, standard deviation 0.15):
+[[Image(source:/web/trunk/static/study/out/lena1-4-1.png,alt="random dithering")]]
+[[Image(source:/web/trunk/static/study/out/grad1-4-1.png,alt="random dithering gradient")]]
+[[Image(source:/web/trunk/static/study/out/lena1-4-2.png,alt="gaussian (0.5, 0.15) random dithering")]]
+[[Image(source:/web/trunk/static/study/out/grad1-4-2.png,alt="gaussian (0.5, 0.15) random dithering gradient")]]
+The images look very noisy, but they are arguably an improvement over standard constant thresholding.
+Finally, this is dynamic thresholding with 4 colours where threshold values at every pixel are computed as usual, but then perturbated using a gaussian distribution (mean 0, standard deviation 0.08):
+[[Image(source:/web/trunk/static/study/out/lena1-4-1.png,alt="4-colour dynamic thresholding, gaussian (0, 0.08)")]]
+[[Image(source:/web/trunk/static/study/out/grad1-4-1.png,alt="4-colour dynamic thresholding, gaussian (0, 0.08) gradient")]]