source: research/2008-displacement/paper/paper.tex @ 4838

Last change on this file since 4838 was 2763, checked in by Sam Hocevar, 12 years ago
  • spelling
File size: 19.2 KB
Line 
1% This is LLNCS.DEM the demonstration file of
2% the LaTeX macro package from Springer-Verlag
3% for Lecture Notes in Computer Science,
4% version 2.2 for LaTeX2e
5%
6\documentclass{llncs}
7%
8\usepackage{makeidx}  % allows for indexgeneration
9\usepackage{graphicx} % for gnuplot epslatex stuff
10\usepackage{color}    % ditto
11\usepackage{pstricks} % for inkscape TeX output
12%
13\begin{document}
14%
15\mainmatter              % start of the contributions
16%
17\title{Reinstating Floyd-Steinberg: Improved Metrics for Quality Assessment
18of Error Diffusion Algorithms}
19%
20\titlerunning{Adapting Qualitative Metrics to Common Error Diffusion Algorithms}  % abbreviated title (for running head)
21%                                     also used for the TOC unless
22%                                     \toctitle is used
23%
24\author{Sam Hocevar\inst{1} \and Gary Niger\inst{2}}
25%
26\authorrunning{Sam Hocevar et al.} % abbreviated author list (for running head)
27%
28%%%% modified list of authors for the TOC (add the affiliations)
29\tocauthor{Sam Hocevar, Gary Niger (Laboratoire d'Imagerie Bureautique et de
30Conception Artistique)}
31%
32\institute{Laboratoire d'Imagerie Bureautique et de Conception Artistique\\
3314 rue de Plaisance, Paris, France
34\and
35143 Rolloffle Avenue, Tarzana, California 91356\\
36\email{sam@hocevar.net}, \email{gary\_niger@gnaa.us}}
37
38\maketitle              % typeset the title of the contribution
39
40\begin{abstract}
41In this contribution we introduce a little-known property of error diffusion
42halftoning algorithms which we call {\it error diffusion displacement}.
43By accounting for the inherent sub-pixel displacement caused by the error
44propagation, we correct an important flaw in most metrics used to assess the
45quality of resulting halftones. We find these metrics to usually highly
46underestimate the quality of error diffusion in comparison to more modern
47algorithms such as direct binary search.
48Using empirical observation, we give a method for creating computationally
49efficient, image-independent, model-based metrics for this quality assessment.
50Finally, we use the properties of error diffusion displacement to justify
51Floyd and Steinberg's well-known choice of algorithm coefficients.
52
53{\bf Keywords}: halftoning, error diffusion, image quality, human visual
54system, color quantization
55\end{abstract}
56%
57\section{Introduction}
58
59Image dithering is the process of reducing continuous-tone images to images
60with a limited number of available colours. Applications vary tremendously,
61from laser and ink-jet printing to display on small devices such as cellphones,
62or even the design of banknotes.
63
64Countless methods have been published for the last 40 years that try to best
65address the problem of colour reduction. Comparing two algorithms in terms of
66speed or memory usage is often straightforward, but how exactly a halftoning
67algorithm performs quality-wise is a far more complex issue, as it highly
68depends on the display device and the inner workings of the human eye.
69
70Though this document focuses on the particular case of bilevel halftoning,
71most of our results can be directly adapted to the more generic problem of
72colour reduction.
73
74\section{Halftoning algorithms}
75
76The most ancient halftoning method is probably classical screening. This highly
77parallelisable algorithm consists in tiling a dither matrix over the image
78and using its elements as threshold values. Classical screening is known for
79its structural artifacts such as the cross-hatch patterns caused by Bayer
80ordered dither matrices \cite{bayer}. However, modern techniques such as the
81void-and-cluster method \cite{void1}, \cite{void2} allow to generate screens
82yielding visually pleasing results.
83
84\medskip Error diffusion dithering, introduced in 1976 by Floyd and Steinberg
85\cite{fstein}, tries to compensate for the thresholding error through the use
86of feedback. Typically applied in raster scan order, it uses an error diffusion
87matrix such as the following one, where $x$ denotes the pixel being processed:
88
89\[ \frac{1}{16} \left| \begin{array}{ccc}
90- & x & 7 \\
913 & 5 & 1 \end{array} \right| \]
92
93Though efforts have been made to make error diffusion parallelisable
94\cite{parfstein}, it is generally considered more computationally expensive
95than screening, but carefully chosen coefficients yield good visual results
96\cite{kite}.
97
98\medskip Model-based halftoning is the third important algorithm category. It
99relies on a model of the human visual system (HVS) and attempts to minimise
100an error value based on that model. One such algorithm is direct binary seach
101(DBS) \cite{allebach}, also referred to as least-squares model-based halftoning
102(LSMB) \cite{lsmb}.
103
104HVS models are usually low-pass filters. Nasanen \cite{nasanen}, Analoui and
105Allebach found that using Gaussian models gave visually pleasing results, an
106observation confirmed by independent visual perception studies \cite{mcnamara}.
107
108DBS yields halftones of impressive quality. However, despite efforts to make
109it more efficient \cite{bhatt}, it suffers from its large computational
110requirements and error diffusion remains a more widely used technique.
111
112\section{Error diffusion displacement}
113
114Most error diffusion implementations parse the image in raster scan order.
115Boustrophedonic (serpentine) scanning has been shown to cause fewer visual
116artifacts \cite{halftoning}, but other, more complex processing paths such as
117Hilbert curves \cite{spacefilling} are seldom used as they do not improve the
118image quality significantly.
119
120Intuitively, as the error is always propagated to the bottom-left or
121bottom-right of each pixel (Fig. \ref{fig:direction}), one may expect the
122resulting image to be slightly translated. This expectation is confirmed
123visually when rapidly switching between an error diffused image and the
124corresponding DBS halftone.
125
126\begin{figure}
127  \begin{center}
128    \input{direction}
129    \caption{Floyd-Steinberg error diffusion direction in raster scan (left)
130             and serpentine scan (right).}\label{fig:direction}
131  \end{center}
132\end{figure}
133
134This small translation is visually innocuous but we found that it means a lot
135in terms of error computation. A common way to compute the error between an
136image $h_{i,j}$ and the corresponding binary halftone $b_{i,j}$ is to compute
137the mean square error between modified versions of the images, in the form:
138
139\begin{equation}
140  E(h,b) = \frac{(||v * h_{i,j} - v * b_{i,j}||_2)^2}{wh}
141\end{equation}
142
143\noindent where $w$ and $h$ are the image dimensions, $*$ denotes the
144convolution and $v$ is a model for the human visual system.
145
146To compensate for the slight translation observed in the halftone, we use the
147following error metric instead:
148
149\begin{equation}
150  E_{dx,dy}(h,b) = \frac{(||v * h_{i,j} - v * t_{dx,dy} * b_{i,j}||_2)^2}{wh}
151\end{equation}
152
153\noindent where $t_{dx,dy}$ is an operator which translates the image along the
154$(dx,dy)$ vector. By design, $E_{0,0} = E$.
155
156A simple example can be given using a Gaussian HVS model:
157
158\begin{equation}
159  v(x,y) = e^{\frac{x^2+y^2}{2\sigma^2}}
160\end{equation}
161
162Finding the second filter is then straightforward:
163
164\begin{equation}
165  (v * t_{dx,dy})(x,y) = e^{\frac{(x-dx)^2+(y-dy)^2}{2\sigma^2}}
166\end{equation}
167
168Experiments show that for a given image and a given corresponding halftone,
169$E_{dx,dy}$ has a local minimum almost always away from $(dx,dy) = (0,0)$ (Fig.
170\ref{fig:lena-values}). Let $E$ be an error metric where this remains true. We
171call the local minimum $E_{min}$:
172
173\begin{equation}
174  E_{min}(h,b) = \min_{dx,dy}E_{dx,dy}(h,b)
175\end{equation}
176
177\begin{figure}
178  \begin{minipage}[c]{0.8\textwidth}
179    \input{lena-values}
180  \end{minipage}
181  \begin{center}
182    \caption{Mean square error for the \textit{Lena} image ($\times10^4$). $v$
183             is a simple $11\times11$ Gaussian convolution kernel with $\sigma
184             = 1.2$ and $(dx,dy)$ vary in $[-1,1]\times[-1,1]$.}
185    \label{fig:lena-values}
186  \end{center}
187\end{figure}
188
189For instance, a Floyd-Steinberg dither of \textit{Lena} with $\sigma = 1.2$
190yields a per-pixel mean square error of $3.67\times10^{-4}$. However, when
191taking the displacement into account, the error becomes $3.06\times10^{-4}$ for
192$(dx,dy) = (0.165,0.293)$. The new, corrected error is significantly smaller,
193with the exact same input and output images.
194
195Experiments show that the corrected error is always noticeably smaller except
196in the case of images that are already mostly pure black and white. The
197experiment was performed on a database of 10,000 images from common computer
198vision sets and from the image board \textit{4chan}, providing a representative
199sampling of the photographs, digital art and business graphics widely exchanged
200on the Internet nowadays \cite{4chan}.
201
202In addition to the classical Floyd-Steinberg and Jarvis-Judice-Ninke kernels,
203we tested two serpentine error diffusion algorithms: Ostromoukhov's simple
204error diffusion \cite{ostromoukhov}, which uses a variable coefficient kernel,
205and Wong and Allebach's optimum error diffusion kernel \cite{wong}:
206
207\begin{center}
208  \begin{tabular}{|l|c|c|}
209  \hline
210  &~ $E\times10^4$ ~&~ $E_{min}\times10^4$ ~\\ \hline
211  ~raster Floyd-Steinberg ~&~ 3.7902 ~&~ 3.1914 ~\\ \hline
212  ~raster Ja-Ju-Ni        ~&~ 9.7013 ~&~ 6.6349 ~\\ \hline
213  ~Ostromoukhov           ~&~ 4.6892 ~&~ 4.4783 ~\\ \hline
214  ~optimum kernel         ~&~ 7.5209 ~&~ 6.5772 ~\\
215  \hline
216  \end{tabular}
217\end{center}
218
219We clearly see that usual metrics underestimate the quality of error-diffused
220halftones, especially in raster scan. Algorithms such as direct binary search,
221on the other hand, do not suffer from this bias since they are designed to
222minimise the very error induced by the HVS model.
223
224\section{An image-independent corrected quality metric for error-diffused
225halftones}
226
227We have seen that for a given image, $E_{min}(h,b)$ is a better and fairer
228visual error measurement than $E(h,b)$. However, its major drawback is that it
229is highly computationally expensive: for each image, the new $(dx,dy)$ values
230need to be calculated to minimise the error value.
231
232Fortunately, we found that for a given raster or serpentine scan
233error diffusion algorithm, there was often very little variation in
234the optimal $(dx,dy)$ values (Fig. \ref{fig:table-historaster} and
235\ref{fig:table-histoserp}).
236
237\begin{figure}
238  \begin{center}
239    \begin{minipage}[c]{0.50\textwidth}
240      \input{fs-histo}
241    \end{minipage}
242    \begin{minipage}[c]{0.40\textwidth}
243      \input{jajuni-histo}
244    \end{minipage}
245    \caption{error diffusion displacement histograms for the raster
246             Floyd-Steinberg (left) and raster Jarvis, Judis and Ninke (right)
247             algorithms applied to a corpus of 10,000 images}
248    \label{fig:table-historaster}
249  \end{center}
250\end{figure}
251
252\begin{figure}
253  \begin{center}
254    \begin{minipage}[c]{0.50\textwidth}
255      \input{ostro-histo}
256    \end{minipage}
257    \begin{minipage}[c]{0.40\textwidth}
258      \input{serpopt-histo}
259    \end{minipage}
260    \caption{error diffusion displacement histograms for the Ostromoukhov                    (left) and optimum kernel (right) algorithms applied to a corpus
261             of 10,000 images}
262    \label{fig:table-histoserp}
263  \end{center}
264\end{figure}
265
266For each algorithm, we choose the $(dx,dy)$ values at the histogram peak and
267we refer to them as the \textit{algorithm's displacement}, as opposed to the
268\textit{image's displacement} for a given algorithm. We call $E_{fast}(h,b)$
269the error computed at $(dx,dy)$. As $E_{fast}$ does not depend on the image, it
270is a lot faster to compute than $E_{min}$, and as it is statistically closer to
271$E_{min}$, we can expect it to be a better error estimation than $E$:
272
273\begin{center}
274  \begin{tabular}{|l|c|c|c|c|c|}
275  \hline
276  &~ $E\times10^4$ ~&~ $E_{min}\times10^4$ ~&~ $dx$ ~&~ $dy$ ~&~ $E_{fast}\times10^4$ ~\\ \hline
277  ~raster Floyd-Steinberg ~&~ 3.7902 ~&~ 3.1914 ~&~ 0.16 ~&~ 0.28 ~&~ 3.3447 ~\\ \hline
278  ~raster Ja-Ju-Ni        ~&~ 9.7013 ~&~ 6.6349 ~&~ 0.26 ~&~ 0.76 ~&~ 7.5891 ~\\ \hline
279  ~Ostromoukhov           ~&~ 4.6892 ~&~ 4.4783 ~&~ 0.00 ~&~ 0.19 ~&~ 4.6117 ~\\ \hline
280  ~optimum kernel         ~&~ 7.5209 ~&~ 6.5772 ~&~ 0.00 ~&~ 0.34 ~&~ 6.8233 ~\\
281  \hline
282  \end{tabular}
283\end{center}
284
285\section{Using error diffusion displacement for optimum kernel design}
286
287We believe that our higher quality $E_{min}$ error metric may be useful in
288kernel design, because it is the very same error that admittedly superior yet
289computationally expensive algorithms such as DBS try to minimise.
290
291Our first experiment was a study of the Floyd-Steinberg-like 4-block error
292diffusion kernels. According to the original authors, the coefficients were
293found "mostly by trial and error" \cite{fstein}. With our improved metric, we
294now have the tools to confirm or infirm Floyd and Steinberg's initial choice.
295
296We chose to do an exhaustive study of every $\frac{1}{16}\{a,b,c,d\}$ integer
297combination. We deliberately chose positive integers whose sum was 16: error
298diffusion coefficients smaller than zero or adding up to more than 1 are known
299to be unstable \cite{stability}, and diffusing less than 100\% of the error
300causes important loss of detail in the shadow and highlight areas of the image.
301
302We studied all possible coefficients on a pool of 3,000 images with an error
303metric $E$ based on a standard Gaussian HVS model. $E_{min}$ is only given here
304as an indication and only $E$ was used to elect the best coefficients:
305
306\begin{center}
307  \begin{tabular}{|c|c|c|c|}
308  \hline
309  ~ rank ~&~ coefficients ~&~ $E\times10^4$ ~&~ $E_{min}\times10^4$ ~\\ \hline
310  ~ 1 ~&~ 7 3 6 0 ~&~ 4.65512 ~&~ 3.94217 ~\\ \hline
311  ~ 2 ~&~ 8 3 5 0 ~&~ 4.65834 ~&~ 4.03699 ~\\ \hline
312  \hline
313  ~ 5 ~&~ 7 3 5 1 ~&~ 4.68588 ~&~ 3.79556 ~\\ \hline
314  \hline
315  ~ 18 ~&~ 6 3 5 2 ~&~ 4.91020 ~&~ 3.70465 ~\\ \hline
316  ~ \dots ~&~ \dots ~&~ \dots ~&~ \dots ~\\
317  \hline
318  \end{tabular}
319\end{center}
320
321The exact same operation using $E_{min}$ as the decision variable yields very
322different results. Similarly, $E$ is only given here as an indication:
323
324\begin{center}
325  \begin{tabular}{|c|c|c|c|}
326  \hline
327  ~ rank ~&~ coefficients ~&~ $E_{min}\times10^4$ ~&~ $E\times10^4$ ~\\ \hline
328  ~ 1 ~&~ 6 3 5 2 ~&~ 3.70465 ~&~ 4.91020 ~\\ \hline
329  ~ 2 ~&~ 7 3 5 1 ~&~ 3.79556 ~&~ 4.68588 ~\\ \hline
330  \hline
331  ~ 15 ~&~ 7 3 6 0 ~&~ 3.94217 ~&~ 4.65512 ~\\ \hline
332  \hline
333  ~ 22 ~&~ 8 3 5 0 ~&~ 4.03699 ~&~ 4.65834 ~\\ \hline
334  ~ \dots ~&~ \dots ~&~ \dots ~&~ \dots ~\\
335  \hline
336  \end{tabular}
337\end{center}
338
339Our improved metric allowed us to confirm that the original Floyd-Steinberg
340coefficients were indeed amongst the best possible for raster scan.
341More importantly, using $E$ as the decision variable may have elected
342$\frac{1}{16}\{7,3,6,0\}$ or $\frac{1}{16}\{8,3,5,0\}$, which are in fact poor
343choices.
344
345For serpentine scan, however, our experiment suggests that
346$\frac{1}{16}\{7,4,5,0\}$ is a better choice than the Floyd-Steinberg
347coefficients that have nonetheless been widely in use so far (Fig.
348\ref{fig:lena7450}).
349
350\begin{figure}
351  \begin{center}
352    \includegraphics[width=0.4\textwidth]{output-7-3-5-1-serp.eps}
353    ~
354    \includegraphics[width=0.4\textwidth]{output-7-4-5-0-serp.eps}
355  \end{center}
356  \begin{center}
357    \includegraphics[width=0.4\textwidth]{crop-7-3-5-1-serp.eps}
358    ~
359    \includegraphics[width=0.4\textwidth]{crop-7-4-5-0-serp.eps}
360    \caption{halftone of \textit{Lena} using serpentine error diffusion
361             (\textit{left}) and the optimum coefficients
362             $\frac{1}{16}\{7,4,5,0\}$ (\textit{right}) that improve on the
363             standard Floyd-Steinberg coefficients in terms of visual quality
364             for the HVS model used in section 3. The detailed area
365             (\textit{bottom}) shows fewer structure artifacts in the regions
366             with low contrast.}
367    \label{fig:lena7450}
368  \end{center}
369\end{figure}
370
371\section{Conclusion}
372
373We have disclosed an interesting property of error diffusion algorithms
374allowing to more precisely measure the quality of such halftoning methods.
375Having showed that such quality is often underestimated by usual metrics,
376we hope to see even more development in simple error diffusion methods.
377
378Confirming Floyd and Steinberg's 30-year old "trial-and-error" result with our
379work is only the beginning: future work may cover more complex HVS models,
380for instance by taking into account the angular dependance of the human eye
381\cite{sullivan}. We plan to use our new metric to improve all error diffusion
382methods that may require fine-tuning of their propagation coefficients.
383
384%
385% ---- Bibliography ----
386%
387\begin{thebibliography}{}
388%
389\bibitem[1]{bayer}
390B. Bayer,
391\textit{Color imaging array}.
392U.S. patent 3,971,065 (1976)
393
394\bibitem[2]{void1}
395R.A. Ulichney (Digital Equipment Corporation),
396\textit{Void and cluster apparatus and method for generating dither templates}.
397U.S. patent 5,535,020 (1992)
398
399\bibitem[3]{void2}
400H. Ancin, A. Bhattacharjya and J. Shu (Seiko Epson Corporation),
401\textit{Void-and-cluster dither-matrix generation for better half-tone
402uniformity}.
403U.S. patent 6,088,512 (1997)
404
405\bibitem[4]{fstein}
406R.W. Floyd, L. Steinberg,
407\textit{An adaptive algorithm for spatial grey scale}.
408Proceedings of the Society of Information Display 17, (1976) 75--77
409
410\bibitem[5]{parfstein}
411P. Metaxas,
412\textit{Optimal Parallel Error-Diffusion Dithering}.
413Color Imaging: Device-Indep. Color, Color Hardcopy, and Graphic Arts IV, Proc.
414SPIE 3648, 485--494 (1999)
415
416\bibitem[6]{kite}
417T. D. Kite,
418\textit{Design and Quality Assessment of Forward and Inverse Error-Diffusion
419Halftoning Algorithms}.
420PhD thesis, Dept. of ECE, The University of Texas at Austin, Austin, TX, Aug.
4211998
422
423\bibitem[7]{halftoning}
424R. Ulichney,
425\textit{Digital Halftoning}.
426MIT Press, 1987
427
428\bibitem[8]{spacefilling}
429L. Velho and J. Gomes,
430\textit{Digital halftoning with space-filling curves}.
431Computer Graphics (Proceedings of SIGGRAPH 91), 25(4):81--90, 1991
432
433\bibitem[9]{nasanen}
434R. Nasanen,
435\textit{Visibility of halftone dot textures}.
436IEEE Trans. Syst. Man. Cyb., vol. 14, no. 6, pp. 920--924, 1984
437
438\bibitem[10]{allebach}
439M. Analoui and J.~P. Allebach,
440\textit{Model-based halftoning using direct binary search}.
441Proc. of SPIE/IS\&T Symp. on Electronic Imaging Science and Tech.,
442February 1992, San Jose, CA, pp. 96--108
443
444\bibitem[11]{mcnamara}
445Ann McNamara,
446\textit{Visual Perception in Realistic Image Synthesis}.
447Computer Graphics Forum, vol. 20, no. 4, pp. 211--224, 2001
448
449\bibitem[12]{bhatt}
450Bhatt \textit{et al.},
451\textit{Direct Binary Search with Adaptive Search and Swap}.
452\url{http://www.ima.umn.edu/2004-2005/MM8.1-10.05/activities/Wu-Chai/halftone.pdf}
453
454\bibitem[13]{4chan}
455moot,
456\url{http://www.4chan.org/}
457
458\bibitem[14]{wong}
459P.~W. Wong and J.~P. Allebach,
460\textit{Optimum error-diffusion kernel design}.
461Proc. SPIE Vol. 3018, p. 236--242, 1997
462
463\bibitem[15]{ostromoukhov}
464Victor Ostromoukhov,
465\textit{A Simple and Efficient Error-Diffusion Algorithm}.
466in Proceedings of SIGGRAPH 2001, in ACM Computer Graphics,  Annual Conference
467Series, pp. 567--572, 2001
468
469\bibitem[16]{lsmb}
470T.~N. Pappas and D.~L. Neuhoff,
471\textit{Least-squares model-based halftoning}.
472in Proc. SPIE, Human Vision, Visual Proc., and Digital Display III, San Jose,
473CA, Feb. 1992, vol. 1666, pp. 165--176
474
475\bibitem[17]{stability}
476R. Eschbach, Z. Fan, K.~T. Knox and G. Marcu,
477\textit{Threshold Modulation and Stability in Error Diffusion}.
478in Signal Processing Magazine, IEEE, July 2003, vol. 20, issue 4, pp. 39--50
479
480\bibitem[18]{sullivan}
481J. Sullivan, R. Miller and G. Pios,
482\textit{Image halftoning using a visual model in error diffusion}.
483J. Opt. Soc. Am. A, vol. 10, pp. 1714--1724, Aug. 1993
484
485\end{thebibliography}
486
487\end{document}
Note: See TracBrowser for help on using the repository browser.