Context Navigation

Changes between Version 10 and Version 11 of img2twit Tweet

Timestamp:: 05/25/2009 12:35:27 PM (16 years ago)
Author:: Sam Hocevar
Comment:: --

Legend:

: Unmodified
: Added
: Removed
: Modified

img2twit

-                      v10
+                      v11
 The first I read about this "competition" was [http://www.flickr.com/photos/quasimondo/3518306770/in/set-72057594062596732/ here].
 = Discussion =
+= How it works =
+== Bit availability ==
+My goal is to reach a reasonable compromise between the following:
+ * Allow fast decompression
+ * Achieve decent reconstruction quality
+ * Work with various message length and character sets
+ * Do not waste a single bit of information
+Twitter allows for 140 characters in a message. UTF-8 is allowed.
+Here is a rough overview of the encoding process:
+ * The number of available bits is computed from desired message length and usable charset
+ * The source image is segmented into as many square cells as the available bits permit
+ * A fixed number of points (currently 2) is affected to each cell, currently by selecting the darkest and brightest pixels in the cell
+ * The following is repeated until a quality condition is met:
+   * A point is chosen a random
+   * An operation is performed at random on this point (moving it inside its cell, changing its colour)
+   * If the resulting image (see the decoding process below) is closer to the source image, the operation is kept
+ * The image size and list of points is encoded in UTF-8
+UTF-8 is restricted to the formal Unicode definition by RFC 3629. It means that the only legal UTF-8 characters range from U+0000 to U+10FFFF. The following restrictions must also be added:
+ * The 2¹¹ high and low surrogates, used for UTF-16 encoding, restricting the Unicode range to U+0000..U+D7FF and U+E000..U+10FFFF.
+ * The 66 non-characters.
+And this is the decoding process:
+ * The image size and points are read from the UTF-8 stream
+ * For each pixel in the destination image:
+   * The list of natural neigbours is computed
+   * The pixel's final colour is set as a weighted average of its natural neighbours' colours
+The final size of this set is:
+== Bit allocation ==
+UTF-8 is restricted to the formal Unicode definition by RFC 3629, meaning that once the 2¹¹ high and low surrogates and the 66 non-characters are removed from the U+0000..U+10FFFF range, the final size of the UTF-8 character set is 1111998. However, a lot of these characters are undefined, not yet allocated or are control characters. As of Unicode 5.1 there are only 100507 graphic characters.
+The number of bits that can be expressed in a 140-character message using this charset is:
 {{{
 #!latex
 $(2^{20} + 2^{16}) - 2^{11} - 66 = 1111998$
+$n_{bits} = \dfrac{140 \log(100507)}{\log(2)} = 2326.37$
 }}}
 The number of bits that can be encoded using 140 such characters is computed as follows:
+If we restrict ourselves to the 20902 characters available in the ''CJK Unified Ideographs'' block, the number of bits becomes:
 {{{
 #!latex
 $n_{bits} = \mathrm{floor}\left(\dfrac{140 \log(1111998)}{\log(2)}\right) = 2811$
+$n_{bits} = \dfrac{140 \log(20902)}{\log(2)} = 2009.18$
 }}}
 In theory, 2811 bits is therefore the maximum we can stuff into a Twitter message. However, a lot of these characters are undefined, not yet allocated or are control characters. As of Unicode 5.1 there are 100507 graphic characters, reducing the number of expressed bits to:
+And finally, using the 94 non-spacing, printable ASCII characters:
 {{{
 #!latex
 $n_{bits} = \mathrm{floor}\left(\dfrac{140 \log(100507)}{\log(2)}\right) = 2326$
+$n_{bits} = \dfrac{140 \log(94)}{\log(2)} = 917.64$
 }}}
+We'll go on with this value of 2326 encodable bits.
+== Optimised bitstream ==
+== Bit allocation ==
+A compressed image usually contains the following information:
+ * The image geometry information (width and height)
+ * Optional colour information (palette)
+ * Elementary picture elements (encoded as pixels, triangles, vectors...)
+Given the amount of compression we are doing, there is little point in compressing images larger than 512×512. This reduces image geometry information to 18 bits, leaving us with 2308 bits to encode the image information.
+Whether to use a palette or to encode colour information into the picture elements is undecided yet. We'll cover both options.
+== Strategy 1: colour information in picture elements ==
+Each picture element will hold data for:
+ * coordinates
+ * colour information
+ * additional control information
+Coordinates could be absolute (therefore requiring 16 or 14 bits, maybe 12) or relative. I would favour a coordinate system relative to predefined image cells because there is a good chance that each cell will hold a point. Assuming at least 8 horizontal and vertical subdivisions, 6 bits can be gained this way. The final coordinate bit allocation is now 10, 8 or 6. We'll pick 8 to be safe for now: 16 X values and 16 Y values.
+Using 7 bits per colour allows for the following options:
+ * full bit range usage: 4 red values, 8 green values, 4 blue values
+ * almost full bit range usage: 5 red values, 5 green values, 5 blue values
+Finally, a weight value could be added, using a final bit.
+The proposed allocation is then 16, allowing 144 points to be stored in the following configurations:
+ * 12×12
+ * 10×14 (wasting 4 point slots)
+ * 9×16
+ * 8×18
+ * 7×20 (wasting 4 point slots)
+ * 6×24
+== Strategy 2: colour information in a separate palette ==
+''To do.''
+== Image reconstruction ==
+Image reconstruction is an interpolation problem on a Delaunay triangulation. We use the natural neighbour coordinates to interpolate between nodes and obtain a first-order smooth image.
+''Todo''
 = Preliminary results =