9 | | Twitter allows for 140 characters in a message. UTF-8 is allowed. |
| 13 | Here is a rough overview of the encoding process: |
| 14 | * The number of available bits is computed from desired message length and usable charset |
| 15 | * The source image is segmented into as many square cells as the available bits permit |
| 16 | * A fixed number of points (currently 2) is affected to each cell, currently by selecting the darkest and brightest pixels in the cell |
| 17 | * The following is repeated until a quality condition is met: |
| 18 | * A point is chosen a random |
| 19 | * An operation is performed at random on this point (moving it inside its cell, changing its colour) |
| 20 | * If the resulting image (see the decoding process below) is closer to the source image, the operation is kept |
| 21 | * The image size and list of points is encoded in UTF-8 |
11 | | UTF-8 is restricted to the formal Unicode definition by RFC 3629. It means that the only legal UTF-8 characters range from U+0000 to U+10FFFF. The following restrictions must also be added: |
12 | | * The 2¹¹ high and low surrogates, used for UTF-16 encoding, restricting the Unicode range to U+0000..U+D7FF and U+E000..U+10FFFF. |
13 | | * The 66 non-characters. |
| 23 | And this is the decoding process: |
| 24 | * The image size and points are read from the UTF-8 stream |
| 25 | * For each pixel in the destination image: |
| 26 | * The list of natural neigbours is computed |
| 27 | * The pixel's final colour is set as a weighted average of its natural neighbours' colours |
38 | | == Bit allocation == |
39 | | |
40 | | A compressed image usually contains the following information: |
41 | | * The image geometry information (width and height) |
42 | | * Optional colour information (palette) |
43 | | * Elementary picture elements (encoded as pixels, triangles, vectors...) |
44 | | |
45 | | Given the amount of compression we are doing, there is little point in compressing images larger than 512×512. This reduces image geometry information to 18 bits, leaving us with 2308 bits to encode the image information. |
46 | | |
47 | | Whether to use a palette or to encode colour information into the picture elements is undecided yet. We'll cover both options. |
48 | | |
49 | | == Strategy 1: colour information in picture elements == |
50 | | |
51 | | Each picture element will hold data for: |
52 | | * coordinates |
53 | | * colour information |
54 | | * additional control information |
55 | | |
56 | | Coordinates could be absolute (therefore requiring 16 or 14 bits, maybe 12) or relative. I would favour a coordinate system relative to predefined image cells because there is a good chance that each cell will hold a point. Assuming at least 8 horizontal and vertical subdivisions, 6 bits can be gained this way. The final coordinate bit allocation is now 10, 8 or 6. We'll pick 8 to be safe for now: 16 X values and 16 Y values. |
57 | | |
58 | | Using 7 bits per colour allows for the following options: |
59 | | * full bit range usage: 4 red values, 8 green values, 4 blue values |
60 | | * almost full bit range usage: 5 red values, 5 green values, 5 blue values |
61 | | |
62 | | Finally, a weight value could be added, using a final bit. |
63 | | |
64 | | The proposed allocation is then 16, allowing 144 points to be stored in the following configurations: |
65 | | * 12×12 |
66 | | * 10×14 (wasting 4 point slots) |
67 | | * 9×16 |
68 | | * 8×18 |
69 | | * 7×20 (wasting 4 point slots) |
70 | | * 6×24 |
71 | | |
72 | | == Strategy 2: colour information in a separate palette == |
73 | | |
74 | | ''To do.'' |
75 | | |
76 | | == Image reconstruction == |
77 | | |
78 | | Image reconstruction is an interpolation problem on a Delaunay triangulation. We use the natural neighbour coordinates to interpolate between nodes and obtain a first-order smooth image. |
| 56 | ''Todo'' |