Changes between Version 11 and Version 12 of img2twit


Ignore:
Timestamp:
05/25/2009 03:40:04 PM (15 years ago)
Author:
Sam Hocevar
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • img2twit

    v11 v12  
    1 A few notes and thoughts about compressing images to 140 characters for use on Twitter.
     1A few notes and thoughts about compressing images to 140 characters, for use on Twitter.
    22
    3 The first I read about this "competition" was [http://www.flickr.com/photos/quasimondo/3518306770/in/set-72057594062596732/ here].
     3The first I read about this problem was [http://www.flickr.com/photos/quasimondo/3518306770/in/set-72057594062596732/ here].
     4
     5There is now a [http://stackoverflow.com/questions/891643/twitter-image-encoding-challenge competition] on StackOverflow.com.
    46
    57= How it works =
    68
    79My goal is to reach a reasonable compromise between the following:
    8  * Allow fast decompression
     10 * Fast decompression (a fraction of a second)
     11 * Reasonable compression time (around 1 minute for high quality)
    912 * Achieve decent reconstruction quality
    1013 * Work with various message length and character sets
     
    1417 * The number of available bits is computed from desired message length and usable charset
    1518 * The source image is segmented into as many square cells as the available bits permit
    16  * A fixed number of points (currently 2) is affected to each cell, currently by selecting the darkest and brightest pixels in the cell
     19 * A fixed number of points (currently 2) is affected to each cell, with initial coordinates and colour values
    1720 * The following is repeated until a quality condition is met:
    1821   * A point is chosen a random
     
    5457== Optimised bitstream ==
    5558
    56 ''Todo''
     59There is a common misconception that values in an information stream should be bit-aligned. This is not the case at all in `img2twit`: the bitstream can store integer values with non-power of two bounds.
     60
     61As an illustration, take the example of 16-bit colour coding. There are basically two widespread ways of storing RGB data into a 16-bit word: either using 5 bits per component and wasting the last bit, or using 6 bits for the G value, because the human eye is more sensible to variations of green. The RGB triplet is then computed as ''(((R) * 64 + G) * 32 + B)'', and eg. the green value is retrieved as ''(RGB / 32) % 64''. Though 5-6-5 looks rather balanced, that's 32 values of red, 64 values of green, 32 values of blue: that's actually as many values of green as the other two put together!
     62
     63In this case, `img2twit`'s approach would be to use 40 values of each component, where the RGB triplet is computed as ''(((R) * 40 + G) * 40 + B)'', and eg. the green value is retrieved as ''(RGB / 40) % 40''. This makes the computations a lot slower (modulo 64 is just a bit shift, while modulo 40 requires dividing the whole bitstream by 40). However, it offers better packing and a lot more control over the data representation. For instance, I found that 2 bits (4 values) per colour component was not enough, but that 3 bits (8 values) was more than required. I went for 5 values (2.58 bits).
     64
     65One drawback is that this makes the bitstream completely unable to recover from bit corruptions. However, for such small streams I don't care at all.
     66
     67== Point selection ==
     68
     69I use 16 values for the X and Y position of a point within a cell, and 6 values for each R, G and B component. This uses roughly 15.755 bits. There is a way to reduce this value down to 15.252 bits by accounting for the fact that two points within a cell are interchangeable, but I have not yet found an elegant way to remove the duplicate information without butchering a lot of code.
     70
     71== Algorithm seed ==
     72
     73For the algorithm to converge quickly, the initial image needs to be as close as possible to an ideal solution. Of course the method works correctly with an initial random seed, but since the fitting is very slow, it can take 2 to 3 times as long to converge to the same solution.
     74
     75My current approach is to look at the current cell's brightest and darkest pixels, and to set the two initial points at these locations. There is a lot to be improved in this domain.
     76
     77== Image rendering ==
     78
     79The final image is simply an interpolation of the control points stored in the image. I use natural neighbour interpolation because it is reasonably fast (remember that the main algorithm loop computes the changes caused by changes to points, and thus require at least a partial render).
     80
     81The rendering method is responsible for both the artifacts (mainly the embossed dots) and the nice edges found in some parts of the resulting images.
    5782
    5883= Preliminary results =
     
    79104
    80105豪弅淶鑆斳愔耐俬秱戂孤访红艶劃嬌躑擣昗呎腆猎扭僬题猛嬰頽恓劉檭橀韮闼帣赑峈鮦宝鰢斣麙蓑騰騺鹸希關鈯穨唖秴斮圫究傲駛朘铃邂巢沿譓船櫃晒峩泪蝻鵲皲販口谹鎺侒戣耔凉蠛抏槱戛蝂荄勞攞咉闏涪彃沏全偫吒溸乎洸螕慹鳩弭蚕弣寽砰薨埻铥恣噿悏镏雈壭蒬礡靑徠鼛慗泏郄渺婥俦攨賌羢髙壶耔僪爯姉蔮蠬伣豖弫
     106
     107= Getting `img2twit` =
     108
     109`img2twit` is currently research material and is not available in any released software. You can however use it, modify it and redistribute it for any purpose, according to the terms of the [http://sam.zoy.org/wtfpl/ WTFPL].
     110
     111If you wish to try `img2twit`, get the [wiki:libpipi] source code.
     112
     113If you just want to have a look at the code, get it [/browser/libpipi/trunk/examples/img2twit.cpp from the web-based browser].