These are chat archives for caryll/otfcc

18th
Sep 2016
Belleve Invis
@be5invis
Sep 18 2016 03:49
Tested using the 65535-glyph cff font.
makeotf takes 6 hours to optimize it.
while otfcc only 24 seconds.
Cosimo Lupo
@anthrotype
Sep 18 2016 09:06
No way! That's awesome! What is the size comparison?
Belleve Invis
@be5invis
Sep 18 2016 10:19
@anthrotype 16344KB(otfcc) vs 16041KB(makeotf)
Cosimo Lupo
@anthrotype
Sep 18 2016 10:20
Not bad Indeed!
Belleve Invis
@be5invis
Sep 18 2016 10:20
otfcc performs better (on size) on some cases
Cosimo Lupo
@anthrotype
Sep 18 2016 10:21
I see
Well, it's pretty astonishing the timing difference
How come is that even possible?
Belleve Invis
@be5invis
Sep 18 2016 10:21
Adobe found a wrong algorithm.
they try to find a "best" subroutine on each pass, but the subroutinization is actually a grammar-based compression
and there are some very fast grammar-based compression algorithms.
Cosimo Lupo
@anthrotype
Sep 18 2016 10:22
Interesting
How would you categorise compreffor on that regard?
I have no experience in compression algorithms
Belleve Invis
@be5invis
Sep 18 2016 10:23
compreffor's implementation looks like FDK
but, well, it used multiple threads
so it performs faster
Cosimo Lupo
@anthrotype
Sep 18 2016 10:25
Ah so yours is single threaded and fares so fast!
Belleve Invis
@be5invis
Sep 18 2016 10:26
yes
btw, have you, and your manager received my mail?
I really want to join DM.
Cosimo Lupo
@anthrotype
Sep 18 2016 10:32
Hehe :). Of she did receive it, but she was at the conference. I'm sure she'll get back to you soon.
*Of course
Belleve Invis
@be5invis
Sep 18 2016 10:32
Thank you.
But, well, please do some promotion for me :)
like that speed comparison :P
Cosimo Lupo
@anthrotype
Sep 18 2016 10:34
Sure, I am a big fan!
Behdad Esfahbod
@behdad
Sep 18 2016 12:45
We thought about the trivial greedy algorithm but dismissed it. What's in compreffor is different algorithm and compresses better than afdko.
So I'm interested to see a comparison
Grammar based algorithms are very fast indeed, but speed is not primary factor for this.
Belleve Invis
@be5invis
Sep 18 2016 12:47
There is a better variant of SEQUITUR called Re-pair, and it is also a kind of grammar-based compression.
Behdad Esfahbod
@behdad
Sep 18 2016 12:48
Anyway. Thanks for the link for the paper. I have not had seen it before. Will read during my flight today
Belleve Invis
@be5invis
Sep 18 2016 12:48
@bv
@behdad Should I provide you the paper of Re-pair?
otfcc's SEQUITUR is a slightly modified one.
It enables the generation rule A → a when the terminal “a” is long enough.
Behdad Esfahbod
@behdad
Sep 18 2016 12:50
Sure. Yes please.
Belleve Invis
@be5invis
Sep 18 2016 12:52
Larsson, N. J.; Moffat, A. (2000), "Offline Dictionary-Based Compression", IEEE, 88 (11): 1722–1732, doi:10.1109/5.892708
Re-pair may have better compression ratio, while its amortized run time is still, well, linear
Behdad Esfahbod
@behdad
Sep 18 2016 12:54
Thanks!
If you get compreffor numbers let me know please. I don't remember how its running time compared with afdko.
Belleve Invis
@be5invis
Sep 18 2016 12:56
I do not have compreffor installed. Maybe you can do the test, for otfcc, just do : otfccdump your.otf | otfccbuild -O3 -s -o out.otf.
Behdad Esfahbod
@behdad
Sep 18 2016 12:57
I don't have age of set up either. Will try when on laptop.
Belleve Invis
@be5invis
Sep 18 2016 12:59
Some other papers and reviews:
Belleve Invis
@be5invis
Sep 18 2016 13:06
Another one callsed SEQUENTIAL: J. C. Kieffer and E. Yang. Grammar-Based Codes: a New Class of Universal Lossless Source Codes. IEEE Transactions on Information Theory, vol. 46 (2000), pp. 737–754.
Behdad Esfahbod
@behdad
Sep 18 2016 13:09
I trust you find the best one :).
Belleve Invis
@be5invis
Sep 18 2016 13:12
Compreffor's method looks like the GREEDY mentioned in one of the links above: repeatly replace a n-ary tuple occured in the entire grammar into a new rule.
this one is not that fast compared to SEQUITUR or RE-PAIR