So I'm sure you've done a website that needed to be on 1,238 different languages. And every time you reached Chinese, Japanese, Korean... you got surprised that just the embeded font made your swf 9,000kbytes big.
For this project we're working on I'm mainly doing little tools with PHP. One of them is a translations manager, so you have a little SQL database with all the keywords and languages and someone fills it with data. At any point you can export it as .xml ready to be used in the website.
Having this set up,
Theo came up with the idea that, as we had control on the text that was going to be needed for each language, we could do a script to output the list of characters needed for each font.
The PHP script goes down to this:
// In this case $lines is a associative array that comes from MySQL.
$list = array();
foreach($lines as $line)
{
$string = $line["text"];
$string = strip_tags($string);
$string = str_replace('\n','',$string);
preg_match_all('/./u', $string, $chars);
foreach($chars[0] as $char)
{
$found = false;
foreach($list as $listchar)
if ($listchar == $char)
$found = true;
if ($found == false)
$list[] = $char;
}
}
foreach($list as $item)
{
echo "U+" . zeropad( strtoupper( dechex( substr( mb_encode_numericentity ( $item, array (0x0, 0xffff, 0, 0xffff), 'UTF-8'), 2, -1 ) ) ), 4 ) . ",";
}
You'll also need this:
function zeropad($num, $lim)
{
return (strlen($num) >= $lim) ? $num : zeropad("0" . $num, $lim);
}
What this code does (properly setted up in yours) is split the whole string into characters and check one by one if has been added to the list of characters used, if it's a new it just adds it. Then it writes a unicode list formated as U+XXXX. The output looks something like this:
U+0043, U+0048, U+0041, U+004E, U+0045, U+004C, U+002E, U+004F, U+004D, U+0052, U+0044, U+0049, U+0054, U+0053, U+5168, U+5C4F, U+89C2, U+770B, U+5176, U+5B83, U+8BED, U+8A00, U+6CD5, U+5F8B, U+58F0, U+660E, U+97F3, U+91CF, U+5E55, U+540E, U+82B1, U+7D6E, U+5965, U+9EDB, U+4E3D, U+2022, U+5854, U+56FE, U+0020, U+4E0E, U+8BA9, U+002D, U+76AE, U+8036, U+5C14, U+70ED, U+5185, U+62CD, U+6444, U+8BB0, U+5F55, U+73B0, U+573A, U+5F71, U+7247, U+0032, U+5206, U+0030, U+79D2, U+0036, U+00B0, U+0035, U+4F20, U+5947, U+4E3A, U+4EC0, U+4E48, U+9009, U+5851, U+9020, U+5973, U+795E, U+642D, U+4E58, U+591C, U+95F4, U+5217, U+8F66, U+7684, U+4EBA, U+6027, U+611F, U+8BF1, U+60D1, U+4F60, U+6700, U+559C, U+7231, U+955C, U+5934, U+7B2C, U+4E00, U+6B21, U+7EED, U+5199, U+8F89, U+714C, U+4EE3, U+00BA, U+9999, U+6C34, U+6C1B, U+5FC6, U+6211, U+53F7, U+2014, U+79D8, U+6570, U+5B57, U+0039, U+5948, U+513F, U+4E4B, U+5E74, U+7537, U+4E3B, U+89D2, U+5D14, U+7EF4, U+65AF, U+0660, U+8FBE, U+6587, U+6CE2, U+7279, U+8FC7, U+7A0B, U+4E2D, U+7F8E, U+597D, U+56DE, U+5609, U+4F2F, U+8389, U+5212, U+65F6, U+521B, U+4F5C, U+73CD, U+8D35, U+6735, U+539F, U+6599, U+5999, U+8C03, U+548C, U+5242, U+7A7F, U+8D8A, U+5149, U+7ECF, U+5178, U+56DB, U+79CD, U+6F14, U+7ECE
What's this for you'll ask. Well, just look at this:
[Embed(source="yourfont.ttf", fontFamily="YourFont", fontWeight= "bold", fontStyle = "normal",advancedAntiAliasing="true", mimeType="application/x-font-truetype",
unicodeRange="U+0043, U+0048, U+0041, U+004E, U+0045, U+004C, U+002E, U+004F, U+004D, U+0052, U+0044, U+0049, U+0054, U+0053, U+5168, U+5C4F, U+89C2, U+770B, U+5176, U+5B83, U+8BED, U+8A00, U+6CD5, U+5F8B, U+58F0, U+660E, U+97F3, U+91CF, U+5E55, U+540E, U+82B1, U+7D6E, U+5965, U+9EDB, U+4E3D, U+2022, U+5854, U+56FE, U+0020, U+4E0E, U+8BA9, U+002D, U+76AE, U+8036, U+5C14, U+70ED, U+5185, U+62CD, U+6444, U+8BB0, U+5F55, U+73B0, U+573A, U+5F71, U+7247, U+0032, U+5206, U+0030, U+79D2, U+0036, U+00B0, U+0035, U+4F20, U+5947, U+4E3A, U+4EC0, U+4E48, U+9009, U+5851, U+9020, U+5973, U+795E, U+642D, U+4E58, U+591C, U+95F4, U+5217, U+8F66, U+7684, U+4EBA, U+6027, U+611F, U+8BF1, U+60D1, U+4F60, U+6700, U+559C, U+7231, U+955C, U+5934, U+7B2C, U+4E00, U+6B21, U+7EED, U+5199, U+8F89, U+714C, U+4EE3, U+00BA, U+9999, U+6C34, U+6C1B, U+5FC6, U+6211, U+53F7, U+2014, U+79D8, U+6570, U+5B57, U+0039, U+5948, U+513F, U+4E4B, U+5E74, U+7537, U+4E3B, U+89D2, U+5D14, U+7EF4, U+65AF, U+0660, U+8FBE, U+6587, U+6CE2, U+7279, U+8FC7, U+7A0B, U+4E2D, U+7F8E, U+597D, U+56DE, U+5609, U+4F2F, U+8389, U+5212, U+65F6, U+521B, U+4F5C, U+73CD, U+8D35, U+6735, U+539F, U+6599, U+5999, U+8C03, U+548C, U+5242, U+7A7F, U+8D8A, U+5149, U+7ECF, U+5178, U+56DB, U+79CD, U+6F14, U+7ECE")]
public var FontClass:Class;
In this way, you're going to import on the .swf
only the characters you're using from the .ttf.
In our case, Chinese went down from 9,554kbytes to 45kbytes. That's a
99.6% reduction. Pretty cool!.
Hopefully this will save some sleepless nights to someone.
shrinking some embeds.
doobsterelly, with the unicooodes.
lowering fiiilesiiize.
alriiight