Note: The below makes sense only in the program is scored as characters, not as bytes.
I haven't seen this, though somebody may have posted it somewhere.
I needed to have some long literal ASCII strings in the code so somehow shortening them (as characters, not bytes) would be beneficial. After some experiments I came up with what I call the "Chinese reencoding". I call it that way because ASCII characters mostly seem to be squashed in unicode code points that represent Chinese characters. You take an ASCII string S, encode it in bytes as ASCII, and then decode it in UTF16-BE, like that:
The resulting string is half the length. It has to be big endian, as the reverse reencoding may not work - and on most systems the shorter 'utf16' is little endian. You also may need to add a character like space if the original string has odd length, but many times this is OK. Also, for non ASCII characters this does not save length, because they result in too big unicode code points that are represented in the liong form ("\uXXXX")
In you code, use the following:
in order to get the original longer string, where [E] is the literal shortened string. This costs 29 additional characters, so the original string has to be longer than 58, obviously.
One example - below is my 12 days of Christmas (it can be shortened additionally, but let's use that as an example):
for i in range(12):print('On the %s day of Christmas\nMy true love sent to me\n%s'%('First Second Third Fourth Fifth Sixth Seventh Eighth Ninth Tenth Eleventh Twelfth'.split()[i],'\n'.join('Twelve Drummers Drumming,+Eleven Pipers Piping,+Ten Lords-a-Leaping,+Nine Ladies Dancing,+Eight Maids-a-Milking,+Seven Swans-a-Swimming,+Six Geese-a-Laying,+Five Gold Rings,+Four Calling Birds,+Three French Hens,+Two Turtle Doves, and+A Partridge in a Pear Tree.\n'.split('+')[11-i:])))
It's 477 characters long. Let's apply the "Chinese" trick to the two longer string:
r=lambda s:s.encode('utf-16be').decode();for i in range(12):print('On the %s day of Christmas\nMy true love sent to me\n%s'%(r('䙩牳琠卥捯湤⁔桩牤⁆潵牴栠䙩晴栠卩硴栠卥癥湴栠䕩杨瑨⁎楮瑨⁔敮瑨⁅汥癥湴栠呷敬晴栠').split()[i],'\n'.join(r('呷敬癥⁄牵浭敲猠䑲畭浩湧Ⱛ䕬敶敮⁐楰敲猠偩灩湧Ⱛ呥渠䱯牤猭愭䱥慰楮本⭎楮攠䱡摩敳⁄慮捩湧Ⱛ䕩杨琠䵡楤猭愭䵩汫楮本⭓敶敮⁓睡湳ⵡⵓ睩浭楮本⭓楸⁇敥獥ⵡⵌ慹楮本⭆楶攠䝯汤⁒楮杳Ⱛ䙯畲⁃慬汩湧⁂楲摳Ⱛ周牥攠䙲敮捨⁈敮猬⭔睯⁔畲瑬攠䑯癥猬\u2061湤⭁⁐慲瑲楤来\u2069渠愠健慲⁔牥攮ਠ').split('+')[11-i:])))
That's 362, including the lambda (it happens to be worth it, as it is used twice).
Now, all code is mostly ASCII characters, so you may have already guessed that you can use that with exec. There is higher overhead - 43 chars for "exec(''.encode('utf-16be').decode())" (in addition to the whole compressed program) and you may need to double escape some escaped characters in your literal strings (like '\n' in mine has to become '\n'). As a bonus you can always easily add that one space. The compressed porogram looks like:
and it's 299 characters long. You can see some high code points can always appear. I have not found a way to eliminate them, as the added handling code is not worth the benefit.
This is a cheap trick, in fact, but it can always be applied on top of your solution when the program is longish and there are no or few non-ASCII characters. Often you can devise a custom encoding that can stuff more than two ASCII chars in an unicode one, but it is specific for the task.