Free recode NEWS - User visible changes. -*- outline -*- (allout) Copyright © 1993-1999, 2000, 2001 Free Software Foundation, Inc. * Version 3.6 - François Pinard, Bruno Haible, 2001-01. .* General changes . + The recode manual is now indexed, by charset, by concept, etc. . + Program messages are also available in Greek, Gallicean and Italian. . + Bruno Haible's nice portable iconv library has been integrated. . + RFC 1345 tables and French character names have been updated. . + The Texinfo charset has been refreshed, and made reversible. .* New charsets (most from libiconv) . + Japanese EUC-JP (csEUCPkdFmtJapanese, EUC_JP, Extended_UNIX_Code_Packed_Format_for_Japanese); ISO-2022-JP (csISO2022JP); ISO-2022-JP-1; ISO-2022-JP-2 (csISO2022JP2); JIS_C6220-1969-ro (csISO14JISC6220ro, ISO646-JP, iso-ir-14, jp); JIS_X0201 (csHalfWidthKatakana, JIS0201, JISX0201-1976, JISX0201.1976-0, X0201); JIS_X0208 (csISO87JISX0208, ISO-IR-87, JIS0208, JIS_X0208.1983-0, JIS_X0208.1983-1, JIS_X0208-1990-0, JIS_X0208.1983-1, X0208); JIS_X0212 (csISO159JISX02121990, ISO-IR-159, JIS0212, JIS_X0212.1990-0, JIS_X0212-1990, X0212); SJIS (csShiftJIS, MS_KANJI, SHIFT-JIS). . + Chinese BIG5 (BIG-5, BIG-FIVE, BIGFIVE, CN-BIG5 csBig5); BIG5HKSCS; EUC-CN (CN-GB, csGB2312, EUC_CN, GB2312); EUC-TW (csEUCTW, EUC_TW); GB18030; HZ (HZ-GB-2312); ISO-2022-CN (csISO2022CN); ISO-2022-CN-EXT; GB_1988-80 (cn, csISO57GB1988, ISO646-CN, iso-ir-57); GB_2312-80 (CHINESE, csISO58GB231280, GB2312.1980-0, ISO-IR-58); ISO-IR-165 (CN-GB-ISOIR165). . + Korean JOHAB (CP1361); EUC-KR (csEUCKR, EUC_KR); GBK (CP936); ISO-2022-KR (csISO2022KR); KSC_5601 (CP949, csKSC56011987, ISO-IR-149, KOREAN, KSC5601.1987-0, KS_C_5601-1987, KS_C_5601-1989, KSX1001:1992). . + Vietnamese (independently of libiconv) TCVN; VIQR; VISCII; VNI; VPS. . + Other languages ARMSCII-8; Georgian-Academy; Georgian-PS; WINDOWS-874 (CP874); MuleLao-1; CP1133 (IBM-CP1133); CP1258 (WINDOWS-1258); TIS-620 (ISO-IR-166, TIS620, TIS620.2529-1, TIS620-0, TIS620.2533-0, TIS620.2533-1). . + Apple specifics MacArabic; MacCentralEurope; MacCroatian; MacCyrillic; MacGreek; MacHebrew; MacIceland; MacRomania; MacThai; MacTurkish; MacUkraine . + Unicode JAVA; UCS-2-INTERNAL; UCS-2LE (UnicodeLITTLE); UCS-2-SWAPPED; UCS-4BE; UCS-4-INTERNAL; UCS-4LE; UCS-4-SWAPPED; UTF-16BE; UTF-16LE. . + Others CP932; CP949 (UHC); CP950; CP866 (866, csIBM866, IBM866). ISO-8859-16 (ISO-IR-226, ISO_8859-16:2000). . + Recode internal :libiconv: (:) [so option -x: avoids going through libiconv] .* New aliases (from libiconv) [list to be revised] csASCII (for ANSI_X3.4-1968); csHPRoman8 (for hp-roman8); csISOLatin1 (for ISO-8859-1); csISOLatin2 (for ISO-8859-2); csISOLatin3 (for ISO-8859-3); csISOLatin4 (for ISO-8859-4); csISOLatin5 (for ISO-8859-9); csISOLatin6 and ISO_8859-10:1992 (for ISO-8859-10); csISOLatinArabic (for ISO-8859-6); csISOLatinCyrillic (for ISO-8859-5); csISOLatinGreek (for ISO-8859-7); csISOLatinHebrew (for ISO-8859-8); csKOI8R (for KOI8-R); csPC850Multilingual (for IBM850); csUCS4 (for ISO-10646-UCS-4); csUnicode, csUnicode11, UCS-2BE, UnicodeBIG (for ISO-10646-UCS-2); csUnicode11UTF7 (for UNICODE-1-1-UTF-7); csVISCII and VISCII1.1-1 (for VISCII); ISO-IR-179 (for ISO-8859-13); csMacintosh and MacRoman (for macintosh); TCVN5712-1, TCVN5712-1:1993 and TCVN-5712 (for TCVN). .* New surfaces tree (experimental). * Version 3.5 - François Pinard, 1999-05. .* Incompatible changes . + A double dot `..' should now be used instead of a colon `:'. . + Option --force (-f) is needed to pursue recoding despite errors. . + There is no more quoting for special characters within charsets names. . + Auto check (`-a') and popen (`-o') options have been withdrawn. . + Some charsets and aliases were deleted, see `Charsets & aliases' below. .* Extended features . + Program messages are available in localised form for many languages. . + Long character names are available in French, if LANGUAGE is set to `fr'. . + A new request syntax allows for recode chaining, and for surfaces. . + Option --header-file (-h) accepts a language parameter, and Perl is new. . + Full charset listings now show the UCS-2 value for characters. . + Option --known=PAIRS (-k) also accepts octal and hexadecimal numbers. . + Option --list (-l) better sorts charsets and aliases, also fully written. . + Charset `RFC1345' implements mnemonic+ascii+38, and is now reversible. . + HTML is not limited anymore to Latin-1, HTML 4.0 entities are supported. .* New features . + Euro support. . + Updated RFC 1345 set of tables, from Keld Simonsen. . + Some African charsets and transliterated forms. . + Conversions for ISO 10646 and Unicode. . + Combining or explosion of UCS-2 diacriticized characters and ligatures. . + Implementation of surfaces, see `Surfaces & aliases' below. . + Mixed mode for recoding only comments and strings in C sources or PO files. . + A stand-alone recoding library gets installed, often as a shared library. . + Option --find-subsets (-T) lists charsets which are subsets of another. . + The library may generate testing data, and study character frequencies. .* Charsets & aliases . + New ISO 10646 and Unicode charsets . - combined-UCS-2: pseudo-charset. . - count-characters: pseudo-charset. . - dump-with-names: pseudo-charset. . - ISO-10646-UCS-2 (UNICODE-1-1, BMP, rune, u2). . - ISO-10646-UCS-4 (10646, ISO-10646, UCS-4, u4). . - UNICODE-1-1-UTF-7 (TF-7, u7). . - UTF-8 (UTF-2, UTF-FSS, FSS_UTF, TF-8, u8). . - UTF-16 (Unicode, TF-16, u6). . + RFC 1345.bis matters . - Deleted charsets dk-us, us-dk (because of &duplicate which `recode' does not handle yet). . - New charsets baltic (alias is iso-ir-179); CP1250 (1250, ms-ee, windows-1250); CP1251 (1251, ms-cyrl, windows-1251); CP1252 (1252, ms-ansi, windows-1252); CP1253 (1253, ms-greek, windows-1253); CP1254 (1254, ms-turk, windows-1254); CP1255 (1255, ms-hebr, windows-1255); CP1256 (1256, ms-arab, windows-1256); CP1257 (1257, WinBaltRim, windows-1257); CWI (CWI-2, cp-hu); EBCDIC-IS-FRISS (friss); GOST_19768-87 with aliases of previous GOST_19768-74; IBM256 (256, CP256, EBCDIC-INT1); IBM875 (875, CP875, EBCDIC-Greek); IBM1004 (1004, CP1004, os2latin1); IBM1047 (1047, CP1047); ISO-8859-13 (ISO_8859-13:1998, iso-baltic, iso-ir-179a, l7, latin7); ISO-8859-14 (ISO_8859-14:1998, iso-celtic, iso-ir-199, l8, latin8); ISO-8859-15 (ISO_8859-15:1998, iso-ir-203, l9, latin9); KOI-7; KOI-8 (GOST_19768-74); KOI8-R; KOI8-RU; KOI8-U; macintosh_ce (macce); mac-is; NeXTSTEP (next) yet previous `recode' had it outside RFC 1345. . - Alias promoted to charset (with previous charset becoming alias) ISO-646.basic (with ISO-646.basic:1983); ISO-646.irv (ISO-646.irv:1983); ISO_5427-ext (ISO_5427:1981); ISO_5428 (ISO_5428:1980); ISO-8859-1 (ISO_8859-1:1987); ISO-8859-2 (ISO_8859-2:1987); ISO-8859-3 (ISO_8859-3:1988); ISO-8859-4 (ISO_8859-4:1988); ISO-8859-5 (ISO_8859-5:1988); ISO-8859-6 (ISO_8859-6:1987); ISO-8859-7 (ISO_8859-7:1987); ISO-8859-8 (ISO_8859-8:1988); ISO-8859-9 (ISO_8859-9:1989); ISO-8859-10 (latin6); NC_NC00-10 (NC_NC00-10:81); sami (latin-lap). . - New aliases 037 (for charset IBM037); 038 (IBM038); 273 (IBM273); 274 (IBM274); 275 (IBM275); 278 (IBM278); 280 (IBM280); 281 (IBM281); 284 (IBM284); 285 (IBM285); 290 (IBM290); 297 (IBM297); 367 (ANSI_X3.4-1968); 420 (IBM420); 423 (IBM423); 424 (IBM424); 500, 500V1 (IBM500); 819 (ISO-8859-1); 864 (IBM864); 868 (IBM868); 870 (IBM870); 871 (IBM871); 880 (IBM880); 891 (IBM891); 903 (IBM903); 905 (IBM905); 912, CP912, IBM912 (ISO-8859-2); 918 (IBM918); 1026 (IBM1026); ECMA-113, ECMA-113:1986 (ECMA-Cyrillic); GOST_19768-74 (KOI8); ISO_8859-N (ISO-8859-N) for N = 1 through 10 and 13 through 15; ISO_8859-10:1993 (ISO-8869-10); iso-ir-170 (INVARIANT); KOI8_L2 (CSN_369103); pclatin2, pcl2 (IBM852); SS636127 (SEN_850200_B). . + New African charsets AFRL1-101-BPI_OCIL (t-francais, t-fra); AFRFUL-102-BPI_OCIL (bambara, bra, ewondo, fulfulde); AFRFUL-103-BPI_OCIL (t-bambara, t-bra, t-ewondo, t-fulfulde); AFRLIN-104-BPI_OCIL (lingala, lin, sango, wolof); AFRLIN-105-BPI_OCIL (t-lingala, t-lin, t-sango, t-wolof). . + Extra miscellaneous charsets KEYBCS2 (Kamenicky); CORK (T1); KOI-8_CS2. . + New HTML pseudo-charsets HTML_1.1 (h1); HTML_2.0 (RFC 1866, 1866, h2); HTML-i18n (RFC 2070); HTML_3.2 (h3) reimplemented; HTML_4.0 (h4, HTML, h); deleted aliases HTF, 8859, ISO 8859, Entities, SGML, WWW, w3. .* Surfaces & aliases Base64 (64, b64); Quoted-Printable (qp, Quote-Printable); 21-Permutation (swabytes); 4321-Permutation; CR; CR-LF (cl); Decimal-1 (d, d1); Decimal-2 (d2), Decimal-4 (d4); Hexadecimal-1 (x, x1); Hexadecimal-2 (x2); Hexadecimal-4 (x4); Octal-1 (o, o1); Octal-2 (o2); Octal-4 (o4). data; test7; test8; test15; test16. * Version 3.4 - François Pinard, 1994-11. .* Charset HTML is new, it handles `&...;' sequences for Latin-1. .* Charset AtariST handling is more general, --list may be used with it. .* Charset ASCII-BS overstriking has been extended, mainly for German. .* Charset RFC1345 may be a goal, to debug or study RFC 1345 short names. .* Charset names have been revised. Note that nextstep is now NeXT. .* Option --force (-f) is accepted, but does not yet protect reversibility. .* Option --quiet or --silent (-q) silences irreversible recoding messages. .* Option --known=PAIRS (-k) helps searching through recodings. .* Option --sequence=pipe (-p) does not fall back on -o anymore. .* Option --auto-check may narrow its study around one particular charset. .* An MSDOS port is available, check ftp.iro.umontreal.ca in pub/gnuish. .* Compilation should now succeed on OS/2 EMX. Thanks to Kai Uwe Rommel. .* Program initialization is almost three times faster on average. .* Corrected reported bugs, added small improvements, some aesthetic. * Version 3.3 - François Pinard, 1993-12. .* Charsets atarist, ebcdic-ccc, ebcdic-ibm and nextstep have been added. .* Also, most RFC 1345 charsets and aliases are handled. That's a bunch! .* Old ascii disappears because of RFC 1345's ascii, use ascii-bs instead. .* Old maci disappears because of RFC 1345's macintosh, use applemac instead. .* Charsets cccascii and cdcascii disappear, use ebcdic-ccc and ebcdic instead. .* Recoding between latin1, ibmpc and applemac is (almost) reversible. .* The texinfo documentation has been reorganized, this to be continued. .* Long options are accepted, charset names may be abbreviated. .* Option --list (-l) displays charsets, aliases and contents in many formats. .* Option --strict (-s) asks for stricter, non-reversible recodings. .* Option --graphics (-g) approximates ibmpc rulers with ASCII graphics. .* Option --header (-h) produces C source for many recoding tables. .* Option --auto-check (-a) reports about all possible recodings. .* Option --ignore (-x) prevents a charset from being selected. .* Execution has been sped up through step merging, hashing for charset names. .* Many various buglets have been eradicated, portability increased. .* Charsets may be edited out by modifying the Makefile only. .* Configuration is made through the use of an external config.h file. * Version 3.2.4 - François Pinard, 1992-10. .* None. * Version 3.2.3 - François Pinard, 1992-09. .* New -d `diacritics_only' option for LaTeX. .* A few bugs have been corrected. .* Documentation reorganization and improvements. .* Increased portability, now uses Autoconf. .* A few bugs solved. * Version 3.2 - François Pinard, 1991-10. .* MSDOS port redone. .* New check goal at installation time. .* Add -v option for verbose processing, remove old -q. .* Add -i, -o and -p for letting the user control the strategy. .* A few bugs corrected. .* Embedded NULs should now be transmitted. * Version 3.1 - François Pinard, 1990-03. .* Rename -V to -C for showing Copyright. .* Calling sequence changed, said files now recoded on themselves. .* Add -t option for touching files. .* Better on-line help. * Version 3.0.1 - François Pinard, 1990-02. .* Add -q option for quiet processing. .* Executable file now considerably smaller, also speedier. .* A few bugs corrected. * Version 3.0 - François Pinard, 1989-10. .* New Text to Latin1 processing, should be faster. .* A few bugs corrected. * For prior history down to 1980, see at the end of the ChangeLog.