encoding(3tcl)



NAME

   encoding - Manipulate encodings

SYNOPSIS

   encoding option ?arg arg ...?
______________________________________________________________________________

INTRODUCTION

   Strings  in Tcl are encoded using 16-bit Unicode characters.  Different
   operating system interfaces or applications  may  generate  strings  in
   other  encodings  such  as  Shift-JIS.   The  encoding command helps to
   bridge the gap between Unicode and these other formats.

DESCRIPTION

   Performs one of  several  encoding  related  operations,  depending  on
   option.  The legal options are:

   encoding convertfrom ?encoding? data
          Convert  data  to  Unicode  from  the  specified  encoding.  The
          characters in data are treated as binary data  where  the  lower
          8-bits  of  each  character  is  taken  as  a  single byte.  The
          resulting sequence of bytes  is  treated  as  a  string  in  the
          specified  encoding.   If encoding is not specified, the current
          system encoding is used.

   encoding convertto ?encoding? string
          Convert string from Unicode  to  the  specified  encoding.   The
          result  is  a  sequence  of  bytes that represents the converted
          string.  Each byte is stored in the lower 8-bits  of  a  Unicode
          character.   If  encoding  is  not specified, the current system
          encoding is used.

   encoding dirs ?directoryList?
          Tcl can load encoding data  files  from  the  file  system  that 
          describe  additional encodings for it to work with. This command 
          sets the search path for *.enc encoding data files to  the  list 
          of  directories  directoryList. If directoryList is omitted then 
          the command returns the current list of directories that make up 
          the  search  path.  It is an error for directoryList to not be a 
          valid list. If, when a search  for  an  encoding  data  file  is 
          happening,  an  element  in  directoryList  does  not refer to a 
          readable, searchable directory, that element is ignored.

   encoding names
          Returns a list containing the names of all of the encodings that
          are currently available.

   encoding system ?encoding?
          Set the system encoding to encoding. If encoding is omitted then
          the command returns the current  system  encoding.   The  system
          encoding is used whenever Tcl passes strings to system calls.

EXAMPLE

   It  is  common  practice to write script files using a text editor that
   produces output in the euc-jp  encoding,  which  represents  the  ASCII
   characters  as  singe bytes and Japanese characters as two bytes.  This
   makes it easy to embed literal strings  that  correspond  to  non-ASCII
   characters  by  simply  typing  the  strings  in  place  in the script.
   However, because the  source  command  always  reads  files  using  the
   current system encoding, Tcl will only source such files correctly when
   the encoding used to write the file is the same.  This tends not to  be
   true  in an internationalized setting.  For example, if such a file was
   sourced in North America (where the ISO8859-1 is normally  used),  each
   byte  in the file would be treated as a separate character that maps to
   the 00 page in Unicode.  The resulting Tcl strings will not contain the
   expected Japanese characters.  Instead, they will contain a sequence of
   Latin-1 characters that correspond to the bytes of the original string.
   The encoding command can be used to convert this string to the expected
   Japanese Unicode characters.  For example,
          set s [encoding convertfrom euc-jp "\xA4\xCF"]
   would return the Unicode string "\u306F", which is the Hiragana  letter
   HA.

SEE ALSO

   Tcl_GetEncoding(3tcl)

KEYWORDS

   encoding




Free and Open Source Software


Free Software Video

Useful Programs

Free Online Courses

Open Opportunity

Open Business