Uploaded image for project: 'FreeSWITCH'
  1. FreeSWITCH
  2. FS-3679

internationalization / localization / conversion of char strings into other charsets (e.g. UTF-8 -> ISO-8859-1) / mod_charset

    XMLWordPrintable

    Details

    • Type: New Feature
    • Status: Closed
    • Priority: Minor
    • Resolution: No Reporter Response
    • Affects Version/s: 1.3.0
    • Fix Version/s: 1.2.23
    • Component/s: freeswitch-core
    • Labels:
      None
    • Environment:
      windows VS2010 and centos 5.x
    • CPU Architecture:
      x86
    • Kernel:
      Linux
    • Userland:
      GNU/Linux
    • Compiler:
      gcc
    • FreeSWITCH GIT Revision:
      git-65a7566 2011-11-07 14-31-36 -0600
    • GIT Master Revision hash::
      n.a.
    • Target Version:

      Description

      This patch allows one to do conversions between char strings encoded with different charsets.

      The fs and fs_cli consoles do not allow you to process accented characters like á ä é ë etc.
      On linux: fs console and fs_cli are restricted to 7-bit ascii.
      On windows: the default input and output codepage is CP437.

      I made a mod_charset, that allows one to set the default codepage of the fs console / fs_cli to whatever you specify in charset.conf, and you can also manipulate it through new api functions. It also provides various routines for conversions between char strings (streams) (e.g. for western europe, from UTF-8 into (CP) 1252 or iso-8859-1)

      On linux, libedit restricts fs_cli / fs console input to ascii-7 bit. To allow libedit to process UTF-8, fs_cli has to be rewritten, as a new version of libedit supports UTF-8, but only when using wide characters. I chose to update fs_cli.c in a minimalistic way: you can now build it also without libedit and it will create fs_cli_i (and fs_cli). If you use fs_cli_i, you will loose all functionality provided by libedit (tab completion, history, function keys), however at least you can use accents in your text.

      Sample use cases: you want to store a default greeting and use it with TTS in your dialplan
      fs> db insert/greeting/avé
      fs> db select/greeting

      you receive a utf8 encoded webrequest (through xmlrpc) on your windows fs installation to execute charset.lua
      (e.g. from a linux browser: http://host:8080/txtapi/lua?charset.lua Avé )

      charset.lua:

      — ---
      sep = ###
      api = freeswitch.API()
      arglist = api:executeString("decode_utf8 " .. (table.concat(argv, sep) .. sep))

      if string.find(arglist, "INVALID COMMAND!") then
      stream:write("-ERR - mod_charset not loaded, cannot decode from utf-8 charset")
      return
      end

      if (arglist == sep) then
      stream:write("-ERR - Invalid or no parameters, try: " .. argv[0] .. " Avé")
      return
      end

      api:executeString("db insert/greeting/" .. arglist)
      stream:write("+OK - string stored)

      — ---

      A few notes:

      diff and patch do not allow filenames to contain spaces. after applying this patch, on windows you may need to rename "Download_apr-iconv" in libs/win32 to "Download apr-iconv" before opening the VS2010 sln file. For now it downloads apr-iconv from an apache repository (windows only, on linux it uses iconv lib through apr-util xlate)

      I added a "pwd" command, this should move to mod_commands.

      I typed the above lua sample / http request so it may contain typo's

        Attachments

          Activity

            People

            • Assignee:
              anthm Anthony Minessale II
              Reporter:
              garmt garmt
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Time Tracking

                Estimated:
                Original Estimate - 1 hour
                1h
                Remaining:
                Remaining Estimate - 1 hour
                1h
                Logged:
                Time Spent - Not Specified
                Not Specified