Uploaded image for project: 'FreeSWITCH'
  1. FreeSWITCH
  2. FS-10021

Large RTP timestamp jump when system clock is late from internal timer

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 1.6.12
    • Fix Version/s: None
    • Component/s: RTP
    • Security Level: public
    • Labels:
      None
    • Environment:
      Debian 3.16.0-4-amd64 #1 SMP Debian 3.16.39-1 (2016-12-30) x86_64 GNU/Linux
    • CPU Architecture:
      x86-64
    • Kernel:
      Linux
    • Userland:
      GNU/Linux
    • Distribution:
      Debian
    • Distribution Version:
      Debian 8 jessie
    • lsb_release:
      Hide
      No LSB modules are available.
      Distributor ID: Debian
      Description: Debian GNU/Linux 8.7 (jessie)
      Release: 8.7
      Codename: jessie
      Show
      No LSB modules are available. Distributor ID: Debian Description: Debian GNU/Linux 8.7 (jessie) Release: 8.7 Codename: jessie
    • Compiler:
      gcc
    • FreeSWITCH GIT Revision:
      v1.5.final-3610-g5619ffd
    • GIT Master Revision hash::
      v1.5.final-3610-g5619ffd

      Description

      The high-level observable symptom is a large RTP timestamp jump.
      It's so large that it's clearly a roll-over of the 32 bit unsigned due to a negative number, e.g. it goes from 6720 to 4294173376.

      I _think_ this is caused by the handling of STUN binding request packets, which causes a read to "now" from the system clock (clock_gettime(CLOCK_REALTIME, &ts); at src/switch_apr.c:312).

      If the system clock has drifted a little in the past since FS was started, the difference (now - timer->start) needed to compute the next timestamp becomes negative, and then rolls-over.

      When there's the jump the Mark bit is correctly set. In most of the cases the client can cope with this jump, but I'm raising this JIRA because the behaviour seems undesired/unneeded and would like to ensure it's not overlooked, should it hide other issues.

      I noticed this with 1.6.10 first, then 1.6.12 and was able to reproduce with current master.

      The configuration is a transcoding Opus to G.711:

      <configuration name="opus.conf">
          <settings>
              <param name="use-vbr" value="1"/>
              <param name="use-dtx" value="0"/>
              <param name="complexity" value="5"/>
              <param name="maxaveragebitrate" value="20000"/>
              <param name="maxplaybackrate" value="8000"/>
              <param name="packet-loss-percent" value="0"/>
              <param name="keep-fec-enabled" value="1"/>
              <param name="use-jb-lookahead" value="1"/>
              <param name="advertise-useinbandfec" value="1"/>
              <param name="adjust-bitrate" value="1"/>
          </settings>
      </configuration>

      opus lib at version 1.1-2.

      Dialplan (default.xml):

          <extension name="gv_test">
            <condition field="destination_number" expression="^gv$">
                     <action application="set" data="ringback=${us-ring}"/>
                     <action application="set" data="instant_ringback=true"/>
                     <action application="set" data="ignore_early_media=true" />
                     <action application="set" data="call_timeout=120"/>
                     <action application="set" data="hangup_after_bridge=true"/>
                     <action application="set" data="continue_on_fail=GATEWAY_DOWN,INVALID_GATEWAY,UNALLOCATED_NUMBER"/>
                     <action application="export" data="nolocal:absolute_codec_string=PCMA"/>
                     <action application="bridge" data="sofia/$${domain}/2000@139.59.163.24"/>
            </condition>
          </extension>

      You can see what happens in master_fail_test1_20170209.pcap, RTP stream 0x592127D8, and the STUN binding request at frame 13.

      The caller (sipp) "plays" the RTP contained in rtcp_rtp_fake.pcap (2 STUN binding requests, one for the RTP port and one for the RTCP port, followed by 3 RTP packets). This is the minimum amount of RTP that can trigger the issue.

      Of course in my experiment I had to manually change the system clock, which ended up about 16" late from FS' internal. This though was visible with small differences, in the order of a few hundreds of msec.

      The timestamp jump goes to 4294173376:

      4294967296-
      4294173376=
      793920
      827 frames (920 ticks per frame) == 16,54" (at 20 ms/frame)

      A backtrace for the relevant call in bt_FS_now_20170209.rtf.

      Any other detail, please let me know.
      1. bt_FS_now_20170209.rtf
        4 kB
        Giacomo Vacca
      2. master_fail_test1_20170209.pcap
        5 kB
        Giacomo Vacca
      3. rtcp_rtp_fake.pcap
        0.4 kB
        Giacomo Vacca

        Activity

        There are no comments yet on this issue.

          People

          • Assignee:
            mikej Mike Jerris
            Reporter:
            giavac Giacomo Vacca
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development