[FS-9367] Add logging of session creation Created: 19/Jul/16  Updated: 11/Nov/16  Resolved: 11/Nov/16

Status: Resolved
Project: FreeSWITCH
Component/s: core
Affects Version/s: None
Fix Version/s: 1.8

Type: New Feature Priority: Minor
Reporter: Nathan Neulinger Assignee: Mike Jerris
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

CPU Architecture:
x86-64
Kernel:
Linux
Userland:
GNU/Linux
Distribution:
Debian
Distribution Version:
Debian 8 jessie
Compiler:
gcc
FreeSWITCH GIT Revision: 35ee4ee593e729dd0bc5d672fac5ef35928b68aa
GIT Master Revision hash:: 35ee4ee593e729dd0bc5d672fac5ef35928b68aa

 Description   
I had an event that caused a leak/exhaustion of sessions. The problem is - I had NO way to determine from logs where/when they were created - such as accumulation or a sudden flood/burst.

This patch (adding PR shortly) adds DEBUG level logging of session creation to correspond with the Session X (...) Ended logging output that currently exists at NOTICE level.

 Comments   
Comment by Nathan Neulinger [ 19/Jul/16 ]
Example log snippet:

{noformat}
3a133109-e1a1-4fc5-af91-6506578e748f 2016-07-19 09:53:30.233585 [DEBUG] switch_core_session.c:2358 Session 1 Created (Total 1)
66746841-77f6-432a-a51a-3df2c619dc36 2016-07-19 09:53:30.253546 [DEBUG] switch_core_session.c:2358 Session 2 Created (Total 2)
a5e59172-0eca-436d-93f0-2a4cbeda0134 2016-07-19 09:53:30.253546 [DEBUG] switch_core_session.c:2358 Session 3 Created (Total 3)
96cc65b7-0b1d-4837-9e36-7c47792c2625 2016-07-19 09:53:30.273545 [DEBUG] switch_core_session.c:2358 Session 4 Created (Total 4)
66746841-77f6-432a-a51a-3df2c619dc36 2016-07-19 09:53:32.713574 [DEBUG] switch_core_session.c:1647 Session 2 (loopback/5733417975-a) Locked, Waiting on external entities
3a133109-e1a1-4fc5-af91-6506578e748f 2016-07-19 09:53:32.713574 [DEBUG] switch_core_session.c:1647 Session 1 (sofia/freesw-vip-voip-5090/5733416679@voice-dev.mst.edu) Locked, Waiting on external entities
3a133109-e1a1-4fc5-af91-6506578e748f 2016-07-19 09:53:32.713574 [NOTICE] switch_core_session.c:1665 Session 1 (sofia/freesw-vip-voip-5090/5733416679@voice-dev.mst.edu) Ended
66746841-77f6-432a-a51a-3df2c619dc36 2016-07-19 09:53:32.713574 [NOTICE] switch_core_session.c:1665 Session 2 (loopback/5733417975-a) Ended
a5e59172-0eca-436d-93f0-2a4cbeda0134 2016-07-19 09:53:32.713574 [DEBUG] switch_core_session.c:1647 Session 3 (loopback/5733417975-b) Locked, Waiting on external entities
a5e59172-0eca-436d-93f0-2a4cbeda0134 2016-07-19 09:53:32.713574 [NOTICE] switch_core_session.c:1665 Session 3 (loopback/5733417975-b) Ended
96cc65b7-0b1d-4837-9e36-7c47792c2625 2016-07-19 09:53:32.713574 [DEBUG] switch_core_session.c:1647 Session 4 (sofia/internal/5733417975@67.43.241.123:5060) Locked, Waiting on external entities
96cc65b7-0b1d-4837-9e36-7c47792c2625 2016-07-19 09:53:32.713574 [NOTICE] switch_core_session.c:1665 Session 4 (sofia/internal/5733417975@67.43.241.123:5060) Ended
{noformat}
Comment by Nathan Neulinger [ 19/Jul/16 ]
https://freeswitch.org/stash/projects/FS/repos/freeswitch/pull-requests/914/overview
Comment by Mike Jerris [ 19/Jul/16 ]
isn't there already logging for them in NEW state that should cover this?
Comment by Nathan Neulinger [ 19/Jul/16 ]
Not that I'm seeing that easily ties to the numeric session #. It may be possible to backtrack based on the uuid to find the start of that session, but it's not clear that is always useful - such as sessions created by mod skinny.
Comment by Nathan Neulinger [ 19/Jul/16 ]
It's also not clear when the session was created in those cases.
Comment by Mike Jerris [ 19/Jul/16 ]
whats the advantage of that session number being logged at all? The reason i push on this is these messages seem to add no real value over what is already there.
Comment by Nathan Neulinger [ 19/Jul/16 ]
The scenario is "session exhaustion". While YOU might know every place in the code that a session could have been created, I certainly don't.

When the error is specifically talking about being out of sessions - how do I go look at the logs and find out the list of created sessions without knowing every possible pattern to look at?

Is the session solely tied to the UUID? in which case, I suppose I could backtrack and look at the 'first log entry I see for a given uuid' - but is there guaranteed to even be a log entry when the session is created?

Why log the session id when the session is closed then?

For me - it would have (clearly not useful now since it's not in the logs) been very helpful to be able to pair up: "Ok, I see sessions 1-999 created here with no corresponding "Ended" log entry.

Plus - the uuid based analysis isn't really a good indicator since there are additional entries in the log. Here's a full example of one particular call - note that stuff is happening to create a channel prior to CS_NEW and after the session is created - but that's not the only place sessions wind up getting created.

5bbb38ed-8574-44e4-acce-385022f462e5 2016-07-19 11:12:28.733609 [DEBUG] switch_core_session.c:2358 Session 27 Created (Total 1)
5bbb38ed-8574-44e4-acce-385022f462e5 2016-07-19 11:12:28.733609 [NOTICE] switch_channel.c:1104 New Channel sofia/internal/100@131.151.246.101 [5bbb38ed-8574-44e4-acce-385022f462e5]
5bbb38ed-8574-44e4-acce-385022f462e5 2016-07-19 11:12:28.733609 [DEBUG] switch_core_state_machine.c:543 (sofia/internal/100@131.151.246.101) Running State Change CS_NEW
5bbb38ed-8574-44e4-acce-385022f462e5 2016-07-19 11:12:28.733609 [DEBUG] sofia.c:9456 sofia/internal/100@131.151.246.101 receiving invite from 209.58.128.150:5107 version: 1.6.9 git d574870 2016-06-13 18:10:44Z 64bit
5bbb38ed-8574-44e4-acce-385022f462e5 2016-07-19 11:12:28.733609 [DEBUG] switch_core_state_machine.c:562 (sofia/internal/100@131.151.246.101) State NEW
2016-07-19 11:12:28.733609 [DEBUG] sofia.c:2248 detaching session 5bbb38ed-8574-44e4-acce-385022f462e5
2016-07-19 11:12:28.893660 [DEBUG] sofia.c:2356 Re-attaching to session 5bbb38ed-8574-44e4-acce-385022f462e5
5bbb38ed-8574-44e4-acce-385022f462e5 2016-07-19 11:12:28.893660 [DEBUG] sofia.c:9456 sofia/internal/100@131.151.246.101 receiving invite from 209.58.128.150:5107 version: 1.6.9 git d574870 2016-06-13 18:10:44Z 64bit
5bbb38ed-8574-44e4-acce-385022f462e5 2016-07-19 11:12:28.893660 [NOTICE] sofia.c:2246 Hangup sofia/internal/100@131.151.246.101 [CS_NEW] [CALL_REJECTED]
5bbb38ed-8574-44e4-acce-385022f462e5 2016-07-19 11:12:28.913543 [DEBUG] sofia.c:1430 Channel is already hungup.
5bbb38ed-8574-44e4-acce-385022f462e5 2016-07-19 11:12:28.913543 [DEBUG] switch_core_state_machine.c:543 (sofia/internal/100@131.151.246.101) Running State Change CS_HANGUP
5bbb38ed-8574-44e4-acce-385022f462e5 2016-07-19 11:12:28.913543 [DEBUG] switch_core_state_machine.c:809 (sofia/internal/100@131.151.246.101) Callstate Change DOWN -> HANGUP
5bbb38ed-8574-44e4-acce-385022f462e5 2016-07-19 11:12:28.913543 [DEBUG] switch_core_state_machine.c:811 (sofia/internal/100@131.151.246.101) State HANGUP
5bbb38ed-8574-44e4-acce-385022f462e5 2016-07-19 11:12:28.913543 [DEBUG] mod_sofia.c:437 Channel sofia/internal/100@131.151.246.101 hanging up, cause: CALL_REJECTED
5bbb38ed-8574-44e4-acce-385022f462e5 2016-07-19 11:12:28.913543 [DEBUG] switch_core_state_machine.c:60 sofia/internal/100@131.151.246.101 Standard HANGUP, cause: CALL_REJECTED
5bbb38ed-8574-44e4-acce-385022f462e5 2016-07-19 11:12:28.913543 [DEBUG] switch_core_state_machine.c:811 (sofia/internal/100@131.151.246.101) State HANGUP going to sleep
5bbb38ed-8574-44e4-acce-385022f462e5 2016-07-19 11:12:28.913543 [DEBUG] switch_core_state_machine.c:578 (sofia/internal/100@131.151.246.101) State Change CS_HANGUP -> CS_REPORTING
5bbb38ed-8574-44e4-acce-385022f462e5 2016-07-19 11:12:28.913543 [DEBUG] switch_core_state_machine.c:543 (sofia/internal/100@131.151.246.101) Running State Change CS_REPORTING
5bbb38ed-8574-44e4-acce-385022f462e5 2016-07-19 11:12:28.913543 [DEBUG] switch_core_state_machine.c:897 (sofia/internal/100@131.151.246.101) State REPORTING
5bbb38ed-8574-44e4-acce-385022f462e5 2016-07-19 11:12:28.913543 [DEBUG] switch_core_state_machine.c:174 sofia/internal/100@131.151.246.101 Standard REPORTING, cause: CALL_REJECTED
5bbb38ed-8574-44e4-acce-385022f462e5 2016-07-19 11:12:28.913543 [DEBUG] switch_core_state_machine.c:897 (sofia/internal/100@131.151.246.101) State REPORTING going to sleep
5bbb38ed-8574-44e4-acce-385022f462e5 2016-07-19 11:12:28.913543 [DEBUG] switch_core_state_machine.c:569 (sofia/internal/100@131.151.246.101) State Change CS_REPORTING -> CS_DESTROY
5bbb38ed-8574-44e4-acce-385022f462e5 2016-07-19 11:12:28.913543 [DEBUG] switch_core_session.c:1647 Session 27 (sofia/internal/100@131.151.246.101) Locked, Waiting on external entities
5bbb38ed-8574-44e4-acce-385022f462e5 2016-07-19 11:12:28.913543 [NOTICE] switch_core_session.c:1665 Session 27 (sofia/internal/100@131.151.246.101) Ended
5bbb38ed-8574-44e4-acce-385022f462e5 2016-07-19 11:12:28.913543 [NOTICE] switch_core_session.c:1669 Close Channel sofia/internal/100@131.151.246.101 [CS_DESTROY]
5bbb38ed-8574-44e4-acce-385022f462e5 2016-07-19 11:12:28.913543 [DEBUG] switch_core_state_machine.c:700 (sofia/internal/100@131.151.246.101) Running State Change CS_DESTROY
5bbb38ed-8574-44e4-acce-385022f462e5 2016-07-19 11:12:28.913543 [DEBUG] switch_core_state_machine.c:710 (sofia/internal/100@131.151.246.101) State DESTROY
5bbb38ed-8574-44e4-acce-385022f462e5 2016-07-19 11:12:28.913543 [DEBUG] mod_sofia.c:342 sofia/internal/100@131.151.246.101 SOFIA DESTROY
5bbb38ed-8574-44e4-acce-385022f462e5 2016-07-19 11:12:28.913543 [DEBUG] switch_core_state_machine.c:181 sofia/internal/100@131.151.246.101 Standard DESTROY
5bbb38ed-8574-44e4-acce-385022f462e5 2016-07-19 11:12:28.913543 [DEBUG] switch_core_state_machine.c:710 (sofia/internal/100@131.151.246.101) State DESTROY going to sleep
Comment by Nathan Neulinger [ 19/Jul/16 ]
Note that is WITH the added patch.
Comment by Mike Jerris [ 19/Jul/16 ]
why not just add the count to the new/init logs out of the state machine instead of adding more log lines?
Comment by Nathan Neulinger [ 19/Jul/16 ]
That would work - but then it would also break anything that anyone else had already written that looked at those log lines for State Change. That seems far more likely to be rejected as a PR. If you think that would be preferred/approved, I can resubmit. Suggested content/format?
Comment by Mike Jerris [ 19/Jul/16 ]
if someone is using log lines and parsing them instead of using a state hook or event, thats their problem. I don't have any specific suggestions on format, other than least obtrusive.
Comment by Nathan Neulinger [ 19/Jul/16 ]
Updated:

0f84aa76-9801-4f94-8e18-4a0def0cc0de 2016-07-19 13:04:27.274964 [DEBUG] switch_core_state_machine.c:543 (SKINNY/internal/SEP00115C0F6B2B:1/1) Running State Change CS_NEW (Cur 1 Tot 6)
Comment by Nathan Neulinger [ 19/Jul/16 ]
Cur = now, Tot = since startup
Generated at Wed Jul 18 16:40:09 CDT 2018 using JIRA 7.3.3#73014-sha1:d5be8da522213be2ca9ad7b043c51da6e4cc9754.