Deployment guidelines for async and regular chat

The page provides some important guidelines regarding regular (traditional) and async (asynchronous) chat deployment.

Regular vs. async chat mode

In general both async and regular chats are processed the same way by all components. The exception being that async chat provides additional capabilities that require a bit more planning and workflow implementation.

Topic	Regular Chat	Async Chat
Duration of single conversation	Lasts only minutes or dozens of minutes	Could potentially last for days or even weeks
Agent handling	Could transfer and/or do the conference/consultation, stop chat session.	Additionally, could place chat session on-hold (i.e. place in workbin until a customer reply or other qualified event). Could resume chat conversation later by taking the interaction from workbin.
Mobile oriented	Could be implemented, but not suited for lengthy conversations	Naturally suitable for mobile applications as allows to conduct long conversations with periods of inactivity.
Workflow	Mostly aimed for routing purposes (i.e. selecting the best available agent).	Additionally, must handle "wake ups" of "on-hold" interactions upon qualified events (i.e. a new messages from a customer, async idle timeout expiration, etc).
Performance implications	Must be sized to conduct a certain number of active chat sessions.	Must take into the account the presence of large number of chat sessions, most of which are expected to be in dormant (i.e. sleepy) state. Please see below about the sizing guidelines.

Short polling vs. CometD-based chat

The end-user (customer-oriented) web or mobile chat applications must communicate with Genesys Mobile Engagement (GMS) through two alternative APIs:

Short polling (REST API) - API requires chat application to send frequent (every other 3 seconds) polling requests to keep chat session transcript updated.
CometD-based - API could utilize either WebSockets or long polling thus providing more prompt chat session transcript updates.

While the second CometD approach naturally looks more efficient as it also reduce the overall load onto the system by eliminating unproductive API calls, below table provides the comparison of different aspects/moments which must be taken into the account when selecting the approach for deployment/implementation:

Topic name	Short polling (REST API)	CometD-based
Performance	Consume much more CPU and traffic resources as it produce lot of unproductive API calls (according to different researches, in average, 98.5% of polls are wasted). So, if a message is expected to be posted into chat session every other 30 seconds, an extra 10 unproductive API requests must be processed during this time by GMS and Chat Server components. Basically, it means that the load onto these components are to measured on how many concurrent chat sessions an instance could hold, independently of the scenario density (see below "sizing" chapter for more information).	CPU and traffic resoruces are used alsmot only for the productive load.
Connections	Each API call is being executed on a separate connection, which is closed immediately after receiving HTTP response. The number of concurrent connections (between GMS and Chat Server) depend on the number of concurrent chat sessions divided by the short polling interval (usually 3 seconds).	Number of concurrent connections will be about the amount of concurrent chat sessions. Also, GMS impose a limitation of only one connection per chat sessions at a time.
Complexity	Simple to implement and troubleshoot (as it based on pure REST API)	Troubleshooting may require the knowledge of CometD protocol functionality.
Client library	There numerous stable versions of HTTP REST libraries.	CometD client library required, which increase the complexity of chat web application.
Timeouts (in Chat Server)	"flex-disconnect-timeout" configuration option is being used to disconnect a chat participant which does not send any API requests for specified amount of time.	"flex-push-timeout" configuration option is being used to disconnect a chat participant which is no being confirmed by GMS as alive for specified amount of time.

Sizing recommendations

On the high level, sizing guidelines depend on various factors:

Short polling (GMS Chat API Version 2) vs CometD (GMS Chat API Version 2 with CometD) mode:
- Short polling produces a constant "background" (i.e. noise) load onto GMS and Chat Server, thus consuming much more CPU and network resources. Also, it is important to tune appropriately operational system TCP parameters to minimize TIME_WAIT state duration, as each short polling request will lead to establishing and closing the TCP connection.
- CometD approach requires to keep long-living connection to GMS for each chat session. It should be taken into the account that some load balancing solutions does not "like" long-living connections and may close inactive connection.
In chat async mode, dormant vs active sessions:
- active chat sessions usually constitute only a fraction of all ongoing async chat sessions. The number of such chat sessions should be around the number of active chat agents multiplied by the capacity of agents (i.e. how many parallel chat session an agent could work on). These chat sessions consume almost all assigned resources (first of all CPU).
- dormant chat sessions are those which does not have an active agent (or bot) in chat session. So, especially in short polling mode, the customer-faced application must minimize the resource consumption by reducing (or completely eliminating) the periodic short polling requests.
Scenario density: the more often messages are being sent from either chat participant, the more load this chat session generates. This factor is more application for CometD approach, while the "noise" short polling load basically flatten out the productive load (i.e. generated by messages).
In high availability, UCS vs Cassandra:
- UCS-based HA option require less deployment and maintenance efforts, as well as guarantee the presence of the latest transcript version for ongoing chat sessions in UCS DB. However, with large deployments, UCS and UCS DB could be overloaded with intermediate transcript updates (generated by Chat Server after each chat session message).
- Cassandra allows off-load UCS from unnecessary load. However, in case of unplanned Chat Server switch-over during the ongoing chat session, the chat transcript could never be propagated into UCS record if such chat session will not be restored on another Chat Server instance (i.e. when it coincide with customer-faced chat application failure/closure).

Important

For async chat, especially in short polling mode, a customer-faced web or mobile chat application must noticeably reduce the frequency of short polling requests when it detects that a session was placed on hold (i.e. when the agent leaves chat session).

Performance benchmarks

Following benchmarks were produced on a hardware with "Intel Xeon E7-8880L 2 GHz" with a single instance of Chat Server (which consumed in average one CPU core) and two instances of GMS (each consuming less then one CPU core). The average length of chat session was around 35 seconds (with 3 messages from a customer and 3 messages from an agent) - which is very dense scenario. Each cell in the table contains the total number of concurrent chat session (and active/dormant ratio).

Mode	Active to dormant sessions ratio
Mode	1:01	1:10	1:50
Short polling mode	1000 (all active)	8000 (800 / 7200)	35000 (700 / 34300)
CometD-based mode	1500 (all active)	11000 (1000 / 10000)	39000 (900 / 38100)

Important

real chat session scenario is usually of much sparse nature, so benchmark numbers are aimed for the upper bound of the expected load.

Async chat workflow recommendations

In the essence, for async chat, the workflow (i.e set of URS/ORS strategies) must additionally (to the regular chat workflow) provide the handling of chat sessions which is being placed on hold by an agent. The session, which is being placed on hold, could be processed in follow ways:

upon the qualified event (message or configured notice) in chat session from a customer, Chat Server updates a special key/value pair in corresponded interaction (which is being handled by Interaction Server). As it implemented in provided "Chat Business Process Sample", workflow could force the interaction for routing, which will will route the interaction to any other agent after several attempts of trying to route it to the last handling agent first.
alternatively, workflow could place the interaction back to the agent workbin, if last handling agent is not available at the moment. However, in this case, the workflow must implement the "escape" to avoid this interaction being stuck forever, if last handling agent will never come back.
in case of a custom desktop, the workflow may not force the interaction to routing at all upon the qualified event. Instead, agent desktop desktop could directly subscribe for notifications from Interaction Server when interaction properties are being changes (for interactions in agent workbin) and so notify an agent about an activity in chat session (which is being placed on hold).
In any case, workflow must ensure that there will be no stuck interactions (which were being placed on hold and will stay there forever). In provided "Chat Business Process Sample", it is being implemented in "async-chat-main-check-view" view of "async-chat-main-queue" queue with condition "GCTI_Chat_AsyncCheckAt < _current_time()", where "GCTI_Chat_AsyncCheckAt" is being set by Chat Server to the sum of "async-idle-alert" and "async-idle-close" configuration options of Chat Server application.

How disconnect and idle timeouts work

Chat Server configuration (options) allows to specify timeouts to control two different functional areas:

The disconnect of a chat session participant, which leads to the removal of a chat participant from a chat session.
The absence of an activity from participants in chat session, which leads first to alert notification and then closing of a chat session if no activity is being produced since alert was sent.

To describe each functional area, following definitions must be introduced:

protocol inactivity means the absence of any protocol requests from a client to Chat Server for a certain period of time. It is used to detect the disconnect of a chat participant. For example, if client applications sends short polling refresh requests, even if it does not carry any useful load, it still reset the timeout for protocol inactivity, and so such client is considered active on the protocol communication level.
session inactivity means the absence of qualified events (i.e. messages, etc - see below) from chat participants with full visibility in chat session. For example, if a customer and an agent are not sending messages for a certain period of time - it is considered as a session inactivity. And, if at the same time the agent communicates with another agent invisibly from a customer (i.e. consultation call) - it does not affect this decision (as this conversation is not fully visible for all chat session participants).

Important

A chat session stays alive in Chat Server until at least one participant is present. As soon as the last participant leaves, Chat Server closes chat session forever (and it could not be resumed anymore).

Chat session participant disconnect and removal

In terms of a connectivity, different types of chat session participants are processed differently, depending on how the application (representing the participant) communicates with GMS and/or Chat Server:

An agent (or bot) participant communicates with Chat Server via persistent TCP network connection, thus the disconnect leads to the immediate removal of a participant from a chat session.
A chat participant which represents a customer in chat session (so called client participant in the terminology of Chat Server) could communicate with GMS in three different modes. Each mode utilize different configuration options:
- Short polling (REST API). In this mode, Chat Server uses "flex-disconnect-timeout" which defines the max amount of time of protocol inactivity. As soon as the timeout expire, Chat Server removes the participant from a chat session. And, if this is a last participant, Chat Server will close the chat session.
- CometD only. If customer web application communicates with GMS over CometD, immediately after chat session is being successfully created by Chat Server, GMS subscribes for unsolicited notifications from Chat Server for this chat participant. This request forces Chat Server to disable "flex-disconnect-timeout" for such chat participant and to use instead "flex-push-timeout" for periodic querying GMS to confirm that the participant is still connected over CometD. When GMS sends confirmation, it tells Chat Server to consider the chat participant alive. As soon as GMS detects the disconnect over CometD, it sends "unsubscribe" request, which forces Chat Server to enable "flex-disconnect-timeout" until the new subscribe request will be sent by GMS to Chat Server (upon client re-connection over CometD to GMS).
- CometD and short polling with subscription for either mobile or custom-http push notification. This mode operates almost exactly the same way as "CometD only" except that GMS never sends "unsubscribe" request (upon CometD disconnect) to Chat Server which basically forces Chat Server to use only flex-push-timeout to ping GMS. In this mode, "flex-disconnect-timeout" is activated only when a client chat participant is being removed from chat session forcedly by another chat participant (i.e. an agent or bot).

Important

When Genesys agent desktops (WDE and WWE) receives the event indicating that a client left chat session (for any reason), agent desktop automatically sends the request to Chat Sever to close chat session. A custom desktop could implement it differently if needed, as Chat Server will keep chat session alive until the last participant will leave chat session.

Inactivity control and chats session closure

We define the chat session inactivity as the absence of a qualified event in chat session for a certain period of time (defined by timeouts in configuration). A qualified event could be a message, a notice (as defined by "async-idle-notices" or "include-notices"), and a participant (i.e. agent) joining or leaving chat session. Only events with full visibility (i.e. visible to all participants) are taken into the account.

There are two complementary inactivity control configurations in Chat Server:

generic chat configuration (applicable both for async and regular chat sessions).
- It is enabled,
  - if the option "enable" in section [inactivity-control] is set to "true"
  - and only when both a customer and an agent is present in chat session. As soon as (last) agent leaves chat session, Chat Server disables this configuration.
- After a certain period of inactivity (defined by option "timeout-alert"), Chat Server sends alert notification (with text defined in "message-alert" option).
- If there are still no activity (for the period defined by option "timeout-alert2"), Chat Server sends the second alert (with message defined in option "message-alert2").
- If there are still no activity (for the period defined by option "timeout-close"), Chat Server sends "close" notification and immediately closes chat session.
async only chat configuration (applicable only for async chat session).
- It is enabled from the very start of async chat session. It is activated independently of the presence of an agent in chat session (i.e. it is activated even if we have only a customer in a chat session).
- After a certain period of inactivity (defined by option "async-idle-alert"), Chat Server sends alert notification (with text defined in "message-alert" option).
- If there are still no activity (for the period defined by option "async-idle-close"), Chat Server sends "close" notification and immediately closes chat session.

Chat Server resets the inactivity period after any qualified event occurs in chat session. Both inactivity configurations could be activated simultaneously.

Contents

Deployment guidelines for async and regular chat

Regular vs. async chat mode

Short polling vs. CometD-based chat

Sizing recommendations

Performance benchmarks

Async chat workflow recommendations

How disconnect and idle timeouts work

Chat session participant disconnect and removal

Inactivity control and chats session closure

Contact

Genesys

Customer Care

Legal