Skip to content


Botsquad bots can work as interactive, smart IVR systems on traditional phone line. Contact us for more information about this.

The dial statement

By using dial, you can forward the current phone call to a new destination.

The phone number given to dial is a full E164 phone number, e.g. including country code and the + sign.

dialog main do
  say "Please hold."
  dial "+31201234567"

In case the calling party does not pick up the dialog continues where it left off. The event.payload variable is then filled with a string which describes the cause why the caller did not pick up:

dialog main do
  say "Please hold."
  dial "+31201234567"

  branch event.payload do
  "BUSY" ->
    say "The line is busy."

    say "The caller did not answer."

  true ->
    say "We are unable to connect you through due to an unknown error."

The following call denial reasons are supported:

event.payload Description
CHANUNAVAIL Channel unavailable (for example in sip.conf, when using qualify=, the SIP chan is unavailable)
BUSY Returned busy
NOANSWER No Answer (i.e SIP 480 or 604 response)
ANSWER Call was answered
CANCEL Call attempt cancelled (i.e user hung up before the call connected)
DONTCALL Privacy manager don’t call
CONGESTION Congestion, or anything else (some other error setting up the call)

Speech adaptation

The phone backend uses Google Speech to Text for recognizing a user's voice.

As such, it is possible to hint the recognizer about certain words, phrases or entities that should be recognized better. This process is called speech adaptation.

The adapter takes hints that are extracted from the expecting: part of ask. Certain references to entities that are contained in the expecting clause are automatically converted to their corresponding class token:

Bubblescript entity Class token
[amount_of_money] $MONEY
[phone_number] $FULLPHONENUM
[date] $FULLDATE
[time] $TIME

So when you have a script like this:

dialog main do
  ask "When is your birthdate?", expecting: entity(match: "[time]")

The phone adapter will automatically add $TIME to its speech context.

Other entities or just text and labels will not be automatically taken as speech_hints. So for labels and text you need to define explicit speech_hints, see paragraph below.

speech_hints: option to ask

It is also possible to override these speech hints by providing a speech_hints: option to ask:

dialog main do
  ask "Where do you live?", speech_hints: ["i live at $POSTALCODE"]

In this case, you would use the class tokens directly as part of the sample phrases.

DTMF and quick replies

Whenever an ask with quick_replies is encountered, these quick replies are automatically mapped onto the DTMF numbers.

So in the following case:

dialog main do
  ask "Do you want to continue?", expecting: ["Yes", "No"]

Pressing 1 would automatically select "Yes" as the answer.

Note there is no hint ("press 1 for yes") spoken, nor are the actual options (the quick replies) read aloud, you would have to implement this yourself in Bubblescirpt.

Automatic replacements

In some cases the text-to-speech engine always makes the same mistake spelling out certain names, et cetera. The phone adapter can be configured to automatically replace strings in the SSML output. Note that this replacement is being done before any Speech Markdown processing.

To do this, create a voice_config YAML file with the following contents:

  $i18n: true
    - { from: "Hi", to: "Hai" }
    - { from: "Arjan", to: '(Arjan)[sub:"Aryuhn"]' }

This would replace all occurrences of the word Hi with the word Hai, and, more realisticly, annotate the name Arjan with a Speech Markdown sub tag, so that it is pronounced correctly.

Turn taking

The bot automatically starts recognizing speech as soon as it finishes its sentence.

Turn taking on voice is quite tricky and there are several timeouts involved. When speech hints are given, the timeouts are shortened because a speech hint in the prompt means that we expect a closed answer. The following table shows how the timings are configured:

When Idle timeout Active timeout
Open question 8 seconds 1 second
Closed question 4 seconds 300 ms second

To make it more clear that the bot expects the user to say something, it is possible to let the user hear short "beep" tones, a low beep before the user is supposed to say something, and a high one once the bot finishing the speech recognition. This setting can be enabled in the settings of the phone connector in the Botsquad studio.

Webhook API

After connecting a bot to the phone channel in the settings in the studio, a new webhook endpoint is available for your bot which should be called from the PBX voice adapter.

The endpoint is called:


The URL parameters of which are:

  • identifier - the bot ID (in the case when chosen "connect through bot ID" in the settings), or otherwise <pool>-<number>.
  • callerid - The E.164-formatted phone number of the person who is calling (no leading + sign). In the case of an anonymous call, it should be a random string starting with x_.
  • channelid - A unique string identifying the current call. The channel ID must stay the same for the duration of the call.

The webhook POST body is an JSON-formatted payload, identical to the Chat REST API chat input payload. read more.

Webhook API request payload examples

Most chatbot conversations start with an initial, empty, request:


Request which includes recognized speech:

  "action": {
    "type": "message",
    "payload": {
      "text": "Hello my name is John",
      "input_type": "voice"

Request which includes a DTMF sequence:

  "action": {
    "type": "message",
    "payload": {
      "text": "1",
      "input_type": "dtmf"

The following requests send events instead of user messages:

The $no_input event is used to indicate to the bot that no speech was captured while it was listening:

  "action": {
    "type": "event",
    "payload": {
      "name": "$no_input"

The $dial_return event is used to indicate the result of a previous dial command; e.g. when the called party does not respond or is busy:

  "action": {
    "type": "event",
    "payload": {
      "name": "$dial_return",
      "payload": "BUSY"

Webhook API response payload

The<identifier>/<callerid>/<channelid> endpoint has the following return value:

  // whether or not this is the final response
  "is_final": false,

  // the locale in which the bot speaks
  "locale": "nl",

  // the text that the bot says, as SSML.
  "ssml": "<speak><s>Hello...</s></speak>",

  // the configured Text-to-speech voice (optional)
  "voice": {
    "type": "google",
    "name": "nl-NL-Standard-A",
    "gender": "FEMALE",
    "locale": "nl"

  // Any speech context elements to be passed into Google's speech-to-text `speechContexts` object (optional)
  "speechContexts": [{
    "phrases": ["I want a cookie"]

  // if dial is set, redirect this call to the given number (optional)
  "dial": "+31641322222",

  // when beep is true, play short beeps around user speech-to-text capturing
  "beep": true,

  // when we request DTMF input
  "get_dtmf": {
    // the maximum nr of digits
    "num_digits": 10,
    // send when pressing this char (result is sent excluding this char)
    "finish_on_key": "#",