Botsquad documentation logo

In natural language processing, there is a difference between intents and entities. An intent captures a global meaning of a sentence: its intent, the thing that someone means. An entity, on the other hand, is a single piece of information. Intents might need one or more entities to do their work. Consider the following sentence:

“I want to book a flight tomorrow to New York”.

The intent here is “booking” – someone planning to make a trip. The “booking” intent here has two entities, “tomorrow”, which is a point in time, and “New York”, which is a location.

Intent classification

As stated, an intent captures a global meaning of a sentence. For instance, the sentences “hello!”, “Hi there”, “How are you?” can all be seen as utterings of the same intent namely, a greeting.

In the DSL, such an intent can be defined by defining example sentences. The dialog matcher then parses each sentence and tries to classify it as one of the defined intents.

@greeting intent(learn: ["hi", "hello there", "howdy", "how are you doing"])

dialog trigger: @greeting do
  say "Hello to you too!"
  say intent
end

Internally, this uses a sentence classifier, which matches the uttered sentence against each of the example sentences and takes the best match. The classifier is trained on standard dataset of Wikipedia, so it is tolerant to synonym usage and variations in the input sentence.

When the intent has matched, the intent variable will be filled with the intent’s label, which is greeting in the above example.

Intent classifier is language specific. By default, English is used as the default language, but it is possible to specify a custom language by passing the locale: option to intent().

Using an external list of examples

Maintaining all example sentences for an intent in the DSL code can be a tedious task. Luckily it is possible to define a YAML script in which you can list all the example sentences.

For instance, when you have a YAML script called sentences, with the following contents:

greeting:
- Hello
- Hi there
- How are you doing?

You can then reference this set of sentences in your intent definition, as follows:

@greeting intent(learn: @sentences.greeting)

This way, instead of editing the code, you can just update the YAML file to add new example sentences for various intents.

Entity extraction

The entity builtin defines a matcher which can be used for fuzzy matching in user input and in message triggers. The result is an extracted “entity” - a part of a the string, possibly normalized in a generic format.

@yes entity(match: "yes|yep|sure", label: "Yes", return: :yes)
@no  entity(match: "no|nope|nah", label: "No", return: :no)

@yes entity shows the label “Yes” in the quick replies, returning the atom :yes on match. @no shows the label “No” in the quick replies, returning the atom :no on match.

@postcode entity(match: "[0-9]{4}\\s*[a-z]{2}")

Creates an entity extractor which matches on a regular exression.

Entity usage

Entities can be used in the ask statement, to directly match something and return the value. Entities can be passed in the expecting: construct of the ask statement, and the ask then blocks until (one of) the given entit(y)(ies) has matched.

@number entity(match: "[1-9][0-9]*")
dialog main do
  age = ask "What is your age?", expecting: @number
end

Entities can also be used as dialog triggers:

dialog(trigger: @email) do
  say "Thank you for your email! You entered: #{entity}"
end

Lastly, entities can be extracted from arbitrary strings using the extract_entity() function:

@postcode entity(match: "[0-9]{4}\\s*[a-z]{2}")

dialog extract_entity do
  sentence = "My postcode is 1061BM now"
  entity = extract_entity(sentence, @postcode)
  say entity.value
end

duckling

The duckling option to entity() specifies that the integrated Duckling library will be used for the matching of the message.

Duckling supports extraction of various entities like phone numbers, email addresses, dates, et cetera.

@email entity(duckling: "email")
@time  entity(duckling: "time", locale: "nl_NL", timezone: "UTC")

Duckling matchers are locale and time zone sensitive. By default, the locale is en_US and the default time zone is Europe/Amsterdam.

Duckling extraction supports the following entities:

Matchers defined as entity() attributes can be used as matchers in both dialog as well as ask:

@email entity(match: "[a-zA-Z0-9_.+-][email protected][a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+")
# alternatively use duckling:
# @email entity(duckling: "email")

dialog(trigger: @email) do
  say "Thank you for your email! You entered: #{entity.value}"
end

dialog ask do
  entity = ask "What is your email?", expecting: @email
  say entity.value
end

The fields of the entity variable that Duckling returns are: