Flow Basics: Data Extraction

A core feature of the NLP engine is the ability to extract data from user input. Data can be captured in two ways: linearly (using slots) and contextually (using entities).

Slot filling

The easiest way to capture user input is by using Triggers: Any Text. By default, this captures any user input and stores it inside a param.

The following example demonstrates a linear flow where Flow captures a user's name.

Typed data

In addition to capturing text, Flow provides a way to capture data of a specific type (for example, e-mail addresses, dates, phone numbers, or a custom list of data).

Matching entity types do not require exact input from a user. For example, the sentences "my email is [email protected]" or "it's [email protected]" would match and extract [email protected].

As the above example illustrates, you can also combine capturing user input with other trigger types by simply branching them.

Benefits

The main benefits of using the Triggers: Any Text for slot filling are:

  • Does not require any training data
  • It's more accurate in extracting specific data

Contextual

With use cases where you do not have a linear flow, you can extract data contextually from user input by annotating entities.

For example, when a user sends "I want to fly from Amsterdam to San Francisco", we want to detect that Amsterdam and San Francisco are a place of departure and arrival.

Entities

Within the training view of any intent, you can mark entities by annotating them. To train the entity classifier, annotate lots of examples. When you annotate lots of cities for example, the classifier learns to recognize cities.

This means it will also pick up on cities you didn't explicitly marked. Of course, you are not limited to cities; you could create a food entity, an animal entity, or a movie entity, or something else entirely.

📘

Tip: Mark and name entities consistently

Make sure you always mark every example. Do not skip any, as this harms the classification process.

Benefits

Although they require more work, contextual entity matching has its own benefits for advanced use cases:

  • Extract data non-linear with any input
  • Extract multiple data contextually, like departure and destinations

Validation and formatting

You can use a variety of entity types, enabling validation and data transformation. For example, the system will convert any entity containing tomorrow to a UTC date-time format if it's a date entity type.

System entity types

We support the following system entity types

NameExample
TextThis is the most used entity type. The AI will match anything you train it using your examples. If you'd like to match music artists, train the AI with examples like Madonna, Michael Jackson, Rush, etc.
DateTomorrow, next week, 1e of July
Time12am, 23:59, now, tomorrow at 8am
Number2453. twenty one
Email[email protected]
URLexamplecorp.com
Distance200 meters, 12 miles, 1.5cm
Money42€, $99
TextAnything from names to cities.

Custom entity type

Flow currently supports one custom type: List entity type.

List entity types work in a similar way as the Text entity type. There is one important difference. Lists provide a way to give boundaries to what the AI will match.

Let's say we take the above example of a departure and destination city. Perhaps you only want a select number of cities a person can travel to. You can limit these options by creating a custom entity list type named city. Within this entity type you provide only matched city names that are valid to travel to.

When the AI detects an entity of this custom entity type, it determines if it matches. The AI could find an entity that is not valid in relation to the type of entity, For example, I want to travel to Washington could be matched in the above example, but since Washington is not in our list, it's immediately dropped.

📘

Tip: Matching is fuzzy

No need to add typos as synonyms, matching is case-insensitive and works with grammar errors.

Exact matching

By default list entity types performs an exact match. This means, it will only match an extracted value if it is actually present inside the list. When you turn of exact matching, custom entities will actually work the same way as a regular Text entity type works.

Using extracted data

Any extracted data is present within params. Those params can be used within Params, Overview of Webhook Actions and String templates.

The following example shows a try-out window that displays extracted data within the right sidebar.

Params always come in the form of an array, even when only one match is found.

The following shows an example of an extracted entity within a param named destination.

Example payload inside cloud code with a destination match

{
    params: {
        destination: [{
            match: "NYc",
            value: "New York",
            id: "NY"        
        }]
    }
}

Example payload inside cloud code with a 2 destination matches

{  
    params: {  
        destination: [{  
            match: "NYc",  
            value: "New York",  
            id: "NY"  
        },{  
            match: "Amsterdam",  
            value: "Amsterdam",  
            id: "AMS"  
        }]  
    }  
}

Filling user profile attributes

User profile attributes can be set using params by giving them a system defined name.

You can fill the following profile data:

  • user.name
  • user.profile.fullName
  • user.profile.firstName
  • user.profile.lastName
  • user.profile.gender (M/F/U)
  • user.profile.locale
  • user.profile.timezone (offset from UTC, -1)
  • user.profile.email
  • user.profile.picture (url)

Read more

Check out these articles about use cases for extracted data:

Capture and validate an Address
Working with dates
Verify user input