Slot Example Sentence
Rhasspy is designed to recognize voice commands in a template language. These commands are categorized by intent, and may contain slots or named entities, such as the color and name of a light.
Subject complement (SUBJECT COMPLEMENT SLOT) - Sentence Type IV The truth was that the home team came back from a 30-point halftime deficit. The truth was SOMETHING. Direct object (DIRECT OBJECT SLOT) - Sentence Type V. Most people chose this as the best definition of slot: A gap between a main. See the dictionary meaning, pronunciation, and sentence examples. Slot Example Sentence and games. Some Slot Example Sentence of them are undoubtedly well known Slot Example Sentence to you already, but you are guaranteed to find something new too. Some examples of the games you will find are listed below. Table Games – The all-time classics! Live Casino – As close to a real-life casino you can get! (6) Insert coins into the slot and press for a ticket. (7) The curtain hooks run along a slot in the curtain rail. (8) Their album has occupied the Number One slot for the past six weeks. (9) Alan dropped another quarter into the slot on the pay phone.
- Intent Recognition
- Slots
- Speech Recognition
sentences.ini
Voice commands stored in an ini file whose 'sections' are intents and 'values' are sentence templates.
Basic Syntax
To get started, simply list your intents (surround by brackets) and the possible ways of invoking them below:
If you say 'this is a sentence' after hitting the Train
button, it will generate a TestIntent1
.
Groups
You can group multiple words together using (parentheses)
like:
Groups (sometimes called sequences) can be tagged and substituted like single words. They may also contain alternatives.
Optional Words
Within a sentence template, you can specify optional word(s) by surrounding them [with brackets]
. For example:
will match:
an example sentence with some optional words
example sentence with some optional words
an example sentence some optional words
example sentence some optional words
Alternatives
A set of items where only one is matched at a time is (specified like this)
. For N items, there will be N matched sentences (unless you nest optional words, etc.). The template:
will match:
set the light to red
set the light to green
set the light to blue
Tags
Named entities are marked in your sentence templates with {tags}
. The name of the {entity}
is between the curly braces, while the (value of the){entity}
comes immediately before:
With the {color}
tag attached to (red green blue)
, Rhasspy will match:
set the light to [red](color)
set the light to [green](color)
set the light to [blue](color)
When the SetLightColor
intent is recognized, the JSON event will contain a color
property whose value is either 'red', 'green' or 'blue'.
Tag Synonyms
Tag/named entity values can be (substituted](#substitutions) using the colon (:
) inside the {curly:braces}
like:
Slot Example Sentences
Now the name
property of the intent JSON event will contain 'light_1' instead of 'living room lamp'.
Substitutions
The colon (:
) is used to put something different than what's spoken into the recognized intent JSON. The left-hand side of the :
is what Rhasspy expects to hear, while the right-hand side is what gets put into the intent:
In this example, the spoken phrase 'living room lamp' will be replaced by 'light_1' in the recognized intent. Substitutions work for single words, groups, alternatives, and tags:
See tag synonyms for more details on tag substitution.
Lastly, you can leave the left-hand or right-hand side (or both!) of the :
empty:
When the right-hand side is empty (dropped:
), the spoken word will not appear in the intent. An empty left-hand side (:added
) means the word is not spoken, but will appear in the intent.
The right-hand side of a substitution can also be a group:
which is equivalent to:
Leaving both sides empty does nothing unless you attach a tag it. This allows you to embed a named entity in a voice command without matching specific words:
An intent from the example above will contain a domain
entity whose value is light
.
Rules
Rules allow you to reuse parts of your sentence templates. They're defined by rule_name = ...
alongside other sentences and referenced by <rule_name>
. For example:
which is equivalent to:
You can share rules across intents by referencing them as <IntentName.rule_name>
like:
The second intent (GetLightColor
) references the colors
rule from SetLightColor
. Rule references without a dot must exist in the current intent.
Number Ranges
Rhasspy supports using number literals (75
) and number ranges (1..10
) directly in your sentence templates. During training, the num2words package is used to generate words that the speech recognizer can handle ('seventy five'). For example:
The brightness
property of the recognized SetBrightness
intent will automatically be converted to an integer for you. You can optionally add a step to the integer range:
Under the hood, number ranges are actually references to the rhasspy/number
slot program. You can override this behavior by creating your slot_programs/rhasspy/number
program or disable it entirely by setting intent.replace_numbers
to false
in your profile.
Slots Lists
Large alternatives can become unwieldy quickly. For example, say you have a list of movie names:
Rather than keep this list in sentences.ini
, you may put each movie name on a separate line in a file named slots/movies
(no file extension) and reference it as $movies
. Rhasspy automatically loads all files in the slots
directory of your profile and makes them available as slots lists.
For the example above, the file slots/movies
should contain:
Now you can simply use the placeholder $movies
in your sentence templates:
When matched, the PlayMovie
intent JSON will contain movie_name
property with either 'Primer', 'Moon', etc.
Make sure to re-train Rhasspy whenever you update your slot values!
Slot Directories
Slot files can be put in sub-directories under slots
. A list in slots/foo/bar
should be referenced in sentences.ini
as $foo/bar
.
Slot Synonyms
Slot values are themselves sentence templates! So you can use all of the familiar syntax from above. Slot 'synonyms' can be created simply using substitutions. So a file named slots/rooms
may contain:
which is referenced by $rooms
and will match:
- the den
- den
- the playroom
- playroom
- the downstairs
- downstairs
This will always output just 'den' because [the:]
optionally matches 'the' and then drops the word.
Slot Programs
Slot lists are great if your slot values always stay the same and are easily written out by hand. If you have slot values that you need to be generated each time Rhasspy is trained, you can use slot programs.
Create a directory named slot_programs
in your profile (e.g., $HOME/.config/rhasspy/profiles/en/slot_programs
):
Add a file in the slot_programs
directory with the name of your slot, e.g. colors
. Write a program in this file, such as a bash script. Make sure to include the shebang and mark the file as executable:
Now, when you reference $colors
in your sentences.ini
, Rhasspy will run the program you wrote and collect the slot values from each line. Note that you can output all the same things as regular slots lists, including optional words, alternatives, etc.
You can pass arguments to your program using the syntax $name,arg1,arg2,...
in sentences.ini
(no spaces). Arguments will be pass on the command-line, so arg1
and arg2
will be $1
and $2
in a bash script.
Like regular slots lists, slot programs can also be put in sub-directories under slot_programs
. A program in slot_programs/foo/bar
should be referenced in sentences.ini
as $foo/bar
.
Built-in Slots
Slot Sentence Example
Rhasspy includes a few built-in slots for each language:
$rhasspy/days
- day names of the week$rhasspy/months
- month names of the year
Converters
By default, all named entity values in a recognized intent's JSON are strings. If you need a different data type, such as an integer or float, or want to do some kind of complex conversion, use a converter:
The !name
syntax calls a converter by name. Rhasspy includes several built-in converters:
- int - convert to integer
- float - convert to real
- bool - convert to boolean
False
for zero or 'false' (case insensitive)
- lower - lower-case
- upper - upper-case
You can define your own converters by placing a file in the converters
directory of your profile. Like slot programs, this file should contain a shebang and be marked as executable (chmod +x
). A file named converters/foo/bar
should be referenced as !foo/bar
in sentences.ini
.
Your custom converter will receive the value to convert on standard in (stdin
) encoded as JSON. You should print a converted JSON value to standard out stdout
. The example below demonstrates converting a string value into an integer:
Converters can be chained, so !foo!bar
will call the foo
converter and then pass the result to bar
.
Special Cases
If one of your sentences happens to start with an optional word (e.g., [the]
), this can lead to a problem:
Python's configparser will interpret [the]
as a new section header, which will produce a new intent, grammar, etc. Rhasspy handles this special case by using a backslash escape sequence ([
):
Now [the]
will be properly interpreted as a sentence under [SomeIntent]
. You only need to escape a [
if it's the very first character in your sentence.
Motivation
The combination of an ini file and JSGF is arguably an abuse of two file formats, so why do this? At a minimum, Rhasspy needs a set of sentences grouped by intent in order to train the speech and intent recognizers. A fairly pleasant way to express this in text is as follows:
Time Slot Example Sentence
Compared to JSON, YAML, etc., there is minimal syntactic overhead for the purposes of just writing down sentences. However, its shortcomings become painfully obvious once you have more than a handful of sentences and intents:
- If two sentences are nearly identical, save for an optional word like 'the' or 'a', you have to maintain two nearly identical copies of a sentence.
- When speaking about collections of things, like colors or states (on/off), you need a sentence for every alternative choice.
- You cannot share commonly repeated phrases across sentences or intents.
- There is no way to tag phrases so the intent recognizer knows the values for an intent's slots (e.g., color).
Each of these shortcomings are addressed by considering the space between intent headings ([Intent 1]
, etc.) as a grammar that represent many possible voice commands. The possible sentences, stripped of their tags, are used as input to opengrm to produce a standard ARPA language model for pocketsphinx or Kaldi. The tagged sentences are then used to train an intent recognizer.
Custom Words
Rhasspy looks for words you've defined outside of your profile's base dictionary (typically base_dictionary.txt
) in a custom words file (typically custom_words.txt
). This is just a CMU phonetic dictionary with words/pronunciations separated by newlines:
You can use the Words tab in Rhasspy's web interface to generate this dictionary. During training, Rhasspy will merge custom_words.txt
into your dictionary.txt
file so the [speech to text](speech-to-text.md** system knows the words in your voice commands are pronounced.
Language Model Mixing
Rhasspy is designed to only respond to the voice commands you specify in sentences.ini, but both the Pocketsphinx and Kaldi speech to text systems are capable of transcribing open ended speech. While this will never be as good as a cloud-based system, Rhasspy offers it as an option.
A middle ground between open transcription and custom voice commands is language model mixing. During training, Rhasspy can mix a (large) pre-built language model with the custom-generated one. You specify a mixture weight (0-1), which controls how much of an influence the large language model has; a mixture weight of 0 makes Rhasspy sensitive only to your voice commands, which is the default.
To see the effect of language model mixing, consider a simple sentences.ini
file:
This will only allow Rhasspy to understand the voice command 'turn on the living room lamp'. If we train Rhasspy and perform speech to text on a WAV file with this command, the output is no surprise:
Now let's do speech to text on a variation of the command, a WAV file with the speech 'would you please turn on the living room lamp':
The word salad here is because we're trying to recognize a voice command that was not present in sentences.ini
. We could always add it, of course, and that is the preferred method for Rhasspy. There may be cases, however, where we cannot anticipate all of the variations of a voice command. For these cases, you should increase the mix_weight
in your profile to something above 0:
Note that training will take significantly longer because of the size of the base language model. Now, let's test our two WAV files:
Great! Rhasspy was able to transcribe a sentence that it wasn't explicitly trained on. If you're trying this at home, you surely noticed that it takes a lot longer to process the WAV files too. In practice, it's not recommended to do mixed language modeling on lower-end hardware like a Raspberry Pi. If you need open ended speech recognition, try running Rhasspy in a client/server set up.
The Elephant in the Room
This isn't the end of the story for open ended speech recognition in Rhasspy, however, because Rhasspy also does intent recognition using the transcribed text as input. When the set of possible voice commands is known ahead of time, it's relatively easy to know what to do with each and every sentence. The flexibility gained from mixing in a base language model unfortunately places a large burden on the intent recognizer.
In our ChangeLightState
example above, we're fortunate that everything works as expected:
But this works only because the default intent recognizer (fsticuffs) ignores unknown words by default, so 'would you please' is not interpreted. Changing 'lamp' to 'light' in the input sentence will reveal the problem:
This sentence would be impossible for the speech to text system to recognize without language model mixing, but it's quite possible now. We can band-aid over the problem a bit by switching to the fuzzywuzzy intent recognizer:
Now when we interpret the sentence with 'light' instead of 'lamp', we still get the expected output:
This works well for our toy example, but will not scale well when there are thousands of voice commands represented in sentences.ini
or if the words used are significantly different than in the training set ('light' and 'lamp' are close enough for fuzzywuzzy
).
A machine learning-based intent recognizer, like Rasa, would be a better choice for open ended speech.
- as we say in a sentence (15) 12-15
- tear-off in a sentence (6) 12-15
- gradient descent in a sentence (16) 12-15
- photographic paper in a sentence (10) 12-15
- power amplifier in a sentence (143) 12-15
- parasitic capacitance in a sentence (23) 12-15
- Faraday cage in a sentence (12) 12-15
- soak time in a sentence (11) 12-15
- rye grass in a sentence (11) 12-15
- condensable gas in a sentence (12) 12-15
- theatre ticket in a sentence (14) 12-15
- sieve tube in a sentence (10) 12-15
- fourier analysis in a sentence (14) 12-15
- weighting factor in a sentence (21) 12-15
- medium-scale in a sentence (21) 12-14
- Euro- in a sentence (20) 12-14
- reporting system in a sentence (33) 12-14
- multipoint in a sentence (41) 12-14
- affectioned in a sentence (20) 12-14
- new data in a sentence (73) 12-14
- industrial cooperation in a sentence (19) 12-14
- effective management in a sentence (60) 12-14
- exponential smoothing in a sentence (23) 12-14
- dartmouth college in a sentence (13) 12-14
- drawing paper in a sentence (12) 12-14
- leguminous plant in a sentence (10) 12-14
- tail feather in a sentence (24) 12-14
- rated current in a sentence (22) 12-14
- master control in a sentence (15) 12-14
- ferric oxide in a sentence (24) 12-14
- us army in a sentence (40+1) 12-14
- neurite in a sentence (18) 12-14
- rake angle in a sentence (22) 12-14
- steering gear in a sentence (33) 12-14
- single standard in a sentence (23) 12-14