Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Featurised slots with `initial_value` get wrongly embedded in general rules

See original GitHub issue

Rasa Open Source version

2.8.3

Rasa SDK version

No response

Rasa X version

No response

Python version

3.8

What operating system are you using?

OSX

What happened?

We encountered this issue a long time ago and thought we fixed it in #8161. But looks like we didn’t. If a featurised slot has an initial_value set, the slot gets included in rules that don’t mention this slot explicitly. This is incorrect as those rules are then not applicable in states where the slot has a value different from the initial one.

Findings from initial investigation

See bellow for an example setup to reproduce this issue.

In #8161, we introduced the omit_unset_slots flag, which, when training RulePolicy, is used to exclude slots that are unset (i.e. set to their initial values) from general rules (i.e. from rules that don’t mention those slots and should be applicable regardless of those slots’ values).

Unfortunately, the intended pattern with omit_unset_slots needs further work (and testing). What is going wrong is:

During data generation from rules, the generator removes duplicated trackers – deduplication is done by creating tracker states (and caching them) and subsequently hashing them. In this step, omit_unset_slots is switched off.
Subsequently, RulePolicy trains – creates its rule lookup, with omit_unset_slots switched on. In doing this, trackers need to be turned into states. And, since we have the states cached from the previous step, we re-use them. And this is where things break – the previously created states incorrectly contain initial values of featurised slots because they were created with omit_unset_slots=False.

I see a few options for possible fixes:

see if it makes sense to use omit_unset_slots=True also during tracker deduplication
don’t use the caching
use caching but check that the cached states were created with the same value of omit_unset_slots
…

Example setup to reproduce the issue Command:

rasa train --augmentation 0
 rasa test

The default config. Training data:

version: "2.0"
 rules:

*   rule: goodbye
     steps:
*   intent: goodbye
*   action: utter_goodbye
     nlu:
*   intent: greet
     examples: |
*   hey
*   hello
*   hi
*   hello there
*   good morning
*   good evening
*   moin
*   hey there
*   let's go
*   hey dude
*   goodmorning
*   goodevening
*   good afternoon
*   intent: goodbye
     examples: |
*   cu
*   good by
*   cee you later
*   good night
*   bye
*   goodbye
*   have a nice day
*   see you around
*   bye bye
*   see you later

Test stories:

version: "2.0"
 stories:

*   story: goodbye - num_appts slot
     steps:
*   slot_was_set:
*   num_appts: multiple
*   intent: goodbye
*   action: utter_goodbye

Domain:

version: "2.0"
 intents:

*   greet
*   goodbye
     slots:
     num_appts:
     type: categorical
     initial_value: single
     values:
*   none
*   single
     responses:
     utter_goodbye:
*   text: "Bye"

Definition of done

<span class="error">[x]</span> Verify that the cause of the issue is as outlined above
<span class="error">[x]</span> Write tests that catch the bug (turns out the tests in #8161 weren’t sufficient)
Decide which way to fix it (involve CSE and Product Management)
Create followup issue: “Implement the fix, adding more tests if necessary”

Issue Analytics

State:
Created 2 years ago
Comments:14 (11 by maintainers)

Top GitHub Comments

1reaction

TyDunncommented, Mar 17, 2022

Exalate commented:

TyDunn commented:

this issue fits well with the domain + tracker featurizer disentangling work stream (Note that the domain disentangling alone does not cover this – but the prototype for the tracker featurizer rework does).

I also think the discussion we’re having here definitely belongs to Enable/Engine, and they’re better equipped to decide how to fix the bug. Wdyt?

@ka-bu @samsucik Sounds good to me. Let’s move this to Engine

1reaction

ka-bucommented, Mar 17, 2022

Exalate commented:

ka-bu commented:

Regarding the other approaches above:

Think this might break: Looks to me like this could differ between consumers? And if there is no agreement then the caching will definitely break
Sounds not too bad: If we clear the cache, re-compute the states, and add them to the lookup one by one then we don’t increase memory consumption - but just pay more time (re-compute the states again).

Adding one more option: 3. This is most costly in terms of refactoring: We could adapt the state creation so that a state is unambiguous - and create both needed versions from that. That is, currently, a state can take to forms based on that ‘omit_unset_slots’ flag, which breaks the caching idea (which will only cache one of those). If we had a more general representation then we could remove the slots that have not been set for the rules in the rule policy (right before hashing again for the lookup).

Top Results From Across the Web

Featurised slots with `initial_value` get wrongly embedded in ...

Featurised slots with `initial_value` get wrongly embedded in general rules · Initial slot values break rule policy by being integrated into all rules...

Rule policies will not fire if unrelated featurized slots are set

The outstanding issue is featurised slots sometimes getting embedded in rules even if they're not present in the rule as it is written....

What happens to a declared, uninitialized variable in C? Does ...

— if it is a union, the first named member is initialized (recursively) according to these rules. As to what exactly indeterminate means,...

Solidity Documentation - Read the Docs

Let us begin with a basic example that sets the value of a variable and exposes it for other contracts to access. It...

Optimize Options (Using the GNU Compiler Collection (GCC))

Each hard register gets a separate stack slot, and as a result function stack frames are larger. -fno-ira-share-spill-slots. Disable sharing of stack slots...