[BUG] GUI utilizes hierarchy incorrectly
See original GitHub issueFrom #403, the issue relating to the priority-based hierarchy builder appears to be only tangentially related and I was asked to create it as a separate issue.
The hierarchy that is created via the priority-based hierarchy ends up with a different output compared to making the same data transformation manually.
In the case mentioned, the anonymization step used a level that didn’t exist when you reference the data-transformation - null
is kept while White
is dropped, and thus produced a sub-optimal output.
Original Message:
prasser There’s something funky with the underlying hierarchy currently.
The hierarchy creation functionality works great and exactly as I’d like! Though a bit backwards. Highest to lowest is the default option, which goes in exact opposite order than what I’d expect. The mode of the data should be the last one to be dropped instead of the first with frequency prioritization (highest to lowest).
The issue I’m seeing is that the anonymizing doesn’t apply the hierarchy as it displays in the “data transformation” tab.
On a dataset I’m using, (unfortunately I can’t paste it for easy reference), the hierarchy when generating lowest to highest (to ensure that the mode of the data is the last dropped), looks like this:
If I generate the results, I see that it decides to drop the Race column instead of keeping “White” only, which is the result I got when I manually created this hierarchy earlier.
I can find the exact value that I used originally for this data set when I look for “Non-anonymous” transformations, which is really odd, because like I said, it should be the same result.
Instead of keeping “White”, it kept “” (empty) as “Level-3” Race. I also tried the “Highest to Lowest” option, and the same result occurred.
When I edit the hierarchy and tell it to remove the underlying representation of the current hierarchy, it suddenly works exactly as it should, with the same results as v3.9.0 and the manual transformation settings I put together.
_Originally posted by @ZachHaber in https://github.com/arx-deidentifier/arx/issues/403#issuecomment-1230492046_
Issue Analytics
- State:
- Created a year ago
- Comments:8
Ok, thanks! I can now see that problem. I’m sure that it’s not related to compilation.
Thanks. Resolved.