Tableau's documentation adjustments (Tableau communicates with Python by lists)
See original GitHub issueBy playing around, I discovered that the documentation in Tableau isn’t up to date.
The examples aren’t working because in Python 3.6, the “map” function return a map object whereas Tableau need to receive a list. Thus, we need to list( ) the map function to return a list.
Also, It should be good to demonstrate how to use Tableau’s Parameters.
In my example below, the [Filter for regex] is a String Parameter inside Tableau. Because Tableau send to Python the parameter as a list with the same value for every row in the visual, we need to limit the parameter to only 1 variable with the [0] to get the first argument of the list.
SCRIPT_BOOL(
"import re
return list(map(lambda x: bool(re.search(_arg1[0],x)),_arg2))
",[Filter for regex],Attr([Customer Name])
)
From this experimentation, I assume that Tableau will always want to receive a list back to match the rows as well as will always send to Python lists. (so, we have to adjust in Python for that)
PS: it’s easier to use the Tableau’s regex functions but it’s a good way to test the way Tableau is working with Python 😉
Issue Analytics
- State:
- Created 5 years ago
- Reactions:2
- Comments:5 (2 by maintainers)
You can find more information about how integration works on TabWiki https://community.tableau.com/docs/DOC-10856
To explain a little bit more about how TabPy handles data from Tableau: values are passed and expected to return as a list. NULLs from Tableau are converted to the Python None type. This can cause issues when a model or function does not expect or have the ability to handle Python None type values.
Regarding partitioning and addressing table calculation dimensions in Tableau: a separate call is made to the TabPy server for each combination of partitioned dimensions. In the example above, separate clusters are being computed for each window in the view ie Consumer/Furniture or Corporate/Office Supplies. Having the box checked next to a dimension, as is the case for Customer Name, signals to the calculation not to partition on that dimension. In the case above, this makes sense because we would not want to send a separate call for each customer. This would result in ~9500 calls in Superstore and each would have a sample size of 1, resulting in all failed clusters or 9500 unique clusters depending on the model.
There are cases when it makes sense to put everything on addressing (check all boxes), for example text data where we are running sentiment analysis. The results of a given sample don’t depend on other samples and checking all boxes will result in only a single server call being made, which will improve performance.
I’m going to close this issue since the underlying questions have been addressed. For further question on TabPy functionality, please reach out to us in the Tableau forums!