question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

sklearn.tree.export_dict

See original GitHub issue

Currently, there are two options to get the decision tree representations: export_graphviz and export_text.

I would like to add export_dict, which will output the decision as a nested dictionary.

While you can convert the graphviz representation using cli tools, it’s a bit unruly and is a weird workflow. When converting back, the nesting is also kind of weird, with /n chars floating around.

The main utility for json/dict representations are

  1. That they’re far easier to work with for any front-end work.
  2. Working with a dict allows you to use the representation easily for any downstream tasks using the decision tree (that aren’t raw inference).

Issue Analytics

  • State:open
  • Created 2 years ago
  • Reactions:1
  • Comments:7 (4 by maintainers)

github_iconTop GitHub Comments

2reactions
thomasjpfancommented, May 17, 2021

I see export_dict as another way to serialize and inspect the tree without having to go through Python object. Going through the Python object can be a bit involved as demonstrated by our example. Other libraries has similar functions to serialized trees into something human readable such as: LightGBM’s model_to_string or XGBoost’s Model IO.

I am overall +0.5 on adding this to sklearn. It has secondary benefit of making it a little easier to export the trees into another format. I suggest we wait to see other’s opinions before working on this.

1reaction
luxedocommented, Jul 1, 2021

Hi! I just like to add the description of the issue I just posted.

Describe the workflow you want to enable

In my opinion It would be great to have the dict representation of decision trees in the library.

Describe your proposed solution

The interface would be the similar to tree.export_text, maybe tree.export_dict?

Additional context

More than once I manually exported a decision tree from a model to a dictionary like so:

tree = {
  "feature": "feature 1",
  "value": 0.5,
  "left": {
    "feature": "feature 2",
    "value": 0.1,
    "left": "category 1",
    "right": "category 2"
  },
  "right": "category 3"
}

And then run the tree with a simple recursive function like:

def run_tree(leaf, data):
    if not isinstance(leaf, dict):
        return leaf
    next_leaf = leaf["left"] if data[leaf["feature"]] <= leaf["value"] else leaf["right"]
    return run_tree(
        next_leaf,
        data,
    )

I like this representation because:

  1. The inference function is very simple.
  2. Removes sklearn dependency for inference
  3. It can be easily ported to other languages
  4. It’s very descriptive and good for educational purposes

My quick implementation of export_dict:

def export_dict(clf, feature_names=None):
    tree = clf.tree_
    if feature_names is None:
        feature_names = range(clf.max_features_)
    
    # Build tree nodes
    tree_nodes = []
    for i in range(tree.node_count):
        if (tree.children_left[i] == tree.children_right[i]):
            tree_nodes.append(
                clf.classes_[np.argmax(tree.value[i])]
            )
        else:
            tree_nodes.append({
                "feature": feature_names[tree.feature[i]],
                "value": tree.threshold[i],
                "left": tree.children_left[i],
                "right": tree.children_right[i],
            })
    
    # Link tree nodes
    for node in tree_nodes:
        if isinstance(node, dict):
            node["left"] = tree_nodes[node["left"]]
        if isinstance(node, dict):
            node["right"] = tree_nodes[node["right"]]
    
    # Return root node
    return tree_nodes[0]

If this is a good idea I’d be glad to help!

The main reason this feature would be useful for me is that I can easily port decision trees to other places. I already put some trees in web applications and microcontrollers. I would love to help if this would benefit the community as well.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Export decision trees as dictionaries with `tree.export_dict`
Removes sklearn dependency for inference; It can be easily ported to other languages; It's very descriptive and good for educational purposes.
Read more >
sklearn.tree.export_text — scikit-learn 1.2.0 documentation
The decision tree estimator to be exported. It can be an instance of DecisionTreeClassifier or DecisionTreeRegressor. feature_nameslist of str, default=None.
Read more >
How to output decision tree data in sklearn - Stack Overflow
Having the data as a dictionary , DataFrame , array , etc. would make it easier to analyze, rather than just looking at...
Read more >
Visualizing Decision Trees with Python (Scikit-learn, Graphviz ...
Export your model to a dot file. The code below code will work on any operating system as python generates the dot file...
Read more >
Decision Tree visualization in Python | by Sourabh Potnis
from sklearn import tree ... from sklearn.tree.export import export_text ... To get the Decision tree as an ordered dictionary:
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found