question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

DecisionTreeClassifier serialization/deserialization

See original GitHub issue
  • I’m submitting a …
  • question about how to use this project
  • Summary I am trying to serialize and deserialize DecisionTreeClassifier, however it throws an error. What is a valid way to save tree to file and restore it later?

Source:

const decision = new DecisionTreeClassifier({ featureLabels: ['a','b'] });
const deserialized = new DecisionTreeClassifier({ featureLabels: ['a','b'] });

const X = [[0, 0], [1, 1]];
const y = [0, 1];
decision.fit( X, y );
const serialized = JSON.stringify(decision.toJSON())
deserialized.fromJSON(JSON.parse(serialized))

Stacktrace:

TypeError: node.question.match is not a function
    at DecisionTreeClassifier._predict (C:\apps\ml\src\lib\tree\tree.ts:434:23)
    at DecisionTreeClassifier.predict (C:\apps\ml\src\lib\tree\tree.ts:202:24)
    at Object.predict (C:\apps\ml/dttest.js:12:27)

Issue Analytics

  • State:open
  • Created 4 years ago
  • Comments:5 (1 by maintainers)

github_iconTop GitHub Comments

2reactions
gilles-baracommented, Jul 29, 2019

The problem is that JSON.stringify will convert the typed classes to regular, non-typed objects. Providing that string to a fromJSON will then only create objects with the same properties, but these objects won’t be the original classes with their appropriate methods. I suppose the fromJSON needs to be extended so that proper instances are created of the correct classes.

For the forest this could be something like this:

    BaseRandomForest.prototype.fromJSON = function (_a) {
        var _b = _a.trees, trees = _b === void 0 ? null : _b;
        if (!trees) {
            throw new Error('You must provide both tree to restore the model');
        }
        this.trees = [];
        for (const tree of trees) {
            const t = new tree_1.DecisionTreeClassifier();
            t.fromJSON(tree);
            this.trees.push(t);
        }
    };

A single classifier could then be:

    DecisionTreeClassifier.prototype.fromJSON = function (_a) {
        var _b = _a.featureLabels, featureLabels = _b === void 0 ? null : _b, _c = _a.tree, tree = _c === void 0 ? null : _c, _d = _a.verbose, verbose = _d === void 0 ? false : _d, _e = _a.random_state, random_state = _e === void 0 ? null : _e;
        this.featureLabels = featureLabels;
        const t = new DecisionNode();
        this.tree = t.fromJSON(tree);
        this.verbose = verbose;
        this.randomState = random_state;
    };

And the nodes:

    DecisionNode.prototype.fromJSON = function (_a) {
        if (typeof _a.prediction !== "undefined") {
            return Object.assign(new Leaf([]), _a);
        }
        const q = _a.question;
        this.question = new Question(q.features, q.column, q.value);
        this.falseBranch = (new DecisionNode()).fromJSON(_a.falseBranch);
        this.trueBranch = (new DecisionNode()).fromJSON(_a.trueBranch);
        return this;
    };

Allthough the fromJSON should probably not return the result to be consistent with the existing fromJSON methods, but you get the idea.

0reactions
danechungcommented, Aug 4, 2020

I have the same problem. I tried to handle using the above example, but some classes are not exported.

import { DecisionTreeClassifier } from ‘./tree’; export { DecisionTreeClassifier };

then i’m fork and fix it import { DecisionTreeClassifier, DecisionNode, Question, Leaf } from ‘./tree’; export { DecisionTreeClassifier, DecisionNode, Question, Leaf };

Read more comments on GitHub >

github_iconTop Results From Across the Web

Machines · MLJ
Deserialize using SomeSerializationPkg to obtain a new object mach; Call restore!(mach) to ensure mach can be used to predict or transform new data....
Read more >
python - Scikit-learn decision tree in production
I'm working at building a decision tree model that will be used in production. In documentation here pickle is used to serialize the...
Read more >
serialize adaboost classifier scikit-learn - Stack Overflow
I'm trying to use scikit-learn AdaBoostClassifier, and i'm trying to serialize the output classifier using cPickle to save it to database or ...
Read more >
Serialize and Deserialize a Binary Tree - GeeksforGeeks
Serialization is to store the tree in a file so that it can be later restored. The structure of the tree must be...
Read more >
MLeap Model Export Demo (Python) - Databricks
Serialized pipelines (bundles) can be deserialized back into Apache Spark ... This notebook demonstrates how to use MLeap to export a DecisionTreeClassifier ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found