question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Model interpretation in Python script takes several minutes

See original GitHub issue

Hello!

I am trying to implement model interpretation within my ML pipeline. The interpretation instructions provided in the README work perfectly fine when I tried them on command line. Then, I tried to include this into my code by writing a small function. The method works but it takes several minutes to provide an output unlike the command line which returns the result immediately. I tried this with just one compound: Clc1ccc(c(Cl)c1Cl)c2c(Cl)cc(Cl)c(Cl)c2Cl

I would really appreciate your help in fixing this. Code and output below.

Best, Vishal

Code:

import tempfile
import time
import pandas as pd
import sys
sys.path.insert(0, './predictors/chemprop')
from chemprop.args import InterpretArgs
from chemprop.interpret import interpret

def get_interpretation(kekule_smiles_df):

    start = time.time()
    intrprt_df = pd.DataFrame()

    with tempfile.NamedTemporaryFile(delete=True) as temp:

        kekule_smiles_df.to_csv(temp.name + '.csv', index=None)

        # interpretation arguments
        intrprt_args = [
        '--data_path', temp.name + '.csv',
        '--checkpoint_path', './models/pampa/gcnn_model.pt',
        '--property_id', '1',
        ]

        # slightly modified the interpret method to return a dataframe for further use
        intrprt_df = interpret(args=InterpretArgs().parse_args(intrprt_args))
        temp.close()
    end = time.time()
    print(f'{end - start} seconds to interpret {kekule_smiles_df.shape[0]} molecules')
    return intrprt_df

Output:

Loading pretrained parameter "encoder.encoder.cached_zero_vector".
Loading pretrained parameter "encoder.encoder.W_i.weight".
Loading pretrained parameter "encoder.encoder.W_h.weight".
Loading pretrained parameter "encoder.encoder.W_o.weight".
Loading pretrained parameter "encoder.encoder.W_o.bias".
Loading pretrained parameter "ffn.1.weight".
Loading pretrained parameter "ffn.1.bias".
Loading pretrained parameter "ffn.4.weight".
Loading pretrained parameter "ffn.4.bias".
smiles,score,rationale,rationale_score
ClC1=C(Cl)C(Cl)=C(C2=C(Cl)C(Cl)=C(Cl)C=C2Cl)C=C1,0.981,Clc1c[cH:1]c(Cl)[cH:1]c1[CH3:1],1.000
1469.1523442268372 seconds to interpret 1 molecules

Issue Analytics

  • State:open
  • Created 2 years ago
  • Comments:7 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
hesthercommented, Oct 8, 2021

I will need to run some tests. Unfortunately, the MCTS algorithm in general is not really fast. However, you can parallelize it manually by e.g. splitting up your test data into multiple files (command line), or running multiple processes within a python script.

I will try to reproduce the weird timing behavior and track it down (if I can reproduce it)

0reactions
iwwwishcommented, Oct 13, 2021

Thank you @hesther. I am not sure I’ll be able to do this myself completely. But if I were to get around it, I will definitely open a PR.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Why Python is so slow and how to speed it up | by Mike Huls
These design choices, however, do make Python code slower than other ... Python is an interpreted, high-level, general-purpose programming language.
Read more >
Python script that is executed every 5 minutes - GeeksforGeeks
In this article, we will discuss how to execute a Python script after every 5 minutes. Let's discuss some methods for doing this....
Read more >
Time Series Analysis in Python - A Comprehensive Guide with ...
This guide walks you through the process of analyzing the characteristics of a given time series in python. Time Series Analysis in Python...
Read more >
How to Make Python Code Run Incredibly Fast - KDnuggets
Despite having these many qualities, python has one drawback, which is it's slow speed. Being an interpreted language, python is slower than ...
Read more >
Find out time it took for a python script to complete execution
from datetime import datetime startTime = datetime.now() #do something #Python 2: print datetime.now() - startTime #Python 3: print(datetime.now() ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found