question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

API for getting DBSCAN-like clusterings out of OPTICS with `fit_predict`

See original GitHub issue

Currently we have an interface for OPTICS with custom method extract_dbscan. This is good for usability and visibility of the functionality, but means that a generic parameter search tool (like GridSearchCV) can’t use OPTICS to perform DBSCAN at various eps.

This would involve adding an eps parameter which, when None, would use the default OPTICS clustering; when not None would use extract_dbscan. But we would also need to retain the model across multiple fits…

Here are two alternative interfaces:

  • Add a warm_start parameter (like many classifiers, regressors, but uncharted territory for clusterers). When True, and fit or fit_predict is called, the current reachability_, ordering_ and core_distances_ would be kept, but a different final clustering step would be used to output / store labels_.
  • Add a memory parameter, like in hierarchical clustering. This would cache the mapping from parameters to reachability_, ordering_ and core_distances_ using a joblib.Memory.

I think the first option sounds more appropriate.

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Reactions:1
  • Comments:76 (74 by maintainers)

github_iconTop GitHub Comments

1reaction
jnothmancommented, Oct 30, 2018

Regarding the verbs. optics_distances sounds like euclidean_distances. I might rather optics_sort or sort_optics or optics_order or something. I’m also not sure that extract is better than cluster.

0reactions
jnothmancommented, Mar 12, 2019

Yes, adding a memory parameter is one of the options here, and perhaps the simplest and most consistent with other clustering, i.e. hierarchical.

Read more comments on GitHub >

github_iconTop Results From Across the Web

sklearn.cluster.OPTICS — scikit-learn 1.2.0 documentation
Estimate clustering structure from vector array. OPTICS (Ordering Points To Identify the Clustering Structure), closely related to DBSCAN, finds core sample of ...
Read more >
NVIDIA DeepStream SDK API Reference: DBScan Based ...
Detailed Description. Defines the API for DBScan-based object clustering. ... Holds object clustering parameters required by DBSCAN.
Read more >
How to predict on new data with saved OPTICS clustering model
I work with density based clustering and usually cluster on data (text) as and when I get it. However, I want to save...
Read more >
Scikit Learn Docs PDF | PDF | Thread (Computing) | Python ... - Scribd
1.2.5 What's the best way to get help on scikit-learn usage? ... Scikit-learn's fit/predict API together with its efficient
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found