question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Kmeans processing of spectral.io.bipfile.BipFile versus numpy array

See original GitHub issue

Hello, I have a msam result (7 bands, default float64 array) that I want to run kmeans classification on. I need to run np.nan_to_num() on the file before applying kmeans. It works fine and terminal/notebook output looks like this:

spectral:INFO: k-means iteration 1 - 999998 pixels reassigned.
INFO:spectral:k-means iteration 1 - 999998 pixels reassigned.
spectral:INFO: k-means iteration 2 - 128415 pixels reassigned.
INFO:spectral:k-means iteration 2 - 128415 pixels reassigned.
spectral:INFO: k-means iteration 3 - 127319 pixels reassigned.
INFO:spectral:k-means iteration 3 - 127319 pixels reassigned.
...

I want to save the msam result to disk before the kmeans processing, to save time. To save, I used:

outfilename = 'msam_result.hdr'
envi.save_image(outfilename, cube_msam, dtype=np.float64, force=True)

When I read that file back in to run kmeans on it, I use this code:

msam_result_envi = envi.open('msam_result.hdr', image='msam_result.img')

Result is a BipFile. Here are the metadata:

Data Source:   '.\msam_result.img'
	# Rows:           1000
	# Samples:        1000
	# Bands:             7
	Interleave:        BIP
	Quantization:  64 bits
	Data format:   float64

When I try to run Kmeans on it, I get this terminal output:

Iteration 1... 0.Iteration 1... 0.Iteration 1... 0.Iteration 1... 0.Iteration 1... 0.Iteration 1... 0.Iteration 1... 0.Iteration 1... 0.Iteration 1... 0.Iteration 1... 0.Iteration 1... 0.Iteration 1... 1.Iteration 1... 1.Iteration 1... 1.Iteration 1... 1.Iteration 1... 1.Iteration 1... 1.Iteration 1... 1.Iteration 1... 1.Iteration 1... 1.Iteration 1... 1.Iteration 1... 2.Iteration 1... 2.Iteration 1... 2.Iteration 1... 2.Iteration 1... 2.Iteration 1... 2.Iteration 1... 2.Iteration 1... 2.Iteration 1... 2.Iteration 1... 2.Iteration 1... 3.Iteration 1... 3.Iteration 1... 3.Iteration 1... 3.Iteration 1... 3.Iteration 1... 3.Iteration 1... 3.Iteration 1... 3.Iteration 1... 3.Iteration 1... 3.Iteration 1... 4.Iteration 1... 4.Iteration 1... 4.Iteration 1... 4.Iteration 1

The processing takes too long, and I kill it each time, but it seems to be working. What am I doing wrong here? Do I need to alter the data type or band interleaving of the BipFile, or convert it to an array, before running Kmeans on it? Sorry to trouble you, and thank you in advance…

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:5 (2 by maintainers)

github_iconTop GitHub Comments

1reaction
tboggscommented, Dec 1, 2020

By default, images are loaded as 32-bit float arrays. You can pass an optional “dtype” keyword to the load method.

1reaction
tboggscommented, Dec 1, 2020

BipFile (a subclass of SpyFile) is just an interface to the image data file and accessing the data that way repetitively is very slow because it will keep reading data from disk every time the same data are needed. It will be much faster if you read that data to memory instead:

msam_result_envi = envi.open('msam_result.hdr', image='msam_result.img').load()
Read more comments on GitHub >

github_iconTop Results From Across the Web

Class/Function Documentation - Spectral Python
ImageArray objects are returned by spectral.SpyFile.load . This class inherits from both numpy.ndarray and Image, providing the interfaces of both classes.
Read more >
Using NumPy to Speed Up K-Means Clustering by 70x
We use NumPy to speed up the k-means clustering algorithm, then use cProfile to find bottlenecks. We'll see how to address them using...
Read more >
Implementing the k-means algorithm with numpy | Frolian's blog
A word of caution before going on: in this post, we will write pure numpy based functions, based on the numpy array object....
Read more >
spectral/VERSIONS.txt at master · spectralpython ... - GitHub
Python module for hyperspectral image processing. Contribute to spectralpython/spectral development by creating an account on GitHub.
Read more >
classification-kmeans-pca-python - | notebook.community
In this tutorial, we will use the Spectral Python (SPy) package to run KMeans and Principal Component Analysis unsupervised classification algorithms.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found