question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Unexpected results when loading images with readImages

See original GitHub issue

Hi,

i am getting unexpected results when loading images (*.jpg) with readImages() from a directory in Hdfs. I am getting nulls instead of the binary image data

Following the results stored as a *.csv file with image_df.repartition(1).write.format("csv").save("/path/to/output_csv")

hdfs://path/to/test_data/Koala.jpg,"[[B@26e4b33,768,null,3,null]"
hdfs://path/to/test_data/Hydrangeas.jpg,"[[B@170b0284,768,null,3,null]"
hdfs://path/to/test_data/Lighthouse.jpg,"[[B@53d233ed,768,null,3,null]"
hdfs://path/to/MA/test_data/Desert.jpg,"[[B@dcacdc,768,null,3,null]"
hdfs://path/to/MA/test_data/Jellyfish.jpg,"[[B@17d89ff5,768,null,3,null]"
hdfs://path/to/test_data/Penguins.jpg,"[[B@eed9a0e,768,null,3,null]"
hdfs://path/to/test_data/Chrysanthemum.jpg,"[[B@111d14ca,768,null,3,null]"
hdfs://path/to/test_data/Tulips.jpg,"[[B@c3fe28d,768,null,3,null]"

ENV: -cdh.5.7.0 -spark 1.60 -Anaconda 4.2.0

Any help is appreciated.

Issue Analytics

  • State:open
  • Created 6 years ago
  • Comments:15 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
thunterdbcommented, Aug 18, 2017

I have not investigated further than that, but Spark is not behaving as you would like here. It is taking a string representation of the java object that contains the row, and by default in java, the arrays are just printed by their pointers (the [[B@6ad79116 elements). It seems to be a spark issue independent of images, which I will try to reproduce.

In the meantime, though, can you store your images in the parquet format, for example? It will be more compact, and high quality readers exist for various languages.

0reactions
hainingrencommented, Nov 10, 2017

Hi there - I cam across this thread while investigating a similar error : ImportError: No module named sparkdl.image.imageIO. Could you share how you resolved this? Thanks!

Read more comments on GitHub >

github_iconTop Results From Across the Web

Opencv unexpected output for Sobel operator - Stack Overflow
I am using cv2.imread() function to read images.By default the image is in BGR forma. Edited my query please check. – addcolor. Nov ......
Read more >
Chapter 1. Basic Image Handling and Processing - O'Reilly
This chapter introduces the basic tools for reading images, converting and scaling images, computing derivatives, plotting or saving results, and so on. We...
Read more >
Loading and Displaying Images in Matlab 1 Some Image Basics
Loading and Displaying Images in Matlab ... The command to read images in Matlab is: imread. ... intervals, unexpected results can occur.
Read more >
How to read images from a database and display in
You can load images stored in a database in BLOB format into Qlik Sense by converting to Base64 encoded format in the SQL...
Read more >
Lightroom Classic unable to read images from Nikon...
I am trying to import the images directly from my Nikon D750 memory - which ... Nikon D800 that has always been able...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found