question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

How to read CIFAR images with scala in mmlspark

See original GitHub issue

Currently, example 301 which evaluates pre-trained CNTK model with CIFAR10 images, is totally written by python. This example use pickle.load to read cifar-10-batches-py/test_batch and then parallelize to distributed RDD, however, I cannot directly use pickle to read data in scala code application.

I tried spark.readImages in mmlspark, but it seems cannot deal with cifar-10-batches-bin data well. And I finally choose cookie-datasets to read cifar data in scala (the master branch of cookie-datasets still used spark-1.5, and I upgrade it to spark-2.1 with necessary changes)

BTW, since you only have python examples, and I have already interpreted 101 and part 301 examples code to scala, I’m not sure whether you want this part of example codes?

Issue Analytics

  • State:closed
  • Created 6 years ago
  • Comments:8 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
alg-jmxcommented, Oct 16, 2017

why mmlspark hasn’t more example using scala? I’m more interesting in scala with spark in mmlspark~

0reactions
drdarshancommented, Jul 25, 2017

Hi @Myasuka, I’m closing this issue for now… please reopen if you are still blocked. Thank you!

Read more comments on GitHub >

github_iconTop Results From Across the Web

How to read CIFAR images with scala in mmlspark #63
Currently, example 301 which evaluates pre-trained CNTK model with CIFAR10 images, is totally written by python.
Read more >
DeepLearning - CIFAR10 Convolutional Network | SynapseML
Set some paths. cdnURL = "https://mmlspark.azureedge.net/datasets"
Read more >
Microsoft Machine Learning for Apache Spark
MMLSpark requires Scala 2.11, Spark 2.1+, and either Python 2.7 or ... To read the EULA for using the docker image, run docker...
Read more >
CIFAR-10 and CIFAR-100 datasets
The CIFAR-10 dataset consists of 60000 32x32 colour images in 10 classes, with 6000 images per class. There are 50000 training images and...
Read more >
Software Libraries and Middleware for Exascale Systems
Application Example of DL: Flickr's Magic View Photo Filtering ... CaffeOnSpark, TensorFlowOnSpark, MMLSpark. VGG-16. 13 / 3. CIFAR-10.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found