question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[KED-921] Dataset class to save matplotlib figures to images locally

See original GitHub issue

Description

A dataset class to save matplotlib figures/plt objects to images locally, MatplotlibWriter. Currently I cannot think of a usecase for reading images into matplotlib, and as such, I’ve not included any support for load.

Context

In the process of making documentation of features, programmatically making plots of all features in a standardised way and embedding links in .md files has been a big timesaver.

Possible Implementation

# Copyright 2018-2019 QuantumBlack Visual Analytics Limited
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
# EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
# OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, AND
# NONINFRINGEMENT. IN NO EVENT WILL THE LICENSOR OR OTHER CONTRIBUTORS
# BE LIABLE FOR ANY CLAIM, DAMAGES, OR OTHER LIABILITY, WHETHER IN AN
# ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF, OR IN
# CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
#
# The QuantumBlack Visual Analytics Limited (“QuantumBlack”) name and logo
# (either separately or in combination, “QuantumBlack Trademarks”) are
# trademarks of QuantumBlack. The License does not grant you any right or
# license to the QuantumBlack Trademarks. You may not use the QuantumBlack
# Trademarks or any confusingly similar mark as a trademark for your product,
#     or use the QuantumBlack Trademarks in any other manner that might cause
# confusion in the marketplace, including but not limited to in advertising,
# on websites, or on software.
#
# See the License for the specific language governing permissions and
# limitations under the License.


"""
``AbstractDataSet`` implementation to save matplotlib objects as image files.
"""

import os.path
from typing import Any, Dict, Optional

from kedro.io import AbstractDataSet, DataSetError, ExistsMixin


class MatplotlibWriter(AbstractDataSet, ExistsMixin):
    """
        ``MatplotlibWriter`` saves matplotlib objects as image files.

        Example:
        ::

            >>> import matplotlib.pyplot as plt
            >>> from kedro.contrib.io.matplotlib import MatplotlibWriter
            >>>
            >>> plt.plot([1,2,3],[4,5,6])
            >>>
            >>> single_plot_writer = MatplotlibWriter(filepath="docs/new_plot.png")
            >>> single_plot_writer.save(plt)
            >>>
            >>> plt.close()
            >>>
            >>> plots = dict()
            >>>
            >>> for colour in ['blue', 'green', 'red']:
            >>>     plots[colour] = plt.figure()
            >>>     plt.plot([1,2,3],[4,5,6], color=colour)
            >>>     plt.close()
            >>>
            >>> multi_plot_writer = MatplotlibWriter(filepath="docs/",
            >>>                                      save_args={'multiFile': True})
            >>> multi_plot_writer.save(plots)

    """

    def _describe(self) -> Dict[str, Any]:
        return dict(
            filepath=self._filepath,
            load_args=self._load_args,
            save_args=self._save_args,
        )

    def __init__(
        self,
        filepath: str,
        load_args: Optional[Dict[str, Any]] = None,
        save_args: Optional[Dict[str, Any]] = None,
    ) -> None:
        """Creates a new instance of ``MatplotlibWriter``.

        Args:
            filepath: path to a text file.
            load_args: Currently ignored as loading is not supported.
            save_args: multiFile: allows for multiple plot objects
                to be saved. Additional load arguments can be found at
                https://matplotlib.org/api/_as_gen/matplotlib.pyplot.savefig.html
        """
        default_save_args = {"multiFile": False}
        default_load_args = {}

        self._filepath = filepath
        self._load_args = self._handle_default_args(load_args, default_load_args)
        self._save_args = self._handle_default_args(save_args, default_save_args)
        self._mutlifile_mode = self._save_args.get("multiFile")
        self._save_args.pop("multiFile")

    @staticmethod
    def _handle_default_args(user_args: dict, default_args: dict) -> dict:
        return {**default_args, **user_args} if user_args else default_args

    def _load(self) -> str:
        raise DataSetError("Loading not supported for MatplotlibWriter")

    def _save(self, data) -> None:

        if self._mutlifile_mode:

            if not os.path.isdir(self._filepath):
                os.makedirs(self._filepath)

            if isinstance(data, list):
                for index, plot in enumerate(data):
                    plot.savefig(
                        os.path.join(self._filepath, str(index)), **self._save_args
                    )

            elif isinstance(data, dict):
                for plot_name, plot in data.items():
                    plot.savefig(
                        os.path.join(self._filepath, plot_name), **self._save_args
                    )

            else:
                plot_type = type(data)
                raise DataSetError(
                    (
                        "multiFile is True but data type "
                        "not dict or list. Rather, {}".format(plot_type)
                    )
                )

        else:
            data.savefig(self._filepath, **self._save_args)

    def _exists(self) -> bool:
        return os.path.isfile(self._filepath)

Possible Alternatives

Thought of writing from inside of pipelines, but this seemed hacky. Not including the multifile option seemed technically viable but in terms of the nature of implementation, it seems a required feature.

Checklist

Include labels so that we can categorise your issue:

  • Add a “Component” label to the issue
  • Add a “Priority” label to the issue

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Reactions:4
  • Comments:6 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
willashfordcommented, Jun 25, 2019

True, I will raise the PR with this amendment

0reactions
921kiyocommented, Oct 4, 2019
Read more comments on GitHub >

github_iconTop Results From Across the Web

[KED-921] Dataset class to save matplotlib figures to images ...
A dataset class to save matplotlib figures/plt objects to images locally, MatplotlibWriter. Currently I cannot think of a usecase for ...
Read more >
How to Save Plots To Image Files Using Matplotlib
In today's article we are going to showcase how to save matplotlib figures and plots into image files on your disk. Additionally, we...
Read more >
How to Save a Plot to a File Using Matplotlib | Tutorial by Chartio
Now to create and display a simple chart, we'll first use the .plot() method and pass in a few arrays of numbers for...
Read more >
matplotlib save figure - Python Tutorial - Pythonspot
The savefig() method is part of the matplotlib.pyplot module. This saves the contents of your figure to an image file. It must have...
Read more >
Save plot to image file instead of displaying it using Matplotlib
png , just call the matplotlib 's pylab class from Jupyter Notebook, plot the figure 'inline' jupyter cells, and then drag that figure/image...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found