Mounting existing PVC not possible.
See original GitHub issueHi all,
first of all thank you for this great project. I think it really adds a lot of value to the whole kubeflow experience. While playing around a bit with kale I experienced an issue while trying something out:
The idea was to see pipeline outputs as files directly in the jupyterlab. Therefore I mounted an RWX volume data-volume
to my notebook pod in /home/jovyan/data
. Then I wanted to pass the name of the volume to kale and define the mountpoint of the volume as /data
.
The code in my pipeline was prepared to write some output to /data/filename
. As the volume is RWX I assumed that the output would be directly visible in my jupyterlab File Browser. I think this workflow would really add to the Data Science experience and the convenience of using pipelines.
However, running this pipeline I encountered an error in the first pipeline step. The step faild with the following message:
This step is in Error state with this message: Pod "rwx-test2-et0i4-tkphz-2706597555" is invalid: [spec.volumes[2].name: Invalid value: "pvolume-ca6c4cec0854efe7746ed49b8661abc2a924aef77f23218ad5c43fc8258b0dfe": must be no more than 63 characters, spec.containers[0].volumeMounts[2].name: Not found: "pvolume-ca6c4cec0854efe7746ed49b8661abc2a924aef77f23218ad5c43fc8258b0dfe", spec.containers[1].volumeMounts[0].name: Not found: "pvolume-ca6c4cec0854efe7746ed49b8661abc2a924aef77f23218ad5c43fc8258b0dfe"]
I was wondering why this happens as the volume name pvolume-ca6c4cec0854efe7746ed49b8661abc2a924aef77f23218ad5c43fc8258b0dfe
was obviously not the one I specified. I tried to track down the problem and came across the part in the pipline.py file where you create the volume in the pipeline:
def auto_generated_pipeline(vol_data='data-volume'):
pvolumes_dict = OrderedDict()
annotations = {}
volume = dsl.PipelineVolume(pvc=vol_data)
...
This looked fine to me so I was wondering whether something goes wrong in dsl.PipelineVolume
and indeed there seems to be a problem with how you call the dsl.PipelineVolume
. In https://github.com/kubeflow/pipelines/blob/df4bc2365e9bfe01e06fb12ce407130ec598d7ce/sdk/python/kfp/dsl/_pipeline_volume.py#L71 there seems to be checked for a kwargs parameter name
. If this parameter is not given a new one with a hash is generated.
Right now I am not exactly sure whether this is a kale issue or a pipelines issue, but I guess this is not the behavior you expected, right?
Issue Analytics
- State:
- Created 4 years ago
- Comments:5 (2 by maintainers)
Top GitHub Comments
Yes, PVCs cannot be mounted on pods living in a different namespace.
If I’m not wrong, with multi-user support, pipelines will be created in different namespaces (wherever the user has access to, e.g.,
kubeflow-user1
orkubeflow-myteam
). Then, the steps will be able to mount PVCs in that same namespace.I think I should close this issue. Feel free to reopen if anything occurs!
@h4gen @Felihong ,
I forgot to mention that this functionality is possible using Rok, which is included in MiniKF.
We overcome this issue by taking a snapshot of the PVC and then cloning it in
kubeflow
namespace as the first step of the pipeline, where the workflow is deployed.You can try it out by deploying the latest MiniKF on GCP. Please find a tutorial exploiting this feature here.