question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. ItΒ collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

selecting a file/dir from directory listing with symlinking

See original GitHub issue

Expected Behavior

In an ExpressionTool and when selecting a file or subdirectory from an input directory, I want this file not to be copied but rather symlinked.

Is there a way to do that ?

This feature would be really useful when files are big. Especially when, in a workflow the expressionTool output is an intermediary file which is not kept in final output.

Actual Behavior

I did not find any way to implement the behavior I am looking for. The selected file is copied in my workflows. Below an example of an expressionTool to select a file from a directory.

Workflow Code

cwlVersion: v1.0
class: ExpressionTool
id: fileFromDir
requirements:
  InlineJavascriptRequirement: {}
inputs:
  dir:
    type: Directory
  prefix:
    type: string
    default: ''
  basename:
    type: string
  suffix:
    type: string
outputs:
  xfile:
    type: 'File'
expression: >
  ${
    var targetFileName=inputs.prefix+inputs.basename+inputs.suffix;
    for (var i in inputs.dir.listing){
      var itemFile=inputs.dir.listing[i];
      if (itemFile.basename == targetFileName){
        var targetFile=itemFile;
        break;
      }
    }
    return {'xfile':targetFile};
  }

Your Environment

  • cwltool version: 3.0.20200709181526

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:5 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
jfouretcommented, Dec 16, 2020

Thank you very much for your reactivity.

My bad, everything is fine with this implementation in cwltool.

Since the problem I had related to #888 I have temporarely replaced cwltool cmdline with toil-cwl-runner.

Using cwltool and while keeping temporary file we ccan see that everything is fine:

$cwltool --parallel --force-docker-pull --verbose --no-match-user --on-error 'continue' --tmpdir-prefix tmp --cachedir cache --leave-tmpdir --timestamps --outdir outdir file-picker-wf.cwl --my_dir test_dir

[...]

$ tree .
.
β”œβ”€β”€ cache
β”‚Β Β  β”œβ”€β”€ 0e647ac658d6b4319997e3bdbb3e5a33
β”‚Β Β  β”‚Β Β  └── 413bfea21ae37cc658b007d2fd2b80b4d5dd3db0
β”‚Β Β  └── 0e647ac658d6b4319997e3bdbb3e5a33.status
β”œβ”€β”€ file-picker-wf.cwl
β”œβ”€β”€ outdir
β”‚Β Β  └── 413bfea21ae37cc658b007d2fd2b80b4d5dd3db0
β”œβ”€β”€ test_dir
β”‚Β Β  β”œβ”€β”€ hello1.txt
β”‚Β Β  └── hello2.txt
└── tmp

5 directories, 6 files

The file hello1.txt is not copied; nor in cache nor in tmp.

However the problem stands with toil-cwl-runner:

$toil-cwl-runner --jobStore file:cache/test --strict-memory-limit --force-docker-pull --logDebug --clean never --outdir outdir file-picker-wf.cwl --my_dir test_dir

[...]

$ tree .

.
β”œβ”€β”€ cache
β”‚Β Β  └── test
β”‚Β Β      β”œβ”€β”€ files
β”‚Β Β      β”‚Β Β  β”œβ”€β”€ for-job
β”‚Β Β      β”‚Β Β  β”‚Β Β  β”œβ”€β”€ kind-CWLWorkflow
β”‚Β Β      β”‚Β Β  β”‚Β Β  β”‚Β Β  └── instance-f0yieq5v
β”‚Β Β      β”‚Β Β  β”‚Β Β  └── kind-file_home_users_jfouret_toil_test_file-picker-wf.cwl_first_20f90064-e089-49f8-b671-2b61809f0bf9
β”‚Β Β      β”‚Β Β  β”‚Β Β      └── instance-wqh50svk
β”‚Β Β      β”‚Β Β  β”‚Β Β          β”œβ”€β”€ file-c7rkl6um
β”‚Β Β      β”‚Β Β  β”‚Β Β          β”‚Β Β  └── hello1.txt
β”‚Β Β      β”‚Β Β  β”‚Β Β          └── file-zktxmvep
β”‚Β Β      β”‚Β Β  β”‚Β Β              └── 65588ba8669eb87edeb7befb2706bef493704366 -> /home/users/jfouret/toil_test/outdir/65588ba8669eb87edeb7befb2706bef493704366
β”‚Β Β      β”‚Β Β  β”œβ”€β”€ no-job
β”‚Β Β      β”‚Β Β  β”‚Β Β  β”œβ”€β”€ file-3l6j87kh
β”‚Β Β      β”‚Β Β  β”‚Β Β  β”œβ”€β”€ file-d197eyo7
β”‚Β Β      β”‚Β Β  β”‚Β Β  β”‚Β Β  └── stream
β”‚Β Β      β”‚Β Β  β”‚Β Β  β”œβ”€β”€ file-qyhetvsm
β”‚Β Β      β”‚Β Β  β”‚Β Β  β”œβ”€β”€ file-sc4qb0xf
β”‚Β Β      β”‚Β Β  β”‚Β Β  β”‚Β Β  └── stream
β”‚Β Β      β”‚Β Β  β”‚Β Β  └── file-z9hgcs_j
β”‚Β Β      β”‚Β Β  └── shared
β”‚Β Β      β”‚Β Β      β”œβ”€β”€ config.pickle
β”‚Β Β      β”‚Β Β      β”œβ”€β”€ environment.pickle
β”‚Β Β      β”‚Β Β      β”œβ”€β”€ pid.log
β”‚Β Β      β”‚Β Β      β”œβ”€β”€ rootJobReturnValue
β”‚Β Β      β”‚Β Β      β”œβ”€β”€ rootJobStoreID
β”‚Β Β      β”‚Β Β      └── succeeded.log
β”‚Β Β      β”œβ”€β”€ jobs
β”‚Β Β      β”‚Β Β  β”œβ”€β”€ kind-CWLWorkflow
β”‚Β Β      β”‚Β Β  β”œβ”€β”€ kind-ResolveIndirect
β”‚Β Β      β”‚Β Β  β”œβ”€β”€ kind-file_home_users_jfouret_toil_test_file-picker-wf.cwl_first_20f90064-e089-49f8-b671-2b61809f0bf9
β”‚Β Β      β”‚Β Β  └── kind-file_home_users_jfouret_toil_test_file-picker-wf.cwl_second_50e162d9-d940-426c-b499-3f1ab7e8218b
β”‚Β Β      └── stats
β”‚Β Β          β”œβ”€β”€ stats8p8nzd5o.new
β”‚Β Β          β”œβ”€β”€ stats8pz85nbi.new
β”‚Β Β          β”œβ”€β”€ statsacbvpvk8.new
β”‚Β Β          β”œβ”€β”€ statsnjpn2rro
β”‚Β Β          └── statsvrbu1oen.new
β”œβ”€β”€ file-picker-wf.cwl
β”œβ”€β”€ outdir
β”‚Β Β  β”œβ”€β”€ 2093ea3651180b15e0b70374fab093583577c459
β”‚Β Β  └── 65588ba8669eb87edeb7befb2706bef493704366
β”œβ”€β”€ test_dir
β”‚Β Β  β”œβ”€β”€ hello1.txt
β”‚Β Β  └── hello2.txt
└── tmp

26 directories, 20 files

The problem is that the hello1.txt file is copied in the temporary directory (jobstore) cache/test/files/for-job/kind-file_home_users_jfouret_toil_test_file-picker-wf.cwl_first_20f90064-e089-49f8-b671-2b61809f0bf9/instance-wqh50svk/file-c7rkl6um/hello1.txt rather than being symlinked.

I did not see that #888 had been fixed. I will try it.

I am going to create an issue for toil and reference this post.

Thanks,

0reactions
mr-ccommented, Dec 16, 2020

Thank you @jfouret for the update and your persistence! I’ll close this issue for now. I hope the fix for #888 works for you πŸ˜ƒ

Read more comments on GitHub >

github_iconTop Results From Across the Web

Recursively List All Files in a Directory Including Symlinks
In this short tutorial, we'll see how to follow symlinks when listing recursively all files in a directory. To do that, we can...
Read more >
thoroughly find all links (hard and symlinks, and any ...
I want, only using "basic" commands (for maximum portability) (i.e., something that would work on AIX / Linux / etc., not just something...
Read more >
Get names of all files from a folder with Ruby - Stack Overflow
and if you want to find all Ruby files in any folder or sub-folder: ... Dir.entries("your/folder").select { |f| File.file?
Read more >
Symlink Tutorial in Linux – How to Create and Remove a ...
A symlink (also called a symbolic link) is a type of file in Linux that points to another file or a folder on...
Read more >
Node.js fs.symlink() Function - GeeksforGeeks
type: It is a string which represents the type of symlink to be created. It can be specified with 'file', 'dir' or 'junction'....
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found