Add a flag to preserve the call.name field
See original GitHub issueIf not using the sample optimized tables, but only using the variant optimized tables, it would be nice for vcf_to_bq
to produce a table with the name
field in the call
record instead of the sample_id
.
One can translate the sample_id
to the name
, and the table with the sample_id
is going to be smaller than one with the name
(typically), but the SQL complexity and overhead of doing the translation does not seem worthwhile for all users.
Adding a flag, such as --preserve-call-name
(boolean) would be one option. That said, it might be nice to allow users to specify what fields they want in the call record. For example:
--call-fields="name,genotype,DP,AD"
The tricky case would be around the sample_id
and name
and making the most common cases straight-forward. It would be nice not to have to specify a full list of fields just to pick up the call name
instead of the sample_id
, so I propose the following:
-
All fields, replace
sample_id
withname
:--call-fields="name,*"
-
All fields, replace
name
withsample_id
(the current default behavior):--call-fields="sample_id,*"
-
All fields, include both
name
andsample_id
:--call-fields="sample_id,name,*"
I’m not really sure when someone would use the last option, so I’m not opposed to it being unavailable.
You could make the options SQL-like with the SELECT * EXCEPT syntax, but that feels a bit like overkill.
Issue Analytics
- State:
- Created 3 years ago
- Comments:9 (5 by maintainers)
This changed was pushed to master in PR #677 and will be included in the next release.
Yes, most probably call.name, to keep consistency with v1 schema.