In DITA-OT project files, allow subsets of deliverables to be published
See original GitHub issueDescription
Currently, DITA-OT project file publishing is limited to publishing one deliverable or all deliverables.
To publish subsets of deliverables, the <deliverable>
definitions must be split into multiple files, with subsets built using a hierarchy of <include>
directives:
This has the following drawbacks:
- Many files can be required.
- We have 100+ contexts, shared by dozens of products, organized into several help collections.
- Our deliverable definitions are resistant to grouping in fewer files, due to being shared where products overlap.
- It is difficult to know what a particular file publishes without tracing through multiple files.
- It is difficult to explore the evolution of project file content over time in a revision control system.
- Maintenance operations (search-and-replace, adjusting collections) are more complex.
<context>
and<publication>
information must be replicated in every deliverable file to allow individual publishing (inelegant, but technically harmless).
Possible Solution
It would be useful to be able to natively define collections of deliverables (and collections of collections), perhaps like this:
where a “collection” represents its subset of referenced deliverables:
dita --project project.xml --deliverable product3
When a collection is published, its deliverables should be published in the order resulting from expanding the references. This provides support for order-dependent deliverables (such as HTML online help that cleans its output directory first, followed by PDF deliverables written to that same output directory).
If the reference expansion includes a deliverable multiple times, the deliverable should only be published once (the first occurrence seems reasonable).
Detection for circular collection reference loops would be needed.
Potential Alternatives
I considered having a deliverable declare its dependencies via a new <depends>
element, but this made it difficult to specify whether to publish a deliverable alone or with its dependencies, plus there was no intuitive way to describe order dependency.
Additional Context
A testcase is included:
ditaot_project_file_collections.zip
The project.xml
file in the testcase uses the format proposed above, although other implementations are possible.
Issue Analytics
- State:
- Created a year ago
- Comments:14 (14 by maintainers)
@chrispy-snps , not much to add. This is exactly how it should work.
@jelovirt - I am normally not a fan of hyphens in element names, but I have come to prefer
<deliverable-set>
becausedita --deliverable
.So regardless of the ordering connotation that comes with
<deliverable-set>
, I think it’s more intuitive for users.An ordering guarantee for deliverable generation is a nice-to-have, but not a requirement.
For us, the documentation for a product family is (1) one Oxygen WebHelp deliverable that contains all the books as submaps, plus (2) individual book PDFs placed into that same output directory:
Because WebHelp is HTML-based, we must clean the output directory before publishing it to avoid orphaned files (see #1199), but that must be done before the PDF deliverables are also written into that directory.
If there were an ordering guarantee “built into” deliverable sets, then we could invoke output-cleaning plugins in a certain way. If not, then output-cleaning must be moved to a preprocessing operation that computes and cleans all output directories for the deliverables to be published. This is easy in automated linux publishing where I can write a wrapper script to do that, but not easy in Oxygen where writers interactively publish from the software’s UI. They will need to remember to manually delete their output directory from time to time.
And some day, perhaps if the DITA-OT implements a “parallel deliverable publishing” capability (wouldn’t that be cool!), then the ordering guarantee must be discarded because a complete linear sequencing of deliverables is inherently incompatible with parallelization.
So if an ordering guarantee is implemented, I think value could be obtained from it. But it’s just a nice-to-have, not a requirement.
@xephon2 - do you have any thoughts on order of deliverable generation in a deliverable set?