question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

In DITA-OT project files, allow subsets of deliverables to be published

See original GitHub issue

Description

Currently, DITA-OT project file publishing is limited to publishing one deliverable or all deliverables.

To publish subsets of deliverables, the <deliverable> definitions must be split into multiple files, with subsets built using a hierarchy of <include> directives:

files

This has the following drawbacks:

  • Many files can be required.
    • We have 100+ contexts, shared by dozens of products, organized into several help collections.
    • Our deliverable definitions are resistant to grouping in fewer files, due to being shared where products overlap.
  • It is difficult to know what a particular file publishes without tracing through multiple files.
  • It is difficult to explore the evolution of project file content over time in a revision control system.
  • Maintenance operations (search-and-replace, adjusting collections) are more complex.
  • <context> and <publication> information must be replicated in every deliverable file to allow individual publishing (inelegant, but technically harmless).

Possible Solution

It would be useful to be able to natively define collections of deliverables (and collections of collections), perhaps like this:

collections

where a “collection” represents its subset of referenced deliverables:

dita --project project.xml --deliverable product3

When a collection is published, its deliverables should be published in the order resulting from expanding the references. This provides support for order-dependent deliverables (such as HTML online help that cleans its output directory first, followed by PDF deliverables written to that same output directory).

If the reference expansion includes a deliverable multiple times, the deliverable should only be published once (the first occurrence seems reasonable).

Detection for circular collection reference loops would be needed.

Potential Alternatives

I considered having a deliverable declare its dependencies via a new <depends> element, but this made it difficult to specify whether to publish a deliverable alone or with its dependencies, plus there was no intuitive way to describe order dependency.

Additional Context

A testcase is included:

ditaot_project_file_collections.zip

The project.xml file in the testcase uses the format proposed above, although other implementations are possible.

Issue Analytics

  • State:open
  • Created a year ago
  • Comments:14 (14 by maintainers)

github_iconTop GitHub Comments

1reaction
xephon2commented, Apr 13, 2022

@chrispy-snps , not much to add. This is exactly how it should work.

0reactions
chrispy-snpscommented, Jun 3, 2022

@jelovirt - I am normally not a fan of hyphens in element names, but I have come to prefer <deliverable-set> because

  • It makes more sense with dita --deliverable.
  • It keeps terminology focused on the “big three” - contexts, publications, deliverables - without introducing a fourth term.
  • Its purpose and relationship to the “big three” are more intuitively clear without referring back to documentation.

So regardless of the ordering connotation that comes with <deliverable-set>, I think it’s more intuitive for users.

An ordering guarantee for deliverable generation is a nice-to-have, but not a requirement.

For us, the documentation for a product family is (1) one Oxygen WebHelp deliverable that contains all the books as submaps, plus (2) individual book PDFs placed into that same output directory:

html5_and_pdf

Because WebHelp is HTML-based, we must clean the output directory before publishing it to avoid orphaned files (see #1199), but that must be done before the PDF deliverables are also written into that directory.

If there were an ordering guarantee “built into” deliverable sets, then we could invoke output-cleaning plugins in a certain way. If not, then output-cleaning must be moved to a preprocessing operation that computes and cleans all output directories for the deliverables to be published. This is easy in automated linux publishing where I can write a wrapper script to do that, but not easy in Oxygen where writers interactively publish from the software’s UI. They will need to remember to manually delete their output directory from time to time.

And some day, perhaps if the DITA-OT implements a “parallel deliverable publishing” capability (wouldn’t that be cool!), then the ordering guarantee must be discarded because a complete linear sequencing of deliverables is inherently incompatible with parallelization.

So if an ordering guarantee is implemented, I think value could be obtained from it. But it’s just a nice-to-have, not a requirement.

@xephon2 - do you have any thoughts on order of deliverable generation in a deliverable set?

Read more comments on GitHub >

github_iconTop Results From Across the Web

In DITA-OT project files, allow subsets of deliverables to be published
Hi everyone, We are starting to use DITA-OT project files: DITA-OT documentation - Publishing with project files. One feature we would find extremely...
Read more >
In DITA-OT project files, allow subsets of deliverables to be ...
Hi everyone, I initially posted this to the DITA-OT Users group, but I am cross-posting here for folks who aren't on that list....
Read more >
Preprocessing DITA-OT Project Files - Blog
This @idref mechanism allows many deliverables to share common context and publication definitions. If there is a change to a <context> (perhaps ...
Read more >
Publishing with project files - DITA Open Toolkit
DITA-OT 3.4 introduces new project files to define publication projects with ... allowing you to define multiple deliverables with separate input files and ......
Read more >
DITA Open Toolkit User Guide - SourceForge
the DITA source files are available at http://dita-ot.sourceforge.net ... If you plan to produce printed deliverables, tools that provide FO ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found