question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

New common merge process

See original GitHub issue

We currently have a single merge process code in OT, the one originally developed for PDF. However, it’s not really optimal for general use and IMO is not particularly good. We should develop a new merging code based on this code and merge PDF to use it.

The current output from merge is

<map>
  <opentopic:map>
    <!-- map contents -->
  </opentopic:map>
  <!-- topics -->
</map>

I suggest the new output is like compound topic, but can also contain a map. Also, the map should be output the the end of the document, because it’s more efficient from processing point of view.

<dita>
  <!-- topics -->
  <map>
    <!-- map contents -->
  <map>
</dita>

Issue Analytics

  • State:closed
  • Created 7 years ago
  • Comments:10 (8 by maintainers)

github_iconTop GitHub Comments

1reaction
ballumscommented, Sep 23, 2016

Radu and Eliot,

I can offer some feedback on creating a single merged map at the beginning of the process. That’s how ePublisher handles DITA map inputs as that closely approximate’s ePublisher expectations for legacy document inputs (Word and FrameMaker).

Since ePublisher has been working with a single merged map for all operations for the past 10 years, I can tell you the approach has its ups and downs. The biggest issue relates to memory. While you can side-step certain problems by running under 64-bit Java, the fact is, if you load a 110MB DITA map into memory, you have a lot of data to work with. XPath operations are simple (you can use xsl:number operations). But, breaking files back out into HTML chunks is complicated as ePublisher flattens DITA hierarchies in its intermediate format. This issue that would not affect DITA-OT if you preserve the DITA map hierarchy in the merged document. For our 2016.1 release, we spent a not insignificant amount of time finding ways to chunk large intermediate files in order to preserve memory for processing. It works very well, yet I wouldn’t wish the work on anyone if they can avoid it.

The main point is that some folks pull in a LOT of data using nested DITA maps. I’d suggest you benchmark the generation of HTML files (roughly 1-1 per DITA topic) against a PDFs (merging all topics) and see how your memory/performance numbers line up. To really stress test it, use 32-bit Java. In the past, when adding support for DITA-OT 1.8 to ePublisher, we encountered memory issues with 32-bit Java. We did add support for 64-bit Java, but also wound up implementing an XSL merge that operates within 32-bit memory limits for large DITA map (100MBs+). I understand there have been many updates to date merge code for later DITA-OT releases. I’m not sure if the memory issue was addressed. Might double-check before going down the single merged file path.

0reactions
stale[bot]commented, Dec 12, 2018

This issue has been automatically marked as stale because it has not been updated recently. It will be closed soon if no further activity occurs. Thank you for your contributions.

Read more comments on GitHub >

github_iconTop Results From Across the Web

About the Merge Process
The merge process typically involves three versions of an Oracle Business Intelligence repository: the original repository, modified repository, ...
Read more >
Git Merge | Atlassian Git Tutorial
Git merge will combine multiple sequences of commits into one unified history. In the most frequent use cases, git merge is used to...
Read more >
Use mail merge for bulk email, letters, labels, and envelopes
Your first step in setting up a mail merge is to pick the source of data you'll use for the personalized information. Excel...
Read more >
Merging Data from Multiple Sources - Challenges and Solutions
Data merging is a process where data is unified from multiple sources to represent a single point of reference or a single point...
Read more >
What You Should Know About Corporate Mergers - Investopedia
A merger typically occurs when one company purchases another company by buying a certain amount of its stock in exchange for its own...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found