question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Repetitive merge_range to same range causes warning in Excel 2016

See original GitHub issue

Hi,

When I, through a program error/bug, repetitively merge_range to the same range, Excel produces two warnings:

  1. We found a problem with some content in ‘Range.XLSX’. Do you want us to try to recover as much as we can? If you trust the source of this workbook, click Yes. Clicking Yes yields:
  2. Excel was able to open the file by repairing or removing the unreadable content. Removed records: Merge cells from /xl/worksheets/sheet2.xml part

I am using Python version 3.4.3 and XlsxWriter 0.8.6 and Excel 2016.

Here is some code that demonstrates the problem:

    import xlsxwriter

    workbook = xlsxwriter.Workbook('range.xlsx')
    worksheet = workbook.add_worksheet()

    merge_format = workbook.add_format({
        'bold': True,
        'align': 'center'
        })

    headings = (
        ("Alpha",   "L1:W1"),
        ("Bravo",   "X1:AI1"),
        ("Charlie", "AJ1:AU1"),
        ("Delta",   "AV1:BG1"))

    for (product, coordinates) in headings:
        worksheet.merge_range("L1:W1", product, merge_format)

Granted it was my program error that caused the merge_format at the same coordinate. My guess is that all four merge_range functions were committed to the spreadsheet and when Excel saw all four merge_formats on the same range, Excel threw its hands up. In correcting/recovering the spreadsheet, Excel used the last value, Delta, in the range.

What I did not have time to test to see if a similar error occurs if I try to write to the same cell repetitively. Hope to follow up with a similar test in a couple of days. e1 e2 range.xlsx

Issue Analytics

  • State:closed
  • Created 7 years ago
  • Comments:9 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
jmcnamaracommented, Jun 10, 2018

@hamx0r

If the first one is a genuine use case then you should open an issue in Pandas.

For the second one open an XlsxWriter issue with a small working example that shows the issue. Follow the instructions in the ISSUE_TEMPLATE.

1reaction
kjosibcommented, Apr 13, 2017

Modify worksheet.merge_range to do something like the following:

	def merge_range(top, left, bottom, right, datum=None, format=None):
		
		# Make sure coordinates are in correct order:
		if top > bottom: top, bottom = bottom, top
		if left > right: left, right = right, left
		
		# Delegate to write_cell if caller tries to merge a single cell:
		if top == bottom and left == right: return self.write_cell(top, left, datum, format) # PLEASE ADD THIS LINE!!!!
		
		# Now the overlap test, which deals in half-open ranges:
		self.__insert_merged_range(top, left, bottom+1, right+1)

		Then do whatever else merge_range currently does...

Now what does that do? It mainly adds a call to this __insert_merged_range routine:

	def __insert_merged_range(self, T, L, B, R):
		# Don't forget: being half-open ranges, B and R are
		# the first row/column NOT in the box...
		for group in self.__merge_groups:
			if group.intersects(T, L, B, R): raise MergeRangeException()
		for (t,l,b,r) in self.__merge_blocks:
			if overlap(t,b, T,B) and overlap(l,r, L,R): raise MergeRangeException()
		self.__merge_blocks.append((T, L, B, R))
		if len(self.__merge_blocks >= PERFORMANCE_THRESHOLD:
			self.__merge_groups.append(MergeGroup(self.__merge_blocks))
			self.__merge_blocks = []

To support that, we must also modify the constructor as follows:

	def __init__(self, whatever arguments):
		self.__merge_blocks = [] # 4-tuples describing half-open ranges on rows and columns.
		self.__merge_groups = [] # Objects that can speed up the merge range overlap test.
		Do whatever else it currently does

It’s handy to define an “overlap” function for half-open ranges. The proof of correctness is left as an exercise for the reader.

def overlap(begin1,end1, begin2,end2):
	# Recall, this deals ONLY in half-open ranges. Begin is INCLUSIVE, end is EXCLUSIVE.
	return not (begin1 >= end2 or begin2 >= end1)

I’ll make an arbitrary decision for PERFORMANCE_THRESHOLD:

PERFORMANCE_THRESHOLD = 20

Now we need to figure out how to make a decently efficient MergeGroup object. A naive approach would simply compute a bounding box for the group, and use that bounding box as a way to possibly skip a detailed test of each contained block. In fact the bounding box idea is valuable, but we can do considerably better with the detailed test. Merge ranges will probably exhibit good spatial locality and alignment. We can take advantage of that fact with minimal pre-processing:

class MergeGroup:
	def __init__(self, blocks):
		# First figure out the sorted list of interesting rows:
		rows = set()
		for (top,left, bottom,right) in blocks:
			rows.add(top)
			rows.add(bottom)
		self.row_classes = sorted(rows)

At this point, we have the upper rows representing a set of row equivalence classes.

		# Fetch out the bounding box. Remove the ending row bound from our list
		# in the process, because it is not helping anything.
		self.top = self.row_classes[0]
		self.bottom = self.row_classes.pop()
		self.left = min(_[1] for _ in blocks)
		self.right = max(_[3] for _ in blocks)

We’re going to develop a simple run-length encoding of columns within the row equivalence classes. By considering merge blocks in left-to-right order by left bound, we get a surprisingly useful guarantee predicated on the knowledge that no two input blocks overlap:

		self.RLEs = [[] for r in row_classes]
		for (top,left, bottom,right) in sorted(blocks, key=lambda _:_[1]):
			for RLE in self.__relevant_RLEs(top, bottom):
				# This is where we use the guarantee alluded to above:
				if RLE and RLE[-1] == left:
					# The new block adjacent to the last block:
					RLE[-1] = right
				else:
					RLE.append(left)
					RLE.append(right)

Finally, we’d like to coalesce consecutive identical row classes:

		victim = 1
		while victim < len(self.row_classes):
			if self.RLEs[victim] == self.RLEs[victim-1]:
				del self.RLEs[victim]
				del self.row_classes[victim]
			else: victim += 1

The intersection test is now straightforward, although I’m going to defer solving the RLE intersection problem until later:

	def intersects(self, top,left, bottom,right):
		# First check the bounding-box:
		if ( top    >= self.bottom or
			 left   >= self.right  or
			 bottom <= self.top    or
			 right  <= self.left      ): return False
		# Otherwise, check the RLEs:
		return any(RLE_intersect(RLE, left, right) for RLE in self.__relevant_RLEs(top, bottom))

How can we tell which RLE lines are relevant to a given merge range? Like this:

	def __relevant_RLEs(self, top, bottom):
		index = 0 if top < self.top else bisect.bisect_right(self.row_classes, top) - 1
		# NB: the rightward insertion point is one more than the index we want, but don't start before the begninning.
		length = len(self.row_classes)
		while index < length and self.row_classes[index] < bottom:
			yield self.RLEs[index]
			index += 1

Finally, we must solve the RLE_intersect problem. The solution is surprisingly straightforward. Proof of correctness is again left as an exercise to the reader. As a hint, recall that we’re dealing with half-open ranges.

def RLE_intersect(RLE, left, right):
	# You may assume: left < right.
	lo = bisect.bisect_right(RLE, left)
	hi = bisect.bisect_left(RLE, right, lo=lo)
	return (lo < hi) or (lo % 2)

The remaining unsolved open question is what to do if the number of merge groups gets large. I have an idea, but that is a problem for another day.

WARNING: I have not tested most of this code. I have only proved it correct. (I did considerable testing of RLE_intersect, though.)

Read more comments on GitHub >

github_iconTop Results From Across the Web

An error message when you sort a range that contains merged ...
Each merged cell in the range must occupy the same number of rows and columns as the other merged cells in the range....
Read more >
Why am I seeing the Name Conflict dialog box in Excel?
Excel will copy the conflicting name range to the destination workbook. That means you will have two similarly named ranges in your destination...
Read more >
Fixes or workarounds for recent issues in Excel for Windows
Having a problem with your Excel? Check this page for recent issues that we've discovered, and what you can do about them.
Read more >
Merge and unmerge cells - Microsoft Support
How to merge and unmerge cells (text or data) in Excel from two or more cells into one cell.
Read more >
How to troubleshoot available resources errors in Excel - Office
Applies to: Excel for Microsoft 365, Excel 2019, Excel 2016, Excel 2013 ... Before we explore the more common reasons for the memory...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found