question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Can't get the content in revision mode in table of docx file

See original GitHub issue

image As shown in above, I want to get the last non-empty row of data in the table.However, because the last line is in the revision state, the data cannot be obtained normally using the tool.

the python code is as below:

#coding:utf-8
import os
from docx import Document
  
def parse_docx(f):
  i = 0
  doc = Document(f)
  tables = doc.tables
  t = tables[i]
  for j in range(len(t.rows)):
      index = len(t.rows) - j - 1         
      version = t.cell(index,0).text
      datetime = t.cell(index,1).text
      modifyContent = t.cell(index,2).text
      author = t.cell(index,3).text
      if (len(version) != 0) and (len(datetime) != 0) and (len(modifyContent) != 0) and (len(author) != 0):            
          return f.decode(encoding='gbk') + '\n' + version + '   ' + datetime + '   ' + modifyContent + '   ' + author  + '\n'
          break
  
if __name__ == "__main__":
  PATH = os.path.dirname(os.path.abspath(__file__)) 
  doc_files = os.listdir(PATH)
  for doc in doc_files:
    if os.path.splitext(doc)[1] == '.docx':
      try:
        retstr = parse_docx(PATH+'\\'+doc)
        print retstr
      except Exception as e:
        print e

The output obtained after executing the script is “V0.0.3 2014-01-22 Modify 2 Test3”,that’s not the data I expected to get.

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:8 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
scannycommented, Dec 3, 2019

Something like this should work:

tbl = table._tbl
# ---move each run inside a `w:ins` element up to be a sibling of the `w:ins`---
for r in tbl.xpath("../w:ins/w:r"):
    r.getparent().addnext(r)
# ---then get rid of all the (now empty) `w:ins` elements---
for ins in tbl.xpath("../w:ins"):
    ins.getparent().remove(ins)

I expect there are more elegant ways to do this, but this should do the trick.

0reactions
gagmengcommented, Dec 10, 2019

The issus has resolved, close.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Track changes in Word - Microsoft Support
How to track changes in Word using revision marks. Accept, reject, or hide changes made by others working in the file.
Read more >
How to Edit a Table of Contents : Microsoft Word Doc Tips
Get Microsoft Word document tips with help from a certified Microsoft Office Specialist (MOS) and Comp TIA CTT+ certified instructor in this ...
Read more >
I cannot see any redlining (changes) - Scribbr
Step 1: Go to the 'Review' tab and select 'All Markup' in the drop down menu (Word 2019). ... Show Track Changes (Word...
Read more >
Open a document in Pages on iPad - Apple Support
To make changes to the document, you may need to tap the Edit button at the top of the screen. Documents you receive...
Read more >
Tracked Changes Won't Go Away - Word Ribbon Tips
When you turn it on, any edits you make are noted in the document as "markup. ... Word 2016 Indexes and Special Tables...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found