question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Delete hyperlinks from word document

See original GitHub issue

I am wanting to delete a hyperlink from docx. The xml looks similar to:

<w:p w:rsidP="00893654" w:rsidRDefault="00A04494" w:rsidRPr="00893654" w:rsidR="002B1D7E">
  <w:r w:rsidRPr="00893654">
    <w:t xml:space="preserve">Here is a text section. Hyperlink here: </w:t>
  </w:r>
  <w:hyperlink r:id="rId12" w:history="1" w:tgtFrame="_blank">
    <w:r w:rsidRPr="00893654">
       <w:t xml:space="preserve">This is the Hyperlink.  </w:t>
     </w:r> 
  </w:hyperlink>
</w:p>

I see it in the paragraph._element.xml, but it isn’t editable. I’ve been able to add xml tags in runs using the docx.oxml library but I haven’t seen a way to edit existing xml tags. Worst case would be to unzip the word file, and edit it via lxml and then rezip it. But I would rather have a more elegant solution. Ideas??

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:7 (3 by maintainers)

github_iconTop GitHub Comments

3reactions
scannycommented, Sep 12, 2019

I’d say the job is this:

  • Promote the enclosed w:r element to a peer of w:hyperlink
  • Remove the relationship "rId12" from the document part (or other story part like header).
  • Remove the w:hyperlink element.

You probably need to start by getting references to the various elements, probably starting with the w:p container. This is simple enough once you’ve identified the paragraph:

p = paragraph._p

The w:hyperlink is a child, so something like:

hyperlink = p.xpath("./w:hyperlink")[0]

If you don’t care about the enclosed run (the hyperlink “label”) you can delete it with the hyperlink element:

p.remove(hyperlink)

If you want to preserve it, you’ll need to promote it (like move it out of being enclosed by hyperlink):

label_r = p.xpath("./w:hyperlink/w:r")[0]
hyperlink.add_previous(label_r)
p.remove(hyperlink)

The last bit is the relationship, which might be okay to leave dangling, worth a try, or maybe you have to remove it:

from docx.oxml.ns import qn  # ---for "qualified name", aka. Clark-notation tagname---
hyperlink_rel_id = hyperlink.get(qn("r:id"))
document_part = document.part
docuemnt_part.drop_rel(hyperlink_rel_id)

Of course all this uses internals so comes with some risk of brittle breakage with new releases, but these parts have been stable for some time so probably work fine indefinitely.

0reactions
yadneshSalvicommented, Aug 16, 2022

I’d say the job is this:

  • Promote the enclosed w:r element to a peer of w:hyperlink
  • Remove the relationship "rId12" from the document part (or other story part like header).
  • Remove the w:hyperlink element.

You probably need to start by getting references to the various elements, probably starting with the w:p container. This is simple enough once you’ve identified the paragraph:

p = paragraph._p

The w:hyperlink is a child, so something like:

hyperlink = p.xpath("./w:hyperlink")[0]

If you don’t care about the enclosed run (the hyperlink “label”) you can delete it with the hyperlink element:

p.remove(hyperlink)

If you want to preserve it, you’ll need to promote it (like move it out of being enclosed by hyperlink):

label_r = p.xpath("./w:hyperlink/w:r")[0]
hyperlink.add_previous(label_r)
p.remove(hyperlink)

The last bit is the relationship, which might be okay to leave dangling, worth a try, or maybe you have to remove it:

from docx.oxml.ns import qn  # ---for "qualified name", aka. Clark-notation tagname---
hyperlink_rel_id = hyperlink.get(qn("r:id"))
document_part = document.part
docuemnt_part.drop_rel(hyperlink_rel_id)

Of course all this uses internals so comes with some risk of brittle breakage with new releases, but these parts have been stable for some time so probably work fine indefinitely.

This worked well, just had to use ‘addprevious’ in place of ‘add_previous’ wrote in this comment because I spend an hour finding out why it was failing.

Also to replace multiple hyperlinks in a paragraph with text you can try the following for loop

if paragraph._p.xpath("./w:hyperlink"):
    for hyperlink, label_r in zip(paragraph._p.xpath("./w:hyperlink"), paragraph._p.xpath("./w:hyperlink/w:r")):
        hyperlink.addprevious(label_r)
        paragraph._p.remove(hyperlink)
Read more comments on GitHub >

github_iconTop Results From Across the Web

How To Remove Hyperlinks from Microsoft Word ... - Alphr
Remove Hyperlinks using Word's Context Menu Options ... Right-click on the chosen link in the document, then select “Remove Hyperlink.” The text/ ...
Read more >
How to Remove Hyperlinks From Microsoft ... - How-To Geek
To remove a single hyperlink, right-click on the hyperlink and select “Remove Hyperlink” on the popup menu. Remove Hyperlink in the shortcut ...
Read more >
How to remove hyperlinks in Word documents (with methods)
Select the text of the hyperlink that you want to remove. · Right-click on the hyperlink text to open the Word context menu....
Read more >
How to Remove All Hyperlinks in a Word Document - Techwalla
Eradicate All Hyperlinks ... Press "Ctrl-A" to highlight the entire Word document. If you want to remove all the links in only a...
Read more >
4 Ways to Easily Remove Hyperlinks from Word Documents
Finally, you can remove links selectively. If you want to remove a link and leave the text intact, right-click the link and choose...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found