question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

No item named 'ppt/drawings/NULL' in the archive

See original GitHub issue

When I open a particular PPTX file (which opens without error or complaint in multiple versions/platforms of PowerPoint) I get the following exception:

$ python
Python 2.7.10 (v2.7.10:15c95b7d81dc, May 23 2015, 09:33:12)
[GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from pptx import Presentation
>>> prs = Presentation("MyPresentation.pptx")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pptx/api.py", line 26, in __init__
    self._package = Package.open(pkg_file)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pptx/package.py", line 44, in open
    return super(Package, cls).open(pkg_file)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pptx/opc/package.py", line 122, in open
    pkg_reader = PackageReader.from_file(pkg_file)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pptx/opc/pkgreader.py", line 36, in from_file
    phys_reader, pkg_srels, content_types
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pptx/opc/pkgreader.py", line 69, in _load_serialized_parts
    for partname, blob, srels in part_walker:
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pptx/opc/pkgreader.py", line 104, in _walk_phys_parts
    phys_reader, part_srels, visited_partnames):
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pptx/opc/pkgreader.py", line 104, in _walk_phys_parts
    phys_reader, part_srels, visited_partnames):
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pptx/opc/pkgreader.py", line 104, in _walk_phys_parts
    phys_reader, part_srels, visited_partnames):
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pptx/opc/pkgreader.py", line 101, in _walk_phys_parts
    blob = phys_reader.blob_for(partname)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pptx/opc/phys_pkg.py", line 109, in blob_for
    return self._zipf.read(pack_uri.membername)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/zipfile.py", line 935, in read
    return self.open(name, "r", pwd).read()
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/zipfile.py", line 961, in open
    zinfo = self.getinfo(name)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/zipfile.py", line 909, in getinfo
    'There is no item named %r in the archive' % name)
KeyError: "There is no item named 'ppt/drawings/NULL' in the archive"

I’m using Pillow-3.1.1 XlsxWriter-0.8.4 lxml-3.6.0 python-pptx-0.5.8 on Python 2.7.10 on OS X. Thoughts on this error? Thanks! -Josh

Issue Analytics

  • State:closed
  • Created 7 years ago
  • Reactions:1
  • Comments:11 (4 by maintainers)

github_iconTop GitHub Comments

3reactions
victaicommented, Jun 4, 2021

I wrote a script that goes through all .rels files, find items with target=“NULL”, and remove those items from the corresponding .xml file. ex. If there’s an item=‘NULL’ in ppt/slides/_rels/slide1.xml.rels, I would get the Id of the item, then remove the object with the same Id in ppt/slides/slide1.xml

I’m not sure if it covers all possibilities, and I have not tested extensively, but hope this helps.

import os
import zipfile
import tempfile
from lxml import etree

def remove_NULL_from_rels(data):
    ET = etree.fromstring(data)
    rels_to_remove = []
    for node in ET.iter():
        if node.attrib.get('Target') == 'NULL':
            rels_to_remove.append(node.attrib['Id'])
            node.getparent().remove(node)
    return etree.tostring(ET, xml_declaration=True), rels_to_remove

def remove_rels_from_slide(data, rels):
    ET = etree.fromstring(data)
    for node in ET.iter():
        for rId in rels:
            if rId in node.attrib.values():
                node.getparent().remove(node)
    return etree.tostring(ET, xml_declaration=True)

def fixCorruptedPPTX(zipname):
    # generate a temp file
    tmpfd, tmpname = tempfile.mkstemp(dir=os.path.dirname(zipname))
    os.close(tmpfd)

    # create a temp copy of the archive without filename            
    with zipfile.ZipFile(zipname, 'r') as zin:
        with zipfile.ZipFile(tmpname, 'w') as zout:
            zout.comment = zin.comment # preserve the comment
            rels_xml_mapping = {}
            modified_xml = {}
            modified_rels = {}
            for item in zin.infolist():
                if item.filename.endswith('.rels'):
                    rels_file = item.filename
                    rels_dir, rels_name = os.path.split(rels_file)
                    rels_data = zin.read(rels_file)
                    rels_data, rels_to_remove = remove_NULL_from_rels(rels_data)
                    xml_file = os.path.join(os.path.split(rels_dir)[0], rels_name.rsplit('.', 1)[0])
                    if len(rels_to_remove) > 0 and xml_file in zin.namelist():
                        xml_data = zin.read(xml_file)
                        xml_data = remove_rels_from_slide(xml_data, rels_to_remove)
                        modified_xml[xml_file] = xml_data
                        modified_rels[rels_file] = rels_data
                
            for item in zin.infolist():
                if item.filename not in modified_xml and item.filename not in modified_rels:
                    zout.writestr(item, zin.read(item.filename))
                    
    # replace with the temp archive
    os.remove(zipname)
    os.rename(tmpname, zipname)

    # now add filename with its new data
    with zipfile.ZipFile(zipname, mode='a', compression=zipfile.ZIP_DEFLATED) as zf:
        for rels_file, rels_data in modified_rels.items():
            zf.writestr(rels_file, rels_data)
        for xml_file, xml_data in modified_xml.items():
            zf.writestr(xml_file, xml_data)

fixCorruptedPPTX(pptx_file)
2reactions
scannycommented, Sep 13, 2021

This should be fixed in the upcoming release 0.6.20 due out today or tomorrow. Any relationship “pointing” to a package part that cannot be found (e.g. “NULL”) is now ignored. If the file is truly corrupted and a part is in fact missing, this isn’t going to help, but the common-ish case where a neglectful client has “removed” relationships simply by marking them “NULL” no longer raises an error on load.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Using Python-pptx, what conditions could a PowerPoint have ...
A PPTX file is an Open Packaging Convention (OPC) package. ... error: KeyError: "There is no item named 'ppt/slides/NULL' in the archive" ....
Read more >
ERROR: No item named xxx/yyy found - Jenkins Jira
Apparently the Pipeline Snippet Generator doesnt use the full project name, so it won't work if the Job is inside multiple, nested Folders...
Read more >
Insert Line Break In Docx4J With Variable Replacement
When I open a particular PPTX file which opens without error or complaint in multiple No item named 'ppt/drawings/NULL' in the archive #206...
Read more >
Address review suggestions for #33643 - IssueHint
SELinux issues deploying rook on CoreOS, 2, 2022-10-08 ; No item named 'ppt/drawings/NULL' in the archive, 11, 2016-03-31 ; build(deps): bump tar from...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found