question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

One illegal uri in Relationship will destroy the document parsing

See original GitHub issue

Description

The PPTX document that include an illegal uri in Relationship will make the System.IO.Packaging.InternalRelationshipCollection.ProcessRelationshipAttributes throw an exception to OpenXmlPart.Load.

And the OpenXmlPart.Load can not catch the exception and it will break the PresentationDocument.Open.

Information

  • .NET Target: All
  • DocumentFormat.OpenXml Version: 2.10.1

Repro

var document = PresentationDocument.Open("hyperlink.pptx", isEditable: false, openSettings)

Here is the hyperlink.pptx file : https://1drv.ms/p/s!AiKjiQqRWKThlv5zkY4HoRvvJ3Ppdg?e=3kfdNU

Observed

The PresentationDocument.Open throw the UriFormatException exception

System.UriFormatException: 'Invalid URI: The hostname could not be parsed.'

Because the ppt\slides_rels\slide1.xml.rels contain this string

<Relationship Id="rId1" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/hyperlink" Target="mailto:!@#$%^&amp;*()_+}{:”?&gt;&lt;,./;’[]=-098766554321" TargetMode="External"/>

As you can see, the Target is not an uri.

Expected

We can design an exception handle API, and we can handle some illegal document.

See #38 #274 #297 #298

And the #298 only add more information but can not tolerate errors.

And just as @twsouthwick says, we can not fix this in the OpenXML SDK project https://github.com/OfficeDev/Open-XML-SDK/issues/297#issuecomment-345560540 , but I think we can tolerate some errors

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Reactions:2
  • Comments:29 (21 by maintainers)

github_iconTop GitHub Comments

3reactions
abelykh0commented, Aug 17, 2020
private void OpenDocument()
{
    // TODO remove this workaround once the issue is fixed
    // https://github.com/OfficeDev/Open-XML-SDK/issues/715

    SpreadsheetDocument result = null;
    try
    {
        result = SpreadsheetDocument.Open(this._fileStream, false);
    }
    catch (OpenXmlPackageException e)
    {
        if (e.ToString().Contains("Invalid Hyperlink", StringComparison.Ordinal))
        {
            var stream = new MemoryStream();
            try
            {
                this._fileStream.Position = 0;
                this._fileStream.CopyTo(stream);
                this._fileStream.Dispose();
                this._fileStream = stream;
                this.FixInvalidUri(this._fileStream, brokenUri =>
                { return new Uri("http://broken-link/"); });
            }
            catch
            {
                stream.Dispose();
                throw;
            }

            result = SpreadsheetDocument.Open(this._fileStream, false);
        }
    }

    this._spreadsheetDocument = result;
}

private void FixInvalidUri(Stream fs, Func<string, Uri> invalidUriHandler)
{
    XNamespace relNs = "http://schemas.openxmlformats.org/package/2006/relationships";
    using (ZipArchive za = new ZipArchive(fs, ZipArchiveMode.Update, true))
    {
        foreach (var entry in za.Entries.ToList())
        {
            if (!entry.Name.EndsWith(".rels", StringComparison.Ordinal))
            {
                continue;
            }

            bool replaceEntry = false;
            XDocument entryXDoc = null;
            using (var entryStream = entry.Open())
            {
                try
                {
                    entryXDoc = XDocument.Load(entryStream);
                    if (entryXDoc.Root != null && entryXDoc.Root.Name.Namespace == relNs)
                    {
                        var urisToCheck = entryXDoc
                            .Descendants(relNs + "Relationship")
                            .Where(r => r.Attribute("TargetMode") != null && (string)r.Attribute("TargetMode") == "External");
                        foreach (var rel in urisToCheck)
                        {
                            var target = (string)rel.Attribute("Target");
                            if (target != null)
                            {
                                try
                                {
                                    Uri uri = new Uri(target);
                                }
                                catch (UriFormatException)
                                {
                                    Uri newUri = invalidUriHandler(target);
                                    rel.Attribute("Target").Value = newUri.ToString();
                                    replaceEntry = true;
                                }
                            }
                        }
                    }
                }
                catch (XmlException)
                {
                    continue;
                }
            }

            if (replaceEntry)
            {
                var fullName = entry.FullName;
                entry.Delete();
                var newEntry = za.CreateEntry(fullName);
                using (StreamWriter writer = new StreamWriter(newEntry.Open()))
                using (XmlWriter xmlWriter = XmlWriter.Create(writer))
                {
                    entryXDoc.WriteTo(xmlWriter);
                }
            }
        }
    }
}
2reactions
twsouthwickcommented, Jan 10, 2023

See #1295 for the package abstraction being proposed

Read more comments on GitHub >

github_iconTop Results From Across the Web

One illegal uri in Relationship will destroy the document ...
UriFormatException: 'Invalid URI: The hostname could not be parsed. ' As you can see, the Target is not an uri. We can design...
Read more >
How can i fix this checkstyle xml error about being well- ...
I am trying to run a gradle build for apacheofbiz, but the checkstyle Main task keeps failing. The error being produced now is:...
Read more >
Uniform Resource Identifiers (URI): Generic Syntax
This document defines a grammar that is a superset of all valid URI, such that an implementation can parse the common components of...
Read more >
Mappings and identity in URIs and IRis
1. URI identity is shared by all parties. Within a given context (*), there is a single (inverse functional) relationship between an ASCII...
Read more >
Fix Your Site With the Right DOCTYPE!
Scattered throughout W3C's site are DOCTYPEs with missing URIs, and DOCTYPEs with relative URIs that point to documents on W3C's own site. Once...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found