question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

add method to read text boxes

See original GitHub issue

Feature request:

Document#getTextBoxes()

it must:

  • Get all text from text boxes in the document

it should:

  • Return the text boxes in the same order as they appear in the document

it could:

  • Somehow try to reproduce the document text order combining the content from getBody() with the text box content. (maybe as a separate Document#getBody({includeTextBoxes: true}). I imagine this is insanely difficult, but just putting it out there 😃

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:7 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
thegoatherdercommented, Jun 16, 2021

@morungos my apologies for the delay on this. Please see this test document which I hope will be useful. Note that I intentionally created the 3rd textbox after creating the 4th textbox (but positioned it above the 4th visually).

The expected test output (if we remove all blank lines from the output) would be:

This is in the header
This is the first paragraph
This is the first textbox
This is the second textbox
This is the second paragraph
This is the third paragraph
This is the third text box
This is the fourth text box
This is the last paragraph
This is in the footer
0reactions
morungoscommented, Jun 6, 2021

Hi @thegoatherder

Many many thanks for this feedback. There might be a challenge here, so here’s what’s happening inside. Word has two things going on:

  1. Character positions, or CPs, which are essentially the linear character position of the text box “anchor”, where 1 is the first character, and so on. Almost all Word data is sorted by character position, and I believe (and did test) that text boxes are extracted in order of character position. So I’d be surprised, but it’s possible.
  2. Visual positions, which are attached using some weird metadata, and more or less say where the X and Y coordinates of the text box are supposed to be. These matter when a text box has the “same page” type display, rather than a strictly in-line display.

The problem is that visual positions can show differently than character positions. (There is some option somewhere which allows you to see where the anchor is when you click on a text box). It’s very easy to simply drag one text box above another even if the character position order is the other way around.

This is another of the issues where if it is visual positions you want to order by, it can be tricky, because page breaks can get in the way. If there’s a page break between (say) boxes A and B, then A appears before B due to different pages, but if they’re on the same page, B appears above A. I remember this hell from trying to typeset long documents with multiple images per page. Adding code to sort the boxes might not be reliable. However, adding code to expose the coordinates of boxes will be much simpler and easier to do, if that works for you.

I mean, I might be wrong and you found a case where the code does character positions wrongly, but I suspect you might be after text boxes ordered by visual position. If so, I’d very much like a test file I can use to make sure I get it right. In fact, if you can send me a test file which shows me what you need, then I’ll be happy to add it and make it work, especially if I can add it to the open repository as a test file.

Let me know either way and I’ll see what I can do.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Use a screen reader to select and read text boxes and images ...
Use Word with your keyboard and a screen reader to select and read images, shapes, and text boxes in Word documents. We have...
Read more >
Word - Creating accessible text boxes - YouTube
This video will show you how to create a text box effect in Microsoft Word, that is accessible to screen readers.
Read more >
Using Method to Read/Write value in Textbox in C# [duplicate]
I have form1 and class1. I would like to use a method to read/write the text in a textbox in form1 at class1....
Read more >
How to Insert and Format a Text Box in Microsoft Word
Go to the spot in your document where you want the text box and select the Insert tab. Click the Text Box drop-down...
Read more >
Word Documents - MDOL: Accessibility Guide - Maine.gov
An alternative to text boxes is to insert a picture and use the formatting to wrap text around the picture and add alternative...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found