Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Extracting coordinates instead of drawing a box

See original GitHub issue

EDIT: Nevermind, got it to work!

Hey, first off, great tutorial, thank you so much.

I got it to run on ubuntu 16.04 as well with ease but I have a problem. I’m running on a CLI Ubuntu server, so instead of using an image as output, I’d just like to have the coordinates of the boxes.

I looked into the Object_detection_image.py and found where the boxes are being drawn, but it uses a function named visualize_boxes_and_labels_on_image_array to draw them. If I try to ouput the np.squeeze(boxes), it returns this:

[[0.5897823  0.35585764 0.87036747 0.5124078 ]
 [0.6508235  0.13419046 0.85757935 0.2114587 ]
 [0.64070517 0.14992228 0.8580698  0.23488007]
 ...
 [0.         0.         0.         0.        ]
 [0.         0.         0.         0.        ]
 [0.         0.         0.         0.        ]]

Is there a way to just get the coordinates from that?

Thank you for your time!

EDIT: Okay, I added a new function to the visualization_utils.py that returns the “ymin, ymax, xmin, xmax” variables, used in other functions of that file to draw the boxes. The problem is, they look like this: [[0.5897822976112366, 0.8703674674034119, 0.35585764050483704, 0.5124077796936035], [0.6508234739303589, 0.8575793504714966, 0.13419045507907867, 0.2114586979150772]] I was expecting coordinates. These seem like percentages.

EDIT: Okay, I got it to work.

Issue Analytics

State:
Created 5 years ago
Reactions:2
Comments:126

Top GitHub Comments

40reactions

psi43commented, Aug 6, 2018

add this to the utils/visualization_utils.py

def return_coordinates(
    image,
    boxes,
    classes,
    scores,
    category_index,
    instance_masks=None,
    instance_boundaries=None,
    keypoints=None,
    use_normalized_coordinates=False,
    max_boxes_to_draw=20,
    min_score_thresh=.5,
    agnostic_mode=False,
    line_thickness=4,
    groundtruth_box_visualization_color='black',
    skip_scores=False,
    skip_labels=False):
  # Create a display string (and color) for every box location, group any boxes
  # that correspond to the same location.
  box_to_display_str_map = collections.defaultdict(list)
  box_to_color_map = collections.defaultdict(str)
  box_to_instance_masks_map = {}
  box_to_instance_boundaries_map = {}
  box_to_score_map = {}
  box_to_keypoints_map = collections.defaultdict(list)
  if not max_boxes_to_draw:
    max_boxes_to_draw = boxes.shape[0]
  for i in range(min(max_boxes_to_draw, boxes.shape[0])):
    if scores is None or scores[i] > min_score_thresh:
      box = tuple(boxes[i].tolist())
      if instance_masks is not None:
        box_to_instance_masks_map[box] = instance_masks[i]
      if instance_boundaries is not None:
        box_to_instance_boundaries_map[box] = instance_boundaries[i]
      if keypoints is not None:
        box_to_keypoints_map[box].extend(keypoints[i])
      if scores is None:
        box_to_color_map[box] = groundtruth_box_visualization_color
      else:
        display_str = ''
        if not skip_labels:
          if not agnostic_mode:
            if classes[i] in category_index.keys():
              class_name = category_index[classes[i]]['name']
            else:
              class_name = 'N/A'
            display_str = str(class_name)
        if not skip_scores:
          if not display_str:
            display_str = '{}%'.format(int(100*scores[i]))
          else:
            display_str = '{}: {}%'.format(display_str, int(100*scores[i]))
        box_to_display_str_map[box].append(display_str)
        box_to_score_map[box] = scores[i]
        if agnostic_mode:
          box_to_color_map[box] = 'DarkOrange'
        else:
          box_to_color_map[box] = STANDARD_COLORS[
              classes[i] % len(STANDARD_COLORS)]

  # Draw all boxes onto image.
  coordinates_list = []
  counter_for = 0
  for box, color in box_to_color_map.items():
    ymin, xmin, ymax, xmax = box
    height, width, channels = image.shape
    ymin = int(ymin*height)
    ymax = int(ymax*height)
    xmin = int(xmin*width)
    xmax = int(xmax*width)
    coordinates_list.append([ymin, ymax, xmin, xmax, (box_to_score_map[box]*100)])
    counter_for = counter_for + 1

  return coordinates_list

add this to Object_detection_dir.py

coordinates = vis_util.return_coordinates(
                        image,
                        np.squeeze(boxes),
                        np.squeeze(classes).astype(np.int32),
                        np.squeeze(scores),
                        category_index,
                        use_normalized_coordinates=True,
                        line_thickness=8,
                        min_score_thresh=0.80)

as well as this:

textfile = open("json/"+filename_string+".json", "a")
                    textfile.write(json.dumps(coordinates))
                    textfile.write("\n")

I think this should be all.

4reactions

PraveenNellihelacommented, Sep 11, 2018

This was very helpful. thank you so much. If anyone needs to access each coordinate separately change the 3rd last line in the newly added code to utils/visualization_utils.py which is

coordinates_list.append([ymin, ymax, xmin, xmax, (box_to_score_map[box]*100)])

into coordinates_list=[ymin, ymax, xmin, xmax, (box_to_score_map[box]*100)]

and you can access each ymin, ymax , xmin, xmax values separately using ymin=coordinate_list[0] etc. in your object detection file.

Top Results From Across the Web

Get the bounding box coordinates in the TensorFlow object ...

Here each item(dictionary) in the coordinates list, is a box to be drawn on the image with boxes coordinates (normalised), class_name and score....

14.3. Object Detection and Bounding Boxes

Let's draw the bounding boxes in the image to check if they are accurate. Before drawing, we will define a helper function bbox_to_rect...

Building an object detector in TensorFlow using bounding-box ...

What I wanted there was to locate the object by drawing a bounding box around it. To do that, I had to find...

Bounding Box Prediction from Scratch using PyTorch

Extract bounding box coordinates from the resized mask. ... Now that we have our data augmentations in place, we can do the train-validation ......

How to display Vision bounding boxes - Machine, Think!

This is a very common problem and it happens because there are several different coordinate systems that you have to translate between.

Troubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.

Start Free

Top Related Reddit Thread

No results found

Top Related Tweet

No results found

Top Related Dev.to Post

No results found

Hello I got an error when I start training, new error TypeError: Cannot convert a list containing a tensor of dtype <dtype: 'int32'> to <dtype: 'float32'> (Tensor is: <tf.Tensor 'Preprocessor/stack_1:0' shape=(1, 3) dtype=int32>)

Extracting coordinates instead of drawing a box

Issue Analytics

Top GitHub Comments

Top Results From Across the Web

Top Related Medium Post

Top Related StackOverflow Question

Troubleshoot Live Code

Top Related Reddit Thread

Top Related Hackernoon Post

Top Related Tweet

Top Related Dev.to Post

Top Related Hashnode Post

Hello I got an error when I start training, new error TypeError: Cannot convert a list containing a tensor of dtype <dtype: 'int32'> to <dtype: 'float32'> (Tensor is: <tf.Tensor 'Preprocessor/stack_1:0' shape=(1, 3) dtype=int32>)

Error: no module named object_detection