Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

question about input data format / using different pose extractors

See original GitHub issue

Hi, i have just started looking into geometric learning and as a first try i want to get the network running in my environment. My issue is that i am not using the joints from openpose so my “input” is formatted in a different way. I am specifically talking bout N, C, T, V, M = x.size() from forward() and extract_feature(). Going from the paper “Spatial Temporal Graph Convolutional Networks for Skeleton-Based ActionRecognition” I am guessing that N is the number of joints, C is the number of channels of the feature (2 for 2d joint positions), T is time as in the number of frames that are processed. For V and M i am at the loss and now im stuck because i cant convert my own pose coordinates into the proper format, I would appreciate any help. I tried installing openPose just to explore the data format more but after endless conflicts because of anaconda and cuda mismatches I gave up. tl;dr - what are N, C, T, V, M = x.size() of the pose data ?

Issue Analytics

State:
Created 3 years ago
Reactions:1
Comments:5 (2 by maintainers)

Top GitHub Comments

1reaction

ronjamollercommented, Jul 8, 2020

Hi, i just noticed my error with the N as well - thanks for reiterating. I cant use the Dataloader unfortunately because of my RL setup so right now im trying to figure out how the normalization was done (working theory is remove half of width/height and then divide by width/height since its distributed between -0.5 and 0.5). As for score i was just setting the visible joints to confidence 1 but maybe your idea is better. In case anyone is working on a similar issue and is interested: x.size() = [64, 3, 300, 18, 2] , minibatchsize=64, channels x,y,score = 3, T is 2 times the temporal window size which is 150 as per config, V is 18 for kinetics as per paper, for the last im not quite sure, the only thing i can say is that the second “person” seems to be missing quite often during training. this is an output of print(x[0, :, 150, :, 0]) , so one middle frame of the first sample in the minibatch for the first person

tensor([[-0.0380,  0.0420, -0.1280, -0.2930,  0.0000,  0.2120,  0.3430,  0.0000,
         -0.0210,  0.0000,  0.0000,  0.1160,  0.0000,  0.0000, -0.0480,  0.0160,
          0.0000,  0.1260],
        [-0.1630,  0.0050,  0.0110,  0.4430,  0.0000, -0.0220,  0.4160,  0.0000,
          0.4920,  0.0000,  0.0000,  0.4920,  0.0000,  0.0000, -0.2530, -0.2230,
          0.0000, -0.2090],
        [ 0.7610,  0.5250,  0.4730,  0.2970,  0.0000,  0.4380,  0.3610,  0.0000,
          0.0650,  0.0000,  0.0000,  0.0530,  0.0000,  0.0000,  0.8300,  0.7860,
          0.0000,  0.8090]], device='cuda:0')

Good luck with your application and thanks for your help !

0reactions

frankiercommented, Jul 8, 2020

I meant the dimensions are in the order given:

(id_within_minibatch, channels, frame_num_aka_time, keypoints, person_id)

So N = id_within_minibatch (hint: use a DataLoader to make minibatches in the 1st dimension) C = channels (x, y, score) OR (x, y) – has to match num_channels T = frame_num_aka_time V = keypoint/joint (probably stands for vertex) M = person ID (for when there are multiple people within a frame I would suppose)

By the way, I have been passing just (x, y) without score since I’m working with images + OpenPose and I think it might be rather dependent upon camera setup/resolution so would prefer to sacrifice in-domain accuracy for generalisation. It’s up to you whether you include score or not.

Would do with the pasting but my code isn’t working at the moment.

Here’s one of the yaml files used:

https://github.com/open-mmlab/mmskeleton/blob/master/configs/recognition/st_gcn/kinetics-skeleton-from-openpose.yaml

Top Results From Across the Web

Data Extraction - Science topic - ResearchGate

asked a question related to Data Extraction. Why data extraction from netcdf file gives different values using R, CDO and ArcMap? Question. 1...

Formatting Questions - Qualtrics

Qtip: This page covers formatting options available to every question type. If you're looking for information on a specific question type (e.g., ...

Question answering - Hugging Face

There are two common types of question answering tasks: Extractive: extract the answer from the given context. Abstractive: generate an answer from the ......

Extraction Support for Image Chooser Question Types

How does extraction logic work for Image Chooser question types? Extraction logic allows the survey to retain information from previously asked question/s and ......

Deep learning for specific information extraction from ...

In this post we shall tackle the problem of extracting some particular information form an unstructured text. We needed to extract our users' ......