question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

question about input data format / using different pose extractors

See original GitHub issue

Hi, i have just started looking into geometric learning and as a first try i want to get the network running in my environment. My issue is that i am not using the joints from openpose so my “input” is formatted in a different way. I am specifically talking bout N, C, T, V, M = x.size() from forward() and extract_feature(). Going from the paper “Spatial Temporal Graph Convolutional Networks for Skeleton-Based ActionRecognition” I am guessing that N is the number of joints, C is the number of channels of the feature (2 for 2d joint positions), T is time as in the number of frames that are processed. For V and M i am at the loss and now im stuck because i cant convert my own pose coordinates into the proper format, I would appreciate any help. I tried installing openPose just to explore the data format more but after endless conflicts because of anaconda and cuda mismatches I gave up. tl;dr - what are N, C, T, V, M = x.size() of the pose data ?

Issue Analytics

  • State:open
  • Created 3 years ago
  • Reactions:1
  • Comments:5 (2 by maintainers)

github_iconTop GitHub Comments

1reaction
ronjamollercommented, Jul 8, 2020

Hi, i just noticed my error with the N as well - thanks for reiterating. I cant use the Dataloader unfortunately because of my RL setup so right now im trying to figure out how the normalization was done (working theory is remove half of width/height and then divide by width/height since its distributed between -0.5 and 0.5). As for score i was just setting the visible joints to confidence 1 but maybe your idea is better. In case anyone is working on a similar issue and is interested: x.size() = [64, 3, 300, 18, 2] , minibatchsize=64, channels x,y,score = 3, T is 2 times the temporal window size which is 150 as per config, V is 18 for kinetics as per paper, for the last im not quite sure, the only thing i can say is that the second “person” seems to be missing quite often during training. this is an output of print(x[0, :, 150, :, 0]) , so one middle frame of the first sample in the minibatch for the first person

tensor([[-0.0380,  0.0420, -0.1280, -0.2930,  0.0000,  0.2120,  0.3430,  0.0000,
         -0.0210,  0.0000,  0.0000,  0.1160,  0.0000,  0.0000, -0.0480,  0.0160,
          0.0000,  0.1260],
        [-0.1630,  0.0050,  0.0110,  0.4430,  0.0000, -0.0220,  0.4160,  0.0000,
          0.4920,  0.0000,  0.0000,  0.4920,  0.0000,  0.0000, -0.2530, -0.2230,
          0.0000, -0.2090],
        [ 0.7610,  0.5250,  0.4730,  0.2970,  0.0000,  0.4380,  0.3610,  0.0000,
          0.0650,  0.0000,  0.0000,  0.0530,  0.0000,  0.0000,  0.8300,  0.7860,
          0.0000,  0.8090]], device='cuda:0')

Good luck with your application and thanks for your help !

0reactions
frankiercommented, Jul 8, 2020

I meant the dimensions are in the order given:

(id_within_minibatch, channels, frame_num_aka_time, keypoints, person_id)

So N = id_within_minibatch (hint: use a DataLoader to make minibatches in the 1st dimension) C = channels (x, y, score) OR (x, y) – has to match num_channels T = frame_num_aka_time V = keypoint/joint (probably stands for vertex) M = person ID (for when there are multiple people within a frame I would suppose)

By the way, I have been passing just (x, y) without score since I’m working with images + OpenPose and I think it might be rather dependent upon camera setup/resolution so would prefer to sacrifice in-domain accuracy for generalisation. It’s up to you whether you include score or not.

Would do with the pasting but my code isn’t working at the moment.

Here’s one of the yaml files used:

https://github.com/open-mmlab/mmskeleton/blob/master/configs/recognition/st_gcn/kinetics-skeleton-from-openpose.yaml

Read more comments on GitHub >

github_iconTop Results From Across the Web

Data Extraction - Science topic - ResearchGate
asked a question related to Data Extraction. Why data extraction from netcdf file gives different values using R, CDO and ArcMap? Question. 1...
Read more >
Formatting Questions - Qualtrics
Qtip: This page covers formatting options available to every question type. If you're looking for information on a specific question type (e.g., ...
Read more >
Question answering - Hugging Face
There are two common types of question answering tasks: Extractive: extract the answer from the given context. Abstractive: generate an answer from the ......
Read more >
Extraction Support for Image Chooser Question Types
How does extraction logic work for Image Chooser question types? Extraction logic allows the survey to retain information from previously asked question/s and ......
Read more >
Deep learning for specific information extraction from ...
In this post we shall tackle the problem of extracting some particular information form an unstructured text. We needed to extract our users' ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found