The GQA results are lower than the reported performance.
See original GitHub issueWe follow the default setting provided by the repo, but get a lower performance on both online leadboard and offline leadboard. The log file is as following. By the way, I have tested the official mcan_small model and achieved 58.3 on online leadboard. That’s strange. Any one can help to fix this?
{ BATCH_SIZE }->64
{ BBOXFEAT_EMB_SIZE }->2048
{ CACHE_PATH }->./results/cache
{ CKPTS_PATH }->./ckpts
{ CKPT_EPOCH }->0
{ CKPT_PATH }->None
{ CKPT_VERSION }->2134787
{ DATASET }->gqa
{ DATA_PATH }->{'vqa': './data/vqa', 'gqa': './data/gqa', 'clevr': './data/clevr'}
{ DATA_ROOT }->./data
{ DEVICES }->[0]
{ DROPOUT_R }->0.1
{ EVAL_BATCH_SIZE }->32
{ EVAL_EVERY_EPOCH }->True
{ FEATS_PATH }->{'vqa': {'train': './data/vqa/feats/train2014', 'val': './data/vqa/feats/val2014', 'test': './data/vqa/feats/test2015'}, 'gqa': {'default-frcn': './data/gqa/feats/gqa-frcn', 'default-grid': './data/gqa/feats/gqa-grid'}, 'clevr': {'train': './data/clevr/feats/train', 'val': './data/clevr/feats/val', 'test': './data/clevr/feats/test'}}
{ FEAT_SIZE }->{'vqa': {'FRCN_FEAT_SIZE': (100, 2048), 'BBOX_FEAT_SIZE': (100, 5)}, 'gqa': {'FRCN_FEAT_SIZE': (100, 2048), 'GRID_FEAT_SIZE': (49, 2048), 'BBOX_FEAT_SIZE': (100, 5)}, 'clevr': {'GRID_FEAT_SIZE': (196, 1024)}}
{ FF_SIZE }->2048
{ FLAT_GLIMPSES }->1
{ FLAT_MLP_SIZE }->512
{ FLAT_OUT_SIZE }->1024
{ GPU }->2
{ GRAD_ACCU_STEPS }->1
{ GRAD_NORM_CLIP }->-1
{ HIDDEN_SIZE }->512
{ LAYER }->6
{ LOG_PATH }->./results/log
{ LOSS_FUNC }->ce
{ LOSS_FUNC_NAME_DICT }->{'ce': 'CrossEntropyLoss', 'bce': 'BCEWithLogitsLoss', 'kld': 'KLDivLoss', 'mse': 'MSELoss'}
{ LOSS_FUNC_NONLINEAR }->{'ce': [None, 'flat'], 'bce': [None, None], 'kld': ['log_softmax', None], 'mse': [None, None]}
{ LOSS_REDUCTION }->sum
{ LR_BASE }->0.0001
{ LR_DECAY_LIST }->[8, 10]
{ LR_DECAY_R }->0.2
{ MAX_EPOCH }->11
{ MODEL }->mcan_small
{ MODEL_USE }->mcan
{ MULTI_HEAD }->8
{ NUM_WORKERS }->8
{ N_GPU }->1
{ OPT }->Adam
{ OPT_PARAMS }->{'betas': (0.9, 0.98), 'eps': 1e-09, 'weight_decay': 0, 'amsgrad': False}
{ PIN_MEM }->True
{ PRED_PATH }->./results/pred
{ RAW_PATH }->{'vqa': {'train': './data/vqa/raw/v2_OpenEnded_mscoco_train2014_questions.json', 'train-anno': './data/vqa/raw/v2_mscoco_train2014_annotations.json', 'val': './data/vqa/raw/v2_OpenEnded_mscoco_val2014_questions.json', 'val-anno': './data/vqa/raw/v2_mscoco_val2014_annotations.json', 'vg': './data/vqa/raw/VG_questions.json', 'vg-anno': './data/vqa/raw/VG_annotations.json', 'test': './data/vqa/raw/v2_OpenEnded_mscoco_test2015_questions.json'}, 'gqa': {'train': './data/gqa/raw/questions1.2/train_balanced_questions.json', 'val': './data/gqa/raw/questions1.2/val_balanced_questions.json', 'testdev': './data/gqa/raw/questions1.2/testdev_balanced_questions.json', 'test': './data/gqa/raw/questions1.2/submission_all_questions.json', 'val_all': './data/gqa/raw/questions1.2/val_all_questions.json', 'testdev_all': './data/gqa/raw/questions1.2/testdev_all_questions.json', 'train_choices': './data/gqa/raw/eval/train_choices', 'val_choices': './data/gqa/raw/eval/val_choices.json'}, 'clevr': {'train': './data/clevr/raw/questions/CLEVR_train_questions.json', 'val': './data/clevr/raw/questions/CLEVR_val_questions.json', 'test': './data/clevr/raw/questions/CLEVR_test_questions.json'}}
{ RESULT_PATH }->./results/result_test
{ RESUME }->False
{ RUN_MODE }->train
{ SEED }->2134787
{ SPLIT }->{'train': 'train+val', 'val': 'testdev', 'test': 'test'}
{ SPLITS }->{'vqa': {'train': '', 'val': 'val', 'test': 'test'}, 'gqa': {'train': 'train+val', 'val': 'testdev', 'test': 'test'}, 'clevr': {'train': '', 'val': 'val', 'test': 'test'}}
{ SUB_BATCH_SIZE }->64
{ TASK_LOSS_CHECK }->{'vqa': ['bce', 'kld'], 'gqa': ['ce'], 'clevr': ['ce']}
{ TEST_SAVE_PRED }->False
{ TRAIN_SPLIT }->train+val
{ USE_AUX_FEAT }->True
{ USE_BBOX_FEAT }->True
{ USE_GLOVE }->True
{ VERBOSE }->True
{ VERSION }->2134787
{ WARMUP_EPOCH }->2
{ WORD_EMBED_SIZE }->300
=====================================
nowTime: 2020-01-19 14:05:12
Epoch: 1, Loss: 1.7046293609006327, Lr: 6.666666666666667e-05
Elapsed time: 5132, Speed(s/batch): 0.3055514312835215
Binary: 58.18%
Open: 36.41%
Accuracy: 46.41%
Distribution: 3.07 (lower is better)
Accuracy / structural type:
choose: 60.05% (1129 questions)
compare: 54.33% (589 questions)
logical: 56.63% (1803 questions)
query: 36.41% (6805 questions)
verify: 59.50% (2252 questions)
Accuracy / semantic type:
attr: 50.46% (5186 questions)
cat: 42.04% (1149 questions)
global: 45.22% (157 questions)
obj: 64.14% (778 questions)
rel: 40.83% (5308 questions)
Accuracy / steps number:
1: 60.76% (237 questions)
2: 43.16% (6395 questions)
3: 47.63% (4266 questions)
4: 45.02% (793 questions)
5: 59.37% (822 questions)
6: 78.05% (41 questions)
7: 100.00% (20 questions)
8: 100.00% (3 questions)
9: 100.00% (1 questions)
Accuracy / words number:
3: 32.45% (151 questions)
4: 45.56% (630 questions)
5: 38.53% (1290 questions)
6: 43.30% (2074 questions)
7: 43.97% (1642 questions)
8: 47.85% (1185 questions)
9: 50.12% (1281 questions)
10: 51.88% (1249 questions)
11: 45.47% (994 questions)
12: 51.10% (638 questions)
13: 50.43% (462 questions)
14: 50.72% (345 questions)
15: 57.81% (237 questions)
16: 49.57% (117 questions)
17: 44.68% (94 questions)
18: 52.63% (76 questions)
19: 60.47% (43 questions)
20: 53.12% (32 questions)
21: 57.89% (19 questions)
22: 50.00% (12 questions)
23: 25.00% (4 questions)
24: 100.00% (2 questions)
25: 100.00% (1 questions)
=====================================
nowTime: 2020-01-19 15:31:07
Epoch: 2, Loss: 1.3994251725464903, Lr: 0.0001
Elapsed time: 5100, Speed(s/batch): 0.3036757551451245
Binary: 63.94%
Open: 37.90%
Accuracy: 49.85%
Distribution: 2.36 (lower is better)
Accuracy / structural type:
choose: 65.46% (1129 questions)
compare: 55.86% (589 questions)
logical: 61.40% (1803 questions)
query: 37.90% (6805 questions)
verify: 67.32% (2252 questions)
Accuracy / semantic type:
attr: 52.97% (5186 questions)
cat: 40.91% (1149 questions)
global: 46.50% (157 questions)
obj: 79.05% (778 questions)
rel: 44.56% (5308 questions)
Accuracy / steps number:
1: 64.56% (237 questions)
2: 45.54% (6395 questions)
3: 52.93% (4266 questions)
4: 48.17% (793 questions)
5: 62.53% (822 questions)
6: 65.85% (41 questions)
7: 100.00% (20 questions)
8: 100.00% (3 questions)
9: 100.00% (1 questions)
Accuracy / words number:
3: 31.13% (151 questions)
4: 44.44% (630 questions)
5: 38.76% (1290 questions)
6: 45.90% (2074 questions)
7: 47.69% (1642 questions)
8: 53.84% (1185 questions)
9: 55.35% (1281 questions)
10: 55.40% (1249 questions)
11: 51.21% (994 questions)
12: 54.86% (638 questions)
13: 53.68% (462 questions)
14: 59.13% (345 questions)
15: 56.96% (237 questions)
16: 57.26% (117 questions)
17: 51.06% (94 questions)
18: 53.95% (76 questions)
19: 69.77% (43 questions)
20: 56.25% (32 questions)
21: 57.89% (19 questions)
22: 50.00% (12 questions)
23: 0.00% (4 questions)
24: 50.00% (2 questions)
25: 100.00% (1 questions)
=====================================
nowTime: 2020-01-19 16:56:30
Epoch: 3, Loss: 1.2595117600660048, Lr: 0.0001
Elapsed time: 5073, Speed(s/batch): 0.3020486486869416
Binary: 68.66%
Open: 37.74%
Accuracy: 51.93%
Distribution: 2.90 (lower is better)
Accuracy / structural type:
choose: 68.47% (1129 questions)
compare: 58.23% (589 questions)
logical: 66.89% (1803 questions)
query: 37.74% (6805 questions)
verify: 72.91% (2252 questions)
Accuracy / semantic type:
attr: 56.85% (5186 questions)
cat: 40.91% (1149 questions)
global: 53.50% (157 questions)
obj: 82.52% (778 questions)
rel: 44.99% (5308 questions)
Accuracy / steps number:
1: 67.09% (237 questions)
2: 46.25% (6395 questions)
3: 55.53% (4266 questions)
4: 55.11% (793 questions)
5: 67.76% (822 questions)
6: 70.73% (41 questions)
7: 95.00% (20 questions)
8: 100.00% (3 questions)
9: 100.00% (1 questions)
Accuracy / words number:
3: 34.44% (151 questions)
4: 46.19% (630 questions)
5: 38.99% (1290 questions)
6: 47.73% (2074 questions)
7: 51.34% (1642 questions)
8: 53.76% (1185 questions)
9: 58.24% (1281 questions)
10: 58.29% (1249 questions)
11: 52.21% (994 questions)
12: 58.78% (638 questions)
13: 54.76% (462 questions)
14: 60.00% (345 questions)
15: 63.71% (237 questions)
16: 63.25% (117 questions)
17: 52.13% (94 questions)
18: 53.95% (76 questions)
19: 79.07% (43 questions)
20: 50.00% (32 questions)
21: 52.63% (19 questions)
22: 66.67% (12 questions)
23: 50.00% (4 questions)
24: 100.00% (2 questions)
25: 100.00% (1 questions)
=====================================
nowTime: 2020-01-19 18:21:26
Epoch: 4, Loss: 1.1537336046805873, Lr: 0.0001
Elapsed time: 5056, Speed(s/batch): 0.3010697898679644
Binary: 70.19%
Open: 38.28%
Accuracy: 52.93%
Distribution: 2.06 (lower is better)
Accuracy / structural type:
choose: 69.26% (1129 questions)
compare: 46.01% (589 questions)
logical: 71.05% (1803 questions)
query: 38.28% (6805 questions)
verify: 76.29% (2252 questions)
Accuracy / semantic type:
attr: 58.79% (5186 questions)
cat: 40.30% (1149 questions)
global: 56.69% (157 questions)
obj: 81.36% (778 questions)
rel: 45.65% (5308 questions)
Accuracy / steps number:
1: 64.98% (237 questions)
2: 47.58% (6395 questions)
3: 54.78% (4266 questions)
4: 62.80% (793 questions)
5: 68.86% (822 questions)
6: 87.80% (41 questions)
7: 95.00% (20 questions)
8: 100.00% (3 questions)
9: 100.00% (1 questions)
Accuracy / words number:
3: 25.83% (151 questions)
4: 48.25% (630 questions)
5: 39.38% (1290 questions)
6: 50.00% (2074 questions)
7: 52.19% (1642 questions)
8: 55.70% (1185 questions)
9: 59.80% (1281 questions)
10: 57.49% (1249 questions)
11: 55.23% (994 questions)
12: 55.96% (638 questions)
13: 56.71% (462 questions)
14: 59.42% (345 questions)
15: 62.45% (237 questions)
16: 66.67% (117 questions)
17: 54.26% (94 questions)
18: 59.21% (76 questions)
19: 79.07% (43 questions)
20: 53.12% (32 questions)
21: 52.63% (19 questions)
22: 58.33% (12 questions)
23: 50.00% (4 questions)
24: 100.00% (2 questions)
25: 100.00% (1 questions)
=====================================
nowTime: 2020-01-19 19:46:06
Epoch: 5, Loss: 1.0821526790137108, Lr: 0.0001
Elapsed time: 5034, Speed(s/batch): 0.2997583388487309
Binary: 72.18%
Open: 38.87%
Accuracy: 54.16%
Distribution: 2.16 (lower is better)
Accuracy / structural type:
choose: 72.54% (1129 questions)
compare: 60.95% (589 questions)
logical: 70.66% (1803 questions)
query: 38.87% (6805 questions)
verify: 76.15% (2252 questions)
Accuracy / semantic type:
attr: 61.16% (5186 questions)
cat: 44.21% (1149 questions)
global: 57.96% (157 questions)
obj: 82.52% (778 questions)
rel: 45.20% (5308 questions)
Accuracy / steps number:
1: 71.31% (237 questions)
2: 47.82% (6395 questions)
3: 57.59% (4266 questions)
4: 60.40% (793 questions)
5: 72.02% (822 questions)
6: 80.49% (41 questions)
7: 100.00% (20 questions)
8: 100.00% (3 questions)
9: 100.00% (1 questions)
Accuracy / words number:
3: 27.81% (151 questions)
4: 47.30% (630 questions)
5: 39.38% (1290 questions)
6: 50.29% (2074 questions)
7: 53.17% (1642 questions)
8: 56.20% (1185 questions)
9: 61.83% (1281 questions)
10: 59.01% (1249 questions)
11: 57.44% (994 questions)
12: 60.97% (638 questions)
13: 59.31% (462 questions)
14: 61.16% (345 questions)
15: 64.56% (237 questions)
16: 64.10% (117 questions)
17: 58.51% (94 questions)
18: 64.47% (76 questions)
19: 74.42% (43 questions)
20: 59.38% (32 questions)
21: 63.16% (19 questions)
22: 66.67% (12 questions)
23: 50.00% (4 questions)
24: 100.00% (2 questions)
25: 100.00% (1 questions)
=====================================
nowTime: 2020-01-19 21:10:24
Epoch: 6, Loss: 1.0242240601529657, Lr: 0.0001
Elapsed time: 5283, Speed(s/batch): 0.3145479110343935
Binary: 72.37%
Open: 39.44%
Accuracy: 54.56%
Distribution: 2.30 (lower is better)
Accuracy / structural type:
choose: 69.97% (1129 questions)
compare: 64.86% (589 questions)
logical: 69.94% (1803 questions)
query: 39.44% (6805 questions)
verify: 77.49% (2252 questions)
Accuracy / semantic type:
attr: 60.70% (5186 questions)
cat: 43.08% (1149 questions)
global: 54.78% (157 questions)
obj: 84.32% (778 questions)
rel: 46.67% (5308 questions)
Accuracy / steps number:
1: 70.46% (237 questions)
2: 48.54% (6395 questions)
3: 58.49% (4266 questions)
4: 59.02% (793 questions)
5: 69.34% (822 questions)
6: 82.93% (41 questions)
7: 100.00% (20 questions)
8: 100.00% (3 questions)
9: 100.00% (1 questions)
Accuracy / words number:
3: 29.80% (151 questions)
4: 45.56% (630 questions)
5: 40.47% (1290 questions)
6: 51.74% (2074 questions)
7: 54.57% (1642 questions)
8: 55.70% (1185 questions)
9: 59.88% (1281 questions)
10: 61.81% (1249 questions)
11: 56.34% (994 questions)
12: 61.91% (638 questions)
13: 56.06% (462 questions)
14: 63.77% (345 questions)
15: 60.76% (237 questions)
16: 63.25% (117 questions)
17: 69.15% (94 questions)
18: 65.79% (76 questions)
19: 72.09% (43 questions)
20: 56.25% (32 questions)
21: 63.16% (19 questions)
22: 66.67% (12 questions)
23: 25.00% (4 questions)
24: 100.00% (2 questions)
25: 100.00% (1 questions)
=====================================
nowTime: 2020-01-19 22:38:50
Epoch: 7, Loss: 0.975194708264801, Lr: 0.0001
Elapsed time: 5540, Speed(s/batch): 0.32989268883207523
Binary: 72.91%
Open: 38.75%
Accuracy: 54.43%
Distribution: 1.97 (lower is better)
Accuracy / structural type:
choose: 72.45% (1129 questions)
compare: 62.48% (589 questions)
logical: 70.99% (1803 questions)
query: 38.75% (6805 questions)
verify: 77.40% (2252 questions)
Accuracy / semantic type:
attr: 60.89% (5186 questions)
cat: 42.91% (1149 questions)
global: 59.87% (157 questions)
obj: 84.96% (778 questions)
rel: 45.97% (5308 questions)
Accuracy / steps number:
1: 71.73% (237 questions)
2: 48.18% (6395 questions)
3: 58.06% (4266 questions)
4: 62.04% (793 questions)
5: 69.46% (822 questions)
6: 78.05% (41 questions)
7: 95.00% (20 questions)
8: 100.00% (3 questions)
9: 100.00% (1 questions)
Accuracy / words number:
3: 30.46% (151 questions)
4: 46.51% (630 questions)
5: 40.47% (1290 questions)
6: 50.72% (2074 questions)
7: 54.93% (1642 questions)
8: 57.13% (1185 questions)
9: 60.73% (1281 questions)
10: 59.89% (1249 questions)
11: 57.34% (994 questions)
12: 61.13% (638 questions)
13: 51.73% (462 questions)
14: 64.06% (345 questions)
15: 63.29% (237 questions)
16: 66.67% (117 questions)
17: 58.51% (94 questions)
18: 64.47% (76 questions)
19: 69.77% (43 questions)
20: 62.50% (32 questions)
21: 63.16% (19 questions)
22: 75.00% (12 questions)
23: 50.00% (4 questions)
24: 100.00% (2 questions)
25: 100.00% (1 questions)
=====================================
nowTime: 2020-01-20 00:11:56
Epoch: 8, Loss: 0.9346485017632614, Lr: 0.0001
Elapsed time: 5532, Speed(s/batch): 0.3294190824187974
Binary: 72.63%
Open: 38.85%
Accuracy: 54.36%
Distribution: 2.27 (lower is better)
Accuracy / structural type:
choose: 69.97% (1129 questions)
compare: 62.48% (589 questions)
logical: 71.77% (1803 questions)
query: 38.85% (6805 questions)
verify: 77.31% (2252 questions)
Accuracy / semantic type:
attr: 60.20% (5186 questions)
cat: 44.73% (1149 questions)
global: 57.32% (157 questions)
obj: 84.19% (778 questions)
rel: 46.27% (5308 questions)
Accuracy / steps number:
1: 69.20% (237 questions)
2: 48.71% (6395 questions)
3: 56.92% (4266 questions)
4: 61.41% (793 questions)
5: 71.53% (822 questions)
6: 82.93% (41 questions)
7: 90.00% (20 questions)
8: 66.67% (3 questions)
9: 100.00% (1 questions)
Accuracy / words number:
3: 29.14% (151 questions)
4: 47.46% (630 questions)
5: 41.55% (1290 questions)
6: 51.25% (2074 questions)
7: 53.78% (1642 questions)
8: 55.86% (1185 questions)
9: 59.48% (1281 questions)
10: 61.33% (1249 questions)
11: 56.54% (994 questions)
12: 61.91% (638 questions)
13: 56.93% (462 questions)
14: 58.84% (345 questions)
15: 62.03% (237 questions)
16: 63.25% (117 questions)
17: 62.77% (94 questions)
18: 65.79% (76 questions)
19: 60.47% (43 questions)
20: 56.25% (32 questions)
21: 63.16% (19 questions)
22: 75.00% (12 questions)
23: 25.00% (4 questions)
24: 100.00% (2 questions)
25: 100.00% (1 questions)
=====================================
nowTime: 2020-01-20 01:44:43
Epoch: 9, Loss: 0.6903910459351245, Lr: 2e-05
Elapsed time: 5527, Speed(s/batch): 0.32908785358784626
Binary: 75.78%
Open: 41.45%
Accuracy: 57.21%
Distribution: 1.63 (lower is better)
Accuracy / structural type:
choose: 74.58% (1129 questions)
compare: 66.89% (589 questions)
logical: 74.27% (1803 questions)
query: 41.45% (6805 questions)
verify: 79.93% (2252 questions)
Accuracy / semantic type:
attr: 64.02% (5186 questions)
cat: 46.30% (1149 questions)
global: 59.87% (157 questions)
obj: 86.50% (778 questions)
rel: 48.55% (5308 questions)
Accuracy / steps number:
1: 72.15% (237 questions)
2: 51.34% (6395 questions)
3: 60.34% (4266 questions)
4: 63.68% (793 questions)
5: 73.60% (822 questions)
6: 82.93% (41 questions)
7: 100.00% (20 questions)
8: 100.00% (3 questions)
9: 100.00% (1 questions)
Accuracy / words number:
3: 29.80% (151 questions)
4: 48.73% (630 questions)
5: 44.65% (1290 questions)
6: 54.44% (2074 questions)
7: 55.54% (1642 questions)
8: 60.00% (1185 questions)
9: 64.09% (1281 questions)
10: 63.33% (1249 questions)
11: 59.66% (994 questions)
12: 64.26% (638 questions)
13: 59.31% (462 questions)
14: 61.45% (345 questions)
15: 64.14% (237 questions)
16: 64.96% (117 questions)
17: 62.77% (94 questions)
18: 68.42% (76 questions)
19: 76.74% (43 questions)
20: 53.12% (32 questions)
21: 73.68% (19 questions)
22: 66.67% (12 questions)
23: 25.00% (4 questions)
24: 100.00% (2 questions)
25: 100.00% (1 questions)
=====================================
nowTime: 2020-01-20 03:17:26
Epoch: 10, Loss: 0.5974722676920379, Lr: 2e-05
Elapsed time: 5601, Speed(s/batch): 0.3334813243748319
Binary: 75.92%
Open: 41.72%
Accuracy: 57.42%
Distribution: 1.49 (lower is better)
Accuracy / structural type:
choose: 75.29% (1129 questions)
compare: 66.38% (589 questions)
logical: 74.32% (1803 questions)
query: 41.72% (6805 questions)
verify: 80.02% (2252 questions)
Accuracy / semantic type:
attr: 64.02% (5186 questions)
cat: 45.95% (1149 questions)
global: 57.32% (157 questions)
obj: 87.15% (778 questions)
rel: 49.10% (5308 questions)
Accuracy / steps number:
1: 75.11% (237 questions)
2: 51.49% (6395 questions)
3: 60.45% (4266 questions)
4: 63.43% (793 questions)
5: 74.21% (822 questions)
6: 85.37% (41 questions)
7: 100.00% (20 questions)
8: 100.00% (3 questions)
9: 100.00% (1 questions)
Accuracy / words number:
3: 35.76% (151 questions)
4: 51.27% (630 questions)
5: 44.88% (1290 questions)
6: 54.05% (2074 questions)
7: 56.52% (1642 questions)
8: 58.48% (1185 questions)
9: 63.39% (1281 questions)
10: 63.73% (1249 questions)
11: 58.75% (994 questions)
12: 64.11% (638 questions)
13: 60.39% (462 questions)
14: 64.06% (345 questions)
15: 64.56% (237 questions)
16: 65.81% (117 questions)
17: 62.77% (94 questions)
18: 72.37% (76 questions)
19: 76.74% (43 questions)
20: 65.62% (32 questions)
21: 63.16% (19 questions)
22: 75.00% (12 questions)
23: 25.00% (4 questions)
24: 100.00% (2 questions)
25: 100.00% (1 questions)
=====================================
nowTime: 2020-01-20 04:51:12
Epoch: 11, Loss: 0.5036989731823418, Lr: 4.000000000000001e-06
Elapsed time: 5452, Speed(s/batch): 0.3246474958766384
Binary: 75.91%
Open: 41.63%
Accuracy: 57.36%
Distribution: 1.54 (lower is better)
Accuracy / structural type:
choose: 75.02% (1129 questions)
compare: 67.91% (589 questions)
logical: 74.32% (1803 questions)
query: 41.63% (6805 questions)
verify: 79.71% (2252 questions)
Accuracy / semantic type:
attr: 64.29% (5186 questions)
cat: 46.65% (1149 questions)
global: 58.60% (157 questions)
obj: 85.86% (778 questions)
rel: 48.70% (5308 questions)
Accuracy / steps number:
1: 75.95% (237 questions)
2: 51.07% (6395 questions)
3: 60.81% (4266 questions)
4: 64.06% (793 questions)
5: 73.97% (822 questions)
6: 85.37% (41 questions)
7: 100.00% (20 questions)
8: 100.00% (3 questions)
9: 100.00% (1 questions)
Accuracy / words number:
3: 32.45% (151 questions)
4: 49.68% (630 questions)
5: 44.42% (1290 questions)
6: 54.39% (2074 questions)
7: 57.06% (1642 questions)
8: 59.16% (1185 questions)
9: 63.08% (1281 questions)
10: 63.49% (1249 questions)
11: 59.36% (994 questions)
12: 63.79% (638 questions)
13: 58.87% (462 questions)
14: 64.06% (345 questions)
15: 63.71% (237 questions)
16: 66.67% (117 questions)
17: 63.83% (94 questions)
18: 73.68% (76 questions)
19: 76.74% (43 questions)
20: 65.62% (32 questions)
21: 63.16% (19 questions)
22: 66.67% (12 questions)
23: 25.00% (4 questions)
24: 100.00% (2 questions)
25: 100.00% (1 questions)
Issue Analytics
- State:
- Created 4 years ago
- Reactions:1
- Comments:7
Top Results From Across the Web
GQA: a new dataset for compositional question answering ...
Abstract. We introduce GQA, a new dataset for real-world visual reasoning and compositional question answering, seeking to address key shortcomings of ...
Read more >Automatic Generation of Contrast Sets from Scene Graphs
We automatically create contrast sets for GQA, and find that for two strong models, performance on the contrast sets is lower than on...
Read more >Governance of Quality Assessment (GQA) Tool
The Governance of Quality Assessment (GQA) tool should be used to evaluate the current level of performance for 30 core processes in six...
Read more >The Influence of Variations in Flow on General Quality ...
This technical report describes a study to quantify the statistical ... performance and high flows in 1993-95, and poor GQA results and low...
Read more >Supervising the Transfer of Reasoning Patterns in VQA
4) results in better performances. Comparison with SOTA — We report in Tab. 5 the results obtained by our approach compared to the...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found

Actually, i can’t achieve 58.3 and my best result is around 57.3~57.4. However, i download the official pretrained model for testing which gets 58.4. Thus, i also want to know the random seed. I’m appreciate if the @MIL-VLG can give some training details.
I am trying to run the same dataset but I facing a key error issue. Could you please help me?