question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Improper normalization of the scores?

See original GitHub issue

In train.py, you normalize the scores according to:

test_map = [list() for p in pool_layers]
for l, p in enumerate(pool_layers):
    test_norm = torch.tensor(test_dist[l], dtype=torch.double)  # EHWx1
    test_norm-= torch.max(test_norm) # normalize likelihoods to (-Inf:0] by subtracting a constant
    test_prob = torch.exp(test_norm) # convert to probs in range [0:1]
    test_mask = test_prob.reshape(-1, height[l], width[l])
    test_mask = test_prob.reshape(-1, height[l], width[l])
    # upsample
    test_map[l] = F.interpolate(test_mask.unsqueeze(1),
        size=c.crp_size, mode='bilinear', align_corners=True).squeeze().numpy()
# score aggregation
score_map = np.zeros_like(test_map[0])
for l, p in enumerate(pool_layers):
    score_map += test_map[l]

This normalization is fine as long as it is done for only one map since this normalization function is monotonically increasing. By adding up the maps from the different layers, this makes no sense to me since the relative weighting of the score maps for aggregation (last line) depends on the test set or to be more precise on the maxima of the individual maps over the test set. Am I missing something here or is this normalization improper?

Issue Analytics

  • State:open
  • Created a year ago
  • Comments:9 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
marco-rudolphcommented, Mar 29, 2022

@marco-rudolph hmm, I think you are right that max is improper for multi-scale case if we cannot use any statistics. In practice, we probably know past statistics and can assume max. This might hold in some practical cases, but cannot be assumed in the anomaly detection setting. Furthermore, max is very sensitive to the test set as scores can explode since the exponentiation can produce very large values. Using only train data would change the weighting a lot. In general, the usage of max is very sensitive to outliers. In practice, I observed that a simple addition without weighting would worsen the mean AUPRO score by about 3% which is quite significant compared to other work.

0reactions
gudovskiycommented, Apr 19, 2022

@alevangel well, you can replace test with train

Read more comments on GitHub >

github_iconTop Results From Across the Web

Relational Database Normalization Process
An improperly normalized condition can result in extensive data redundancy, which puts a burden on all programs that modify the data. From a...
Read more >
What is normalization and what are some of the problems ...
So improper normalization can result in misuse of storage, problems in retrieval of data, accuracy and integrity can also suffer. Your response is...
Read more >
Database Normalization: A Step-By-Step-Guide With Examples
Normalization in a DBMS is done to achieve these points. Without normalization on a database, the data can be slow, incorrect, and messy....
Read more >
The myth of over-normalization - Simple Talk - Redgate Software
Denormalisation is a strange activity that is supposed to take place after a database has been normalized, and is assumed to be necessary...
Read more >
Standardization VS Normalization - Medium
Normalizing the data is sensitive to outliers, so if there are outliers in the data set it is a bad practice. Standardization creates...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found