Improper normalization of the scores?
See original GitHub issueIn train.py, you normalize the scores according to:
test_map = [list() for p in pool_layers]
for l, p in enumerate(pool_layers):
test_norm = torch.tensor(test_dist[l], dtype=torch.double) # EHWx1
test_norm-= torch.max(test_norm) # normalize likelihoods to (-Inf:0] by subtracting a constant
test_prob = torch.exp(test_norm) # convert to probs in range [0:1]
test_mask = test_prob.reshape(-1, height[l], width[l])
test_mask = test_prob.reshape(-1, height[l], width[l])
# upsample
test_map[l] = F.interpolate(test_mask.unsqueeze(1),
size=c.crp_size, mode='bilinear', align_corners=True).squeeze().numpy()
# score aggregation
score_map = np.zeros_like(test_map[0])
for l, p in enumerate(pool_layers):
score_map += test_map[l]
This normalization is fine as long as it is done for only one map since this normalization function is monotonically increasing. By adding up the maps from the different layers, this makes no sense to me since the relative weighting of the score maps for aggregation (last line) depends on the test set or to be more precise on the maxima of the individual maps over the test set. Am I missing something here or is this normalization improper?
Issue Analytics
- State:
- Created a year ago
- Comments:9 (4 by maintainers)
Top Results From Across the Web
Relational Database Normalization Process
An improperly normalized condition can result in extensive data redundancy, which puts a burden on all programs that modify the data. From a...
Read more >What is normalization and what are some of the problems ...
So improper normalization can result in misuse of storage, problems in retrieval of data, accuracy and integrity can also suffer. Your response is...
Read more >Database Normalization: A Step-By-Step-Guide With Examples
Normalization in a DBMS is done to achieve these points. Without normalization on a database, the data can be slow, incorrect, and messy....
Read more >The myth of over-normalization - Simple Talk - Redgate Software
Denormalisation is a strange activity that is supposed to take place after a database has been normalized, and is assumed to be necessary...
Read more >Standardization VS Normalization - Medium
Normalizing the data is sensitive to outliers, so if there are outliers in the data set it is a bad practice. Standardization creates...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@marco-rudolph hmm, I think you are right that max is improper for multi-scale case if we cannot use any statistics. In practice, we probably know past statistics and can assume max.
This might hold in some practical cases, but cannot be assumed in the anomaly detection setting. Furthermore, max is very sensitive to the test set as scores can explode since the exponentiation can produce very large values. Using only train data would change the weighting a lot. In general, the usage of max is very sensitive to outliers. In practice, I observed that a simple addition without weighting would worsen the mean AUPRO score by about 3% which is quite significant compared to other work.@alevangel well, you can replace test with train