question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

neptune.ai logger console error: X-coordinates must be strictly increasing

See original GitHub issue

🐛 Bug

When using the neptune.ai logger, epochs automatically get logged, even though I never explicitly told it to do so. Also, I get an error, yet everything seems to get logged correctly (apart from the epochs, which also get logged every training step and not every epoch):

WARNING:neptune.internal.channels.channels_values_sender:Failed to send channel value: Received batch errors sending channels' values to experiment BAC-32. Cause: Error(code=400, message='X-coordinates must be strictly increasing for channel: 2262414e-e5fc-4a8f-b3f8-4a8d84d7a5e2. Invalid point: InputChannelValue(2020-03-18T17:18:04.233Z,Some(164062.27599999998),None,Some(Epoch 35: 99%|#######', type=None) (metricId: '2262414e-e5fc-4a8f-b3f8-4a8d84d7a5e2', x: 164062.27599999998) Skipping 3 values.

To Reproduce

Steps to reproduce the behavior:

Write a lightning module and use the neptune.ai logger. See my own code below.

Code sample

class MemoryTest(pl.LightningModule):
    # Main Testing Unit for Experiments on Recurrent Cells
    def __init__(self, hp):
        super(MemoryTest, self).__init__()
        self.predict_col = hp.predict_col
        self.n_datasamples = hp.n_datasamples
        self.dataset = hp.dataset
        if self.dataset is 'rand':
            self.seq_len = None
        else:
            self.seq_len = hp.seq_len
        self.hparams = hp
        self.learning_rate = hp.learning_rate
        self.training_losses = []
        self.final_loss = None

        self.model = RecurrentModel(1, hp.n_cells, hp.n_layers, celltype=hp.celltype)

    def forward(self, input, input_len):
        return self.model(input, input_len)

    def training_step(self, batch, batch_idx):
        x, y, input_len = batch
        features_y = self.forward(x, input_len)

        loss = F.mse_loss(features_y, y)
        mean_absolute_loss = F.l1_loss(features_y, y)

        self.training_losses.append(mean_absolute_loss.item())

        tensorboard_logs = {'batch/train_loss': loss, 'batch/mean_absolute_loss': mean_absolute_loss}
        return {'loss': loss, 'batch/mean_absolute_loss': mean_absolute_loss, 'log': tensorboard_logs}

    def on_epoch_end(self):
        train_loss_mean = np.mean(self.training_losses)
        self.final_loss = train_loss_mean
        self.logger.experiment.log_metric('epoch/mean_absolute_loss', y=train_loss_mean, x=self.current_epoch)
        self.training_losses = []  # reset for next epoch

    def on_train_end(self):
        self.logger.experiment.log_text('network/final_loss', str(self.final_loss))

    def configure_optimizers(self):
        return torch.optim.SGD(self.parameters(), lr=self.learning_rate)

    @pl.data_loader
    def train_dataloader(self):
        train_dataset = dg.RandomDataset(self.predict_col, self.n_datasamples)
        if self.dataset == 'rand_fix':
            train_dataset = dg.RandomDatasetFix(self.predict_col, self.n_datasamples, self.seq_len)
        if self.dataset == 'correlated':
            train_dataset = dg.CorrelatedDataset(self.predict_col, self.n_datasamples)
        train_loader = DataLoader(dataset=train_dataset, batch_size=1)
        return train_loader

    @staticmethod
    def add_model_specific_args(parent_parser):
        # MODEL specific
        model_parser = ArgumentParser(parents=[parent_parser])
        model_parser.add_argument('--learning_rate', default=1e-2, type=float)
        model_parser.add_argument('--n_layers', default=1, type=int)
        model_parser.add_argument('--n_cells', default=5, type=int)
        model_parser.add_argument('--celltype', default='LSTM', type=str)

        # training specific (for this model)
        model_parser.add_argument('--epochs', default=500, type=int)
        model_parser.add_argument('--patience', default=200, type=int)
        model_parser.add_argument('--min_delta', default=0.01, type=float)

        # data specific
        model_parser.add_argument('--n_datasamples', default=1000, type=int)
        model_parser.add_argument('--seq_len', default=10, type=int)
        model_parser.add_argument('--dataset', default='rand', type=str)
        model_parser.add_argument('--predict_col', default=2, type=int)

        return model_parser

def main(hparams):

neptune_logger = NeptuneLogger(
    project_name="dunrar/bachelor-thesis",
    params=vars(hparams),
)                        
early_stopping = EarlyStopping('batch/mean_absolute_loss', min_delta=hparams.min_delta, patience=hparams.patience)
model = MemoryTest(hparams)
trainer = pl.Trainer(logger=neptune_logger,
                     gpus=hparams.cuda,
                     val_percent_check=0,
                     early_stop_callback=early_stopping,
                     max_epochs=hparams.epochs) 
trainer.fit(model)

Expected behavior

Epochs should not be logged without my explicit instruction. Also there should be no error when running that code.

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:45 (27 by maintainers)

github_iconTop GitHub Comments

2reactions
Dunrarcommented, Mar 19, 2020

Okay, I’ll compare, make a few updates/downgrades and report back

1reaction
jakubczakoncommented, Mar 20, 2020

Sorry for the trouble @Dunrar 😃 We will work on a better Windows experience.

I will post updates about this here once I have them.

Read more comments on GitHub >

github_iconTop Results From Across the Web

neptune.ai logger console error: X-coordinates must be strictly ...
Cause: Error(code=400, message='X-coordinates must be strictly increasing for channel: 2262414e-e5fc-4a8f-b3f8-4a8d84d7a5e2. Invalid point: ...
Read more >
How to Keep Track of PyTorch Lightning Experiments With ...
Working with PyTorch Lightning and wondering which logger should you choose to keep track of ... In the simplest case, you just create...
Read more >
pytorch_lightning.loggers.neptune - PyTorch Lightning
[docs]class NeptuneLogger(Logger): r""" Log using `Neptune <https://neptune.ai>`_. Install it with pip: .. code-block:: bash pip install neptune-client or ...
Read more >
Untitled
Error taskmgr.exe, Hotel oria spain, Branka balantic, Sonia sbergio, Schoenstatt san antonio, Tom linster facebook, Vente maison arcomps 18200, ...
Read more >
sitemap-questions-26.xml - Stack Overflow
... .com/questions/1325829/where-is-the-net-framework-error-log-location ... -do-you-raise-a-java-biginteger-to-the-power-of-a-biginteger-without-doing-mo ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found