question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

❓ Questions & Help

Hi

So i’m currently trying to combine two very similar datasets. I have them at two distinct locations and my goal would be to load both into the same dataset, using one as the training data and the other one as the test and val data. Now when using an InMemoryDataset, i thought about using masks for that. But as far as i have found in the datasets provided, masks are used to use certain parts of graphs for training and other parts for testing (am i right, or am i misunderstanding them?)

Would there be a way, to declare certain graphs as part of the training data, and others as part of the test data?

Sorry if it’s a simple question, but i haven’t worked with masks before.

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:5 (2 by maintainers)

github_iconTop GitHub Comments

1reaction
rusty1scommented, Nov 21, 2019

Just split them into train and test set and load them separaretly, e.g., like here.

0reactions
SaschaStengercommented, Nov 27, 2019

Yes, for whatever reason it crashes when i try to load it this way. But I just made a workaround. I initialize the encode that i’m using on both datasets beforehand and then pass it to both datasets as an argument, when creating them, so that both are using the same encoder instance.
It’s possible, that i have made a mistake, when coding the above section somewhere, but i cant find it and so i’ll just be moving on to this solution.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Splitting a dataset - Towards Data Science
A brief explanation of how to do train-test split of a dataset using sklearn · from sklearn.datasets import load_iris · iris = load_iris()...
Read more >
Split Your Dataset With scikit-learn's train_test_split()
In it, you divide your dataset into k (often five or ten) subsets, or folds, of equal size and then perform the training...
Read more >
What is data splitting and why is it important? - TechTarget
Data splitting is when data is divided into two or more subsets. Typically, with a two-part split, one part is used to evaluate...
Read more >
Data Split Example | Machine Learning - Google Developers
Sampling and Splitting Data. Prepare to work with large datasets to solve machine learning problems. Updated Jul 18, 2022. Imbalanced Data.
Read more >
Splitting a Dataset into Train and Test Sets - Baeldung
In this tutorial, we'll investigate how to split a dataset into training and test sets. Firstly, we'll try to understand why do we...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found