toCSV after joinOuter running very slow
See original GitHub issueAshley, thanks for your awesome work on everything. I’m new to JavaScript and I’m not sure if my issue is related to something I’m doing or if there’s an issue.
I’m having an issue with writing to CSV after performing an outer join. I’ve been able to verify that my data frames are being created. When I display head as pictured below the process is a little slow, but writing to CSV takes minutes to complete. I originally thought it wasn’t writing, but it does seem to write after some period of time. Additionally, the script seems to hang and I don’t return to the command line.
`const exceptDF = cleanRev.joinOuter(cleanSF,
cleanRev => cleanRev.sfKey,
cleanSF => cleanSF.sfKey,
(cleanRev, cleanSF) => {
return {
index: cleanRev ? cleanRev.sfKey : cleanSF.sfkey,
swanKey: cleanRev ? cleanRev.sfKey : undefined,
sfKey: cleanSF ? cleanSF.sfKey : undefined
};
}
);
console.log(exceptDF.head(3).toString()); //this works, but it's slow
exceptDF.asCSV().writeFile('exceptDF.csv'); //this writes the file, but it takes several minutes and I don't return to the command line in the terminal`
For reference, I’m loading two csv files to different data frames, doing some manipulation, and then performing the outer join. Each file has around 4,000 rows and 20 columns.
Thanks for any input.
Issue Analytics
- State:
- Created 4 years ago
- Comments:11 (8 by maintainers)
Top GitHub Comments
Fantastic! Thanks again for logging the feedback.
Please make sure you star this repo!
Ashley,
Sorry for the delay here and thanks very much for your help. I’m new to JavaScript, but the calls to .bake() make a phenomenal difference in speed. It seems like that was the main reason for the slowdown. I’ll need to read through the documentation a bit more to get a better understanding of when to use them.
I see a similar time to what you saw. Thank you for all the help!