Implement writable streams for natively streaming inserts
See original GitHub issueIs your feature request related to a problem? Please describe.
It would be really useful to be able to natively stream records directly into a table (via /insertAll
). This would make it much more accessible to transfer significant amounts of data on low-resource platforms like cloud functions.
Describe the solution you’d like
A Table method similar to createWriteStream() in the @google-cloud/storage package. Being able to just call mySourceStream.pipe(myTable.createWriteStream())
would be super helpful.
Describe alternatives you’ve considered
I’ve managed to get roughly the functionality I’m looking for by calling insert()
in arbitrarily sized batches, i.e.
class TableWriteableStream extends Writable {
constructor (destinationTable) {
super({ objectMode: true })
this.destinationTable = destinationTable
this.records = []
}
_write (chunk, encoding, callback) {
this.records.push(chunk)
// Every 500 records, call insert()
if (this.records.length >= 500) {
const recordsToInsert = this.records
this.records = []
this.destinationTable
.insert(recordsToInsert)
.then(() => { callback() })
} else {
callback()
}
}
_final (callback) {
// Insert any remaining records ...
Additional context This is somewhat related to #75, though that seems focused on compression (which would be a great option for this feature, just like the ‘gzip’ option for the @google-cloud/storage function).
Issue Analytics
- State:
- Created 4 years ago
- Reactions:5
- Comments:6 (2 by maintainers)
Possibly related, would be nice for this lib to implement https://cloud.google.com/bigquery/docs/write-api
I like this proposal. Having the client do some automatic batching by time / number of records comes with a lot of complexity, though. Also, the streaming API returns errors in the response, so there’d have to be some logic to check for that as well as true HTTP error responses.
Since it’s a complex feature, we should write up a full design doc for this. Ideally we’d share similar semantics with the pub/sub API.