question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Pipe broken exception on write to GCS

See original GitHub issue

I’m importing some data to Apache Accumulo which runs on top of Google Cloud Storage (as HDFS replacement). I use the GCS connector 1.8.1-hadoop2 and Accumulo runs in GCloud VMs.

I see the following exceptions in the logs quite frequently (the first - on GoogleHadoopOutputStream.write, the second - on GoogleHadoopOutputStream.close):

java.io.IOException: java.io.IOException: Pipe broken
		at com.google.cloud.hadoop.repackaged.gcs.com.google.cloud.hadoop.util.AbstractGoogleAsyncWriteChannel.waitForCompletionAndThrowIfUploadFailed(AbstractGoogleAsyncWriteChannel.java:432)
		at com.google.cloud.hadoop.repackaged.gcs.com.google.cloud.hadoop.util.AbstractGoogleAsyncWriteChannel.write(AbstractGoogleAsyncWriteChannel.java:256)
		at java.nio.channels.Channels.writeFullyImpl(Channels.java:78)
		at java.nio.channels.Channels.writeFully(Channels.java:101)
		at java.nio.channels.Channels.access$000(Channels.java:61)
		at java.nio.channels.Channels$1.write(Channels.java:174)
		at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
		at java.io.BufferedOutputStream.write(BufferedOutputStream.java:95)
		at com.google.cloud.hadoop.fs.gcs.GoogleHadoopOutputStream.write(GoogleHadoopOutputStream.java:96)
		at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:50)
		at java.io.DataOutputStream.write(DataOutputStream.java:88)
		at java.io.DataOutputStream.writeByte(DataOutputStream.java:153)
		at org.apache.accumulo.tserver.logger.LogFileKey.write(LogFileKey.java:89)
		at org.apache.accumulo.tserver.log.DfsLogger.write(DfsLogger.java:616)
		at org.apache.accumulo.tserver.log.DfsLogger.logFileData(DfsLogger.java:633)
		at org.apache.accumulo.tserver.log.DfsLogger.logManyTablets(DfsLogger.java:673)
		at org.apache.accumulo.tserver.log.TabletServerLogger$7.write(TabletServerLogger.java:533)
		at org.apache.accumulo.tserver.log.TabletServerLogger.write(TabletServerLogger.java:420)
		at org.apache.accumulo.tserver.log.TabletServerLogger.write(TabletServerLogger.java:371)
		at org.apache.accumulo.tserver.log.TabletServerLogger.logManyTablets(TabletServerLogger.java:523)
		at org.apache.accumulo.tserver.TabletServer$ThriftClientHandler.flush(TabletServer.java:1030)
		at org.apache.accumulo.tserver.TabletServer$ThriftClientHandler.closeUpdate(TabletServer.java:1118)
		at sun.reflect.GeneratedMethodAccessor10.invoke(Unknown Source)
		at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
		at java.lang.reflect.Method.invoke(Method.java:498)
		at org.apache.accumulo.core.trace.wrappers.RpcServerInvocationHandler.invoke(RpcServerInvocationHandler.java:46)
		at org.apache.accumulo.server.rpc.RpcWrapper$1.invoke(RpcWrapper.java:83)
		at com.sun.proxy.$Proxy17.closeUpdate(Unknown Source)
		at org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Processor$closeUpdate.getResult(TabletClientService.java:2501)
		at org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Processor$closeUpdate.getResult(TabletClientService.java:2485)
		at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
		at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
		at org.apache.accumulo.server.rpc.TimedProcessor.process(TimedProcessor.java:65)
		at org.apache.thrift.server.AbstractNonblockingServer$FrameBuffer.invoke(AbstractNonblockingServer.java:518)
		at org.apache.accumulo.server.rpc.CustomNonBlockingServer$CustomFrameBuffer.invoke(CustomNonBlockingServer.java:113)
		at org.apache.thrift.server.Invocation.run(Invocation.java:18)
		at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
		at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
		at org.apache.accumulo.fate.util.LoggingRunnable.run(LoggingRunnable.java:35)
		at java.lang.Thread.run(Thread.java:748)
	Caused by: java.io.IOException: Pipe broken
		at java.io.PipedInputStream.read(PipedInputStream.java:321)
		at java.io.PipedInputStream.read(PipedInputStream.java:377)
		at com.google.cloud.hadoop.repackaged.gcs.com.google.api.client.util.ByteStreams.read(ByteStreams.java:181)
		at com.google.cloud.hadoop.repackaged.gcs.com.google.api.client.googleapis.media.MediaHttpUploader.setContentAndHeadersOnCurrentRequest(MediaHttpUploader.java:629)
		at com.google.cloud.hadoop.repackaged.gcs.com.google.api.client.googleapis.media.MediaHttpUploader.resumableUpload(MediaHttpUploader.java:409)
		at com.google.cloud.hadoop.repackaged.gcs.com.google.api.client.googleapis.media.MediaHttpUploader.upload(MediaHttpUploader.java:336)
		at com.google.cloud.hadoop.repackaged.gcs.com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:427)
		at com.google.cloud.hadoop.repackaged.gcs.com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:352)
		at com.google.cloud.hadoop.repackaged.gcs.com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogleClientRequest.java:469)
		at com.google.cloud.hadoop.repackaged.gcs.com.google.cloud.hadoop.util.AbstractGoogleAsyncWriteChannel$UploadOperation.call(AbstractGoogleAsyncWriteChannel.java:358)
		at java.util.concurrent.FutureTask.run(FutureTask.java:266)
		at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
		at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
		... 1 more
java.io.IOException: java.io.IOException: Pipe broken
		at com.google.cloud.hadoop.repackaged.gcs.com.google.cloud.hadoop.util.AbstractGoogleAsyncWriteChannel.waitForCompletionAndThrowIfUploadFailed(AbstractGoogleAsyncWriteChannel.java:432)
		at com.google.cloud.hadoop.repackaged.gcs.com.google.cloud.hadoop.util.AbstractGoogleAsyncWriteChannel.write(AbstractGoogleAsyncWriteChannel.java:256)
		at java.nio.channels.Channels.writeFullyImpl(Channels.java:78)
		at java.nio.channels.Channels.writeFully(Channels.java:101)
		at java.nio.channels.Channels.access$000(Channels.java:61)
		at java.nio.channels.Channels$1.write(Channels.java:174)
		at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
		at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
		at java.io.FilterOutputStream.close(FilterOutputStream.java:158)
		at com.google.cloud.hadoop.fs.gcs.GoogleHadoopOutputStream.close(GoogleHadoopOutputStream.java:126)
		at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72)
		at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:106)
		at org.apache.accumulo.tserver.log.DfsLogger.close(DfsLogger.java:592)
		at org.apache.accumulo.tserver.log.TabletServerLogger.close(TabletServerLogger.java:338)
		at org.apache.accumulo.tserver.log.TabletServerLogger.access$1000(TabletServerLogger.java:70)
		at org.apache.accumulo.tserver.log.TabletServerLogger$3.withWriteLock(TabletServerLogger.java:455)
		at org.apache.accumulo.tserver.log.TabletServerLogger.testLockAndRun(TabletServerLogger.java:137)
		at org.apache.accumulo.tserver.log.TabletServerLogger.write(TabletServerLogger.java:446)
		at org.apache.accumulo.tserver.log.TabletServerLogger.write(TabletServerLogger.java:371)
		at org.apache.accumulo.tserver.log.TabletServerLogger.logManyTablets(TabletServerLogger.java:523)
		at org.apache.accumulo.tserver.TabletServer$ThriftClientHandler.flush(TabletServer.java:1030)
		at org.apache.accumulo.tserver.TabletServer$ThriftClientHandler.closeUpdate(TabletServer.java:1118)
		at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source)
		at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
		at java.lang.reflect.Method.invoke(Method.java:498)
		at org.apache.accumulo.core.trace.wrappers.RpcServerInvocationHandler.invoke(RpcServerInvocationHandler.java:46)
		at org.apache.accumulo.server.rpc.RpcWrapper$1.invoke(RpcWrapper.java:83)
		at com.sun.proxy.$Proxy17.closeUpdate(Unknown Source)
		at org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Processor$closeUpdate.getResult(TabletClientService.java:2501)
		at org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Processor$closeUpdate.getResult(TabletClientService.java:2485)
		at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
		at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
		at org.apache.accumulo.server.rpc.TimedProcessor.process(TimedProcessor.java:65)
		at org.apache.thrift.server.AbstractNonblockingServer$FrameBuffer.invoke(AbstractNonblockingServer.java:518)
		at org.apache.accumulo.server.rpc.CustomNonBlockingServer$CustomFrameBuffer.invoke(CustomNonBlockingServer.java:113)
		at org.apache.thrift.server.Invocation.run(Invocation.java:18)
		at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
		at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
		at org.apache.accumulo.fate.util.LoggingRunnable.run(LoggingRunnable.java:35)
		at java.lang.Thread.run(Thread.java:748)
		Suppressed: java.io.IOException: java.io.IOException: Pipe broken
			at com.google.cloud.hadoop.repackaged.gcs.com.google.cloud.hadoop.util.AbstractGoogleAsyncWriteChannel.waitForCompletionAndThrowIfUploadFailed(AbstractGoogleAsyncWriteChannel.java:432)
			at com.google.cloud.hadoop.repackaged.gcs.com.google.cloud.hadoop.util.AbstractGoogleAsyncWriteChannel.close(AbstractGoogleAsyncWriteChannel.java:287)
			at java.nio.channels.Channels$1.close(Channels.java:178)
			at java.io.FilterOutputStream.close(FilterOutputStream.java:159)
			... 31 more
		Caused by: java.io.IOException: Pipe broken
			at java.io.PipedInputStream.read(PipedInputStream.java:321)
			at java.io.PipedInputStream.read(PipedInputStream.java:377)
			at com.google.cloud.hadoop.repackaged.gcs.com.google.api.client.util.ByteStreams.read(ByteStreams.java:181)
			at com.google.cloud.hadoop.repackaged.gcs.com.google.api.client.googleapis.media.MediaHttpUploader.setContentAndHeadersOnCurrentRequest(MediaHttpUploader.java:629)
			at com.google.cloud.hadoop.repackaged.gcs.com.google.api.client.googleapis.media.MediaHttpUploader.resumableUpload(MediaHttpUploader.java:409)
			at com.google.cloud.hadoop.repackaged.gcs.com.google.api.client.googleapis.media.MediaHttpUploader.upload(MediaHttpUploader.java:336)
			at com.google.cloud.hadoop.repackaged.gcs.com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:427)
			at com.google.cloud.hadoop.repackaged.gcs.com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:352)
			at com.google.cloud.hadoop.repackaged.gcs.com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogleClientRequest.java:469)
			at com.google.cloud.hadoop.repackaged.gcs.com.google.cloud.hadoop.util.AbstractGoogleAsyncWriteChannel$UploadOperation.call(AbstractGoogleAsyncWriteChannel.java:358)
			at java.util.concurrent.FutureTask.run(FutureTask.java:266)
			at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
			at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
			... 1 more
		[CIRCULAR REFERENCE:java.io.IOException: Pipe broken]

Accumulo marks this exception with the ERROR level.

What could be the root cause? How to get more details about the exception (debug logs, etc.)? Thank you!

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:10 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
ctubbsiicommented, Jun 22, 2018

@medb The Apache Accumulo issue you referenced did not conclude that GCS wouldn’t, or couldn’t, be supported. That issue was closed because the question raised by the user about what was the explanation for the issue they were seeing, that question was answered.

The supported solution is to use a LogCloser configured on the user’s class path for Accumulo which will handle closing logs on GCS. I don’t know enough about GCS to know for sure, but it may be sufficient to trivially fork Accumulo’s built-in HadoopLogCloser, and do nothing instead of throwing the IllegalStateException when the FileSystem is GCS (essentially, no attempt to do lease recovery, just like in the local file system case).

I do not think that the issue has anything to do with Accumulo’s write pattern… as suggested here… at least, not if it’s the same issue as the one you referenced. It’s likely a simple matter of implementing an appropriate LogCloser.

1reaction
medbcommented, Jun 22, 2018

The problem is that GCS, Azure Blob Store and AWS S3 are not file systems, but object stores and Apache Accumulo written in mind with HDFS capabilities, which could not be fully supported by object stores.

GCS connector tries to mimic HDFS semantic, but because of object stores limitations it could not do so fully.

We need to take a look into Accumulo use case to determine if it possible to make it work with GCS, but because Accumulo is not supported now by GCS connector, it’s not immediate action item for us.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Pipe broken exception on write to GCS · Issue #103 - GitHub
If it writes a lot of small files at high QPS rate to GCS then it could cause GCS to drop connections which...
Read more >
Broken pipe and Device or resource busy errors in a GCP ...
I am having broken pipe issues while transferring a Google drive directory into GCP Google Storage. Steps are as follows.
Read more >
Re: getting broken pipe error in system.log
onExceptionWrite exception:{} java.io.IOException: Broken pipe at com.apigee.nio.channels.ClientOutputChannel.
Read more >
Snapshot Recovery fails due to connection issues with GCS ...
Snapshot Recovery fails due to connection issues with GCS repository ... SocketException: Broken pipe", "\tat sun.nio.ch.NioSocketImpl.
Read more >
[jira] [Created] (BEAM-8216) GCS IO fails with uninformative ...
Summary: GCS IO fails with uninformative 'Broken pipe' errors while attempting to write to a GCS bucket without proper permissions.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found