Current code uses openWriteChannel():
protected var selectorManager = SelectorManager(Dispatchers.IO)
abstract suspend fun onConnectSocket(timeout: Long): ReadWriteSocket
override suspend fun connect() {
val socket = onConnectSocket(timeout)
input = socket.openReadChannel()
output = socket.openWriteChannel(autoFlush = false)
address = java.net.InetSocketAddress(host, port).address
this.socket = socket
}
Problem
When the remote socket is closed while data is still being written, the actual write failure may be thrown inside Ktor's internal cio-to-nio-writer coroutine.
This failure is not thrown from the caller's writeFully() / flush() call stack, so caller-side error handling cannot reliably catch it:
runCatching {
output.writeFully(data)
output.flush()
}.onFailure {
// This may not receive the actual cio-to-nio-writer failure.
}
In the observed case, the stack trace only contains Ktor internal classes, which indicates that the failure happens inside Ktor's internal writer coroutine.
Why caller-side catch does not work
openWriteChannel() only returns a ByteWriteChannel. It is not a suspend function that performs the actual socket write in the caller's coroutine chain.
The actual flow is closer to this:
caller try-catch / runCatching
↓
output.writeFully(...) / output.flush(...)
↓
ByteWriteChannel
↓
Ktor internal cio-to-nio-writer coroutine
↓
actual socket write fails
Once the failure happens inside cio-to-nio-writer, it is no longer on the caller's call stack. Therefore, wrapping writeFully() / flush() with try-catch or runCatching cannot reliably catch it.
Why external CoroutineContext cannot handle it
From Ktor's implementation, the socket itself is a CoroutineScope.
The simplified inheritance chain is:
SocketImpl(...)
: NIOSocketImpl(...)
NIOSocketImpl(...)
: ReadWriteSocket,
SocketBase(EmptyCoroutineContext)
SocketBase(parent: CoroutineContext)
: CoroutineScope {
override val socketContext: CompletableJob = Job(parent[Job])
override val coroutineContext: CoroutineContext
get() = socketContext
}
Since NIOSocketImpl creates SocketBase with EmptyCoroutineContext, the socket owns its internal socketContext and does not inherit the caller's CoroutineContext.
As a result, the caller has no public entry point to inject its own CoroutineExceptionHandler into Ktor's internal cio-to-nio-writer coroutine.
So the issue is:
The caller cannot control the CoroutineContext of cio-to-nio-writer, so it cannot handle or consume exceptions thrown inside that internal coroutine with an external CoroutineExceptionHandler.
Why invokeOnCompletion is not enough
Using attachForWriting() instead of openWriteChannel() allows the caller to keep the writer job:
val writeChannel = ByteChannel(autoFlush = false)
val writerJob = socket.attachForWriting(writeChannel)
writerJob.invokeOnCompletion { cause ->
if (cause != null && cause !is CancellationException) {
// The writer failure can be observed here.
}
}
output = writeChannel
However, invokeOnCompletion only observes the job completion cause. It does not catch or consume the exception, and it cannot prevent the uncaught coroutine failure path.
It is useful for logging and cleanup, but it is not equivalent to try-catch.
Conclusion
The key point is:
The exception is thrown inside Ktor's internal cio-to-nio-writer coroutine, not from the caller's writeFully() / flush() call stack.
Therefore:
- caller-side
try-catch cannot catch it;
- caller-side
runCatching cannot catch it;
openWriteChannel() is not a suspend function that can propagate this failure through the caller's coroutine chain;
- the caller cannot inject a
CoroutineExceptionHandler into cio-to-nio-writer;
invokeOnCompletion can observe the writer job failure, but cannot consume it.
Correction note
This issue has been updated.
The original version suggested handling this by installing a CoroutineExceptionHandler on SelectorManager. After checking Ktor's socket implementation, that is not considered a valid fix for this failure path.
The actual problem is that the caller has no public API to inject its own CoroutineExceptionHandler into Ktor's internal cio-to-nio-writer coroutine context.
Current code uses
openWriteChannel():Problem
When the remote socket is closed while data is still being written, the actual write failure may be thrown inside Ktor's internal
cio-to-nio-writercoroutine.This failure is not thrown from the caller's
writeFully()/flush()call stack, so caller-side error handling cannot reliably catch it:runCatching { output.writeFully(data) output.flush() }.onFailure { // This may not receive the actual cio-to-nio-writer failure. }In the observed case, the stack trace only contains Ktor internal classes, which indicates that the failure happens inside Ktor's internal writer coroutine.
Why caller-side catch does not work
openWriteChannel()only returns aByteWriteChannel. It is not a suspend function that performs the actual socket write in the caller's coroutine chain.The actual flow is closer to this:
Once the failure happens inside
cio-to-nio-writer, it is no longer on the caller's call stack. Therefore, wrappingwriteFully()/flush()withtry-catchorrunCatchingcannot reliably catch it.Why external CoroutineContext cannot handle it
From Ktor's implementation, the socket itself is a
CoroutineScope.The simplified inheritance chain is:
Since
NIOSocketImplcreatesSocketBasewithEmptyCoroutineContext, the socket owns its internalsocketContextand does not inherit the caller'sCoroutineContext.As a result, the caller has no public entry point to inject its own
CoroutineExceptionHandlerinto Ktor's internalcio-to-nio-writercoroutine.So the issue is:
Why
invokeOnCompletionis not enoughUsing
attachForWriting()instead ofopenWriteChannel()allows the caller to keep the writer job:However,
invokeOnCompletiononly observes the job completion cause. It does not catch or consume the exception, and it cannot prevent the uncaught coroutine failure path.It is useful for logging and cleanup, but it is not equivalent to
try-catch.Conclusion
The key point is:
Therefore:
try-catchcannot catch it;runCatchingcannot catch it;openWriteChannel()is not a suspend function that can propagate this failure through the caller's coroutine chain;CoroutineExceptionHandlerintocio-to-nio-writer;invokeOnCompletioncan observe the writer job failure, but cannot consume it.Correction note
This issue has been updated.
The original version suggested handling this by installing a
CoroutineExceptionHandleronSelectorManager. After checking Ktor's socket implementation, that is not considered a valid fix for this failure path.The actual problem is that the caller has no public API to inject its own
CoroutineExceptionHandlerinto Ktor's internalcio-to-nio-writercoroutine context.