Skip to content

Deadlock if a connection times out while streamDataMutex lock is held by connection thread #83

@haydenmc

Description

@haydenmc

In FtlServer::onNewControlConnection(...), a timeout thread is started that waits a number of seconds and then stops the control connection if it has not yet authenticated.

When the timeout lapses, the connection is stopped while a streamDataMutex lock is held.

std::unique_lock streamDataLock(streamDataMutex);
if (pendingControlConnections.count(ingestControlConnectionPtr) > 0)
{
spdlog::info("{} didn't authenticate within {}ms, closing",
addrString, CONNECTION_AUTH_TIMEOUT_MS);
pendingControlConnections.at(ingestControlConnectionPtr)->Stop();
pendingControlConnections.erase(ingestControlConnectionPtr);
}

Under some very specific timing circumstances, a deadlock can occur if the timeout elapses around the same time the connection thread attempts to acquire a lock to process stream state changes.

The deadlock looks something like this:

  1. Client connects and the timeout thread starts
  2. FtlServer attempts to request a new Stream ID
  3. Timeout thread expires, lock is acquired on streamDataMutex while waiting for the connection thread to stop
  4. Connection thread is stuck attempting to acquire a lock on streamDataMutex to indicate that the stream has started

I was able to reproduce while debugging a REST server implementation by holding up a response to ServiceConnection::StartStream(...), which ended up resulting in lock contention between

lock.lock();

and

pendingControlConnections.at(ingestControlConnectionPtr)->Stop();

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions