File 0745-Fix-possible-hangings-in-ct_telnet_client.patch of Package erlang

From 204ff73d7d063d13a9b801f9ee3565335ad9df96 Mon Sep 17 00:00:00 2001
From: Anders Svensson <anders@erlang.org>
Date: Tue, 13 Apr 2021 10:20:54 +0200
Subject: [PATCH 4/4] Fix possible hangings in ct_telnet_client

In particular, receive on messages that will never arrive if the
ct_telnet_client process that should send them is no more, which can
happen if the peer closes the TCP connection for example. Hanging in
open/6 has been seen in practice in the case of network disturbances
that has presumably lead to such a scenario.

This is a minimal fix that just monitors on the process expected to
answer so that receive can't hang indefinitely, but there's much room
for improvement.

For one, ct_telnet isn't expecting an error return so getting one will
currently result in a failure in that module: this is a conscious choice
instead of exiting since ct_telnet should handle the error. (ct_telnet
is just a callback module for a ct_gen_conn process, so it should
probably leave it up to the latter, but the mechanics of receiving data
in ct_telnet are fairly tortured.)

For two, get_data/1 is a bit broken: it waits for a hardcoded (and
undocumented) 100 ms to return when there is no data return when the
request is received, regardless of whether data is then received after 1
ms, 99 ms, or 101 ms, and calls to get_data/1 affect the keep_alive NOP
since any message to the ct_telnet_client process is regarded as traffic
that delays the keep_alive message. Moreover, the 100 ms is on top of
poll_interval that was introduced in commit c3dd39ed. A timeout that
returns data as soon as it arrives up to a timeout would likely be a
better solution if a delay is really needed. As it is, the documented
behaviour of keep_alive is reasonable, but when NOP is actually sent
depends on more than just lack of traffic on the TCP connection.
(Sending it even after every 10th call to get_data/1 that returns
nothing was possibly meant to counteract that, but when NOP is sent is
unpredictable in practice.)

For three, close/1 will wait 5 seconds unnecessarily if the process in
question is already dead, and can leave a closed message in the message
queue after a timeout.

For four, implementing all of this more sanely as a gen_server instead
of reinventing the wheel with an own process loop would be an
improvement.
---
 lib/common_test/src/ct_telnet_client.erl | 33 ++++++++++++++----------
 1 file changed, 19 insertions(+), 14 deletions(-)

diff --git a/lib/common_test/src/ct_telnet_client.erl b/lib/common_test/src/ct_telnet_client.erl
index 6f9d05e504..f3b11c7ed5 100644
--- a/lib/common_test/src/ct_telnet_client.erl
+++ b/lib/common_test/src/ct_telnet_client.erl
@@ -83,15 +83,16 @@ open(Server, Port, Timeout, KeepAlive, ConnName) ->
 
 open(Server, Port, Timeout, KeepAlive, NoDelay, ConnName) ->
     Self = self(),
-    Pid = spawn(fun() ->
-			init(Self, Server, Port, Timeout,
-			     KeepAlive, NoDelay, ConnName)
-		end),
-    receive 
-	{open,Pid} ->
-	    {ok,Pid};
-	{Error,Pid} ->
-	    Error
+    {Pid, MRef} = spawn_monitor(fun() ->
+                                        init(Self, Server, Port, Timeout,
+                                             KeepAlive, NoDelay, ConnName)
+                                end),
+    receive
+        {Result, Pid} ->
+            demonitor(MRef, [flush]),
+            if Result == open -> {ok, Pid}; true -> Result end;
+        {'DOWN', MRef, process, _, _} = T ->
+            {error, T}
     end.
 
 close(Pid) ->
@@ -111,10 +112,14 @@ send_data(Pid, Data, false) ->
     ok.
 
 get_data(Pid) ->
-    Pid ! {get_data,self()},
-    receive 
-	{data,Data} ->
-	    {ok,Data}
+    MRef = monitor(process, Pid),
+    Pid ! {get_data, self()},
+    receive
+        {data, Data} ->
+            demonitor(MRef, [flush]),
+            {ok, Data};
+        {'DOWN', MRef, process, _, _} = T ->
+            {error, T}
     end.
 
 %%%-----------------------------------------------------------------
-- 
2.26.2
Places

File 0745-Fix-possible-hangings-in-ct_telnet_client.patch of Package erlang

Places