-
Notifications
You must be signed in to change notification settings - Fork 292
CP-54393: VNC console idle timeout #6692
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: feature/limit-vnc-console-sessions
Are you sure you want to change the base?
CP-54393: VNC console idle timeout #6692
Conversation
This commit adds idle timeout feature for vnc console connections. Key changes: - Add idle timeout detection by monitoring RFB keyEvent and pointerEvent. - Add callback function to `proxy` to parse the RFB messages and determine if the connection is idle or not. Signed-off-by: Stephen Cheng <[email protected]>
This is a re-open PR of #6667 based on the rfb parser.
|
exception Close_proxy | ||
|
||
let proxy ?(should_close : (bytes * int * int -> bool) option = None) | ||
?(poll_timeout = -1) (a : Unix.file_descr) (b : Unix.file_descr) = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The parameter poll_timeout
is optional. Should we not need special values to indicate that we have no timeout?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to be a consequence of the interface of polly, which expects a -1
for "no timeout".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However, you can define should_close
without option = None
(as the ?
makes it an option type already). Then you don't need to use Some
when you call the function in console.ml
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. This should be a parameter with a default but it should not be an optional value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
proxy
is not only used by console. I made poll_timeout
as optional because I don't want to modify the code that calls proxy
in other places before.
ocaml/xapi/xapi_globs.ml
Outdated
(* The timeout (in seconds) for event polling in the proxy loop. | ||
If set to a positive value, the poll will wake up periodically, | ||
which is useful for implementing features like idle timeout or periodic inspection of proxy buffers. *) | ||
let poll_timeout_sec = ref 5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this not be a float value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This timeout value is for polly.wait_fold. It should be an int.
Check:
https://github.com/lindig/polly/blob/master/lib/polly.mli#L127
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The expected value at the Polly API is an int in milliseconds - so we could take a float in seconds and convert it to milliseconds when we pass it to Polly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be an Mtime.Span.t
, also the current name doesn't make it clear it's for a proxy
let poll_timeout_sec = ref 5 | |
let proxy_poll_period_timeout = ref Mtime.Span.(5 * s) |
ocaml/xapi/xapi_globs.ml
Outdated
) | ||
; ( "poll_timeout_sec" | ||
, Arg.Set_int poll_timeout_sec | ||
, (fun () -> string_of_int !poll_timeout_sec) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would use a float.
exception Close_proxy | ||
|
||
let proxy ?(should_close : (bytes * int * int -> bool) option = None) | ||
?(poll_timeout = -1) (a : Unix.file_descr) (b : Unix.file_descr) = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to be a consequence of the interface of polly, which expects a -1
for "no timeout".
match should_close with | ||
| Some cb -> | ||
let buf, read_len, offset = CBuf.read b' b in | ||
if cb (buf, read_len, offset) then | ||
close_proxy () | ||
| None -> | ||
ignore (CBuf.read b' b) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This shorter version should work too:
let data = CBuf.read b' b in
Option.iter (fun cb ->
if cb data then close_proxy ()
) should_close
( match should_close with | ||
| Some cb -> | ||
if cb (Bytes.empty, 0, 0) then | ||
close_proxy () | ||
| None -> | ||
() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Option.iter (fun cb ->
if cb (Bytes.empty, 0, 0) then close_proxy ()
) should_close
| None -> | ||
ignore (CBuf.read b' b) | ||
) ; | ||
( match should_close with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this is to handle the case where polly returned after a timeout? Could it then go in an else
branch?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Modified the code to call the callback only once by providing a init value(Bytes.empty, 0, 0) to Polly.wait_fold
ocaml/xapi/console.ml
Outdated
if !Xapi_globs.poll_timeout_sec < 0 then | ||
-1 | ||
else | ||
!Xapi_globs.poll_timeout_sec * 1000 (* convert to milliseconds *) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or:
max (-1) (!Xapi_globs.poll_timeout_sec * 1000) (* convert to milliseconds *)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now I changed poll_timeout_sec
to a float type. The max (-1) ...
can not work, say poll_timeout_sec
is set to -0.00001 (though it's unlikely to set like this).
So I changed the logic back to the original one.
let get_console_idle_timeout_value ~__context ~vm = | ||
try | ||
let idle_timeout = | ||
if Db.VM.get_is_control_domain ~__context ~self:vm then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like this triggers several DB calls every 5 seconds for every console. If the VM is running on a pool member, then that means several calls to the coordinator. I understand that we want to be responsive to timeout changes, with we should reduce this to at most one DB call.
Several of these values are static for given VM and can be captured in the closure before the line with fun (buf, read_len, offset) ->
below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Good point.
ocaml/xapi/console.ml
Outdated
Db.Pool.get_vm_console_idle_timeout ~__context ~self:pool | ||
in | ||
idle_timeout | ||
with _ -> 0L (* Default: no timeout *) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make it an option
please.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
ocaml/xapi/console.ml
Outdated
false | ||
| Some parser -> ( | ||
try | ||
let data = Bytes.sub_string buf offset read_len in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that this still creates a copy. I had hoped that we could avoid that and pass buf+offset+len straight to the parser. Perhaps not worth doing, but I can't remember how the parser processes this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The parser calls Angstrom.Buffered.feed
which needs a string type.
So there must be a copy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, not even with the unsafe functions the copy can be avoided, as Bytes.sub
family of functions always allocates
ocaml/xapi/console.ml
Outdated
Int64.to_float (get_console_idle_timeout_value ~__context ~vm) | ||
in | ||
if is_idle_timeout ~idle_timeout_seconds ~last_activity then ( | ||
debug "Idle timeout exceeded! (timeout: %.1fs)" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be handy to say that this is about a console connection and for which VM.
exception Close_proxy | ||
|
||
let proxy ?(should_close : (bytes * int * int -> bool) option = None) | ||
?(poll_timeout = -1) (a : Unix.file_descr) (b : Unix.file_descr) = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. This should be a parameter with a default but it should not be an optional value.
ocaml/xapi/xapi_globs.ml
Outdated
(* The timeout (in seconds) for event polling in the proxy loop. | ||
If set to a positive value, the poll will wake up periodically, | ||
which is useful for implementing features like idle timeout or periodic inspection of proxy buffers. *) | ||
let poll_timeout_sec = ref 5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The expected value at the Polly API is an int in milliseconds - so we could take a float in seconds and convert it to milliseconds when we pass it to Polly.
ocaml/xapi/xapi_globs.ml
Outdated
; ( "poll_timeout_sec" | ||
, Arg.Set_int poll_timeout_sec | ||
, (fun () -> string_of_int !poll_timeout_sec) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be defined as part of the xapi_globs_spec_with_descriptions
list, in line 1259:
; ( "proxy_poll_period_timeout"
, ShortDurationFromSeconds proxy_poll_period_timeout
, "Timeout (in seconds) for event polling in network proxy loops. When \
positive, the proxy will wake up periodically to check tasks like vnc \
idle timeouts or perform other maintenance tasks. Set to -1 to wait \
indefinitely for network events without periodic wake-ups."
)
use ShortDurationFromSeconds
to have the conf value as float
, or LongDurationFromSeconds
for int
. Because polly accepts milliseconds and the polling loop is bound to timeout in a matter of seconds at most, I recommend the float.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. I didn't know this before. But I checked the code, it seems ShortDurationFromSeconds cannot be negative, while epoll can take -1 as input. So I'll use float as Lindig suggested, and it'll be simpler to covert it to milliseconds.
val s_to_span : float -> Mtime.Span.t option
(** [s_to_span seconds] converts a float representing seconds to a timespan.
Returns None when [seconds] is negative, is not a number or larger than
~104 days. Avoid whenever possible, some RPC function already use this so
it needs to be available. *)
Signed-off-by: Stephen Cheng <[email protected]>
Signed-off-by: Stephen Cheng <[email protected]>
ocaml/xapi/console.ml
Outdated
last_activity := Mtime_clock.counter () ; | ||
|
||
(* The idle_timeout config may change during running the callback, | ||
so need to get the latest *) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the callback function is performance critical. We may have to give up on this. Specifically, I think we need to avoid any DB operations within this callback.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. It's a design choice. In that way, the timeout config change only applys to the new connections.
I can add this to the design doc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we need it to apply to current connections, then another option is to have a separate thread watch the DB field using event.from
and set a local ref
that all console proxies can use. Then DB calls are outside of the critical path. But even simpler to not do it at all...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, better to not do it.
If users want the idle timeout config applys to old connections, it's easy to reconnect the connections.
ocaml/xapi/console.ml
Outdated
match parser data with | ||
| Ok messages -> | ||
if is_active messages then | ||
last_activity := Mtime_clock.counter () ; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's unlikely that it would get timed out after this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right. We should not call timed_out function if is_active is true.
ocaml/xapi/console.ml
Outdated
debug "RFB parse error: %s" error_msg ; | ||
rfb_parser := None ; | ||
false | ||
with e -> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will the parsing raise any exception? Will the result
return have covered all the error cases? It is the point of result
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checking the parser code, it'll return Error when parsing FAIL
.
I'll remove the exception.
Signed-off-by: Stephen Cheng <[email protected]>
ocaml/xapi/console.ml
Outdated
let pool = Helpers.get_pool ~__context in | ||
Db.Pool.get_vm_console_idle_timeout ~__context ~self:pool | ||
in | ||
Some idle_timeout |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It could be
if idle_timeout > 0. then Some (Int64.to_float idle_timeout) else None
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated and tested
ocaml/xapi/console.ml
Outdated
messages | ||
|
||
let timed_out ~idle_timeout_seconds ~last_activity = | ||
idle_timeout_seconds > 0. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This checking is not needed anymore.
Signed-off-by: Stephen Cheng <[email protected]>
0e57657
to
b238556
Compare
Hello, @stephenchengCloud |
Will do. |
This commit adds idle timeout feature for vnc console connections.
Key changes:
proxy
to parse the RFB messages and determine if the connection is idle or not.