-
Notifications
You must be signed in to change notification settings - Fork 383
[Design PR] Spec for dotnet-counters and dotnet-trace run #1522
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
josalem
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Yay, a priori documentation!!) I agree that dotnet-<tool> run ... should mirror dotnet-<tool> collect|monitor .... Can I suggest a slight variation on how to specify the executable?
dotnet-trace run <dotnet-trace args> -- <exec + args>
For example,
dotnet-trace run --buffersize 512 --profile cpu-sampling -- dotnet MyAwesomeApp.dll -myArg1 -myArg2 val
That feels more similar to what similar tools, e.g., perf, do:
root@acafcbc23e4f:/# perf sched record -h
Usage: perf record [<options>] [<command>]
or: perf record [<options>] -- <command> [<options>]
...
I'm not 100% against the -e|--exec + --args combo, but it feels a little clunky to use the verb run and then have all the other arguments be the same as collect|monitor. It might end up being simpler if we added -e|--exec as an argument on collect|monitor instead.
If we do use -e|--exec could we not have --args as well? At least for inner loop, it's common to run a dotnet process as dotnet /bin/.../my.dll. If we use -e|--exec and --args, that becomes dotnet-trace run -e dotnet --args bin/.../my.dll. I think it might be simpler if we just took the entire invocation as single string, e.g., dotnet-trace run --exec "dotnet bin/.../my.dll --my-arg" [rest of dotnet-trace args].
Thoughts?
| Syntax: | ||
|
|
||
| ``` | ||
| dotnet-counters collect [-h||--help] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| dotnet-counters collect [-h||--help] | |
| dotnet-counters run [-h||--help] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find the verb run to be confusing. Without reading the proposal, I'd assumed dotnet counters run was similar to dotnet run where I could point it to a directory containing a project file as opposed to needing to specify an executable
|
@josalem Thanks for the thoughtful suggestions!!
We used to do this more often... I figured it would be nice if we do this at least for feature work that we do in this space :p
I actually quite like this idea since we might end up adding more commands to dotnet-trace as well (i.e. report), and it would fit in nicely with that. I'll wait for more people to put in their thoughts and if people are happy with what you suggested here, I would be happy to move forward with this proposal instead. |
|
Its nice to be able to discuss on the spec and +1 for most of what John said : ) I think a very useful scenario is that if someone has a command line X (ie MyApp.exe -appArg1) that runs their app, it is very convenient if the command to run with a diagnostic tool can be produced solely by concatenating some prefix: This makes it easy to give instructions, to manually compose the command line, or to automate composing it in a script. Other options are possible but take progressively more effort to explain or use: Doing the top option typically means allowing the "--" to ensure there is no ambiguity between options for the diagnostic tool and arguments for the app to be run that also happen to start with "-". I'd hope System.CommandLine already has support for it as it is a pattern that I've seen in a number of other tools. I like the suggestion to re-use the collect/monitor verbs rather than adding a new run verb. Combining this with the above is a little obnoxious because dotnet-counters already uses positional arguments as the counter list. I wouldn't be opposed to changing the syntax so that counters are now a comma separated list with an explicit "--counters" option to free up the positional arguments for the target app command line. We could also be nice and treat the positional arguments as a counter list when a PID option is already specified to avoid breaking anyone who might be relying on the current syntax. |
|
Thanks @sywhang for writing this up! Although not explicitly mentioned, I assume the stdout/stderr are still wired to the target process being traced? It might worth looking at other trace tools for prior art on what they do here. If I remember correctly valgrind outputs to stderr while the target process is still wired to stdout |
|
What happens to child dotnet processes that are spawned? Are they traced by default? Is it configurable? |
I was expecting something like this, but it's good to explicitly call out the behavior.
TL;DR - My preference would be to have the default be "trace all subprocesses" with a flag that makes it "trace first process only". This tech is based on the Diagnostic Port work in 5.0 which is configured via an env var. Under the covers, this is just creating a Diagnostic Port and setting an environment variable for the Another thing we can do similar to the proposed functionality would be a |
|
Thanks for the feedback everyone! I did take some time today to go through some of the existing tools and modified the proposal a bit. To summarize, I changed the spec to follow what Two main reasons for doing this were:
Some additional thoughts on other things being brought up here:
Actually, what I was thinking was the opposite - make the executable being specified on the CLI be the one and only process that gets traced, and not any of its subprocesses.
Yes it is. And for what it's worth I don't think that's something all users need to know or frankly, even care about. From an end user's perspective they may be confused on why all of the subprocesses are being collected by default when it wasn't explicitly configured to do so - the process they launch are explicitly typed by them so they know what is being traced, but tracing all subprocesses doesn't really fall into that category. I also don't know any other dev tools that you can use to launch a process and does something to that process do the same thing to all of its subprocesses. Say you launch a debugger that spawns child processes - by default the debugger won't attach to those processes. But it is a good point and I think we can add a flag in case the user wants to trace all those processes.
Yes :p Let's wait on that for a bit. |
josalem
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, what I was thinking was the opposite - make the executable being specified on the CLI be the one and only process that gets traced, and not any of its subprocesses.
I don't have any data or hypothetical scenarios for arguing in either direction, so I'm okay making that the default so long as we have a flag to trace all subprocesses. Perhaps a --trace-children or --follow-fork flag? One evokes the equivalent from windbg and the other from lldb/gdb.
I don't think that's something all users need to know or frankly, even care about
Agreed! 😃
Other than the few nits on the spec itself, I think this is going in a good direction.
The last couple things I think we should try to answer, either in this spec or as part of the implementation:
- How do we handle file/session rollover for long-lived processes? Should we?
- Can we stop tracing before the end of the process?
- E.g., if I'm doing fusion log collection, but I don't care about anything after startup, do I need to wait till process end to get my log?
|
|
||
| 2. Launch my-server-app.exe and start collecting metrics into a CSV file from its startup with the runtime and Kestrel counters and update interval of once per minute. | ||
| ``` | ||
| dotnet-counters collect System.Runtime Microsoft-AspNetCore-Server-Kestrel --format csv -- C:\path\to\my-server-app.exe --refresh-interval 60 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would expect that everything after -- will be used with the subprocess. Did you mean to have --refresh-interval 60 before --?
| [--format <csv|json>] | ||
| [--refreshInterval <sec>] | ||
| counter_list | ||
| [ -- <command-to-execute>] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| [ -- <command-to-execute>] | |
| [ -- <command-to-execute> [command options]] |
| -h, --help | ||
| Show command line help | ||
| --exec,--executable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we still need --exec if we use the -- <command> [command args] model?
| [--profile <profile_name>] | ||
| [--providers <list-of-comma-separated-providers>] | ||
| [--format <trace-file-format>] | ||
| -- <command-to-execute> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| -- <command-to-execute> | |
| [-- <command-to-execute> [command args]] |
|
I've been thinking about what to do with the UI. The main question around this is what we do with the UI (stdout/stdin/stderr) for target application being launched since this also affects the lifetime of the target app (i.e. stop tracing before the app exits, etc.). Some options I have thought of:
This still leaves the question of what to do about stdin and the lifetime of the target app. (Whether stdin should be used for the counter, or the app, or both => stop tracing/counter session means stop both tracing/counter AND the app).
None of these options seem "complete" enough to cover all use cases, and I can think of use cases that would make each one of these options "better" than others. I was hoping to get thoughts on these options, or any other thoughts people might have. |
|
Right off the bat, my head goes to how most debuggers on Linux work:
If we follow this model, we effectively have two use cases:
We've already settled on using a console A console B This would give us a simple UX for simple apps (1), but also gives us a much more flexible UX at the cost of simplicity (2). Another thought I had was to reduce the frequency of updates to stdout for our tooling and to skip line rewriting. That's more difficult with dotnet-counters since the stdout output is the purpose of the One way we can handle the "trace ends before app" issues is to make users press ctrl-c twice to end the target app. The first press triggers the trace stopping logic and the second gets forwarded to the app. Applying the last two observations to use cases 1 and 2 from above gets us the following:
Because we are adding this functionality on to |
|
I very much like @josalems suggestion to split the scenario into those two cases that tradeoff simplicity for access to interactive console. A few spots I was uncertain of:
Was this suggestion an entirely different alternative or a suggestion of behavior to add to case 1 where we are asserting the user doesn't care about the output? If it is the latter then it wasn't clear what our rationale for changing the output would be because we are already assuming the user doesn't care about their app output.
What do you think are the odds that a user won't care about their app's output but does care about having it keep running after the trace is complete? My guess is that this is a fairly uncommon set of requirements and based on that I'd propose serving this scenario with case 2 so that we make case 1 simpler. In case 1 the app would end as soon as the tracing does, at the first ctrl-C. |
It was meant to be another suggestion on top of case 1, but your comment has made me reconsider and I don’t think it’s worth including. I’m okay, okay with your suggestions about the crew-c handling as well Noah. I’d rather case 1 be as simple as possible and we leave case 2 for anyone with more advanced requirements. If we get feedback that case 1 needs more options or different behavior, then we can course correct at that point 😄 |
noahfalk
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Arg, I wrote this feedback weeks ago and forgot to hit submit : (
|
|
||
| 1. Launch my-server-app.exe and start monitoring it from its startup with default list of counters and interval. | ||
| ``` | ||
| dotnet-counters monitor -- C:\path\to\my-server-app.exe |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a big deal either way, but will it also work if I write?
dotnet-counters monitor C:\path\to\my-server-app.exe
Typically the "--" means everything after that point should be treated as an argument, not an option, regardless if it starts with a -. However in this case there are no tokens that start with "-" after the "--" so it isn't ambiguous. As far as I know most tools that utilize the "--" syntax don't require it in the non-ambiguous case.
| ``` | ||
|
|
||
| MONITOR / COLLECT on Startup (on .NET 5 or later) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error case is also pretty interesting. What happens if I run this command on an executable that doesn't wind up starting a runtime version >= 5?
|
|
||
| 2. Launch my-server-app.exe and start collecting metrics into a CSV file from its startup with the runtime and Kestrel counters and update interval of once per minute. | ||
| ``` | ||
| dotnet-counters collect System.Runtime Microsoft-AspNetCore-Server-Kestrel --format csv -- C:\path\to\my-server-app.exe --refresh-interval 60 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on my understanding of how "--" is usually interpretted in other apps (and probably how System.CommandLine would interpret it?) I would expect what this actually should do is attempt to run the process:
System.Runtime Microsoft-AspNetCore-Server-Kestrel C:\path\to\my-server-app.exe --refresh-interval 60
I don't think that is what you meant to have it do. My understanding of the parse rules are something like this pseudo-code:
bool parsingOptions = true;
foreach(string token in commandLine.Split(whitespace))
{
if(token == "--" && parsingOptions)
{
parsingOptions = false;
continue;
}
if(parsingOptions && token.StartsWith("-"))
{
// this is an option, skip it an potentially the next token too if the option consumes a value (not shown)
continue;
}
arguments.Add(token);
}
System.Runtime and Microsoft-AspNetCore-Server-Kestrel are arguments because they don't start with a "-". "--format csv" is an option because it does start with a dash and it consumes the following token. "--" disables all further options. "C:\path\to\my-server-app.exe" "--refresh-interval" and "60" are all arguments because regardless what they start with option parsing is disabled. Then the command-line is inferred to be all the arguments joined together with single spaces.
This is why counters using using a counter_list as its current argument list is problematic, it collides with interpreting arguments as being parts of the command-line. My suggestion is to change the syntax to something like this:
dotnet-counters collect --counters System.Runtime,Microsoft-AspNetCore-Server-Kestrel --format csv --refresh-interval 60 -- C:\path\to\my-server-app.exe
Back-compat with old syntax can still work by observing if the -p {PID} option is present and interpreting any arguments as elements in the counter list in that case only.
|
|
||
| ``` | ||
| dotnet-counters collect [-h||--help] | ||
| [--exec|--executable <path>] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| [--exec|--executable <path>] |
| [--mode <monitor|collect>] | ||
| [--format <csv|json>] | ||
| [--refreshInterval <sec>] | ||
| counter_list |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| counter_list | |
| --counters counter_list |
| dotnet-counters collect [-h||--help] | ||
| [--exec|--executable <path>] | ||
| [-o|--output <name>] | ||
| [--mode <monitor|collect>] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| [--mode <monitor|collect>] |
|
I am closing old PRs that have had no progress in a long time. You are welcome to resume the work any time and open a new PR. |
Related: #1466
I've been working on implementing
dotnet-counters runanddotnet-trace runand got them working locally. Before I proceed any further I wanted to bring up a spec PR so that we can all chime in on what this command should really do. My current proposal is for dotnet-counters and dotnet-trace only, but dotnet-monitor could benefit from the same command and I believe the tool was written with the run command in mind.My proposal is to have the run command be an analogous experience to dotnet-trace collect and dotnet-counters collect/monitor, except it launches the target app and starts collecting it.
For dotnet-counters, a user may want to use the collect command instead of the monitor command, so a flag was added so that the user can specify it depending on the need.
I am open to suggestions on shaping this command so please feel free to share your thoughts!
cc @josalem @noahfalk @shirhatti