Context: Ad-hoc tests show potential issues with suggestion mode performance on an entry-level Android. (See for instance the numbers in T418532). We need a more systematic way of monitoring how the system is performing.
The purpose of this task is to enable the Editing team to see, more easily and quickly, when suggestion mode/edit checks are affecting user experience.
Scope: There are multiple ways to define "edit check performance". We could consider parts that aren't immediately noticeable to the user but that still warrant investigation if performing abnormally. (For example, if tone check were suddenly taking longer than usual to generate actions, the user probably wouldn't experience typing lag but we'd still want to look into it.) However, to provide some scope to this task, we've decided to limit monitoring to only the user-perceptible parts.
Acceptance criteria:
- A new metric in the code that tracks how the system is performing with respect to the scope defined above. (See below for the metrics we decided on)
- A view (such as a Grafana dashboard) where we can see these metrics (in order to more easily catch when edit checks are affecting user experience).
Decisions made:
- These are the metrics we'll track:
- time to generate actions, sync and async, on a per-check level; computed each time the checks are run and tracked for a randomly selected 1% of sessions
- an aggregate of typing lags (avg, p50, p95, max) for each session