Page MenuHomePhabricator

Instrument performance monitoring for Edit Checks and Suggestions
Closed, ResolvedPublic

Description

Context: Ad-hoc tests show potential issues with suggestion mode performance on an entry-level Android. (See for instance the numbers in T418532). We need a more systematic way of monitoring how the system is performing.

The purpose of this task is to enable the Editing team to see, more easily and quickly, when suggestion mode/edit checks are affecting user experience.

Scope: There are multiple ways to define "edit check performance". We could consider parts that aren't immediately noticeable to the user but that still warrant investigation if performing abnormally. (For example, if tone check were suddenly taking longer than usual to generate actions, the user probably wouldn't experience typing lag but we'd still want to look into it.) However, to provide some scope to this task, we've decided to limit monitoring to only the user-perceptible parts.

Acceptance criteria:

  • A new metric in the code that tracks how the system is performing with respect to the scope defined above. (See below for the metrics we decided on)
  • A view (such as a Grafana dashboard) where we can see these metrics (in order to more easily catch when edit checks are affecting user experience).

Decisions made:

  • These are the metrics we'll track:
    • time to generate actions, sync and async, on a per-check level; computed each time the checks are run and tracked for a randomly selected 1% of sessions
    • an aggregate of typing lags (avg, p50, p95, max) for each session

Event Timeline

Change #1261681 had a related patch set uploaded (by Medelius; author: Medelius):

[mediawiki/extensions/VisualEditor@master] Edit Check: Instrument edit check & suggestion mode performance

https://gerrit.wikimedia.org/r/1261681

Change #1268278 had a related patch set uploaded (by Medelius; author: Medelius):

[mediawiki/extensions/VisualEditor@master] Compute and store various edit check performance stats

https://gerrit.wikimedia.org/r/1268278

Change #1268606 had a related patch set uploaded (by Medelius; author: Medelius):

[mediawiki/extensions/VisualEditor@master] WIP: Add tracking metrics to EditCheckPerformance

https://gerrit.wikimedia.org/r/1268606

Change #1268278 merged by jenkins-bot:

[mediawiki/extensions/VisualEditor@master] Create EditCheckPerformance class to store various metrics

https://gerrit.wikimedia.org/r/1268278

Change #1268606 merged by jenkins-bot:

[mediawiki/extensions/VisualEditor@master] Add tracking metrics to EditCheckPerformance

https://gerrit.wikimedia.org/r/1268606

Documenting the metrics we've decided on:

  • time to generate actions, sync and async, on a per-check level; computed each time the checks are run and tracked for a randomly selected 1% of sessions
  • an aggregate of typing lags (avg, p50, p95, max) for each session

Change #1261681 abandoned by Medelius:

[mediawiki/extensions/VisualEditor@master] Edit Check: Instrument edit check & suggestion mode performance

Reason:

Went with building a whole EditCheckPerformance class instead. See 1268278.

https://gerrit.wikimedia.org/r/1261681

@medelius: a couple of questions before closing this out (below).

  1. Where might I need to go/what might I need to do to see, "...time to generate actions, sync and async, on a per-check level..."? Right now, when I visit the grafana dashboard I'm seeing a Time to generate actions chart. Although, I've not yet discovered a way to filter this down by check type.
  1. Would it be accurate for me to interpret the presence of the Typing lag and Time to generate actions charts within the grafana dashboard as meaning the data we'd been waiting on has "arrived"?
medelius updated the task description. (Show Details)
medelius updated the task description. (Show Details)

Sorry Peter, that still had dummy data on it. We're getting the data we expect now (2), and the graphs are updated and filters are at the top (1).