Maniphest T419447

Instrument performance monitoring for Edit Checks and Suggestions
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	dchan
	Mar 9 2026, 4:47 PM

Description

Context: Ad-hoc tests show potential issues with suggestion mode performance on an entry-level Android. (See for instance the numbers in T418532). We need a more systematic way of monitoring how the system is performing.

The purpose of this task is to enable the Editing team to see, more easily and quickly, when suggestion mode/edit checks are affecting user experience.

Scope: There are multiple ways to define "edit check performance". We could consider parts that aren't immediately noticeable to the user but that still warrant investigation if performing abnormally. (For example, if tone check were suddenly taking longer than usual to generate actions, the user probably wouldn't experience typing lag but we'd still want to look into it.) However, to provide some scope to this task, we've decided to limit monitoring to only the user-perceptible parts.

Acceptance criteria:

A new metric in the code that tracks how the system is performing with respect to the scope defined above. (See below for the metrics we decided on)
A view (such as a Grafana dashboard) where we can see these metrics (in order to more easily catch when edit checks are affecting user experience).
- See Suggestion Mode Performance dashboard.

Decisions made:

These are the metrics we'll track:
- time to generate actions, sync and async, on a per-check level; computed each time the checks are run and tracked for a randomly selected 1% of sessions
- an aggregate of typing lags (avg, p50, p95, max) for each session

Details

Related Changes in Gerrit:

Subject	Repo	Branch	Lines +/-
Edit Check: Instrument edit check & suggestion mode performance	mediawiki/extensions/VisualEditor	master	+21 -0
Add tracking metrics to EditCheckPerformance	mediawiki/extensions/VisualEditor	master	+30 -11
Create EditCheckPerformance class to store various metrics	mediawiki/extensions/VisualEditor	master	+197 -0

Customize query in gerrit

Related Objects
Search...

Status	Assigned	Task
Open	None	T360489 [EPIC] Generate and present edit suggestions at scale
Open	None	T414853 [MILESTONE] Deploy Suggestion Mode as a default-on feature.
Open	None	T404600 [MILESTONE] Run controlled experiment of Suggestion Mode MVP
Resolved	medelius	T419447 Instrument performance monitoring for Edit Checks and Suggestions

Event Timeline

dchan created this task.Mar 9 2026, 4:47 PM

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMar 9 2026, 4:47 PM

dchan updated the task description. (Show Details)Mar 9 2026, 4:58 PM

ppelberg subscribed.Mar 10 2026, 5:08 PM

ppelberg triaged this task as High priority.Mar 17 2026, 11:35 PM

ppelberg moved this task from Decisions to be made to Ready to Be Worked On on the Editing-team (Editing-18Mar-27Mar-2026) board.

ppelberg added a parent task: T404600: [MILESTONE] Run controlled experiment of Suggestion Mode MVP.

medelius claimed this task.Mar 25 2026, 5:21 PM

medelius moved this task from Ready to Be Worked On to Doing on the Editing-team (Editing-18Mar-27Mar-2026) board.Mar 26 2026, 5:01 PM

ppelberg added projects: VisualEditor Suggestion Mode, OKR-Work.Mar 26 2026, 6:34 PM

Change #1261681 had a related patch set uploaded (by Medelius; author: Medelius):

[mediawiki/extensions/VisualEditor@master] Edit Check: Instrument edit check & suggestion mode performance

https://gerrit.wikimedia.org/r/1261681

gerritbot added a project: Patch-For-Review.Mar 27 2026, 12:35 AM

ppelberg moved this task from Backlog to Meta on the VisualEditor Suggestion Mode board.Mar 27 2026, 7:52 PM

DLynch edited projects, added Editing-team (Editing-Q4-30Mar-10Apr-2026); removed Editing-team (Editing-18Mar-27Mar-2026).Mar 30 2026, 5:20 PM

DLynch moved this task from Decision to be made to Doing on the Editing-team (Editing-Q4-30Mar-10Apr-2026) board.

ppelberg mentioned this in T417922: Consider cost-benefit of running ToneCheck in suggestion mode.Mar 31 2026, 12:38 AM

DLynch mentioned this in T422064: [SPIKE] Investigate what's possible for an automated test that watches typing performance in VisualEditor + suggestion mode.Apr 1 2026, 5:24 PM

medelius updated the task description. (Show Details)Apr 1 2026, 8:05 PM

Change #1268278 had a related patch set uploaded (by Medelius; author: Medelius):

[mediawiki/extensions/VisualEditor@master] Compute and store various edit check performance stats

https://gerrit.wikimedia.org/r/1268278

Change #1268606 had a related patch set uploaded (by Medelius; author: Medelius):

[mediawiki/extensions/VisualEditor@master] WIP: Add tracking metrics to EditCheckPerformance

https://gerrit.wikimedia.org/r/1268606

DLynch moved this task from Doing to Code Review on the Editing-team (Editing-Q4-30Mar-10Apr-2026) board.Apr 9 2026, 5:14 PM

DLynch moved this task from Code Review to QA on the Editing-team (Editing-Q4-30Mar-10Apr-2026) board.Apr 12 2026, 10:34 PM

DLynch added a project: Editing QA.

Change #1268278 merged by jenkins-bot:

[mediawiki/extensions/VisualEditor@master] Create EditCheckPerformance class to store various metrics

https://gerrit.wikimedia.org/r/1268278

Change #1268606 merged by jenkins-bot:

[mediawiki/extensions/VisualEditor@master] Add tracking metrics to EditCheckPerformance

https://gerrit.wikimedia.org/r/1268606

ReleaseTaggerBot added a project: MW-1.46-notes (1.46.0-wmf.24; 2026-04-14).Apr 13 2026, 5:00 AM

Documenting the metrics we've decided on:

time to generate actions, sync and async, on a per-check level; computed each time the checks are run and tracked for a randomly selected 1% of sessions
an aggregate of typing lags (avg, p50, p95, max) for each session

medelius updated the task description. (Show Details)Apr 13 2026, 3:39 PM

Change #1261681 abandoned by Medelius:

[mediawiki/extensions/VisualEditor@master] Edit Check: Instrument edit check & suggestion mode performance

Reason:

Went with building a whole EditCheckPerformance class instead. See 1268278.

https://gerrit.wikimedia.org/r/1261681

Maintenance_bot removed a project: Patch-For-Review.Apr 13 2026, 4:31 PM

Moving to current sprint for me to review and close-out.

ppelberg moved this task from Decision to be made to Ready for Sign off on the Editing-team (Editing-Q4-13Apr-24Apr-2026) board.Apr 15 2026, 4:35 PM

medelius updated the task description. (Show Details)Apr 15 2026, 4:40 PM

ppelberg awarded a token.Apr 16 2026, 10:53 PM

@medelius: a couple of questions before closing this out (below).

Where might I need to go/what might I need to do to see, "...time to generate actions, sync and async, on a per-check level..."? Right now, when I visit the grafana dashboard I'm seeing a Time to generate actions chart. Although, I've not yet discovered a way to filter this down by check type.

Would it be accurate for me to interpret the presence of the Typing lag and Time to generate actions charts within the grafana dashboard as meaning the data we'd been waiting on has "arrived"?