Avoid querying database on every page view to check if page is in the user's reading list
Closed, ResolvedPublic3 Estimated Story Points
Actions

Description

In the current implementation of ReadingLists on web, we query the database to determine:

if a page is in the user's reading list, so that the "save page" button has the "save" or "unsave" icon.
if it is in the reading list, then get the reading list entry id to provide to the JS
also get the reading list size for metrics.

It is problematic for scaling reading lists to query the reading_list database tables on x1 on every page view, and it is also unnecessary.

We need to determine if we still need the reading list size metric here, and if so, consider other approaches.

For the reading list entry id, there probably is a way to use the page id for this instead.

For determining if the page in in the reading list, Amir has a suggestion:

Build a bloom filter of existing reading list page ids for each user and put it in user_properties backed by some cache. Bloom filter will take away 99%‌of the load and even if it incorrectly say "this article is in the user's reading list", then you can query x1 to actually be sure but again it won't cause any load issues. You can also put that behind memcached to make everything faster and avoid local db query too.

This is the idea that I wanted to implement for many years to remove the query of watchlist table on every logged-in page view. If you can implement it for watchlist too, to improve performance (since it'll be backed by memcached). It would be even better!

A less efficient approach could be just to have a list of page ids that are on the user's reading list and put it in memcached and check against that, vs a database query.

Details

Related Changes in Gerrit:

Subject	Repo	Branch	Lines +/-
Split bloom filter cache-related code to BookmarkBloomFilterCache	mediawiki/extensions/ReadingLists	master	+502 -286
Use bloom filter to reduce DB queries to check page bookmark status	mediawiki/extensions/ReadingLists	master	+1 K -22
Add pleonasm/bloom-filter v1.0.4 for ReadingLists	mediawiki/vendor	master	+600 -7

Customize query in gerrit

Related Objects

Mentioned In: T422009: Add instrumentation and benchmarks for ReadingLists bloom filter lookups
T419743: Security Review: pleonasm/bloom-filter
T419466: Spike - Title normalization is not applied when saving pages to the reading list via the API
T417010: Avoid querying for ReadingList default list id on page views
T418001: Update save page bookmark.js button code to save page to the default list
T418013: Remove data-mw-list-page-count attribute from the "Save page" bookmark button
T417923: Update ReadingLists bookmark button to unsave page from all ReadingLists
T414252: Spike - Database scaling for reading list beta rollout
Mentioned Here: T422009: Add instrumentation and benchmarks for ReadingLists bloom filter lookups
T419826: Use a bloom filter for looking up disambig pages

Event Timeline

aude created this task.Feb 13 2026, 2:26 PM

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptFeb 13 2026, 2:26 PM

aude triaged this task as High priority.Feb 13 2026, 2:28 PM

aude mentioned this in T414252: Spike - Database scaling for reading list beta rollout.

Ladsgroup subscribed.Feb 13 2026, 11:14 PM

aude mentioned this in T417923: Update ReadingLists bookmark button to unsave page from all ReadingLists.Feb 19 2026, 6:37 PM

aude moved this task from Incoming to Phase 2 Beta feature launch (+ mobile web improvements) on the FY25-26 Reading Lists board.Feb 19 2026, 8:12 PM

aude moved this task from Incoming to Needs refinement on the Reader Experience Team board.Feb 20 2026, 5:14 PM

This can initially be a spike with a proof of concept, that considers the feasibility of the suggested approach.

AnneT subscribed.Feb 24 2026, 6:39 PM

SToyofuku-WMF edited projects, added Reader Experience Team (REx Sprint 15 [Q3 Feb 24 - Mar 9]); removed Reader Experience Team.Feb 24 2026, 6:39 PM

aude claimed this task.Feb 24 2026, 8:58 PM

aude moved this task from Ready to In progress on the Reader Experience Team (REx Sprint 15 [Q3 Feb 24 - Mar 9]) board.

Change #1245418 had a related patch set uploaded (by Aude; author: Aude):

[mediawiki/extensions/ReadingLists@master] WIP - Use bloom filter to reduce DB queries to check page bookmark status

https://gerrit.wikimedia.org/r/1245418

gerritbot added a project: Patch-For-Review.Feb 27 2026, 5:35 PM

Jdrewniak moved this task from Phase 2 Beta feature launch (+ mobile web improvements) to Phase 2 - Beta feature on the FY25-26 Reading Lists board.Mar 2 2026, 7:04 PM

Jdrewniak edited projects, added FY25-26 Reading Lists (Phase 2 - Beta feature); removed FY25-26 Reading Lists.

Jdrewniak moved this task from Backlog to Performance on the FY25-26 Reading Lists (Phase 2 - Beta feature) board.Mar 2 2026, 7:25 PM

aude renamed this task from SPIKE - Avoid querying database on every page view to check if page is in the user's reading list to Avoid querying database on every page view to check if page is in the user's reading list.Mar 4 2026, 11:54 PM

aude moved this task from In progress to Needs code review on the Reader Experience Team (REx Sprint 15 [Q3 Feb 24 - Mar 9]) board.

@Ladsgroup I uploaded a new patch for this to gerrit:

https://gerrit.wikimedia.org/r/c/mediawiki/extensions/ReadingLists/+/1245418

One question with this is that I am using a library for the bloom filter (https://github.com/pleonasm/bloom-filter). This has been used by Wikimedia before, but maybe still has to go through security review?

or maybe bloom filter is something simple enough to implement ourselves as a library or something in MediaWiki?

Change #1248891 had a related patch set uploaded (by Jforrester; author: Jforrester):

[mediawiki/vendor@master] Add pleonasm/bloom-filter for ReadingLists

https://gerrit.wikimedia.org/r/1248891

Rsilvola added a project: Security-Team.Mar 9 2026, 12:10 PM

aude mentioned this in T418013: Remove data-mw-list-page-count attribute from the "Save page" bookmark button.Mar 9 2026, 4:21 PM

aude mentioned this in T418001: Update save page bookmark.js button code to save page to the default list.

aude mentioned this in T417010: Avoid querying for ReadingList default list id on page views.

To be carried forward to sprint 16 - also to be reviewed by a committee of Amir, Steph, and Anne (at minimum! feel free to also take a look)

ASanford-WMF subscribed.Mar 9 2026, 5:35 PM

cc: @ASanford-WMF

aude mentioned this in T419466: Spike - Title normalization is not applied when saving pages to the reading list via the API.Mar 9 2026, 7:15 PM

aude moved this task from REx Sprint 15 [Q3 Feb 24 - Mar 9] to REx Sprint 16 [Q3 Mar 10 - Mar 23] on the Reader Experience Team board.Mar 10 2026, 6:54 PM

aude edited projects, added Reader Experience Team (REx Sprint 16 [Q3 Mar 10 - Mar 23]); removed Reader Experience Team (REx Sprint 15 [Q3 Feb 24 - Mar 9]).

aude moved this task from Ready to Needs code review on the Reader Experience Team (REx Sprint 16 [Q3 Mar 10 - Mar 23]) board.

ASanford-WMF mentioned this in T419743: Security Review: pleonasm/bloom-filter.Mar 11 2026, 6:05 PM

@Ladsgroup would you be interested to help with code review for the patch? our team will also do a review since we are more familiar with how ReadingLists works.

https://gerrit.wikimedia.org/r/c/mediawiki/extensions/ReadingLists/+/1245418

For now, we are implementing the bloom filter (with the composer package) in ReadingLists. As a follow up, we are definitely interested to get bloom filter into core and have it used for watchlist (and other similar use cases), since we want to be able to handle more logged-in users.

Thanks. I did a preliminary review. I‌ also think we can start using bloom filters in a lot more areas: T419826: Use a bloom filter for looking up disambig pages

ASanford-WMF moved this task from In Progress to Watching on the Security-Team board.Mar 23 2026, 6:20 PM

Jdrewniak edited projects, added Reader Experience Team (REx Sprint 17 [Q3 Mar 24 - Apr 3]); removed Reader Experience Team (REx Sprint 16 [Q3 Mar 10 - Mar 23]).Mar 24 2026, 5:40 PM

aude moved this task from Ready to Needs code review on the Reader Experience Team (REx Sprint 17 [Q3 Mar 24 - Apr 3]) board.Mar 24 2026, 5:56 PM

Jdrewniak removed aude as the assignee of this task.Mar 25 2026, 6:12 PM

Change #1262251 had a related patch set uploaded (by Aude; author: Aude):

[mediawiki/extensions/ReadingLists@master] Split bloom filter cache-related code to BookmarkBloomFilterCache

https://gerrit.wikimedia.org/r/1262251

matmarex subscribed.Mar 31 2026, 9:57 PM

Change #1248891 merged by jenkins-bot:

[mediawiki/vendor@master] Add pleonasm/bloom-filter v1.0.4 for ReadingLists

https://gerrit.wikimedia.org/r/1248891

ReleaseTaggerBot added a project: MW-1.46-notes (1.46.0-wmf.23; 2026-04-07).Apr 1 2026, 7:00 PM