Fix PubSub.get_feed() exposing private reading log entries by michelleallen122 · Pull Request #12209 · internetarchive/openlibrary

michelleallen122 · 2026-03-27T22:10:10Z

Fix filters usernames against each publisher's public_readlog preference (notifications.public_readlog == 'yes') before querying, matching the same check used in Bookshelves.patrons_who_also_read().

Fixes #12196

Closes #12196

Technical

get_feed() was querying bookshelves_books for all followed users without checking whether each publisher had their reading log set to public. This exposed reading activity for patrons who had set their log to private.

Testing

Tests added to openlibrary/tests/core/test_follows.py covering:

private publisher excluded from feed
public publisher included
all private returns empty feed
no subscriptions returns empty feed

Screenshot

Stakeholders

get_feed() was querying bookshelves_books for all followed users without checking whether each publisher had their reading log set to public. This exposed reading activity for patrons who had set their log to private. Fix filters usernames against each publisher's public_readlog preference (notifications.public_readlog == 'yes') before querying, matching the same check used in Bookshelves.patrons_who_also_read(). Tests added to openlibrary/tests/core/test_follows.py covering: - private publisher excluded from feed - public publisher included - all private returns empty feed - no subscriptions returns empty feed Fixes internetarchive#12196

for more information, see https://pre-commit.ci

mekarpeles · 2026-03-30T20:08:07Z

+            return []
+
+        # Filter to only publishers with public reading logs
+        prefs = web.ctx.site.get_many([f'/people/{u}/preferences' for u in usernames])


Preferences is the old way we do this; we'll want to use the user.preferences function that uses the web.ctx.store under the hood

Copilot

Pull request overview

Fixes a privacy leak in PubSub.get_feed() by ensuring only publishers with a public reading log are included when generating a subscriber’s feed.

Changes:

Add a public-reading-log preference filter step before querying bookshelves_books in PubSub.get_feed().
Add new unit tests covering inclusion/exclusion behavior for public vs private publishers and empty-feed cases.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File	Description
`openlibrary/core/follows.py`	Filters followed publishers by reading-log visibility before fetching feed entries.
`openlibrary/tests/core/test_follows.py`	Adds tests for `PubSub.get_feed()` privacy filtering and empty-feed behavior.

Copilot · 2026-03-30T20:13:48Z

+        usernames = [
+            p.key.split('/')[2]
+            for p in prefs
+            if p.dict().get('notifications', {}).get('public_readlog') == 'yes'


The preference field lookup appears to be incorrect: user preferences store public_readlog at the top level (see openlibrary/core/models.py:949-987 and openlibrary/templates/account/privacy.html:18), not under notifications.public_readlog. As written, this filter will likely exclude all publishers and make the feed empty (and won’t actually enforce the intended privacy rule). Use p.dict().get('public_readlog', 'no') == 'yes' (or equivalent) when filtering publishers.

Suggested change

if p.dict().get('notifications', {}).get('public_readlog') == 'yes'

if p.dict().get('public_readlog', 'no') == 'yes'

Copilot · 2026-03-30T20:13:49Z

+    @classmethod
+    def setup_class(cls):
+        web.config.db_parameters = {"dbn": "sqlite", "db": ":memory:"}
+        get_db().query(READING_LOG_DDL)
+


This test creates bookshelves_books with a custom DDL, but openlibrary.core.db.get_db() is globally cached (openlibrary/core/db.py:_get_db uses functools.cache). Other test modules also create bookshelves_books with a different schema (e.g. openlibrary/tests/core/test_db.py), which can cause "table already exists" errors or missing-column failures depending on collection order. Consider clearing the DB cache in setup/teardown (e.g. _get_db.cache_clear()), and/or using a setup which won’t collide with existing table definitions.

Copilot · 2026-03-30T20:13:49Z

+    """Build a mock preferences object like web.ctx.site.get_many returns."""
+    p = MagicMock()
+    p.key = f"/people/{username}/preferences"
+    p.dict.return_value = {"notifications": {"public_readlog": "yes" if public else "no"}}


The mocked preferences shape here doesn’t match how OL stores preferences: public_readlog is a top-level key on the preferences dict (see openlibrary/templates/account/privacy.html and openlibrary/core/models.py:973+). With the current mock (notifications.public_readlog), this test can pass while production code is checking the wrong field, so it won’t catch the real regression. Update the mock (and expectations) to reflect the real stored structure.

Suggested change

p.dict.return_value = {"notifications": {"public_readlog": "yes" if public else "no"}}

# In production, `public_readlog` is stored as a top-level preference key.

p.dict.return_value = {"public_readlog": "yes" if public else "no"}

- Replace web.ctx.site.get_many(...preferences) with web.ctx.site.get() + user.preferences() per mekarpeles' feedback; this uses web.ctx.store under the hood, matching the pattern used elsewhere in the codebase (mybooks.py, fastapi/public_my_books.py) - Fix public_readlog lookup: was incorrectly nested under 'notifications', it is a top-level key on the preferences dict - Fix test mock to reflect the real stored shape {'public_readlog': 'yes'} instead of {'notifications': {'public_readlog': 'yes'}} - Add _get_db.cache_clear() in setup/teardown to avoid DB cache collisions with other test modules that also create bookshelves_books

michelleallen122 · 2026-03-31T17:14:11Z

Hey @jimchamp, @mekarpeles i've made updates to the pr, in my latest commit.

jimchamp · 2026-03-31T18:36:02Z

@michelleallen122, these changes were not necessary (see comment). In the future, please wait until an issue is triaged by staff before working on a solution.

michelleallen122 and others added 2 commits March 29, 2026 07:14

[pre-commit.ci] auto fixes from pre-commit.com hooks

abc924e

for more information, see https://pre-commit.ci

michelleallen122 force-pushed the 12196/hotfix/fix-get-feed-private-readlog-privacy branch from 504b443 to abc924e Compare March 29, 2026 11:14

mekarpeles requested a review from jimchamp March 30, 2026 20:06

mekarpeles added the Priority: 1 Do this week, receiving emails, time sensitive, . [managed] label Mar 30, 2026

mekarpeles requested review from Copilot and removed request for jimchamp March 30, 2026 20:07

mekarpeles assigned jimchamp Mar 30, 2026

Copilot started reviewing on behalf of mekarpeles March 30, 2026 20:07 View session

mekarpeles requested changes Mar 30, 2026

View reviewed changes

Copilot AI reviewed Mar 30, 2026

View reviewed changes

michelleallen122 requested a review from mekarpeles March 31, 2026 17:15

jimchamp closed this Mar 31, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix PubSub.get_feed() exposing private reading log entries#12209

Fix PubSub.get_feed() exposing private reading log entries#12209
michelleallen122 wants to merge 3 commits intointernetarchive:masterfrom
michelleallen122:12196/hotfix/fix-get-feed-private-readlog-privacy

michelleallen122 commented Mar 27, 2026

Uh oh!

mekarpeles Mar 30, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Mar 30, 2026

Uh oh!

Copilot AI Mar 30, 2026

Uh oh!

Copilot AI Mar 30, 2026

Uh oh!

michelleallen122 commented Mar 31, 2026 •

edited

Loading

Uh oh!

jimchamp commented Mar 31, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

	if p.dict().get('notifications', {}).get('public_readlog') == 'yes'
	if p.dict().get('public_readlog', 'no') == 'yes'

	p.dict.return_value = {"notifications": {"public_readlog": "yes" if public else "no"}}
	# In production, `public_readlog` is stored as a top-level preference key.
	p.dict.return_value = {"public_readlog": "yes" if public else "no"}

Uh oh!

Conversation

michelleallen122 commented Mar 27, 2026

Technical

Testing

Screenshot

Stakeholders

Uh oh!

mekarpeles Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

michelleallen122 commented Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jimchamp commented Mar 31, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

michelleallen122 commented Mar 31, 2026 •

edited

Loading