Skip to content

Fix JSON Feed parser: use correct key for date_updated and prefer content_html#558

Open
danishashko wants to merge 1 commit intokurtmckee:mainfrom
danishashko:fix-json-feed-date-updated-key
Open

Fix JSON Feed parser: use correct key for date_updated and prefer content_html#558
danishashko wants to merge 1 commit intokurtmckee:mainfrom
danishashko:fix-json-feed-date-updated-key

Conversation

@danishashko
Copy link
Copy Markdown

Two bugs in the JSON Feed parser:

1. KeyError when parsing date_updated

parse_entry() checks "date_updated" in e (correct per the JSON Feed spec) but then reads e["date_modified"] which doesn't exist in JSON Feed. Every entry with a date_updated field raises a KeyError.

# current (broken):
if "date_updated" in e:
    entry["updated"] = e["date_modified"]        # KeyError
    entry["updated_parsed"] = _parse_date(e["date_modified"])  # KeyError

# fixed:
if "date_updated" in e:
    entry["updated"] = e["date_updated"]
    entry["updated_parsed"] = _parse_date(e["date_updated"])

2. content_text silently wins over content_html

The current if/elif returns content_text whenever present, even if content_html is also present. The JSON Feed 1.1 spec recommends HTML-capable clients prefer content_html. Swapping the branches gives richer content by default and matches the spec intent.

Both bugs are in feedparser/parsers/json.py, no new imports needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant