Skip to content

feat: Add media content parsing for images#158

Open
DoomHammer wants to merge 1 commit intocustom-components:masterfrom
DoomHammer:add-media-content-parsing
Open

feat: Add media content parsing for images#158
DoomHammer wants to merge 1 commit intocustom-components:masterfrom
DoomHammer:add-media-content-parsing

Conversation

@DoomHammer
Copy link
Copy Markdown

@DoomHammer DoomHammer commented Mar 25, 2026

Add media_content parsing for images. Useful for Mastodon feeds.

@DoomHammer DoomHammer force-pushed the add-media-content-parsing branch from aee4864 to 5eab987 Compare March 25, 2026 16:15
Copy link
Copy Markdown
Collaborator

@ogajduse ogajduse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your patch. I left one blocking comment. I'd also ask you to cover this behavior with one or more tests.

if images:
# pick the first image found
return images[0]["url"]
elif "summary" in feed_entry:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there is no media_content, we'd still like this logical branch to be executed. Or if media_content exists but contains no images, the summary fallback will be skipped.

Suggested change
elif "summary" in feed_entry:
if "summary" in feed_entry:

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds support for parsing images from media_content entries in RSS/Atom items, improving compatibility with feeds like Mastodon that publish images via media extensions.

Changes:

  • Extend _process_image to look for image/* items in feed_entry["media_content"].
  • Return the first matching media_content image URL when present.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +264 to +266
images = [
enc for enc in feed_entry["media_content"] if enc['type'].startswith("image/")
]
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Filtering media_content items with enc['type'].startswith("image/") can raise KeyError (missing type) or AttributeError (type is None). Using a safe accessor (e.g., enc.get("type", "")) and matching the enclosures style (enc.type) would make this more robust for varying feedparser outputs.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I consider this a minor comment.

Comment on lines +263 to +269
if feed_entry.get("media_content"):
images = [
enc for enc in feed_entry["media_content"] if enc['type'].startswith("image/")
]
if images:
# pick the first image found
return images[0]["url"]
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change introduces a new image source (media_content) but the test suite doesn't include any fixture/feed data that exercises media_content image parsing. Adding a representative feed sample (e.g., Mastodon) and asserting the selected image URL would prevent regressions and validate the new behavior.

Copilot uses AI. Check for mistakes.
Comment on lines +265 to 270
enc for enc in feed_entry["media_content"] if enc['type'].startswith("image/")
]
if images:
# pick the first image found
return images[0]["url"]
elif "summary" in feed_entry:
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The summary fallback is skipped when media_content exists but contains no image items because this uses elif "summary" in feed_entry after the media_content block. This can cause DEFAULT_THUMBNAIL to be returned even when an <img> exists in the summary. Consider making the summary check an independent if (or otherwise falling through) so it runs when media_content has no usable images.

Suggested change
enc for enc in feed_entry["media_content"] if enc['type'].startswith("image/")
]
if images:
# pick the first image found
return images[0]["url"]
elif "summary" in feed_entry:
enc for enc in feed_entry["media_content"] if enc["type"].startswith("image/")
]
if images:
# pick the first image found
return images[0]["url"]
if "summary" in feed_entry:

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants