Skip to content

Resolving the block level rendering of <label>#233

Open
Rajdeep0511 wants to merge 8 commits intoedgi-govdata-archiving:mainfrom
Rajdeep0511:main
Open

Resolving the block level rendering of <label>#233
Rajdeep0511 wants to merge 8 commits intoedgi-govdata-archiving:mainfrom
Rajdeep0511:main

Conversation

@Rajdeep0511
Copy link

Add 'label' to block level tags in HTML rendering

Currently the code treats as block-like because it can contain block elements. The problem is that behaves similarly in many real-world pages (often styled as display: block), so diff markup inside a label becomes confusing.

What needs to change:
Add label to the same condition where is treated as block-like and update is_block
##Result
Now ins,del will not wrap , but instead appear inside or beside it,preventing confusing rendering like duplicated labels such as:
School or school district ZIP Code
appearing` as if it were two separate labels.
Looking forward to contribute to Wayback Machine Project @gsoc 2026

Copy link
Member

@Mr0grog Mr0grog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @Rajdeep0511, thanks very much for working on this.

From the original issue:

We should also unify and abstract this idea of checking whether an element should break or be broken by change element. Having this code repeated is bad.

Can you please do that, so there isn’t a list of literal tag names repeated in the code?

Looking forward to contribute to Wayback Machine Project https://github.com/gsoc 2026

Just to be clear, while this code is used by the Wayback Machine, EDGI (this organization) is not a part of the Internet Archive and does not maintain the Wayback Machine. If you want to do GSoC work with the Internet Archive on the Wayback Machine, you should get in touch with them directly.


Side note: if you could use a more descriptive title, that would be helpful. Also, if you put “fixes #201” (or whatever issue number) in your PR description, GitHub will pick it up automatically and link this PR to the issue.

@Rajdeep0511
Copy link
Author

1.Added a change boundary tag set including 'a','label' so that no repeat for that.
2.Added a helper function to centralize the logic and avoid repeating tag names throughout the code.
Thanks for the clarification regarding GSOC.

@Rajdeep0511 Rajdeep0511 changed the title fixes #201 Resolving the block level redering of <label> Mar 14, 2026
@Rajdeep0511 Rajdeep0511 changed the title Resolving the block level redering of <label> Resolving the block level rendering of <label> Mar 14, 2026
@Rajdeep0511 Rajdeep0511 requested a review from Mr0grog March 15, 2026 05:39
@Rajdeep0511
Copy link
Author

A succesful merge would be beneficial for me to to formally submit my proposal at GSOC

Change current_content from None to a list to store tag content.
Added whitespace normalization function and updated equality checks.
Refactor whitespace handling and improve DiffToken class structure
@Rajdeep0511
Copy link
Author

Please review my PRs

@Mr0grog
Copy link
Member

Mr0grog commented Mar 16, 2026

Please review my PRs

Hi @Rajdeep0511, please do not post comments like this unless it has been quite a while (maybe a week, at least). Reviews take time, and you may have noticed that several people have submitted new work recently. I cannot get to every PR right away (especially since I am working on shipping v0.2.0 right now), but will get to yours as soon as I am able. Comments also take time to sort through and respond to, and comments like this take time that I could have been reviewing PRs instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants