Resolving the block level rendering of <label>#233
Resolving the block level rendering of <label>#233Rajdeep0511 wants to merge 8 commits intoedgi-govdata-archiving:mainfrom
Conversation
Mr0grog
left a comment
There was a problem hiding this comment.
Hi @Rajdeep0511, thanks very much for working on this.
From the original issue:
We should also unify and abstract this idea of checking whether an element should break or be broken by change element. Having this code repeated is bad.
Can you please do that, so there isn’t a list of literal tag names repeated in the code?
Looking forward to contribute to Wayback Machine Project https://github.com/gsoc 2026
Just to be clear, while this code is used by the Wayback Machine, EDGI (this organization) is not a part of the Internet Archive and does not maintain the Wayback Machine. If you want to do GSoC work with the Internet Archive on the Wayback Machine, you should get in touch with them directly.
Side note: if you could use a more descriptive title, that would be helpful. Also, if you put “fixes #201” (or whatever issue number) in your PR description, GitHub will pick it up automatically and link this PR to the issue.
… names throughout the code.
|
1.Added a change boundary tag set including 'a','label' so that no repeat for that. |
|
A succesful merge would be beneficial for me to to formally submit my proposal at GSOC |
Change current_content from None to a list to store tag content.
Added whitespace normalization function and updated equality checks.
Refactor whitespace handling and improve DiffToken class structure
|
Please review my PRs |
Hi @Rajdeep0511, please do not post comments like this unless it has been quite a while (maybe a week, at least). Reviews take time, and you may have noticed that several people have submitted new work recently. I cannot get to every PR right away (especially since I am working on shipping v0.2.0 right now), but will get to yours as soon as I am able. Comments also take time to sort through and respond to, and comments like this take time that I could have been reviewing PRs instead. |
Add 'label' to block level tags in HTML rendering
Currently the code treats as block-like because it can contain block elements. The problem is that behaves similarly in many real-world pages (often styled as display: block), so diff markup inside a label becomes confusing.
What needs to change:
Add label to the same condition where is treated as block-like and update is_block
##Result
Now ins,del will not wrap , but instead appear inside or beside it,preventing confusing rendering like duplicated labels such as:
School or school district ZIP Codeappearing` as if it were two separate labels.
Looking forward to contribute to Wayback Machine Project @gsoc 2026