Conversation
Mr0grog
left a comment
There was a problem hiding this comment.
Hi @aaxis-em, thanks for working on this! I appreciate the effort. 🙇
However, I’m not sure about the approach here. From the original issue:
a good way to handle this is to give tokens a diffable text representation (in which non-breaking spaces are replaced by spaces) and a literal text representation (where these types of characters are unchanged). The former is used for comparisons, but the latter is used when stitching the actual diff back together.
This PR normalizes all types of spaces to just " " and will cause that to be the output in the diff, even when a different kind of space is used.
A better approach here is probably to customize the __eq__() method in DiffToken (and some of its subclasses) to use normalized text, while the html() method continues to return the original text:
web-monitoring-diff/web_monitoring_diff/html_render_diff.py
Lines 637 to 676 in b5d9795
A good example of this kind of thing is href_token, which uses a special comparator function in its __eq__() method rather than comparing the actual URL that it renders in html():
web-monitoring-diff/web_monitoring_diff/html_render_diff.py
Lines 710 to 738 in b5d9795
Also, please separate these changes from the OCI label changes you made as of #227, and please also remove the style changes — they make the diff here much harder to review.
880fc7e to
9e48cc3
Compare
|
Hi @aaxis-em, sorry I have not been able to get to this. I’m at a conference this week, and will try and take a look this weekend or early next week. |
fixes #198