Skip to content

Conversation

@yads
Copy link

@yads yads commented Oct 3, 2014

Before the html

<p>Joe went to yahoo.com&nbsp;and google.com</p>

Generated the following output

<p>Joe went to <a href="http://yahoo.com&nbsp">yahoo.com&nbsp</a>;and <a href="http://google.com">google.com</a></p>

This handles this case as well as properly handling & inside urls (as part of the query string)

Approach is to split the html into parts separated at html character entities and process those parts separately. & characters are handled separately in the UrlMatch code, which ideally should know if it's dealing with html or text, but currently doesn't.

Vadim Kazakov added 2 commits October 3, 2014 10:08
any html character entities following a url are now not being treated like part of the query string
gregjacobs added a commit that referenced this pull request Oct 5, 2014
Correctly handling html character entities that follow a url
@gregjacobs gregjacobs merged commit 8a38854 into gregjacobs:master Oct 5, 2014
@gregjacobs
Copy link
Owner

Nicely done. That's been a long-standing issue.

One quick tip on English though: it's is the contraction form of it is. I saw you removed the apostrophe in some of my comments ;)

@yads
Copy link
Author

yads commented Oct 5, 2014

Oh jeez, that was definitely not intentional. My editor decided to remove single quotes from where I was editing. I'll fix that up tomorrow and push it in, unless you don't mind fixing it up yourself.

@gregjacobs
Copy link
Owner

Haha, no worries, already fixed :) Thanks for the patch!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants