-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid unnecessary allocations while finding token matches in a file #73500
Conversation
for (var i = startIndex; i <= length; i++) | ||
return caseSensitive | ||
? IndexOfCaseSensitive() | ||
: IndexOfCaseInsensitive(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i found this cleaner as just two separate find helpers. one for the common C# case, which needs no converting of chars, and much less branching, and one for VB.
Both no longer alloc. Only the VB one has some special complex logic around case insensitivity.
var match = true; | ||
for (var j = 0; j < searchStringLength; j++) | ||
{ | ||
var matchChar = j == 0 ? normalizedFirstChar : CaseInsensitiveComparison.ToLower(searchString[j]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes. but i didn't measure any problems with this. and i view allocatoins as much worse. most code is ascii, so we're going to fastpath all these ToLowers all the time.
@@ -70,35 +70,64 @@ public static TextChangeRange GetEncompassingTextChangeRange(this SourceText new | |||
return TextChangeRange.Collapse(ranges); | |||
} | |||
|
|||
public static int IndexOf(this SourceText text, string value, int startIndex, bool caseSensitive) | |||
public static int IndexOf(this SourceText text, string searchString, int startIndex, bool caseSensitive) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, I wonder if something similar to what was done in https://devdiv.visualstudio.com/DevDiv/_git/VSUnitTesting/pullrequest/550572 might be useful in source text searching.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i leave to you to implement :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
slacker! :) Did this method show up at all in the CPU side of your profile?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll check again. I think it's dominated by compiler time. Will do tomorrow!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// | ||
// only one implementation we have that could have bad indexer perf is CompositeText with heavily modified text | ||
// at compiler layer but I believe that being used in find all reference will be very rare if not none. | ||
if (!Match(normalized[j], text[i + j], caseSensitive)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So once it's not spanable. It's a source text. :-(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
makes sense, thanks!
closing out. i haven't been able to see this again. |
Saw these allocs while doing a trace that include FAR in it. We have a fast path that says "the bloom filter found a hit in this file, and we know the identifier was not escaped in it". IN that case, we do a textual search to find spans to get as tokens, so we don't have to walk the entire tree looking for the matches (we can instead dive down right to that span, only realizing the red nodes along that path).
However, the finding of text locations was unnecessarily allocating for each match it was looking for.