Skip to content

String offsets should always be in bytes #1849

@marijnh

Description

@marijnh

Things like str::index and str::find currently return a character offset, and str::slice takes character offsets. As by-char indexing is O(n), this will fall down horribly in any kind of code that needs to be fast.

I think the way to do this is to always use byte offsets. This complicates the mental model of our strings a bit, and is not safe (you can pass a bogus middle-of-multibyte-char offset to slice), but it certainly beats O(n) access.

(If we are so concerned about safety, we should not be representing strings as UTF8 byte sequences.)

Metadata

Metadata

Assignees

No one assigned

    Labels

    I-slowIssue: Problems and improvements with respect to performance of generated code.

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions