Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Encode for Cow<'a, str> #27

Open
matteopolak opened this issue Jul 11, 2024 · 6 comments
Open

Implement Encode for Cow<'a, str> #27

matteopolak opened this issue Jul 11, 2024 · 6 comments

Comments

@matteopolak
Copy link
Contributor

Hey!

I'm trying to encode a Cow<'static, str> but I don't think it's implemented. Is this intentional?

Please let me know if I'm missing something or if you don't have time to add it -- I can submit a PR :)

Thanks

@finnbear
Copy link
Member

finnbear commented Jul 14, 2024

AFAIK, Cow<'_, str>: Encode is possible:

// src/derive/impls.rs
impl<T: Encode + ToOwned + ?Sized> Encode for ::alloc::borrow::Cow<'_, T> {
    type Encoder = DerefEncoder<T>;
}

The hard, error-prone part would be Cow<?, str>: Decode. Since Rust lacks lifetime specialization, we cannot implement both:

  • Cow<'static, str>: Decode which produces Cow::Owned copied from the input buffer
  • Cow<'a, str>: Decode which produces Cow::Borrowed borrowed from the 'a input buffer

So I'm wondering: Would Cow<'_, str>: Encode be useful without Cow<?, str>: Decode? You would have to use T: Decode where T is compatible with the encoded representation of Cow<'_, str>. One issue is that bitcode will probably never be able to specify what those representation(s) would be. There probably can never be a guarantee that String serializes the same as &str. Maybe there could be a guarantee that &str serializes the same as Cow<'_, str>.

@matteopolak
Copy link
Contributor Author

I think always opting for Cow::Owned upon Decode is acceptable, serde does the same by deserializing into a String first. They always encode Cow<'_, str> as a &str, so the repr of &str and String would need to be the same to do the same.

It looks like this is currently the case link (unless I'm reading it incorrectly). Could potentially add some tests for it, then if it's not feasible to keep the same repr it could be removed in some future breaking change?

@adamreichold
Copy link

serde does the same by deserializing into a String first.

Note that serde has special handling for Cow<str> and Cow<[u8]> via its borrow attribute, c.f. https://github.com/serde-rs/serde/blob/28a092261b1b47726c4c16fd00aa7de8df19785f/serde_derive/src/internals/attr.rs#L1260, i.e. it is able to deserialize Cow::Borrowed borrowing from the input.

@finnbear
Copy link
Member

finnbear commented Jul 14, 2024

I think always opting for Cow::Owned upon Decode is acceptable,

Maybe that's the best serde can do with its set of constraints.

It looks like this is currently the case

Yes. A possible reason for it to change would be if we needed one format for borrowing from the buffer (&str) but there was a more size-optimal format for copying (String). This would be the case for &[u16] vs. Vec<u16> because borrowing the former imposes an alignment requirement that copying into Vec<u16> doesn't. I realize &str doesn't impose alignment constraints, but we might have, for example, buffer layout constraints. We prioritize flexibility so we can freely optimize later.

Note that serde has special handling for Cow and Cow<[u8]> via its borrow attribute,

Yeah I'm vaguely aware of that. Maybe we could do something similar. But, if we're going to detect Cow, might as well detect non-'static lifetime too?

I doubt this issue can proceed without input from @caibear 👀

@adamreichold
Copy link

adamreichold commented Jul 14, 2024

But, if we're going to detect Cow, might as well detect non-'static lifetime too?

I think the main learning from serde is that these detections tend to have false positives and the generated code must fail safely in these cases, i.e getting confused about renamed lifetimes must not lead to invalid references.

@finnbear
Copy link
Member

finnbear commented Jul 14, 2024

tend to have false positives

I think there are three cases (for the declared lifetime of the Cow):

  • 'static (verbatim match, copy from buffer)
  • 'a: !'buffer (user error, must fail to compile e.g. due to a generated 'a: 'buffer bound)
  • 'a: 'buffer (borrow from buffer)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants