Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add proposal for non-skipping generators #62

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
151 changes: 151 additions & 0 deletions eeps/eep-0070.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,151 @@
Author: Dániel Szoboszlay <[email protected]>
Status: Draft
Type: Standards Track
Created: 01-Jul-2024
Erlang-Version: 28
Post-History:
****
EEP 70: Strict and relaxed generators
----

Abstract
========

This EEP proposes the addition of a new, *strict* variant of all
existing generators (list, bit string and map). Currently existing
generators are *relaxed*: they ignore terms in the right-hand side
expression that do not match the left-hand side pattern. Strict
generators on the other hand shall fail with exception `badmatch`.

Rationale
=========

The motivation for strict generators is that relaxed generators
can hide the presence of unexpected elements in the input data of a
comprehension. For example consider the below snippet:

[{User, Email} || #{user := User, email := Email} <- all_users()]

This list comprehension would filter out users that don't have an email
address. This may be an issue if we suspect potentially incorrect input
data, like in case `all_users/0` would read the users from a JSON file.
Therefore cautious code that would prefer crashing instead of silently
filtering out incorrect input would have to use a more verbose map
function:

lists:map(fun(#{user := User, email := Email}) -> {User, Email} end,
all_users())

Unlike the generator, the anonymous function would crash on a user
without an email address. Strict generators would allow similar
semantics in comprehensions too:

[{User, Email} || #{user := User, email := Email} <:- all_users()]

This generator would crash (with a `badmatch` error) if the pattern
wouldn't match an element of the list.

The proposed operators for strict generators are `<:-` (for lists
and maps) and `<:=` (for bit strings) instead of `<-` and `<=`. This
syntax was chosen because `<:-` and `<:=` somewhat resemble the `=:=`
operator that tests whether two terms match, and at the same time keep
the operators short and easy to type. Having the two types of operators
differ by a single character, `:`, also makes the operators easy to
remember as "`:` means strict."

Alternate Designs
=================

There are numerous other ways of achieving the goal of crashing upon
encountering non-matching input:

1. Converting the comprehension to a `map` call, just as in the example
of the previous section.

2. Moving the pattern matching that may fail to the result expression of
the comprehension:

[begin
#{user:= User, email := Email} = Usr,
{User, Email}
end
|| Usr <- all_users()]

3. Moving the pattern matching that may fail to a filter in the
comprehension (which would still have to evaluate to a boolean,
making this solution quite awkward looking):

[{User, Email}
|| Usr <- all_users(),
(#{user := User, email := Email} = Usr) > 0]

4. The same as before, but using binders proposed in [EEP 12][]:

[{User, Email}
|| Usr <- all_users(),
#{user := User, email := Email} = Usr]

Most of these solutions are much more verbose than the strict
generator syntax, undermining the compactness of comprehensions. But
there are more serious issues too:

* 1 would not work for bit string comprehensions, because there isn't
a `map` function available for bit strings.

* 2 would make it impossible to use sub terms of the pattern (`Usr` in
this example) in subsequent generators or filters.

* The intent of the code isn't clear in most of these solutions
(definitely not in 3, and arguably in 4 and 2).

* None of these techniques can solve the problem of final non-matching
bits of a bit string.

The issue with bit string generators is that unlike lists and maps, bit
strings don't have natural "elements" the generator could iterate over.
Instead the pattern in the generator dictates where to split the bit
string into parts. This could inevitably lead to situations where the
bit string ends with some bits that don't match the pattern, like in
this example:

[X || <<X:16>> <- <<1,2,3>>]

The existing generator would skip these final non-matching bits, and
this behaviour cannot be changed due to backward compatibility reasons.
It is only possible to guarantee that a bit string generator would fully
consume its input by introducing a new type of bit string generator or
some other new syntax that would alter the behaviour of the existing
generator.

Reference Implementation
========================

Current implementation: [PR #8625][].

Backward Compatibility
======================

There are no backward compatibility issues with the proposed new syntax,
since the `<:-` and `<:=` operators used to be invalid syntax in
previous versions of Erlang.

[PR #8625]: https://github.com/erlang/otp/pull/8625
"Reference implementation PR"

[EEP 12]: eep-0012.md
"Extensions to comprehensions, O'Keefe"

Copyright
=========

This document is placed in the public domain or under the CC0-1.0-Universal
license, whichever is more permissive.

[EmacsVar]: <> "Local Variables:"
[EmacsVar]: <> "mode: indented-text"
[EmacsVar]: <> "indent-tabs-mode: nil"
[EmacsVar]: <> "sentence-end-double-space: t"
[EmacsVar]: <> "fill-column: 70"
[EmacsVar]: <> "coding: utf-8"
[EmacsVar]: <> "End:"
[VimVar]: <> " vim: set fileencoding=utf-8 expandtab shiftwidth=4 softtabstop=4: "