Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add proposal for non-skipping generators #62

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
152 changes: 152 additions & 0 deletions eeps/eep-0070.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,152 @@
Author: Dániel Szoboszlay <[email protected]>
Status: Draft
Type: Standards Track
Created: 01-Jul-2024
Erlang-Version: 28
Post-History:
****
EEP 70: Non-filtering generators
----

Abstract
========

This EEP proposes the addition of a new, *non-skipping* variant of all
existing generators (list, bit string and map). Currently existing
generators are *skipping*: they ignore terms in the right-hand side
expression that do not match the left-hand side pattern. Non-skipping
generators on the other hand shall fail with exception `badmatch`.

Rationale
=========

The motivation for non-skipping generators is that skipping generators
can hide the presence of unexpected elements in the input data of a
comprehension. For example consider the below snippet:

[{User, Email} || #{user := User, email := Email} <- all_users()]

This list comprehension would skip users that don't have an email
address. This may be an issue if we suspect potentially incorrect input
data, like in case `all_users/0` would read the users from a JSON file.
Therefore cautious code that would prefer crashing instead of silently
skipping incorrect input would have to use a more verbose map function:

lists:map(fun(#{user := User, email := Email}) -> {User, Email} end,
all_users())

Unlike the generator, the anonymous function would crash on a user
without an email address. Non-skipping generators would allow similar
semantics in comprehensions too:

[{User, Email} || #{user := User, email := Email} <-:- all_users()]

This generator would crash (with a `badmatch` error) if the pattern
wouldn't match an element of the list.

The proposed operators for non-skipping generators are `<-:-` (for lists
and maps) and `<=:=` (for bit strings) instead of `<-` and `<=`. This
syntax was chosen because `<-:-` and `<=:=` resemble the `=:=` operator
that tests whether two terms match.

Alternate Designs
=================

There are numerous other ways of achieving the goal of crashing upon
encountering non-matching input:

1. Converting the comprehension to a `map` call, just as in the example
of the previous section.

2. Moving the pattern matching that may fail to the result expression of
the comprehension:

[begin
#{user:= User, email := Email} = Usr,
{User, Email}
end
|| Usr <- all_users()]

3. Moving the pattern matching that may fail to a filter in the
comprehension (which would still have to evaluate to a boolean,
making this solution quite awkward looking):

[{User, Email}
|| Usr <- all_users(),
(#{user := User, email := Email} = Usr) > 0]

4. The same as before, but using binders proposed in [EEP 12][]:

[{User, Email}
|| Usr <- all_users(),
#{user := User, email := Email} = Usr]

Most of these solutions are much more verbose than the non-skipping
generator syntax, undermining the compactness of comprehensions. But
there are more serious issues too:

* 1 would not work for bit string comprehensions, because there isn't
a `map` function available for bit strings.

* 2 would make it impossible to use sub terms of the pattern (`Usr` in
this example) in subsequent generators or filters.

* The intent of the code isn't clear in most of these solutions
(definitely not in 3, and arguably in 4 and 2).

* None of these techniques can solve the last non-matching bits of a
bit string problem.

The issue with bit string generators is that unlike lists and maps, bit
strings don't have natural "elements" the generator could iterate over.
Instead the pattern in the generator dictates where to split the bit
string into parts. This could inevitably lead to situations where the
bit string ends with some bits that don't match the pattern, like in
this example:

[X || <<X:16>> <- <<1,2,3>>]

The existing generator would skip these last non-matching bits, and this
behaviour cannot be changed due to backward compatibility reasons.
It is only possible to guarantee that a bit string generator would fully
consume its input by introducing a new type of bit string generator or
some other new syntax that would alter the behaviour of the existing
generator.

Reference Implementation
========================

Current implementation: [PR #8625][].

Backward Compatibility
======================

There are no backward compatibility issues with the proposed new syntax,
since the `<-:-` and `<=:=` operators used to be invalid syntax in
previous versions of Erlang.

Furthermore, non-skipping generators can be added as an experimental
feature that has to be explicitly enabled:

-feature(non_skipping_generators,enable).

[PR #8625]: https://github.com/erlang/otp/pull/8625
"Reference implementation PR"

[EEP 12]: eep-0012.md
"Extensions to comprehensions, O'Keefe"

Copyright
=========

This document is placed in the public domain or under the CC0-1.0-Universal
license, whichever is more permissive.

[EmacsVar]: <> "Local Variables:"
[EmacsVar]: <> "mode: indented-text"
[EmacsVar]: <> "indent-tabs-mode: nil"
[EmacsVar]: <> "sentence-end-double-space: t"
[EmacsVar]: <> "fill-column: 70"
[EmacsVar]: <> "coding: utf-8"
[EmacsVar]: <> "End:"
[VimVar]: <> " vim: set fileencoding=utf-8 expandtab shiftwidth=4 softtabstop=4: "
Loading