Design for an efficient mechanism for I/O reads with Polyphony #115
noteflakes
started this conversation in
General
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
A problem with I/O read operations in Ruby is that the stock API uses
String
s as buffers. Strings are not ideal, because they need to be allocated, and in order to be able to reuse them, you need to construct some mechanism for managing their use. Another problem is that concatenating strings is in in many situations a costly operation.The problem is accentuated when one needs to read lines, for example, in an HTTP/1 parser, especially if we're dealing with a slow HTTP client. Creating a string in order to read into it, then concatenating it to a previously read string, and later cutting into smaller strings represents a duplication of effort both in terms of memory allocations and CPU usage.
This design document describes an alternative way of performing I/O read/recv operations, that will be integrated into Polyphony and will achieve the following goals:
See also #108.
Basic design
The present proposal aims to achieve the above goals by adding two mechanisms to Polyphony:
The buffers are turned into strings only at the IOStream user-code interface.
IOStream
The
IOStream
class encapsulates a stream that can be read from. It implements all the different ways an IO can be read from, i.e.getc
,gets
,read
,readpartial
etc. The IO instance will automatically instantiate an IOStream and delegate all of its read method calls to it.An IOStream instance references zero or more buffer entries and maintains a cursor that marks the current 4reading position. Upon calling any of its read methods, if no data is available to be read, the IOStream issues a call to
Backend#stream_read
(or#stream_recv
) and gets back a buffer entry (which can be a buffer or anEOF
marker). Once the call returns, the IOStream can resume reading. For some types of reads, e.g.gets
orread(len)
, if the data available does not allow terminating the read (for example, a newline character has not been found in the available buffer entries), the IOStream will issue subsequent read operations to the underlying backend, until the read request can be sasisfied, or an EOF is encountered.When the cursor position moves past an entry, the underlying buffer is released back to the buffer manager.
The IOStream class can also be used as a transport-agnostic buffer layer that allows reading from any source, as discussed here.
Buffer management
Buffers are allocated in pools according to size. Buffer sizes range from 4KB to 4GB, and the requested buffer sizes are rounded up to the nearest power of two. A single buffer manager is tasked with managing buffers across all threads and backends. The buffer manager maintains a free list for each of the power-of-two sizes, implemented as doubly linked lists. When a buffer is needed, the manager removes the head of the appropriate free list (according to the requested size), and returns it. If no buffer is available, the manager allocates it.
io_uring provided buffers and buffer rings
This design is compatible with the way io_uring allows providing pre-allocated buffers and the newer buffer ring feature. We provide buffers to the io_uring interface, and when a CQE arrives, it will contain a reference to the buffer that was used. We will need, however, to add support for buffer group ids to the buffer manager, and we'll also need to add a buffer group id field to
struct op_ctx
.API
In addition to supporting the usual read methods, we'll want to add some methods for parsers. Some examples:
Beta Was this translation helpful? Give feedback.
All reactions