AbstractBufReader

Core, low-level interface

The core interface of an io::AbstractBufReader consists of three functions, that are to be used together:

  • get_buffer(io) returns a view into the internal buffer with data ready to read. You read from the io by copying.
  • fill_buffer(io) attempts to append more bytes to the buffer returned by future calls to get_buffer
  • consume(io, n::Int) removes the first n bytes of the buffer from future buffers returned by get_buffer

While lots of higher-level convenience functions are also defined, nearly all functionality is defined in terms of these three core functions. See the docstrings of these functions for details and edge cases.

Let's see two use cases to demonstrate how this core interface is used.

Example: Reading N bytes

Suppose we want a function read_exact(io::AbstractBufReader, n::Int) which reads exactly n bytes to a new Vector{UInt8}, unless io hits end-of-file (EOF).

This functionality is already implemented as read(::AbstractBufReader, ::Integer), so the below implementation is for illustration purposes.

Since io itself controls how many bytes are filled with fill_buffer (typically whatever is the most efficient), we do this best by calling the functions above in a loop:

function read_exact(io::AbstractBufReader, n::Int)
    n > -1 || throw(ArgumentError("n must be non-negative"))
    result = sizehint!(UInt8[], n)
    remaining = n
    while !iszero(remaining)
        # Get the buffer to copy bytes from in order to read from `io`
        buffer = get_buffer(io)
        if isempty(buffer)
            # Fill new bytes into the buffer. This returns `0` if `io` if EOF,
            # in which case we break to return the result.
            # `fill_buffer` can return `nothing` for some reader types, but only if
            # the buffer is not empty.
            iszero(something(fill_buffer(io))) && break
            buffer = get_buffer(io)
        end
        mn = min(remaining, length(buffer))
        append!(result, buffer[mn])
        # Signal to `io` that the first `mn` bytes have already been read,
        # so these should not be output in future calls to `get_buffer`
        consume(io, mn)
        remaining -= mn
    end
    return result
end

The code above may be simplified by using the convenience function get_nonempty_buffer or by simply calling the already-implemented read(io, n).

Example: Reading a line without intermediate allocations

In this example, we want to buffer a full line in io's buffer, and then return a view into the buffer representing that line.

This function is an unusual use case, because we need to ensure the buffer is able to hold a full line. For most IO operations, we do not need, nor want to control exactly how much is buffered by io, since leaving that up to io is typically more efficient.

Therefore, this is one of the rare cases where we may need to force io to grow its buffer.

function get_line_view(io::AbstractBufReader)
    # Which position to search for a newline from
    scan_from = 1
    while true
        buffer = get_buffer(io)
        pos = findnext(==(UInt8('\n')), buffer, scan_from)
        if pos === nothing
            scan_from = length(buffer) + 1
            n_filled = fill_buffer(io)
            if n_filled === nothing
                # fill_buffer may return nothing if the buffer is not empty,
                # and the buffer cannot be expanded further.
                error("io could not buffer an entire line")
            elseif iszero(n_filled)
                # This indicates EOF, so the line is defined as the rest of the
                # content of `io`
                return buffer
            end
        else
            return buffer[1:pos]
        end
    end
end

Functionality similar to the above is provided by the line_views iterator.

BufferIO.get_bufferFunction
get_buffer(io::AbstractBufReader)::ImmutableMemoryView{UInt8}

Get the available bytes of io.

Calling this function, even when the buffer is empty, should never do actual system I/O, and in particular should not attempt to fill the buffer. To fill the buffer, call fill_buffer.

get_buffer(io::AbstractBufWriter)::MutableMemoryView{UInt8}

Get the available mutable buffer of io that can be written to.

Calling this function should never do actual system I/O, and in particular should not attempt to flush data from the buffer or grow the buffer. To increase the size of the buffer, call grow_buffer.

BufferIO.fill_bufferFunction
fill_buffer(io::AbstractBufReader)::Union{Int, Nothing}

Fill more bytes into the buffer from io's underlying buffer, returning the number of bytes added. After calling fill_buffer and getting n, the buffer obtained by get_buffer should have n new bytes appended.

This function must fill at least one byte, except

  • If the underlying io is EOF, or there is no underlying io to fill bytes from, return 0
  • If the buffer is not empty, and cannot be expanded, return nothing.

Buffered readers which do not wrap another underlying IO, and therefore can't fill its buffer should return 0 unconditionally. This function should never return nothing if the buffer is empty.

Note

Idiomatically, users should not call fill_buffer when the buffer is not empty, because doing so may force growing the buffer instead of letting io choose an optimal buffer size. Calling fill_buffer with a nonempty buffer is only appropriate if, for algorithmic reasons you need io itself to buffer some minimum amount of data.

BufferIO.consumeFunction
consume(io::Union{AbstractBufReader, AbstractBufWriter}, n::Int)::Nothing

Remove the first n bytes of the buffer of io. Consumed bytes will not be returned by future calls to get_buffer.

If n is negative, or larger than the current buffer size, throw an IOError with ConsumeBufferError kind. This check is a boundscheck and may be elided with @inbounds.

Notable AbstractReader functions

AbstractBufReader implements most of the Base.IO interface, see the section in the sidebar. They also have a few special convenience functions:

BufferIO.get_nonempty_bufferMethod
get_nonempty_buffer(x::AbstractBufReader)::Union{Nothing, ImmutableMemoryView{UInt8}}

Get a buffer with at least one byte, if bytes are available. Otherwise, fill the buffer, and return the newly filled buffer. Returns nothing only if x is EOF.

BufferIO.read_into!Function
read_into!(x::AbstractBufReader, dst::MutableMemoryView{UInt8})::Int

Read bytes into the beginning of dst, returning the number of bytes read. This function will always read at least 1 byte, except when dst is empty, or x is EOF.

This function is defined generically for AbstractBufReader. New methods should strive to do at most one read call to the underlying IO, if x wraps such an IO.

BufferIO.read_all!Function
read_all!(io::AbstractBufReader, dst::MutableMemoryView{UInt8})::Int

Read bytes into dst until either dst is filled or io is EOF, returning the number of bytes read.