AbstractBufReader

Core, low-level interface

The core interface of an io::AbstractBufReader consists of three functions, that are to be used together:

  • get_buffer(io) returns a view into the internal buffer with data ready to read. You read from the io by copying.
  • fill_buffer(io) attempts to append more bytes to the buffer returned by future calls to get_buffer
  • consume(io, n::Int) removes the first n bytes of the buffer

While lots of higher-level convenience functions are also defined, nearly all functionality is defined in terms of these three core functions. See the docstrings of these functions for details and edge cases.

Let's see two use cases to demonstrate how this core interface is used.

Example: Reading N bytes

Suppose we want a function read_exact(io::AbstractBufReader, n::Int) which reads exactly n bytes to a new Vector{UInt8}, unless io hits end-of-file (EOF).

Since io itself controls how many bytes are filled with fill_buffer (typically whatever is the most efficient), we do this best by calling the functions above in a loop:

function read_exact(io::AbstractBufReader, n::Int)
    n > -1 || throw(ArgumentError("n must be non-negative"))
    result = sizehint!(UInt[], n)
    remaining = n
    while !iszero(remaining)
        # Get the buffer to copy bytes from in order to read from `io`
        buffer = get_buffer(io)
        if isempty(buffer)
            # Fill new bytes into the buffer. This returns `0` if `io` if EOF,
            # in which case we break to return the result
            iszero(something(fill_buffer(io))) && break
            buffer = get_buffer(io)
        end
        mn = min(remaining, length(buffer))
        append!(result, buffer[mn])
        # Signal to `io` that the first `mn` bytes have already been read,
        # so these should not be output in future calls to `get_buffer`
        consume(io, mn)
        remaining -= mn
    end
    result
end

The code above may be simplified by using the convenience function get_nonempty_buffer or the higher level function read_all!

Example: Reading a line without intermediate allocations

This example is different, because to avoid allocations, we need an entire line to be available in the buffer of the io. Therefore, this is one of the rare cases where we may need to force io to grow its buffer.

function get_line_view(io::AbstractBufReader)
    scan_from = 1
    while true
        buffer = get_buffer(io)
        pos = findnext(==(UInt8('\n')), buffer, scan_from)
        if pos === nothing
            scan_from = length(buffer) + 1
            n_filled = fill_buffer(io)
            if n_filled === nothing
                # fill_buffer may return nothing if the buffer is not empty,
                # and the buffer cannot be expanded further.
                error("io could not buffer an entire line")
            elseif iszero(n_filled)
                # This indicates EOF, so the line is defined as the rest of the
                # content of `io`
                return buffer
            end
        else
            return buffer[1:pos]
        end
    end
end

Functionality similar to the above is provided by the line_views iterator.

BufIO.get_bufferFunction
get_buffer(io::AbstractBufReader)::ImmutableMemoryView{UInt8}

Get the available bytes of io.

Calling this function when the buffer is empty should do actual system I/O, and in particular should not attempt to fill the buffer. To fill the buffer, call fill_buffer.

get_buffer(io::AbstractBufWriter)::MutableMemoryView{UInt8}

Get the available mutable buffer of io that can be written to.

Calling this function should do actual system I/O, and in particular should not attempt to flush data from the buffer. To increase the size of the buffer, call grow_buffer.

BufIO.fill_bufferFunction
fill_buffer(io::AbstractBufReader)::Union{Int, Nothing}

Fill more bytes into the buffer from io's underlying buffer, returning the number of bytes added. After calling fill_buffer and getting n, the buffer obtained by get_buffer should have n new bytes appended.

This function must fill at least one byte, except

  • If the underlying io is EOF, or there is no underlying io to fill bytes from, return 0
  • If the buffer is not empty, and cannot be expanded, return nothing.

Buffered readers which do not wrap another underlying IO, and therefore can't fill its buffer should return 0 unconditionally. This function should never return nothing if the buffer is empty.

Note

Idiomatically, users should not call fill_buffer when the buffer is not empty, because doing so forces growing the buffer instead of letting io choose an optimal buffer size. Calling fill_buffer with a nonempty buffer is only appropriate if, for algorithmic reasons you need io itself to buffer some minimum amount of data.

BufIO.consumeFunction
consume(io::Union{AbstractBufReader, AbstractBufWriter}, n::Int)::Nothing

Remove the first n bytes of the buffer of io. Consumed bytes will not be returned by future calls to get_buffer.

If n is negative, or larger than the current buffer size, throw an IOError with ConsumeBufferError kind. This check is a boundscheck and may be elided with @inbounds.

Notable AbstractReader functions

AbstractBufReader implements most of the Base.IO interface, see the section in the sidebar. They also have a few special convenience functions:

BufIO.get_nonempty_bufferMethod
get_nonempty_buffer(x::AbstractBufReader)::Union{Nothing, ImmutableMemoryView{UInt8}}

Get a buffer with at least one byte, if bytes are available. Otherwise, fill the buffer, and return the newly filled buffer. Returns nothing only if x is EOF.

BufIO.read_into!Function
read_into!(x::AbstractBufReader, dst::MutableMemoryView{UInt8})::Int

Read bytes into the beginning of dst, returning the number of bytes read. This function will always read at least 1 byte, except when dst is empty, or x is EOF.

This function is defined generically for AbstractBufReader. New methods should strive to do at most one read call to the underlying IO, if x wraps such an IO.

BufIO.read_all!Function
read_all!(io::AbstractBufReader, dst::MutableMemoryView{UInt8})::Int

Read bytes into dst until either dst is filled or io is EOF, returning the number of bytes read.