Bit encoding of nucleic acid types

Unambiguous nucleotides are represented in one-hot encoding as follows:

NucleicAcidBits
A0001
C0010
G0100
T/U1000

Ambiguous nucleotides are the bitwise OR of these four nucleotides. For example, R, A or G, is represented as 0101 (= A: 0001 | G: 0100). The gap symbol is always 0000. The meaningful four bits are stored in the least significant bits of a byte.

This encoding applies to both the DNA and RNA types.