Fast Avro for Python
Because the Apache Python avro package is written in pure Python, it is relatively slow. In one test case, it takes about 14 seconds to iterate through a file of 10,000 records. By comparison, the JAVA avro SDK reads the same file in 1.9 seconds. The fastavro library was written to offer performance comparable to the Java library. With regular CPython, fastavro uses C extensions which allow it to iterate the same 10,000 record file in 1.7 seconds. With PyPy, this drops to 1.5 seconds (to be fair, the JAVA benchmark is doing some extra JSON encoding/decoding). Supported Features • File Writer • File Reader (iterating via records or blocks) • Schemaless Writer • Schemaless Reader • JSON Writer • JSON Reader • Codecs (Snappy, Deflate, Zstandard, Bzip2, LZ4, XZ) • Schema resolution • Aliases • Logical Types • Parsing schemas into the canonical form • Schema fingerprinting Missing Features • Anything involving Avro’s RPC features
Release | Stable | Testing |
---|---|---|
Fedora Rawhide | 1.4.9-1.fc36 | - |
Fedora 35 | 1.4.4-1.fc35 | - |
Fedora 34 | 1.3.4-1.fc34 | - |
You can contact the maintainers of this package via email at
python-fastavro dash maintainers at fedoraproject dot org
.