Apache Arrow and Pyarrow
PyArrow: units
PyArrow: RecordBatch
Creating a RecordBatch and Basic Operations
Struct Arrays: Nested Data in Columns
Flattening: Breaking Structs Apart
Conversion to Native Python (Handling Nulls)
Pro-Level Features for Data Engineering
Zero-Copy NumPy Views
Schema Evolution
The Schema Object: Your Data Contract
Bonus: RecordBatch vs Table
PyArrow: Tables
1. Creating Tables and Basic Operations
2. Adding, Removing, and Renaming Columns
3. Filtering and Sorting
4. Grouping and Aggregation
5. Combining Tables
6. Working with Chunked Columns
7. Converting Between Formats
8. Pro Pattern: Lazy Table Operations
Key Differences: RecordBatch vs Table
Last updated