parquet

Apache Parquet columnar storage format for Kit

Files

FileDescription
kit.tomlPackage manifest with metadata and dependencies
src/parquet.kitParquet reader/writer with compression options
tests/parquet.test.kitModule import verification test
examples/analytics.kitSales data aggregation with Arrow tables
examples/basic.kitFile validation and metadata inspection
examples/read.kitRow group and column-level reading
examples/write.kitWriting with compression and metadata
examples/sample.parquetSample Parquet data file
LICENSEMIT license file

Dependencies

No Kit package dependencies.

Installation

kit add gitlab.com/kit-lang/packages/kit-parquet.git

Usage

import Kit.Parquet

License

MIT License - see LICENSE for details.

Exported Functions & Types

ParquetError

Parquet error type with specific variants for different failure modes.

Variants

ParquetReadError {message}
ParquetWriteError {message}
ParquetAccessError {message}

uncompressed

Compression

snappy

Compression

gzip

Compression

lz4

Compression

zstd

Compression

brotli

Compression

default-options

WriteOptions

options

WriteOptions

with-compression

WriteOptions -> Compression -> WriteOptions

with-row-group-size

WriteOptions -> Int -> WriteOptions

with-metadata

WriteOptions -> String -> String -> WriteOptions

close

Close a Parquet reader

Reader -> Void

read-table

Read entire file as Arrow table

Reader -> Result Ptr ParquetError

read-row-group

Read a specific row group as Arrow record batch

Reader -> Int -> Result Ptr ParquetError

read-column

Read a specific column

Reader -> Int -> Result Ptr ParquetError

read

Read entire Parquet file into Arrow table (convenience function)

String -> Result Ptr ParquetError

read-rows

Read with row selection

String -> Int -> Int -> Result Ptr ParquetError

read-columns

Read specific columns only

String -> List String -> Result Ptr ParquetError

create-writer

Create a Parquet writer with Arrow schema

String -> Ptr -> Result Writer ParquetError

create-writer-with-options

Create writer with options

String -> Ptr -> WriteOptions -> Result Writer ParquetError

write-table

Write Arrow table to Parquet

Writer -> Ptr -> Result () ParquetError

write-batch

Write Arrow record batch to Parquet

Writer -> Ptr -> Result () ParquetError

close-writer

Close writer and finalize file

Writer -> Result () ParquetError

write

Write Arrow table to Parquet file (convenience function)

String -> Ptr -> Ptr -> Result () ParquetError

write-with-options

Write with options

String -> Ptr -> Ptr -> WriteOptions -> Result () ParquetError

metadata

Get file metadata

Reader -> FileMetadata

num-rows

Get number of rows

Reader -> Int

num-row-groups

Get number of row groups

Reader -> Int

num-columns

Get number of columns

Reader -> Int

column-name

Get column name by index

Reader -> Int -> Option String

column-names

Get all column names

Reader -> List String

get-metadata

Get custom metadata value

Reader -> String -> Option String

row-group-metadata

Get row group metadata

Reader -> Int -> Result RowGroupMetadata ParquetError

column-stats

Get column statistics for a row group

Reader -> Int -> Int -> Result {min: Int, max: Int, null-count: Int, distinct-count: Int} ParquetError

schema

Get Arrow schema from Parquet file

Reader -> Result Ptr ParquetError

column-descriptor

Get column descriptor

Reader -> Int -> Result ColumnDescriptor ParquetError

is-parquet?

Check if file is a valid Parquet file

String -> Bool

file-info

Get Parquet file size info

String -> Result {path: String, num-rows: Int, num-row-groups: Int, num-columns: Int, columns: List String} ParquetError

summary

Print file summary

String -> Void