embeddings
| Kind | kit |
|---|---|
| Capabilities | net |
| Categories | machine-learning database |
| Keywords | embeddings vector rag similarity machine-learning |
Lightweight vector embeddings store for RAG applications
Files
| File | Description |
|---|---|
.editorconfig | Editor formatting configuration |
.gitignore | Git ignore rules for build artifacts and dependencies |
.tool-versions | asdf tool versions (Zig, Kit) |
LICENSE | MIT license file |
README.md | This file |
examples/postgres-store.kit | Example: postgres store |
examples/simple-rag.kit | Example: simple rag |
examples/sqlite-store.kit | Example: sqlite store |
examples/test-sqlite.kit | Example: test sqlite |
examples/vector-ops.kit | Example: vector ops |
kit.toml | Package manifest with metadata and dependencies |
src/main.kit | Main module |
src/postgres.kit | Module for postgres |
src/sqlite.kit | Module for sqlite |
tests/embeddings.test.kit | Tests for embeddings |
tests/types.test.kit | Tests for types |
Architecture
RAG Pipeline
Backend Abstraction
Dependencies
jsonpostgressqlite
Installation
kit add gitlab.com/kit-lang/packages/kit-embeddings.gitUsage
import Kit.EmbeddingsLicense
MIT License - see LICENSE for details.
Exported Functions & Types
dot
Dot product of two vectors. Uses Kit's internal SIMD-accelerated implementation.
List Float -> List Float -> Float
Embeddings.dot [1.0, 2.0, 3.0] [4.0, 5.0, 6.0] # => 32.0magnitude
Magnitude (L2 norm) of a vector. Returns sqrt(sum of squares).
List Float -> Float
Embeddings.magnitude [3.0, 4.0] # => 5.0scale
Scale a vector by a scalar value. Returns a new vector with each element multiplied by the scalar.
List Float -> Float -> List Float
Embeddings.scale [1.0, 2.0, 3.0] 2.0 # => [2.0, 4.0, 6.0]normalize
Normalize a vector to unit length. Returns zero vector if input has zero magnitude.
List Float -> List Float
Embeddings.normalize [3.0, 4.0] # => [0.6, 0.8]add
Element-wise addition of two vectors.
List Float -> List Float -> List Float
sub
Element-wise subtraction of two vectors.
List Float -> List Float -> List Float
cosine-similarity
Cosine similarity between two vectors. Returns value in [-1, 1], where 1 = identical direction, 0 = orthogonal, -1 = opposite direction.
Formula: dot(a,b) / (|a| * |b|)
List Float -> List Float -> Float
Embeddings.cosine-similarity [1.0, 0.0] [1.0, 0.0] # => 1.0
Embeddings.cosine-similarity [1.0, 0.0] [0.0, 1.0] # => 0.0euclidean-distance
Euclidean distance between two vectors. Returns the L2 distance (straight-line distance).
Formula: sqrt(sum((a[i] - b[i])^2))
List Float -> List Float -> Float
Embeddings.euclidean-distance [0.0, 0.0] [3.0, 4.0] # => 5.0angular-distance
Angular distance between two vectors. Returns value in [0, 1], where 0 = identical direction, 0.5 = orthogonal, 1 = opposite direction.
Formula: arccos(cosine-similarity) / pi
List Float -> List Float -> Float
Embeddings.angular-distance [1.0, 0.0] [1.0, 0.0] # => 0.0
Embeddings.angular-distance [1.0, 0.0] [0.0, 1.0] # => 0.5to-bytes
Serialize a list of floats to bytes (IEEE 754 f64). Each float is stored as 8 bytes in little-endian format. Useful for storing embeddings as BLOBs in databases.
List Float -> Bytes
bytes = Embeddings.to-bytes [1.0, 2.0, 3.0] # 24 bytesfrom-bytes
Deserialize bytes to a list of floats. Expects IEEE 754 f64 format (8 bytes per float).
Bytes -> List Float
floats = Embeddings.from-bytes bytes # [1.0, 2.0, 3.0]similarity-fn
Get similarity function by metric name. Returns a function (a, b) -> Float.
Supported metrics: - :cosine - Cosine similarity (higher = more similar) - :euclidean - Negative Euclidean distance (higher = more similar) - :dot - Dot product (higher = more similar)
Symbol -> (List Float -> List Float -> Float)
sim-fn = Embeddings.similarity-fn :cosine
score = sim-fn vec-a vec-bdistance-fn
Get distance function by metric name. Returns a function (a, b) -> Float where lower = more similar.
Supported metrics: - :cosine - 1 - cosine similarity - :euclidean - Euclidean distance - :angular - Angular distance [0, 1]
Symbol -> (List Float -> List Float -> Float)
dist-fn = Embeddings.distance-fn :euclidean
distance = dist-fn vec-a vec-bsqlite-create
Create or open an embedding store with SQLite backend.
String -> Int -> Symbol -> Result EmbeddingStore String
sqlite-open
Open an existing embedding store with SQLite backend.
String -> Int -> Symbol -> Result EmbeddingStore String
sqlite-close
Close the embedding store.
EmbeddingStore -> Unit
sqlite-upsert
Insert or update an embedding.
EmbeddingStore -> String -> String -> List Float -> String -> Result Unit String
sqlite-insert
Insert an embedding (alias for upsert).
EmbeddingStore -> String -> String -> List Float -> String -> Result Unit String
sqlite-get
Get an embedding by ID.
EmbeddingStore -> String -> Option Record
sqlite-delete
Delete an embedding by ID.
EmbeddingStore -> String -> Result Unit String
sqlite-count
Count total embeddings in the store.
EmbeddingStore -> Int
sqlite-exists?
Check if an embedding exists by ID.
EmbeddingStore -> String -> Bool
sqlite-search
Search for top-k similar embeddings.
EmbeddingStore -> List Float -> Int -> Result (List Record) String
sqlite-search-threshold
Search with a minimum score threshold.
EmbeddingStore -> List Float -> Float -> Result (List Record) String
sqlite-search-filter
Search with metadata filtering.
EmbeddingStore -> List Float -> Int -> (Json -> Bool) -> Result (List Record) String
sqlite-upsert-batch
Insert multiple embeddings at once.
EmbeddingStore -> List Record -> Result Int String
sqlite-get-all
Get all embeddings (for small stores).
EmbeddingStore -> List Record
sqlite-clear
Clear all embeddings from the store.
EmbeddingStore -> Result Unit String
postgres-create
Create or open an embedding store with PostgreSQL backend.
String -> Int -> Symbol -> Result PgEmbeddingStore String
postgres-search
Search for top-k similar embeddings using pgvector.
PgEmbeddingStore -> List Float -> Int -> Result (List Record) String
postgres-upsert
Insert or update an embedding.
PgEmbeddingStore -> String -> String -> List Float -> String -> Result Unit String
postgres-delete
Delete an embedding by ID.
PgEmbeddingStore -> String -> Result Unit String
postgres-count
Count total embeddings in the store.
PgEmbeddingStore -> Int
postgres-create-index
Create HNSW index for fast approximate nearest neighbor search.
PgEmbeddingStore -> Int -> Int -> Result Unit String
create
Create or open an embedding store. Creates the embeddings table with pgvector column if it doesn't exist.
Parameters:
String -> Int -> Symbol -> Result PgEmbeddingStore String
store = Embeddings.Postgres.create "postgresql://localhost/mydb" 1536 :cosinecreate-with-table
Create an embedding store with a custom table name.
String -> Int -> Symbol -> NonEmptyString -> Result PgEmbeddingStore String
store = Embeddings.Postgres.create-with-table conn-string 1536 :cosine "my_vectors"create-index
Create HNSW index for fast approximate nearest neighbor search. Call this after creating the store for better search performance.
Parameters:
PgEmbeddingStore -> Int -> Int -> Result Unit String
Embeddings.Postgres.create-index store 16 64create-index-default
Create HNSW index with default parameters.
PgEmbeddingStore -> Result Unit String
open
Open an existing embedding store. Does not create table if it doesn't exist.
String -> Int -> Symbol -> Result PgEmbeddingStore String
store = Embeddings.Postgres.open "postgresql://localhost/mydb" 1536 :cosineopen-with-table
Open an existing embedding store with a custom table name.
String -> Int -> Symbol -> NonEmptyString -> Result PgEmbeddingStore String
close
Close the embedding store.
PgEmbeddingStore -> Unit
upsert
Insert or update an embedding. If an embedding with the same ID exists, it will be replaced.
Parameters:
PgEmbeddingStore -> String -> String -> List Float -> String -> Result Unit String
Embeddings.Postgres.upsert store "doc1" "Hello world" [0.1, 0.2, ...] "{\"source\": \"api\"}"insert
Insert an embedding (alias for upsert).
PgEmbeddingStore -> String -> String -> List Float -> String -> Result Unit String
get
Get an embedding by ID. Returns the embedding record or None if not found.
PgEmbeddingStore -> String -> Option Record
match Embeddings.Postgres.get store "doc1"
| Some result -> println result.content
| None -> println "Not found"delete
Delete an embedding by ID.
PgEmbeddingStore -> String -> Result Unit String
Embeddings.Postgres.delete store "doc1"count
Count total embeddings in the store.
PgEmbeddingStore -> Int
n = Embeddings.Postgres.count store # => 1234exists?
Check if an embedding exists by ID.
PgEmbeddingStore -> String -> Bool
search
Search for top-k similar embeddings using pgvector's native operators. Uses the configured metric for similarity comparison.
Parameters:
Returns:
PgEmbeddingStore -> List Float -> Int -> Result (List Record) String
match Embeddings.Postgres.search store query-vec 10
| Ok results ->
results |> each fn(r) => println "${r.score}: ${r.content}"
| Err e -> println "Search failed"search-threshold
Search with a minimum score threshold. Only returns results with score >= threshold.
PgEmbeddingStore -> List Float -> Float -> Result (List Record) String
search-where
Search with metadata filtering using JSONB operators. Filter is a SQL WHERE clause fragment for the metadata column.
PgEmbeddingStore -> List Float -> Int -> String -> Result (List Record) String
# Find documents from a specific source
Embeddings.Postgres.search-where store query-vec 10 "metadata->>'source' = 'api'"set-ef-search
Set the ef_search parameter for HNSW index. Higher values give more accurate results but slower queries. Default is 40. Typical range: 10-200.
PgEmbeddingStore -> Int -> Result Unit String
Embeddings.Postgres.set-ef-search store 100upsert-batch
Insert multiple embeddings at once using a transaction. More efficient than individual inserts.
PgEmbeddingStore -> List Record -> Result Int String
get-all
Get all embeddings (for small stores). Warning: May be slow for large stores. Consider using search instead.
PgEmbeddingStore -> List Record
clear
Clear all embeddings from the store.
PgEmbeddingStore -> Result Unit String
truncate
Truncate the table (faster than delete for large tables).
PgEmbeddingStore -> Result Unit String
vacuum
Vacuum the table to reclaim space and update statistics.
PgEmbeddingStore -> Result Unit String
reindex
Reindex the HNSW index (useful after large batch inserts).
PgEmbeddingStore -> Result Unit String
drop-index
Drop the HNSW index.
PgEmbeddingStore -> Result Unit String
create
Create or open an embedding store. Creates the embeddings table if it doesn't exist.
Parameters:
NonEmptyString -> Int -> Symbol -> Result EmbeddingStore String
store = Embeddings.SQLite.create "knowledge.db" 1536 :cosineopen
Open an existing embedding store. Does not create table if it doesn't exist.
NonEmptyString -> Int -> Symbol -> Result EmbeddingStore String
store = Embeddings.SQLite.open "knowledge.db" 1536 :cosineclose
Close the embedding store.
EmbeddingStore -> Unit
upsert
Insert or update an embedding. If an embedding with the same ID exists, it will be replaced.
Parameters:
EmbeddingStore -> String -> String -> List Float -> String -> Result Unit String
Embeddings.SQLite.upsert store "doc1" "Hello world" [0.1, 0.2, ...] "{\"source\": \"api\"}"insert
Insert an embedding (alias for upsert).
EmbeddingStore -> String -> String -> List Float -> String -> Result Unit String
get
Get an embedding by ID. Returns the embedding record or None if not found.
EmbeddingStore -> String -> Option Record
match Embeddings.SQLite.get store "doc1"
| Some result -> println result.content
| None -> println "Not found"delete
Delete an embedding by ID.
EmbeddingStore -> String -> Result Unit String
Embeddings.SQLite.delete store "doc1"count
Count total embeddings in the store.
EmbeddingStore -> Int
n = Embeddings.SQLite.count store # => 1234exists?
Check if an embedding exists by ID.
EmbeddingStore -> String -> Bool
search
Search for top-k similar embeddings. Performs brute-force similarity comparison against all stored embeddings.
Parameters:
Returns:
EmbeddingStore -> List Float -> Int -> Result (List Record) String
match Embeddings.SQLite.search store query-vec 10
| Ok results ->
results |> each fn(r) => println "${r.score}: ${r.content}"
| Err e -> println "Search failed"search-threshold
Search with a minimum score threshold. Only returns results with score >= threshold.
EmbeddingStore -> List Float -> Float -> Result (List Record) String
search-filter
Search with metadata filtering. Only searches embeddings where filter function returns true.
EmbeddingStore -> List Float -> Int -> (Json -> Bool) -> Result (List Record) String
upsert-batch
Insert multiple embeddings at once. More efficient than individual inserts.
EmbeddingStore -> List Record -> Result Int String
get-all
Get all embeddings (for small stores). Warning: May be slow for large stores.
EmbeddingStore -> List Record
clear
Clear all embeddings from the store.
EmbeddingStore -> Result Unit String