regex

PCRE2-based regular expression library for Kit

Files

FileDescription
.editorconfigEditor formatting configuration
.gitignoreGit ignore rules for build artifacts and dependencies
.tool-versionsasdf tool versions (Zig, Kit)
LICENSEMIT license file
README.mdThis file
c/kit_regex.cC FFI wrapper around PCRE2
c/kit_regex.hC header for the FFI wrapper
docs/api.htmlGenerated API documentation
docs/basic.htmlGenerated documentation for the basic example
docs/regex.htmlGenerated documentation for the main regex example
examples/basic.kitBasic usage example
examples/regex.kitFull feature walkthrough
kit.tomlPackage manifest with metadata, native build settings, and tasks
src/regex.kitKit Regex API
tests/regex.test.kitTests for regex helpers and public API shape

Dependencies

No Kit package dependencies.

This package is an FFI package and requires PCRE2 development headers and libraries at build time.

PlatformNative dependency
macOSbrew install pcre2
Ubuntu/Debiansudo apt install libpcre2-dev
Fedorasudo dnf install pcre2-devel

The package declares the ffi capability and builds c/kit_regex.c into the native wrapper library used by src/regex.kit.

Installation

kit add gitlab.com/kit-lang/packages/kit-regex.git

Usage

import Kit.Regex as Regex

main = fn =>
  # Match anywhere in a string
  if Regex.matches? "\\d+" "abc123def" then
    println "contains digits"

  # Find the first match position
  pos = Regex.find "\\d+" "abc123def"
  println "first digit starts at ${pos}"

  # Extract the first match
  match Regex.extract Regex.email-pattern "Contact support@example.com"
    | Some email -> println "email: ${email}"
    | None -> println "no email found"

  # Capture groups by index, where 0 is the full match
  match Regex.group "(\\d{4})-(\\d{2})-(\\d{2})" "Date: 2026-04-30" 1
    | Some year -> println "year: ${year}"
    | None -> println "no date found"

  # Replace and remove matches
  normalized = Regex.replace-all "\\s+" "hello   world" " "
  println normalized

  stripped = Regex.remove-all "\\d+" "foo123bar456"
  println stripped

  # Validate whole strings
  println "valid email: ${Regex.valid-email? "user@example.com"}"
  println "integer: ${Regex.full-match? "-?\\d+" "12345"}"

  # Compile once when a pattern will be reused
  match Regex.compile "\\b\\w+@\\w+[.]\\w+\\b"
    | Ok compiled ->
      println "has email: ${Regex.matches-compiled? compiled "Email: user@test.com"}"
      println "position: ${Regex.find-compiled compiled "Email: user@test.com"}"
      Regex.free compiled
    | Err e ->
      println "compile error: ${e}"

main

API Overview

The high-level API compiles patterns for one-off use and frees the native PCRE2 handle internally:

FunctionDescription
matches?Test whether a pattern matches anywhere in a subject
findReturn the first match position, or -1
extractReturn the first match as Option String
groupReturn a capture group as Option String
group-countReturn the number of capture groups in a pattern
replaceReplace the first match
replace-allReplace every match
removeRemove the first match
remove-allRemove every match
starts-with?Test whether the subject starts with a pattern
ends-with?Test whether the subject ends with a pattern
full-match?Test whether the entire subject matches a pattern
valid-email?Validate a basic email address
valid-url?Validate a basic HTTP/HTTPS URL

For repeated matches, use compile or compile-with-options and the *-compiled functions. Always call Regex.free on compiled handles when finished.

Option helpers:

FunctionPCRE2 behavior
caselessCase-insensitive matching
multiline^ and $ match line boundaries
dotall. matches newlines
extendedIgnore pattern whitespace and allow comments
ungreedyInvert quantifier greediness

Common pattern helpers:

FunctionDescription
email-patternBasic email pattern
url-patternBasic HTTP/HTTPS URL pattern
int-patternSigned integer pattern
float-patternFloating-point number pattern
word-patternWord-character pattern
whitespace-patternWhitespace pattern

Development

Running Examples

Run examples with the interpreter:

kit run examples/basic.kit
kit run examples/regex.kit

Compile examples to native binaries:

kit build examples/basic.kit && ./basic
kit build examples/regex.kit && ./regex

Running Tests

Run the test suite:

kit test

Run the test suite with coverage:

kit test --coverage

Running kit dev

Run the standard development workflow (format, check, test):

kit dev

This will:

  1. Build the native regex wrapper
  2. Format and check source files in src/
  3. Type check examples in examples/
  4. Run tests in tests/ with coverage

Running Parity Checks

Check that interpreted and compiled examples produce the same output:

kit parity --failures-only

For automated runs, disable the spinner:

kit parity --no-spinner --failures-only

Generating Documentation

Generate API documentation from doc comments:

kit doc src/regex.kit -o docs/api.html

Note: Kit sources with doc comments (##) will generate HTML documents in docs/*.html.

Cleaning Build Artifacts

Remove generated files, caches, and build artifacts:

kit task clean

Note: Defined in kit.toml.

Local Installation

To install this package locally for development:

kit install

This installs the package to ~/.kit/packages/@kit/regex/, builds libkit_regex, and makes the package available for import as Kit.Regex in other projects.

License

This package is released under the MIT License - see LICENSE for details.

This package links against PCRE2.

Exported Functions & Types

caseless

Enable case-insensitive pattern matching.

Int

multiline

Make ^ and $ match line boundaries instead of just string boundaries.

Int

dotall

Allow . to match newline characters.

Int

extended

Enable extended syntax allowing whitespace and comments in patterns.

Int

ungreedy

Invert the greediness of quantifiers (make them ungreedy by default).

Int

compile

Compile a regex pattern with default options.

Parameters:

Returns:

String -> Result Ptr String

match compile "\\d{3}-\\d{4}"
  | Ok regex ->
    # Use regex...
    free regex
  | Err err -> print "Compilation failed: ${err}"

compile-with-options

Compile a regex pattern with custom options.

Parameters:

Returns:

String -> Int -> Result Ptr String

# Case-insensitive matching
opts = caseless()
match compile-with-options "hello" opts
  | Ok regex ->
    matches-compiled? regex "HELLO"  # Returns true
    free regex
  | Err err -> print err

free

Free a compiled regex and release its resources.

Parameters:

Returns:

Ptr -> Unit

match compile "\\d+"
  | Ok regex ->
    # Use regex...
    free regex  # Always free when done
  | Err _ -> ()

matches?

Test if a pattern matches anywhere in the subject string.

Parameters:

Returns:

String -> String -> Bool

if matches? "\\d+" "hello123" then
  print "Contains digits"

matches-compiled?

Test if a compiled regex matches the subject string.

Parameters:

Returns:

Ptr -> String -> Bool

match compile "\\d+"
  | Ok regex ->
    result = matches-compiled? regex "abc123"  # true
    free regex
  | Err _ -> false

find

Find the position of the first match in the subject string.

Parameters:

Returns:

String -> String -> Int

pos = find "\\d+" "abc123def"
# pos = 3

find-compiled

Find the position of the first match using a compiled regex.

Parameters:

Returns:

Ptr -> String -> Int

match compile "\\d+"
  | Ok regex ->
    pos = find-compiled regex "abc123def"  # 3
    free regex
  | Err _ -> -1

extract

Extract the first match from the subject string as an Option.

Parameters:

Returns:

String -> String -> Option String

match extract "\\d+" "abc123def"
  | Some digits -> print "Found: ${digits}"  # "123"
  | None -> print "No match"

extract-compiled

Extract the first match using a compiled regex.

Parameters:

Returns:

Ptr -> String -> Option String

match compile "\\d+"
  | Ok regex ->
    result = extract-compiled regex "abc123def"  # Some "123"
    free regex
  | Err _ -> None

group

Get a capture group by index from pattern match.

Parameters:

Returns:

String -> String -> NonNegativeInt -> Option String

match group "(\\w+)@(\\w+)" "user@example.com" 1
  | Some username -> print "User: ${username}"  # "user"
  | None -> print "No match"

group-compiled

Get a capture group by index using a compiled regex.

Parameters:

Returns:

Ptr -> String -> NonNegativeInt -> Option String

match compile "(\\w+)@(\\w+)"
  | Ok regex ->
    domain = group-compiled regex "user@example.com" 2  # Some "example"
    free regex
  | Err _ -> None

group-count

Get the number of capture groups in a pattern.

Parameters:

Returns:

String -> Int

count = group-count "(\\w+)@(\\w+)\\.(\\w+)"
# count = 3

replace

Replace the first occurrence of a pattern in the subject string.

Parameters:

Returns:

String -> String -> String -> String

result = replace "\\d+" "abc123def456" "NUM"
# result = "abcNUMdef456"

replace-compiled

Replace the first occurrence using a compiled regex.

Parameters:

Returns:

Ptr -> String -> String -> String

match compile "\\d+"
  | Ok regex ->
    result = replace-compiled regex "abc123def456" "NUM"
    # result = "abcNUMdef456"
    free regex
  | Err _ -> subject

replace-all

Replace all occurrences of a pattern in the subject string.

Parameters:

Returns:

String -> String -> String -> String

result = replace-all "\\d+" "abc123def456" "NUM"
# result = "abcNUMdefNUM"

replace-all-compiled

Replace all occurrences using a compiled regex.

Parameters:

Returns:

Ptr -> String -> String -> String

match compile "\\d+"
  | Ok regex ->
    result = replace-all-compiled regex "abc123def456" "NUM"
    # result = "abcNUMdefNUM"
    free regex
  | Err _ -> subject

starts-with?

Test if the subject string starts with the given pattern.

Parameters:

Returns:

String -> String -> Bool

if starts-with? "\\d+" "123abc" then
  print "Starts with digits"

ends-with?

Test if the subject string ends with the given pattern.

Parameters:

Returns:

String -> String -> Bool

if ends-with? "\\d+" "abc123" then
  print "Ends with digits"

full-match?

Test if the entire subject string matches the pattern exactly.

Parameters:

Returns:

String -> String -> Bool

if full-match? "\\d{3}-\\d{4}" "123-4567" then
  print "Valid format"

remove-all

Remove all occurrences of the pattern from the subject string.

Parameters:

Returns:

String -> String -> String

result = remove-all "\\s+" "hello  world  "
# result = "helloworld"

remove

Remove the first occurrence of the pattern from the subject string.

Parameters:

Returns:

String -> String -> String

result = remove "\\d+" "abc123def456"
# result = "abcdef456"

email-pattern

Basic email address pattern for validation and extraction.

Returns:

String

if matches? (email-pattern()) "user@example.com" then
  print "Valid email"

url-pattern

Basic HTTP/HTTPS URL pattern for validation and extraction.

Returns:

String

match extract (url-pattern()) "Visit https://example.com for info"
  | Some url -> print "Found: ${url}"
  | None -> print "No URL found"

int-pattern

Integer number pattern (with optional negative sign).

Returns:

String

if matches? (int-pattern()) "-42" then
  print "Valid integer"

float-pattern

Floating point number pattern (with optional negative sign and decimal point).

Returns:

String

if matches? (float-pattern()) "3.14159" then
  print "Valid float"

word-pattern

Word pattern matching alphanumeric characters and underscores.

Returns:

String

words = extract-all (word-pattern()) "hello world"

whitespace-pattern

Whitespace pattern matching spaces, tabs, and newlines.

Returns:

String

result = replace-all (whitespace-pattern()) "hello  world" " "

valid-email?

Check if a string is a valid email address format.

Parameters:

Returns:

String -> Bool

if valid-email? "user@example.com" then
  print "Valid email address"

valid-url?

Check if a string is a valid HTTP/HTTPS URL format.

Parameters:

Returns:

String -> Bool

if valid-url? "https://example.com" then
  print "Valid URL"