pg-parser
| Kind | ffi-c |
|---|---|
| Capabilities | ffi |
| Categories | sql parser ffi |
| Keywords | postgresql sql parser ast lint validation |
PostgreSQL SQL parser - parse, validate, normalize, and fingerprint SQL queries using libpg_query
Files
| File | Description |
|---|---|
.editorconfig | Editor formatting configuration |
.gitignore | Git ignore rules for build artifacts and dependencies |
.tool-versions | asdf tool versions (Zig, Kit) |
LICENSE | MIT license file |
README.md | This file |
examples/demo.kit | Example: demo |
kit.toml | Package manifest with metadata and dependencies |
src/pg-parser.kit | Kit PostgreSQL Parser - Parse, validate, normalize, and fingerprint SQL queries |
tests/error-types.test.kit | Tests for error-types |
tests/parser.test.kit | Tests for parser |
zig/Makefile | Build automation rules |
zig/kit_ffi.zig | Zig FFI module for kit ffi |
zig/pg_query_shim.c | C shim wrapping libpg_query |
zig/pg_query_shim.h | C header for libpg_query shim |
Dependencies
No Kit package dependencies.
Related Packages
- kit-postgres - PostgreSQL bindings for Kit using libpq
Installation
kit add gitlab.com/kit-lang/packages/kit-pg-parser.gitUsage
import Kit.PgParser as Sql
main = fn() =>
# Parse SQL to JSON AST
match Sql.parse "SELECT id, name FROM users WHERE active = true"
| Ok ast -> println "Parse OK: ${String.slice ast 0 100}..."
| Err e -> println "Parse error: ${SqlParseError.show e}"
# Check if SQL is valid
valid = Sql.valid? "SELECT * FROM users"
println "Valid: ${Bool.show valid}"
# Normalize SQL (replace literals with $1, $2, etc.)
match Sql.normalize "SELECT * FROM users WHERE id = 42 AND name = 'Alice'"
| Ok normalized -> println "Normalized: ${normalized}"
| Err e -> println "Error: ${SqlParseError.show e}"
# Fingerprint for query grouping
match Sql.fingerprint "SELECT * FROM users WHERE id = 1"
| Ok fp -> println "Fingerprint: ${fp}"
| Err e -> println "Error: ${SqlParseError.show e}"
# Extract tables referenced in a query
match Sql.tables "SELECT * FROM users u JOIN orders o ON u.id = o.user_id"
| Ok tables -> println "Tables: ${List.show tables}"
| Err e -> println "Error: ${SqlParseError.show e}"
# Extract column references
match Sql.columns "SELECT id, name FROM users WHERE active = true"
| Ok cols -> println "Columns: ${List.show cols}"
| Err e -> println "Error: ${SqlParseError.show e}"
# Detect statement type
match Sql.statement-type "CREATE TABLE users (id INT)"
| Ok stmt-type -> println "Type: ${stmt-type}"
| Err e -> println "Error: ${SqlParseError.show e}"
# Split multi-statement SQL
match Sql.split "SELECT 1; SELECT 2; SELECT 3"
| Ok stmts -> println "Statements: ${Int.to-string (List.length stmts)}"
| Err e -> println "Error: ${SqlParseError.show e}"
# Deparse (normalize SQL formatting)
match Sql.deparse "select ID,NAME from USERS where ID=1"
| Ok deparsed -> println "Deparsed: ${deparsed}"
| Err e -> println "Error: ${SqlParseError.show e}"
# Cleanup allocated memory
Sql.cleanup()
mainAPI Reference
Core Functions
| Function | Signature | Description |
|---|---|---|
parse | String -> Result String SqlParseError | Parse SQL to JSON AST |
valid? | String -> Bool | Check if SQL is syntactically valid |
normalize | String -> Result String SqlParseError | Replace literals with $1, $2, etc. |
fingerprint | String -> Result String SqlParseError | Generate a hash for query grouping |
deparse | String -> Result String SqlParseError | Convert SQL to canonical format |
pg-version | () -> String | Get the PostgreSQL parser version |
cleanup | () -> Unit | Free internal parser memory |
parse-with-position | String -> Result String {message: String, position: Int} | Parse with detailed error position |
Multi-Statement Functions
| Function | Signature | Description |
|---|---|---|
split | String -> Result (List String) SqlParseError | Split multi-statement SQL |
statement-count | String -> Result Int SqlParseError | Count statements in SQL |
statement-type | String -> Result String SqlParseError | Detect statement type (SELECT, INSERT, etc.) |
statement-types | String -> Result (List String) SqlParseError | Detect types of multiple statements |
Extraction Functions
| Function | Signature | Description |
|---|---|---|
tables | String -> Result (List String) SqlParseError | Extract table names |
columns | String -> Result (List String) SqlParseError | Extract column references |
functions | String -> Result (List String) SqlParseError | Extract function calls |
ctes | String -> Result (List String) SqlParseError | Extract CTE names |
joins | String -> Result (List String) SqlParseError | Extract join types |
schemas | String -> Result (List String) SqlParseError | Extract schema names |
table-aliases | String -> Result (List String) SqlParseError | Extract table aliases (alias:table format) |
column-aliases | String -> Result (List String) SqlParseError | Extract column aliases |
params | String -> Result (List Int) SqlParseError | Extract positional parameter numbers |
Detection Functions
| Function | Signature | Description |
|---|---|---|
utility? | String -> Result Bool SqlParseError | Check if statement is DDL/utility |
classify-statements | String -> Result (List Bool) SqlParseError | Classify multiple statements as utility/DML |
has-cte? | String -> Result Bool SqlParseError | Check if SQL contains CTEs |
has-subquery? | String -> Result Bool SqlParseError | Check if SQL contains subqueries |
has-join? | String -> Result Bool SqlParseError | Check if SQL contains joins |
has-alias? | String -> Result Bool SqlParseError | Check if SQL contains aliases |
has-schema? | String -> Result Bool SqlParseError | Check if SQL has schema-qualified identifiers |
has-params? | String -> Result Bool SqlParseError | Check if SQL contains $N parameters |
Count Functions
| Function | Signature | Description |
|---|---|---|
join-count | String -> Result Int SqlParseError | Count total joins |
subquery-count | String -> Result Int SqlParseError | Count subqueries |
param-count | String -> Result Int SqlParseError | Count unique parameters |
param-max | String -> Result Int SqlParseError | Get maximum parameter number |
Complexity Analysis
| Function | Signature | Description |
|---|---|---|
complexity | String -> Result QueryComplexity SqlParseError | Detailed complexity breakdown |
complexity-score | String -> Result Int SqlParseError | Weighted complexity score |
Error Types
type SqlParseError =
| SqlSyntaxError {message: String, position: Int}
| SqlInternalError {message: String}SqlParseError implements Show and Error traits.
Development
Running Examples
Run examples with the interpreter:
kit run examples/demo.kitCompile examples to a native binary:
kit build examples/demo.kit && ./demoRunning Tests
Run the test suite:
kit testRun the test suite with coverage:
kit test --coverageRunning kit dev
Run the standard development workflow (format, check, test):
kit devThis will:
- Format and check source files in
src/ - Run tests in
tests/with coverage
Generating Documentation
Generate API documentation from doc comments:
kit docNote: Kit sources with doc comments (##) will generate HTML documents in docs/*.html
Cleaning Build Artifacts
Remove generated files, caches, and build artifacts:
kit task cleanNote: Defined in kit.toml.
Local Installation
To install this package locally for development:
kit installThis installs the package to ~/.kit/packages/@kit/pg-parser/, making it available for import as Kit.PgParser in other projects.
License
This package is released under the MIT License - see LICENSE for details.
libpg_query is released under the BSD 3-Clause License.
Exported Functions & Types
SqlParseError
SQL parse error type with position information.
Variants
SqlSyntaxError {message, position}SqlInternalError {message}parse
Parse SQL and return the AST as a JSON string.
Parameters:
Returns:
The JSON AST follows PostgreSQL's internal parse tree structure. See: https://github.com/pganalyze/libpg_query for documentation.
NonEmptyString -> Result String SqlParseError
match PgParser.parse "SELECT id, name FROM users WHERE active = true"
| Ok ast -> print ast
| Err e -> print "Error: ${SqlParseError.message e}"valid?
Check if SQL is syntactically valid.
Parameters:
Returns:
NonEmptyString -> Bool
if PgParser.valid? "SELECT * FROM users" then
print "Valid SQL"
else
print "Invalid SQL"normalize
Normalize SQL by replacing literal values with positional parameters.
This is useful for: - Comparing queries that differ only in literal values - Preparing queries for caching/analysis - Redacting sensitive data from SQL logs
Parameters:
Returns:
NonEmptyString -> Result String SqlParseError
PgParser.normalize "SELECT * FROM users WHERE id = 42 AND name = 'Alice'"
# => Ok "SELECT * FROM users WHERE id = $1 AND name = $2"fingerprint
Generate a fingerprint hash for query grouping.
Fingerprints allow grouping similar queries that differ only in literal values. This is useful for: - Query analysis and statistics - Identifying slow query patterns - Deduplicating query logs
Parameters:
Returns:
NonEmptyString -> Result String SqlParseError
PgParser.fingerprint "SELECT * FROM users WHERE id = 1"
# => Ok "a0ead580058af585"
PgParser.fingerprint "SELECT * FROM users WHERE id = 999"
# => Ok "a0ead580058af585" # Same fingerprint!pg-version
Get the PostgreSQL version that the parser was built with.
Returns:
String
print "Parser uses PostgreSQL ${PgParser.pg-version}"parse-with-position
Try to parse SQL and return detailed error information.
Unlike parse, this returns both the error message and position in a structured format, useful for editor integrations.
Parameters:
Returns:
NonEmptyString -> Result String {message: String, position: Int}
match PgParser.parse-with-position "SELEC * FROM users"
| Ok ast -> print "Valid"
| Err e -> print "Error at ${Int.to-string e.position}: ${e.message}"cleanup
Clean up internal memory used by the parser.
Call this when you're done using the parser to free allocated memory. This is optional but recommended for long-running applications.
Unit
split
Split SQL containing multiple statements into individual statements.
This handles complex cases like: - Comments containing semicolons - Strings containing semicolons - CREATE RULE statements with embedded queries
Parameters:
Returns:
NonEmptyString -> Result (List String) SqlParseError
PgParser.split "SELECT 1; SELECT 2; SELECT 3"
# => Ok ["SELECT 1", "SELECT 2", "SELECT 3"]
PgParser.split "SELECT 'a;b'; SELECT 1"
# => Ok ["SELECT 'a;b'", "SELECT 1"] -- semicolon in string handledstatement-count
Count the number of statements in SQL.
Parameters:
Returns:
NonEmptyString -> Result Int SqlParseError
PgParser.statement-count "SELECT 1; SELECT 2"
# => Ok 2utility?
Check if a single SQL statement is a utility statement.
Utility statements are DDL and administrative commands like: - CREATE, ALTER, DROP - GRANT, REVOKE - VACUUM, ANALYZE, REINDEX - SET, SHOW, RESET - COPY, TRUNCATE - LISTEN, NOTIFY, UNLISTEN
Non-utility (DML) statements: - SELECT, INSERT, UPDATE, DELETE, MERGE
Parameters:
Returns:
NonEmptyString -> Result Bool SqlParseError
PgParser.utility? "CREATE TABLE users (id INT)"
# => Ok true
PgParser.utility? "SELECT * FROM users"
# => Ok falseclassify-statements
Classify multiple statements as utility or DML.
Returns a list of booleans, one per statement, indicating whether each statement is a utility statement.
Parameters:
Returns:
NonEmptyString -> Result (List Bool) SqlParseError
PgParser.classify-statements "CREATE TABLE t (id INT); SELECT * FROM t"
# => Ok [true, false]tables
Extract all table names referenced in SQL.
Returns unique table names from the query, including: - Tables in FROM clauses - Tables in JOIN clauses - Tables in subqueries - Tables in INSERT/UPDATE/DELETE targets
Schema-qualified names are returned as "schema.table".
Parameters:
Returns:
NonEmptyString -> Result (List String) SqlParseError
PgParser.tables "SELECT * FROM users u JOIN orders o ON u.id = o.user_id"
# => Ok ["users", "orders"]
PgParser.tables "SELECT * FROM public.users"
# => Ok ["public.users"]columns
Extract all column references from SQL.
Returns unique column names from the query, including: - Columns in SELECT list - Columns in WHERE clause - Columns in ORDER BY, GROUP BY, HAVING - Columns in JOIN conditions
Qualified column names are returned as "table.column" or "schema.table.column". Star (*) columns are excluded.
Parameters:
Returns:
NonEmptyString -> Result (List String) SqlParseError
PgParser.columns "SELECT id, name FROM users WHERE active = true"
# => Ok ["id", "name", "active"]
PgParser.columns "SELECT u.id, u.name FROM users u"
# => Ok ["u.id", "u.name"]statement-type
Detect the type of a single SQL statement.
Returns a human-readable statement type like: - "SELECT", "INSERT", "UPDATE", "DELETE", "MERGE" (DML) - "CREATE TABLE", "CREATE INDEX", "CREATE VIEW" (DDL) - "ALTER TABLE", "DROP", "TRUNCATE" (DDL) - "GRANT", "REVOKE", "CREATE ROLE" (DCL) - "BEGIN", "COMMIT", "ROLLBACK" (TCL) - "EXPLAIN", "VACUUM", "ANALYZE" (Utility)
Parameters:
Returns:
NonEmptyString -> Result String SqlParseError
PgParser.statement-type "SELECT * FROM users"
# => Ok "SELECT"
PgParser.statement-type "CREATE TABLE users (id INT)"
# => Ok "CREATE TABLE"statement-types
Detect the types of multiple SQL statements.
Returns a list of statement types, one per statement.
Parameters:
Returns:
NonEmptyString -> Result (List String) SqlParseError
PgParser.statement-types "SELECT 1; INSERT INTO t VALUES (1); DROP TABLE t"
# => Ok ["SELECT", "INSERT", "DROP"]deparse
Convert SQL to its canonical form by parsing and deparsing.
This round-trips the SQL through PostgreSQL's parser: 1. Parse the SQL to an internal AST (protobuf format) 2. Deparse the AST back to SQL
The resulting SQL is in PostgreSQL's canonical format: - Consistent formatting and whitespace - Explicit parentheses where needed - Normalized identifier casing
Use cases: - Normalizing SQL formatting - Validating SQL can be round-tripped - Comparing SQL queries for semantic equality - Pretty-printing SQL
Parameters:
Returns:
NonEmptyString -> Result String SqlParseError
PgParser.deparse "select ID,NAME from USERS where ID=1"
# => Ok "SELECT id, name FROM users WHERE (id = 1)"functions
Extract all function calls from SQL.
Returns unique function names from the query, including: - Aggregate functions (COUNT, SUM, AVG, MIN, MAX) - Scalar functions (COALESCE, NULLIF, GREATEST, LEAST) - Date/time functions (NOW, CURRENT_TIMESTAMP, DATE_TRUNC) - String functions (UPPER, LOWER, CONCAT, SUBSTRING) - User-defined functions
Schema-qualified function names are returned as "schema.function_name".
Parameters:
Returns:
NonEmptyString -> Result (List String) SqlParseError
PgParser.functions "SELECT COUNT(*), MAX(price) FROM orders WHERE created_at > NOW()"
# => Ok ["count", "max", "now"]
PgParser.functions "SELECT public.my_func(id) FROM users"
# => Ok ["public.my_func"]ctes
Extract all CTE names from SQL.
CTEs (Common Table Expressions) are defined with the WITH clause. This function extracts the names of all CTEs in a query.
Parameters:
Returns:
NonEmptyString -> Result (List String) SqlParseError
PgParser.ctes "WITH active AS (SELECT * FROM users WHERE active) SELECT * FROM active"
# => Ok ["active"]
PgParser.ctes "WITH a AS (SELECT 1), b AS (SELECT 2) SELECT * FROM a, b"
# => Ok ["a", "b"]has-cte?
Check if SQL contains any CTEs (WITH clause).
A convenience function that returns a boolean instead of the full list.
Parameters:
Returns:
NonEmptyString -> Result Bool SqlParseError
PgParser.has-cte? "WITH x AS (SELECT 1) SELECT * FROM x"
# => Ok true
PgParser.has-cte? "SELECT * FROM users"
# => Ok falsesubquery-count
Count the number of subqueries in SQL.
Subqueries are nested SELECT statements that appear in: - WHERE clause (IN, EXISTS, scalar subqueries) - FROM clause (derived tables) - SELECT list (scalar subqueries)
Parameters:
Returns:
NonEmptyString -> Result Int SqlParseError
PgParser.subquery-count "SELECT * FROM users WHERE id IN (SELECT user_id FROM orders)"
# => Ok 1
PgParser.subquery-count "SELECT * FROM (SELECT * FROM a) x WHERE x.id IN (SELECT id FROM b)"
# => Ok 2has-subquery?
Check if SQL contains any subqueries.
A convenience function that returns a boolean instead of the count.
Parameters:
Returns:
NonEmptyString -> Result Bool SqlParseError
PgParser.has-subquery? "SELECT * FROM users WHERE id IN (SELECT 1)"
# => Ok true
PgParser.has-subquery? "SELECT * FROM users"
# => Ok falsejoins
Extract join types from SQL.
Returns a list of join types found in the query: - "INNER" - INNER JOIN or just JOIN - "LEFT" - LEFT JOIN or LEFT OUTER JOIN - "RIGHT" - RIGHT JOIN or RIGHT OUTER JOIN - "FULL" - FULL JOIN or FULL OUTER JOIN - "CROSS" - CROSS JOIN
Parameters:
Returns:
NonEmptyString -> Result (List String) SqlParseError
PgParser.joins "SELECT * FROM users u JOIN orders o ON u.id = o.user_id"
# => Ok ["INNER"]
PgParser.joins "SELECT * FROM a LEFT JOIN b ON a.id = b.a_id RIGHT JOIN c ON b.id = c.b_id"
# => Ok ["RIGHT", "LEFT"]join-count
Count the number of joins in SQL.
Parameters:
Returns:
NonEmptyString -> Result Int SqlParseError
PgParser.join-count "SELECT * FROM a JOIN b ON a.id = b.a_id JOIN c ON b.id = c.b_id"
# => Ok 2has-join?
Check if SQL contains any joins.
A convenience function that returns a boolean instead of the full list.
Parameters:
Returns:
NonEmptyString -> Result Bool SqlParseError
PgParser.has-join? "SELECT * FROM users u JOIN orders o ON u.id = o.user_id"
# => Ok true
PgParser.has-join? "SELECT * FROM users"
# => Ok falsetable-aliases
Extract table aliases from SQL.
Table aliases are defined with the AS keyword or implicitly: - FROM users AS u or FROM users u - JOIN orders AS o or JOIN orders o
Returns aliases in the format "alias:table" (e.g., "u:users").
Parameters:
Returns:
NonEmptyString -> Result (List String) SqlParseError
PgParser.table-aliases "SELECT * FROM users u JOIN orders o ON u.id = o.user_id"
# => Ok ["u:users", "o:orders"]column-aliases
Extract column aliases from SQL.
Column aliases are defined with the AS keyword: - SELECT id AS user_id - SELECT name AS full_name
Parameters:
Returns:
NonEmptyString -> Result (List String) SqlParseError
PgParser.column-aliases "SELECT id AS user_id, name AS full_name FROM users"
# => Ok ["user_id", "full_name"]has-alias?
Check if SQL contains any aliases.
A convenience function that returns a boolean indicating whether the query uses any table or column aliases.
Parameters:
Returns:
NonEmptyString -> Result Bool SqlParseError
PgParser.has-alias? "SELECT * FROM users u"
# => Ok true
PgParser.has-alias? "SELECT * FROM users"
# => Ok falseschemas
Extract schema names from SQL.
Schemas are found in schema-qualified identifiers: - public.users → "public" - sales.orders → "sales"
Returns unique schema names.
Parameters:
Returns:
NonEmptyString -> Result (List String) SqlParseError
PgParser.schemas "SELECT * FROM public.users JOIN sales.orders ON ..."
# => Ok ["public", "sales"]has-schema?
Check if SQL contains any schema-qualified identifiers.
A convenience function that returns a boolean indicating whether the query uses any schema-qualified table or function names.
Parameters:
Returns:
NonEmptyString -> Result Bool SqlParseError
PgParser.has-schema? "SELECT * FROM public.users"
# => Ok true
PgParser.has-schema? "SELECT * FROM users"
# => Ok falseparams
Extract all positional parameter numbers from SQL.
PostgreSQL uses $1, $2, $3, etc. for positional parameters. This function extracts all unique parameter numbers from a query, sorted in ascending order.
Parameters:
Returns:
NonEmptyString -> Result (List Int) SqlParseError
PgParser.params "SELECT * FROM users WHERE id = $1 AND status = $2"
# => Ok [1, 2]
PgParser.params "SELECT * FROM users WHERE id = $3 AND name = $1"
# => Ok [1, 3] -- sortedparam-count
Count the number of unique positional parameters in SQL.
Parameters:
Returns:
NonEmptyString -> Result Int SqlParseError
PgParser.param-count "SELECT * FROM users WHERE id = $1 AND status = $2"
# => Ok 2param-max
Get the maximum parameter number in SQL.
This is useful for knowing how many parameters need to be provided. For example, if a query uses $1 AND $3 (skipping $2), this returns 3.
Parameters:
Returns:
NonEmptyString -> Result Int SqlParseError
PgParser.param-max "SELECT * FROM users WHERE id = $1 AND name = $5"
# => Ok 5 -- max is 5, even though $2, $3, $4 are missinghas-params?
Check if SQL contains any positional parameters.
A convenience function that returns a boolean indicating whether the query uses $N style positional parameters.
Parameters:
Returns:
NonEmptyString -> Result Bool SqlParseError
PgParser.has-params? "SELECT * FROM users WHERE id = $1"
# => Ok true
PgParser.has-params? "SELECT * FROM users WHERE id = 1"
# => Ok falseQueryComplexity
Query complexity breakdown with individual metrics and total score.
Fields: - tables: Int - Number of tables referenced - joins: Int - Number of JOIN operations - subqueries: Int - Number of subqueries - ctes: Int - Number of CTEs (WITH clauses) - functions: Int - Number of function calls - score: Int - Weighted complexity score
Weights used for score calculation: - tables: 1 (base unit) - joins: 2 (multiplies row combinations) - subqueries: 3 (nested execution, most expensive) - ctes: 2 (similar to subqueries but often optimized) - functions: 1 (adds computation)
Variants
QueryComplexity {tables, joins, subqueries, ctes, functions, score}complexity
Calculate query complexity with detailed breakdown.
Returns a QueryComplexity record containing individual metrics and a weighted total score. Higher scores indicate more complex queries.
Parameters:
Returns:
NonEmptyString -> Result QueryComplexity SqlParseError
PgParser.complexity "SELECT * FROM users"
# => Ok (QueryComplexity {tables: 1, joins: 0, subqueries: 0, ctes: 0, functions: 0, score: 1})
PgParser.complexity "SELECT * FROM a JOIN b ON a.id = b.a_id JOIN c ON b.id = c.b_id"
# => Ok (QueryComplexity {tables: 3, joins: 2, subqueries: 0, ctes: 0, functions: 0, score: 7})
PgParser.complexity "SELECT * FROM users WHERE id IN (SELECT user_id FROM orders)"
# => Ok (QueryComplexity {tables: 2, joins: 0, subqueries: 1, ctes: 0, functions: 0, score: 5})complexity-score
Calculate query complexity score (simple version).
Returns just the weighted complexity score as an integer. Use complexity for a detailed breakdown of individual metrics.
Parameters:
Returns:
NonEmptyString -> Result Int SqlParseError
PgParser.complexity-score "SELECT * FROM users"
# => Ok 1
PgParser.complexity-score "SELECT * FROM a JOIN b JOIN c"
# => Ok 7