pg-parser
| Kind | ffi-zig |
|---|---|
| Categories | sql parser ffi |
| Keywords | postgresql sql parser ast lint validation |
PostgreSQL SQL parser - parse, validate, normalize, and fingerprint SQL queries using libpg_query
Files
| File | Description |
|---|---|
kit.toml | Package manifest with metadata and dependencies |
src/main.kit | SQL parsing, validation, and analysis API |
tests/test-parser.kit | SQL validation for SELECT, INSERT, UPDATE, DELETE |
examples/demo.kit | Full API demo: parse, normalize, fingerprint, analyze |
LICENSE | MIT license file |
Dependencies
No Kit package dependencies.
Installation
kit add gitlab.com/kit-lang/packages/kit-pg-parser.gitUsage
import Kit.PgParserLicense
MIT License - see LICENSE for details.
Exported Functions & Types
SqlParseError
SQL parse error type with position information.
Variants
SqlSyntaxError {message, position}SqlInternalError {message}parse
Parse SQL and return the AST as a JSON string.
Parameters:
Returns:
The JSON AST follows PostgreSQL's internal parse tree structure. See: https://github.com/pganalyze/libpg_query for documentation.
String -> Result String SqlParseError
match PgParser.parse "SELECT id, name FROM users WHERE active = true"
| Ok ast -> print ast
| Err e -> print "Error: ${SqlParseError.message e}"valid?
Check if SQL is syntactically valid.
Parameters:
Returns:
String -> Bool
if PgParser.valid? "SELECT * FROM users" then
print "Valid SQL"
else
print "Invalid SQL"normalize
Normalize SQL by replacing literal values with positional parameters.
This is useful for: - Comparing queries that differ only in literal values - Preparing queries for caching/analysis - Redacting sensitive data from SQL logs
Parameters:
Returns:
String -> Result String SqlParseError
PgParser.normalize "SELECT * FROM users WHERE id = 42 AND name = 'Alice'"
# => Ok "SELECT * FROM users WHERE id = $1 AND name = $2"fingerprint
Generate a fingerprint hash for query grouping.
Fingerprints allow grouping similar queries that differ only in literal values. This is useful for: - Query analysis and statistics - Identifying slow query patterns - Deduplicating query logs
Parameters:
Returns:
String -> Result String SqlParseError
PgParser.fingerprint "SELECT * FROM users WHERE id = 1"
# => Ok "a0ead580058af585"
PgParser.fingerprint "SELECT * FROM users WHERE id = 999"
# => Ok "a0ead580058af585" # Same fingerprint!pg-version
Get the PostgreSQL version that the parser was built with.
Returns:
String
print "Parser uses PostgreSQL ${PgParser.pg-version}"parse-with-position
Try to parse SQL and return detailed error information.
Unlike parse, this returns both the error message and position in a structured format, useful for editor integrations.
Parameters:
Returns:
String -> Result String {message: String, position: Int}
match PgParser.parse-with-position "SELEC * FROM users"
| Ok ast -> print "Valid"
| Err e -> print "Error at ${Int.to-string e.position}: ${e.message}"cleanup
Clean up internal memory used by the parser.
Call this when you're done using the parser to free allocated memory. This is optional but recommended for long-running applications.
Unit
split
Split SQL containing multiple statements into individual statements.
This handles complex cases like: - Comments containing semicolons - Strings containing semicolons - CREATE RULE statements with embedded queries
Parameters:
Returns:
String -> Result (List String) SqlParseError
PgParser.split "SELECT 1; SELECT 2; SELECT 3"
# => Ok ["SELECT 1", "SELECT 2", "SELECT 3"]
PgParser.split "SELECT 'a;b'; SELECT 1"
# => Ok ["SELECT 'a;b'", "SELECT 1"] -- semicolon in string handledstatement-count
Count the number of statements in SQL.
Parameters:
Returns:
String -> Result Int SqlParseError
PgParser.statement-count "SELECT 1; SELECT 2"
# => Ok 2utility?
Check if a single SQL statement is a utility statement.
Utility statements are DDL and administrative commands like: - CREATE, ALTER, DROP - GRANT, REVOKE - VACUUM, ANALYZE, REINDEX - SET, SHOW, RESET - COPY, TRUNCATE - LISTEN, NOTIFY, UNLISTEN
Non-utility (DML) statements: - SELECT, INSERT, UPDATE, DELETE, MERGE
Parameters:
Returns:
String -> Result Bool SqlParseError
PgParser.utility? "CREATE TABLE users (id INT)"
# => Ok true
PgParser.utility? "SELECT * FROM users"
# => Ok falseclassify-statements
Classify multiple statements as utility or DML.
Returns a list of booleans, one per statement, indicating whether each statement is a utility statement.
Parameters:
Returns:
String -> Result (List Bool) SqlParseError
PgParser.classify-statements "CREATE TABLE t (id INT); SELECT * FROM t"
# => Ok [true, false]tables
Extract all table names referenced in SQL.
Returns unique table names from the query, including: - Tables in FROM clauses - Tables in JOIN clauses - Tables in subqueries - Tables in INSERT/UPDATE/DELETE targets
Schema-qualified names are returned as "schema.table".
Parameters:
Returns:
String -> Result (List String) SqlParseError
PgParser.tables "SELECT * FROM users u JOIN orders o ON u.id = o.user_id"
# => Ok ["users", "orders"]
PgParser.tables "SELECT * FROM public.users"
# => Ok ["public.users"]columns
Extract all column references from SQL.
Returns unique column names from the query, including: - Columns in SELECT list - Columns in WHERE clause - Columns in ORDER BY, GROUP BY, HAVING - Columns in JOIN conditions
Qualified column names are returned as "table.column" or "schema.table.column". Star (*) columns are excluded.
Parameters:
Returns:
String -> Result (List String) SqlParseError
PgParser.columns "SELECT id, name FROM users WHERE active = true"
# => Ok ["id", "name", "active"]
PgParser.columns "SELECT u.id, u.name FROM users u"
# => Ok ["u.id", "u.name"]statement-type
Detect the type of a single SQL statement.
Returns a human-readable statement type like: - "SELECT", "INSERT", "UPDATE", "DELETE", "MERGE" (DML) - "CREATE TABLE", "CREATE INDEX", "CREATE VIEW" (DDL) - "ALTER TABLE", "DROP", "TRUNCATE" (DDL) - "GRANT", "REVOKE", "CREATE ROLE" (DCL) - "BEGIN", "COMMIT", "ROLLBACK" (TCL) - "EXPLAIN", "VACUUM", "ANALYZE" (Utility)
Parameters:
Returns:
String -> Result String SqlParseError
PgParser.statement-type "SELECT * FROM users"
# => Ok "SELECT"
PgParser.statement-type "CREATE TABLE users (id INT)"
# => Ok "CREATE TABLE"statement-types
Detect the types of multiple SQL statements.
Returns a list of statement types, one per statement.
Parameters:
Returns:
String -> Result (List String) SqlParseError
PgParser.statement-types "SELECT 1; INSERT INTO t VALUES (1); DROP TABLE t"
# => Ok ["SELECT", "INSERT", "DROP"]deparse
Convert SQL to its canonical form by parsing and deparsing.
This round-trips the SQL through PostgreSQL's parser: 1. Parse the SQL to an internal AST (protobuf format) 2. Deparse the AST back to SQL
The resulting SQL is in PostgreSQL's canonical format: - Consistent formatting and whitespace - Explicit parentheses where needed - Normalized identifier casing
Use cases: - Normalizing SQL formatting - Validating SQL can be round-tripped - Comparing SQL queries for semantic equality - Pretty-printing SQL
Parameters:
Returns:
String -> Result String SqlParseError
PgParser.deparse "select ID,NAME from USERS where ID=1"
# => Ok "SELECT id, name FROM users WHERE (id = 1)"functions
Extract all function calls from SQL.
Returns unique function names from the query, including: - Aggregate functions (COUNT, SUM, AVG, MIN, MAX) - Scalar functions (COALESCE, NULLIF, GREATEST, LEAST) - Date/time functions (NOW, CURRENT_TIMESTAMP, DATE_TRUNC) - String functions (UPPER, LOWER, CONCAT, SUBSTRING) - User-defined functions
Schema-qualified function names are returned as "schema.function_name".
Parameters:
Returns:
String -> Result (List String) SqlParseError
PgParser.functions "SELECT COUNT(*), MAX(price) FROM orders WHERE created_at > NOW()"
# => Ok ["count", "max", "now"]
PgParser.functions "SELECT public.my_func(id) FROM users"
# => Ok ["public.my_func"]ctes
Extract all CTE names from SQL.
CTEs (Common Table Expressions) are defined with the WITH clause. This function extracts the names of all CTEs in a query.
Parameters:
Returns:
String -> Result (List String) SqlParseError
PgParser.ctes "WITH active AS (SELECT * FROM users WHERE active) SELECT * FROM active"
# => Ok ["active"]
PgParser.ctes "WITH a AS (SELECT 1), b AS (SELECT 2) SELECT * FROM a, b"
# => Ok ["a", "b"]has-cte?
Check if SQL contains any CTEs (WITH clause).
A convenience function that returns a boolean instead of the full list.
Parameters:
Returns:
String -> Result Bool SqlParseError
PgParser.has-cte? "WITH x AS (SELECT 1) SELECT * FROM x"
# => Ok true
PgParser.has-cte? "SELECT * FROM users"
# => Ok falsesubquery-count
Count the number of subqueries in SQL.
Subqueries are nested SELECT statements that appear in: - WHERE clause (IN, EXISTS, scalar subqueries) - FROM clause (derived tables) - SELECT list (scalar subqueries)
Parameters:
Returns:
String -> Result Int SqlParseError
PgParser.subquery-count "SELECT * FROM users WHERE id IN (SELECT user_id FROM orders)"
# => Ok 1
PgParser.subquery-count "SELECT * FROM (SELECT * FROM a) x WHERE x.id IN (SELECT id FROM b)"
# => Ok 2has-subquery?
Check if SQL contains any subqueries.
A convenience function that returns a boolean instead of the count.
Parameters:
Returns:
String -> Result Bool SqlParseError
PgParser.has-subquery? "SELECT * FROM users WHERE id IN (SELECT 1)"
# => Ok true
PgParser.has-subquery? "SELECT * FROM users"
# => Ok falsejoins
Extract join types from SQL.
Returns a list of join types found in the query: - "INNER" - INNER JOIN or just JOIN - "LEFT" - LEFT JOIN or LEFT OUTER JOIN - "RIGHT" - RIGHT JOIN or RIGHT OUTER JOIN - "FULL" - FULL JOIN or FULL OUTER JOIN - "CROSS" - CROSS JOIN
Parameters:
Returns:
String -> Result (List String) SqlParseError
PgParser.joins "SELECT * FROM users u JOIN orders o ON u.id = o.user_id"
# => Ok ["INNER"]
PgParser.joins "SELECT * FROM a LEFT JOIN b ON a.id = b.a_id RIGHT JOIN c ON b.id = c.b_id"
# => Ok ["RIGHT", "LEFT"]join-count
Count the number of joins in SQL.
Parameters:
Returns:
String -> Result Int SqlParseError
PgParser.join-count "SELECT * FROM a JOIN b ON a.id = b.a_id JOIN c ON b.id = c.b_id"
# => Ok 2has-join?
Check if SQL contains any joins.
A convenience function that returns a boolean instead of the full list.
Parameters:
Returns:
String -> Result Bool SqlParseError
PgParser.has-join? "SELECT * FROM users u JOIN orders o ON u.id = o.user_id"
# => Ok true
PgParser.has-join? "SELECT * FROM users"
# => Ok falsetable-aliases
Extract table aliases from SQL.
Table aliases are defined with the AS keyword or implicitly: - FROM users AS u or FROM users u - JOIN orders AS o or JOIN orders o
Returns aliases in the format "alias:table" (e.g., "u:users").
Parameters:
Returns:
String -> Result (List String) SqlParseError
PgParser.table-aliases "SELECT * FROM users u JOIN orders o ON u.id = o.user_id"
# => Ok ["u:users", "o:orders"]column-aliases
Extract column aliases from SQL.
Column aliases are defined with the AS keyword: - SELECT id AS user_id - SELECT name AS full_name
Parameters:
Returns:
String -> Result (List String) SqlParseError
PgParser.column-aliases "SELECT id AS user_id, name AS full_name FROM users"
# => Ok ["user_id", "full_name"]has-alias?
Check if SQL contains any aliases.
A convenience function that returns a boolean indicating whether the query uses any table or column aliases.
Parameters:
Returns:
String -> Result Bool SqlParseError
PgParser.has-alias? "SELECT * FROM users u"
# => Ok true
PgParser.has-alias? "SELECT * FROM users"
# => Ok falseschemas
Extract schema names from SQL.
Schemas are found in schema-qualified identifiers: - public.users → "public" - sales.orders → "sales"
Returns unique schema names.
Parameters:
Returns:
String -> Result (List String) SqlParseError
PgParser.schemas "SELECT * FROM public.users JOIN sales.orders ON ..."
# => Ok ["public", "sales"]has-schema?
Check if SQL contains any schema-qualified identifiers.
A convenience function that returns a boolean indicating whether the query uses any schema-qualified table or function names.
Parameters:
Returns:
String -> Result Bool SqlParseError
PgParser.has-schema? "SELECT * FROM public.users"
# => Ok true
PgParser.has-schema? "SELECT * FROM users"
# => Ok falseparams
Extract all positional parameter numbers from SQL.
PostgreSQL uses $1, $2, $3, etc. for positional parameters. This function extracts all unique parameter numbers from a query, sorted in ascending order.
Parameters:
Returns:
String -> Result (List Int) SqlParseError
PgParser.params "SELECT * FROM users WHERE id = $1 AND status = $2"
# => Ok [1, 2]
PgParser.params "SELECT * FROM users WHERE id = $3 AND name = $1"
# => Ok [1, 3] -- sortedparam-count
Count the number of unique positional parameters in SQL.
Parameters:
Returns:
String -> Result Int SqlParseError
PgParser.param-count "SELECT * FROM users WHERE id = $1 AND status = $2"
# => Ok 2param-max
Get the maximum parameter number in SQL.
This is useful for knowing how many parameters need to be provided. For example, if a query uses $1 AND $3 (skipping $2), this returns 3.
Parameters:
Returns:
String -> Result Int SqlParseError
PgParser.param-max "SELECT * FROM users WHERE id = $1 AND name = $5"
# => Ok 5 -- max is 5, even though $2, $3, $4 are missinghas-params?
Check if SQL contains any positional parameters.
A convenience function that returns a boolean indicating whether the query uses $N style positional parameters.
Parameters:
Returns:
String -> Result Bool SqlParseError
PgParser.has-params? "SELECT * FROM users WHERE id = $1"
# => Ok true
PgParser.has-params? "SELECT * FROM users WHERE id = 1"
# => Ok falseQueryComplexity
Query complexity breakdown with individual metrics and total score.
Fields: - tables: Int - Number of tables referenced - joins: Int - Number of JOIN operations - subqueries: Int - Number of subqueries - ctes: Int - Number of CTEs (WITH clauses) - functions: Int - Number of function calls - score: Int - Weighted complexity score
Weights used for score calculation: - tables: 1 (base unit) - joins: 2 (multiplies row combinations) - subqueries: 3 (nested execution, most expensive) - ctes: 2 (similar to subqueries but often optimized) - functions: 1 (adds computation)
Variants
QueryComplexity {tables, joins, subqueries, ctes, functions, score}complexity
Calculate query complexity with detailed breakdown.
Returns a QueryComplexity record containing individual metrics and a weighted total score. Higher scores indicate more complex queries.
Parameters:
Returns:
String -> Result QueryComplexity SqlParseError
PgParser.complexity "SELECT * FROM users"
# => Ok (QueryComplexity {tables: 1, joins: 0, subqueries: 0, ctes: 0, functions: 0, score: 1})
PgParser.complexity "SELECT * FROM a JOIN b ON a.id = b.a_id JOIN c ON b.id = c.b_id"
# => Ok (QueryComplexity {tables: 3, joins: 2, subqueries: 0, ctes: 0, functions: 0, score: 7})
PgParser.complexity "SELECT * FROM users WHERE id IN (SELECT user_id FROM orders)"
# => Ok (QueryComplexity {tables: 2, joins: 0, subqueries: 1, ctes: 0, functions: 0, score: 5})complexity-score
Calculate query complexity score (simple version).
Returns just the weighted complexity score as an integer. Use complexity for a detailed breakdown of individual metrics.
Parameters:
Returns:
String -> Result Int SqlParseError
PgParser.complexity-score "SELECT * FROM users"
# => Ok 1
PgParser.complexity-score "SELECT * FROM a JOIN b JOIN c"
# => Ok 7