sql

2023-06-07T17:01:06.154389Z

Hello, I posted this in find-my-lib, but cross-posting here as well: does anyone know of a library that parses SQL text (mostly with, from, join, CTEs, and subqueries) into a Graphviz diagram showing how inputs flow?

isak 2023-06-07T19:47:55.168479Z

I think you would want something that visualizes execution plans that you get from the database. Or do you mean something else when you say 'how inputs flow'?

2023-06-07T20:08:10.481659Z

No, nothing to do with execution plans. Most businesses have complex analytical queries to compute analytic targets ( DAU, MAU, churn, etc,,). Often there is complex business logic in the form of SQL queries that provide the basis for such computation (that differ for each business depending on how they wish to compute it though there are industry norms). It is often the case that there are several (or many) CTEs that do joins, groupings, filterings, and compute metrics along the way before the final projection. Hence a graphviz diagram is often helpful. The difficult part is that the SQL needs to be parsed properly before a diagram can be made. Fun project though, if it doesn't already exist.

2023-06-07T20:13:02.980279Z

Here is an example, but it could be much improved https://graphviz.org/Misc/sql2dot/

2023-06-07T20:13:19.793779Z

(i.e. it doesn't work very well on files I have tested it with)

2023-06-07T20:16:28.526669Z

e.g. Sql2Dot.java is perhaps a starting point

isak 2023-06-07T20:18:37.216679Z

I've something related before, but only for table relationships based on consuming a bunch of queries and looking at the joins. The code is here, but keep in mind it is just an example: https://github.com/isaksky/FsSqlDom/blob/master/examples/FsSqlDomGallery/FsSqlDomGallery/GraphTableRelationships.fs

isak 2023-06-07T20:19:38.789459Z

That is probably a cool idea for Clojure if there is a good SQL Parser available

🔥 1
2023-06-07T20:20:34.940619Z

Nice! My C# isn't too good, but table relationships are an important part. Typically, queries start with tables, then have CTE's and/or subqueries in route to the final projection

2023-06-07T20:21:24.821879Z

exactly, looking for a good SQL parser. Perhaps there is a BNF route with instaparse, not sure

2023-06-07T20:22:54.656359Z

some bnf grammars here: https://ronsavage.github.io/SQL/

isak 2023-06-07T20:23:24.403629Z

Yea that could work. Or maybe this one, if you can find a good antler file: https://github.com/aphyr/clj-antlr Maybe from here: https://github.com/antlr/grammars-v4/tree/master/sql

2023-06-07T20:28:32.637179Z

never heard of g4 files

2023-06-07T20:55:32.778289Z

clj-antlr errors out and seems hard to use without a better understanding. Another resource might be https://github.com/datacamp/antlr-tsql

isak 2023-06-07T21:20:14.720649Z

Ah, bummer. If you need T-SQL specifically though, I know the one inside this package works extremely well (we use it for parsing, rewriting and building SQL in production): https://www.nuget.org/packages/Microsoft.SqlServer.DacFx/162.1.39-preview https://learn.microsoft.com/en-us/dotnet/api/microsoft.sqlserver.transactsql.scriptdom.tsql150parser?view=sql-dacfx-161 Bummer if C#/F# is out of the question.

2023-06-08T01:28:29.463059Z

In R, https://cran.r-project.org/web/packages/RSqlParser/RSqlParser.pdf simplifies things without needing a parser. But anyways, a clojure library for this would be useful

refset 2023-06-28T11:43:49.986969Z

There may well be something of use here... https://github.com/xtdb/xtdb/blob/2.x/core/src/main/clojure/xtdb/sql.clj (although the upstream EBNF grammar + edn conversion isn't available for distribution)

1