Fork me on GitHub
#sql
<
2023-06-07
>
aaelony17:06:06

Hello, I posted this in find-my-lib, but cross-posting here as well: does anyone know of a library that parses SQL text (mostly with, from, join, CTEs, and subqueries) into a Graphviz diagram showing how inputs flow?

isak19:06:55

I think you would want something that visualizes execution plans that you get from the database. Or do you mean something else when you say 'how inputs flow'?

aaelony20:06:10

No, nothing to do with execution plans. Most businesses have complex analytical queries to compute analytic targets ( DAU, MAU, churn, etc,,). Often there is complex business logic in the form of SQL queries that provide the basis for such computation (that differ for each business depending on how they wish to compute it though there are industry norms). It is often the case that there are several (or many) CTEs that do joins, groupings, filterings, and compute metrics along the way before the final projection. Hence a graphviz diagram is often helpful. The difficult part is that the SQL needs to be parsed properly before a diagram can be made. Fun project though, if it doesn't already exist.

aaelony20:06:02

Here is an example, but it could be much improved https://graphviz.org/Misc/sql2dot/

aaelony20:06:19

(i.e. it doesn't work very well on files I have tested it with)

aaelony20:06:28

e.g. Sql2Dot.java is perhaps a starting point

isak20:06:37

I've something related before, but only for table relationships based on consuming a bunch of queries and looking at the joins. The code is here, but keep in mind it is just an example: https://github.com/isaksky/FsSqlDom/blob/master/examples/FsSqlDomGallery/FsSqlDomGallery/GraphTableRelationships.fs

isak20:06:38

That is probably a cool idea for Clojure if there is a good SQL Parser available

🔥 2
aaelony20:06:34

Nice! My C# isn't too good, but table relationships are an important part. Typically, queries start with tables, then have CTE's and/or subqueries in route to the final projection

aaelony20:06:24

exactly, looking for a good SQL parser. Perhaps there is a BNF route with instaparse, not sure

isak20:06:24

Yea that could work. Or maybe this one, if you can find a good antler file: https://github.com/aphyr/clj-antlr Maybe from here: https://github.com/antlr/grammars-v4/tree/master/sql

aaelony20:06:32

never heard of g4 files

aaelony20:06:32

clj-antlr errors out and seems hard to use without a better understanding. Another resource might be https://github.com/datacamp/antlr-tsql

isak21:06:14

Ah, bummer. If you need T-SQL specifically though, I know the one inside this package works extremely well (we use it for parsing, rewriting and building SQL in production): https://www.nuget.org/packages/Microsoft.SqlServer.DacFx/162.1.39-preview https://learn.microsoft.com/en-us/dotnet/api/microsoft.sqlserver.transactsql.scriptdom.tsql150parser?view=sql-dacfx-161 Bummer if C#/F# is out of the question.

aaelony01:06:29

In R, https://cran.r-project.org/web/packages/RSqlParser/RSqlParser.pdf simplifies things without needing a parser. But anyways, a clojure library for this would be useful

refset11:06:49

There may well be something of use here... https://github.com/xtdb/xtdb/blob/2.x/core/src/main/clojure/xtdb/sql.clj (although the upstream EBNF grammar + edn conversion isn't available for distribution)

metal 2