beginners

2026-02-13T08:09:28.137629Z

Hi At work we get somewhat big (>600MB) json files which we need to process. Im able to load them with Chesire and slurp. But i would like to explore them a little bit are there some tools which could help me with that? (One idea was to load them somehow into a datomic database so that i can query it.)

Marcin Borkowski 2026-02-16T11:30:22.492769Z

Maybe only slightly related, but I have two blog posts about exploring data with UUIDs in Emacs so that when the point (cursor) is on a UUID, the JSON representation of the entity whose primary key is that UUID is shown in the echo area. My code is Node.js and Postgres-based, but can be adapted to other stacks. https://mbork.pl/2025-07-26_Finding_entities_with_given_uuids_in_the_current_project https://mbork.pl/2025-08-11_Using_Eldoc_to_show_entities_with_given_uuids_in_the_echo_area

👍 1
adi 2026-02-13T08:13:14.955839Z

Perhaps... dump JSON records into SQLite JSON column and use babashka to query the SQLite.

➕ 1
adi 2026-02-13T08:13:50.039249Z

ref: https://github.com/babashka/pod-babashka-go-sqlite3

adi 2026-02-13T08:14:43.675659Z

Quick duck-search showed this blog post: https://www.dbpro.app/blog/sqlite-json-virtual-columns-indexing

2026-02-13T08:16:01.431929Z

Thanks! Thats a great idea. I had in my mind that jsonb is much more limited in size

adi 2026-02-13T08:18:26.407679Z

Maximum length of a string or BLOBPhysical storage limit if each file is a single 600 MB JSON object: https://sqlite.org/limits.html > The current implementation will only support a string or BLOB length up to 2,147,483,645 bytes

liebs 2026-02-13T09:23:54.554879Z

https://jqlang.org/

2026-02-13T09:37:48.797719Z

Thanks! I had a short look there in the beginning. But my documents have a lot of internal relations (one thing references an other via an uuid) so i have the feeling someldatabase is easier

Thomas McInnis 2026-02-13T09:57:31.511519Z

Paula Gearson did a great presentation about auto schema generation from json for Datomic at the 2017 conj which I happened to watch a couple of days ago: https://www.youtube.com/watch?v=8jXEqvTnOTg

2026-02-13T10:01:23.532899Z

Cool

teodorlu 2026-02-13T12:54:41.862849Z

https://github.com/cnuernber/charred might come in handy. JSON parser optimized for performance, quite a bit faster than Cheshire for large JSON files.

respatialized 2026-02-13T13:18:01.976139Z

I would recommend https://duckdb.org/ over sqlite in this instance

💯 1
➕ 3
Bailey Kocin 2026-02-13T13:48:01.866289Z

Can second DuckDB really useful even the command line tool

respatialized 2026-02-13T14:00:08.048709Z

if you want to go the datalog route due to the internal relations, consider using https://github.com/datalevin/datalevin. the two big selling points for this use case are 1. runs locally with minimal setup 2. you don't have to define a schema up front like you do with Datomic (though it may be beneficial for a subeset of keys that express relations) https://github.com/quoll/asami is also a schemaless datalog DB and has even more advanced support for graph relations. Datalevin might have better performance.

1
Harold 2026-02-13T16:02:29.015129Z

We use fx for this, it can be great: https://fx.wtf/ If the shape is right (an array of objects with many keys in common, like a table) then visidata https://www.visidata.org/ can be helpful too.

2026-02-13T16:26:32.734129Z

Thanks

👍 1
2026-02-20T18:51:43.031169Z

Turning Json into a graph DB is one of the selling propositions of Asami: https://github.com/quoll/asami There was even a talk "Asami: Turn your JSON into a Graph in 2 Lines": https://youtu.be/-XegX_K6w-o?si=oQsomgNemoZnTDzX

2026-02-20T19:00:31.787449Z

Thanks i will have a look 👍