@lidorcg I added experimental support for unstructured inputs (edn/json) in this PR https://github.com/replikativ/datahike/pull/730 as documented here https://github.com/replikativ/datahike/blob/729-support-unstructured-input/doc/unstructured.md. I can merge the PR in the next days, but I would incorporate your feedback first if you have any.
Hey @whilo 👋🏼
Finally got around to checkout the transact unstructured.
My experience:
When transacting non-schemaed (deeply nested) data only the first level is schemaed, the rest is just inserted as a whole value.
However when I use du/process-unstructured-data first, it does provide the correct full schema. I was under the impression that transact uses du/process-unstructured-data under the hood so I was baffled at first.
However I want to stress this is not a bug report but an experience report.
I believe I found an interesting work model with the current settings:
I use a convention of my own to determine what I deem as entities (currently maps that have keyword with name id).
I clean the data of everything else.
I use du/process-unstructured-data to derive the schema, Filter out every existing attr ident. , and transact the resulted schema-patch.
Then I transact the data.
This gives me quite the leverage of controlling what are entities and refs, and what are just values.
I can still predicate arbitrarily on values, and use queries for entities, relations & attributes.
For now I'm quite happy 😁.
I am happy to hear that 🙂
@whilo thank you so much for your responsiveness and willingness to help. We'll be happy to test that and provide feedback. Unfortunately both @itai and me are out for an holly day and we'll be back in about week and half. We'll be happy to test that and provide feedback when we're back 🙏
Sounds good!