This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
I'm struggling to get import-cloud
working with anything besides a top level :since
filter set to an extremely recent time. Outside of that I just get Importing
printed and then nothing happening until sometimes an eventual timeout error.
I have two attributes I need.
::attr-a has a few hundred new datoms transacted every minute and I only need this most recent data
::attr-b had a bunch of data inserted about a week ago. Duplicate data gets periodically re-transacted, but I think the datoms are still marked with the :t
at which they were originally asserted, so I need the filter to include that earliest time.
Overall my entire database isn't huge, I'm not sure if this note from the docstring:
> import-cloud limits the total number of datoms imported in a transaction to 16 million and limits strings to 1 million characters.
refers to the total pr-str of the data, but overall what I'm trying to get should fall well below those limits.
(let [two-mins-ago (-> (java.time.Instant/now)
(.minus (java.time.Duration/parse "pt10m"))
java.util.Date/from)
earlier #inst"2024-03-29"]
(dl/import-cloud
{:source source-conf
:dest dest-conf
:filter
;; Just the top level filter works if I make it recent enough. but I need older data
#_{:since two-mins-ago}
;; Even just specifying a specific attribute filter for the same
;; restriction hangs forever
#_{:include-attrs
{::attr-a {:since two-mins-ago}}}
;; This is what I really want. But also hangs forever
{:include-attrs
{::attr-a {:since two-mins-ago}
::attr-b {:before earlier}}}}))
Anything obvious I'm doing wrong?
Anyone else have any issues like this? How can I even begin to debug this?
I'm on windows, it's possible that's relevant since it doesn't seem well supported, but the fact I can get it working with one of the filters makes me hopeful it's possible to get this workingI’ll have to double check but I don’t know for sure that mixing before and since for different attrs is supported.
To try to narrow it a bit, I can't even get just a single specific attribute to import. Even this hangs forever for me after printing "Importing" when :my/attr
is has a valueType of keyword and ~3000 datoms
This should only import those :my/attr
datoms, right?
(dl/import-cloud
{:source ...
:dest ...
:filter
{:my/attr {}}})
Shot in the dark, and unlikely this is the bug vs something I'm doing wrong, but it feels sort of like it's always doing :my/*
to import everything at the namespace, even when I give it a specific attribute. (which would make sense why it would timeout since there's a lot of stuff in the namespaces I have). Are you able to confirm that just importing a specific attribute with no filters imports just those datoms?
I wonder if that instance can handle the import request. Are there any alerts in the logs?
Hmm none that seem related, just some like ConsumedWriteCapacityUnits < 150 for 15 datapoints within 15 minutes
Are you being DDB Read throttled? (Check the ddb table's "Monitoring" dashboard tab, not cloudwatch)
That explains why you could get very recent answers (that part of the log was still in memory)
So the imports can cause throttling even when queries for similar amounts of data don't seem to come close?
your queries are hitting the object-cache/efs/s3 the import is reading the log directly from DDB by scanning ddb items
(Sorry it took so long to respond to this, I saw it the day you posted but had to think about how to diagnose it in my background mind)
So the solution is probably to just raise read capacity on the table before doing imports, and change it back later right?
That could work, or you might also consider using ddb on-demand provisioning. It's (marginally) more expensive since it's pay-per-request, but as long as you're aware of that you can decide what works best for you