Fork me on GitHub
#clojars
<
2018-03-15
>
xtreak2908:03:45

Is there a database of list of dependencies for a given artifact. I can't access it from the API. I wish to build a website that prints the dependency tree of a given library for the given version and also to track the issue with latest JDK. Is this data available via API or as a JSON dump or do I need to crawl the data?

juhoteperi08:03:27

@xtreak29 Dependencies are part of maven data: http://repo.clojars.org/metosin/eines/0.0.9/eines-0.0.9.pom look for <dependencies>

xtreak2908:03:50

Thanks. Is POM available as a JSON or much helpful as a dump? That would help me avoid scraping

xtreak2908:03:36

Sorry. I clicked the link and rendered them in a browser. The XML output is helpful. Still a database will be very much helpful or I need to download the entire POM and construct a database for myself. I am sure someone would have done this already.

juhoteperi08:03:19

As far as I remember, Clojars doesn't have the dependency information in DB

xtreak2908:03:35

Thanks. It gave me a good pointer. It's just that I don't want to scrap and load the server if the data is available since there are around 153k entries as I can get from http://clojars.org/repo/all-poms.txt.gz.

xtreak2908:03:45

With each file around 5KB it will be a lot of data to scrape šŸ˜•

juhoteperi08:03:10

Hmm I think there is a way to load all pom files, for development environment

xtreak2908:03:30

Yes, there is rsync option I think

juhoteperi08:03:50

Or maybe I have just used rsync. You could probably tell rsync to only load pom files, not jars.

xtreak2908:03:20

> If you want to use the actual repo from http://clojars.org, you can grab it via rsync.

xtreak2908:03:41

> Note that this setup task isn't perfect - SNAPSHOTS won't have version-specific metadata (which won't matter for the operation of clojars, but may matter if you try to use the resulting repo as a real repo), and versions will be listed out of order on the project pages, but it should be good enough to test with. https://github.com/clojars/clojars-web

xtreak2908:03:10

I am ok with non snapshot data but there is no information about the URL for rsync and so on.

juhoteperi08:03:50

Hmm? Wiki shows the rsync command

xtreak2908:03:30

But as you said I need only the poms and not the jars

juhoteperi08:03:13

--exclude '*.jar' might work, or --exclude '*' --include '*.pom'

xtreak2908:03:13

Thanks a lot. I will try that.

xtreak2908:03:38

I tried rsync -av --delete my-wonderful-copy-of-clojars --include="*/" --include="*.pom" --exclude="*" . It creates the folders. Around 42MB of empty folders but the pom files are printed on the screen and not downloaded. I cancelled in the middle. Am I missing something? I am trying to use the command against less data

juhoteperi08:03:42

maybe change the include - exclude order, so that include is done after exclude

xtreak2909:03:13

Is there a way to only do 10% of rsync or something?

xtreak2909:03:38

Gets me an empty directory

āžœ  ~ rsync -av --delete  my-wonderful-copy-of-clojars --exclude="*" --include="*/" --include="*.pom"
receiving incremental file list
./

sent 53 bytes  received 55 bytes  30.86 bytes/sec
total size is 0  speedup is 0.00

xtreak2909:03:21

rsync -av --delete my-wonderful-copy-of-clojars --include="*/" --include="*.pom" --exclude="*" works . It seems there are duplicate pom files that I need to clean up. Thanks a lot.

danielcompton21:03:36

@xtreak29 there is also https://github.com/clojars/clojars-web/wiki/Data which might have what you need, if not then open an issue and we might be able to add it