Fork me on GitHub
#clojars
<
2016-09-12
>
onetom03:09:45

@micha im also thinking about these trust questions for many years. i've even participated in the http://cacert.org movement and got an assurer status > You have made 28 assurances which ranks you as the #1791 top assurer. > Total Points Issued: 901

onetom03:09:45

i think the main problem is that trust is not a binary quantity and im not sure what the measure of trust should be. i understand proof of work is one proxy for this, but it's rather complicated infrastructure to setup, maintain, etc.

onetom03:09:07

i have a gut feeling that mirroring how we gauge trust in real-world context every day would be a good enough solution. that could be done by 1. visualizing a trust network 2. calculate a trust score 3. allow hinting the calculation based on our preferences (eg. i dont really trust microsoft or verisign, but i trust micha) 4. visualizing/monitoring changes in the web of trust over time

onetom03:09:40

it all hinges on what do we consider the electronic representation of identity of course

onetom03:09:31

this was quite well summarized in this famous keynote https://www.youtube.com/watch?v=RrpajcAgR1E

onetom03:09:27

so to me it's clear that identity can be a set of interconnected identity representations, which all add to the overall trust level

onetom03:09:33

pgp signatures can prove that an avatar and an email address are indeed represent a real individual

onetom03:09:18

imean i can just choose to accept it as a proof and for example i would sign your gpg key based on the fact i've seen you in youtube talks and we had skype video chats and conversations about specific commits which bear your github username.

onetom03:09:42

to me having such proof shown from a couple of community members who also trust each other would make your software a lot more trustable than being signed by come centralized CA with questionable security practices and opaque internal processes...

onetom04:09:41

but i think the main problem is the lack of good UX for all this in the 1st place as @danielcompton points it out on https://github.com/clojars/clojars-web/issues/560

onetom04:09:06

iirc the debian ecosystem was quite successfully maintaining a gpg ecosystem for their packager

onetom04:09:02

the 2nd problem could be the lack of protocol for representing such claims that i just made about micha for example. but let me know if you know about any good ones.

richiardiandrea04:09:11

Agree on this latter point, and specifically, PGP done right works with actual I'd checks and of course is only an identity check. Then there is the "how much I trust you as a dev" part, and this can be a custom metric (# of project downloads on Clojars maybe) . This point for me is the part noone has yet solved, but GitHub is pretty much the implicit CA here. Everything coupled with ipfs to avoid dependency tampering (distributed, signed, high available deps)

onetom04:09:41

im still not sure though how to trust the public keys from the various keyservers though... 😕

richiardiandrea04:09:13

Keys could reside on ipfs as well?

richiardiandrea04:09:35

Do you mean in terms of trust or tampering?

onetom04:09:46

just trust

richiardiandrea04:09:28

Yeah trust as a "good dev" is a difficult concept to measure, but there can be ways imho, we have so much info on GitHub/BitBucket..of course nothing really as accurate as coding shoulder to shoulder..

onetom04:09:31

like if none of my trusted identities are transitively trusting an identity, how do i know it's not someone impersonating a specific email and natural-name combination?

onetom04:09:42

well, i think trust is difficult to measure on a unified scale, BUT that's why in point 2 & 3 i proposed having parametric algorithms which everyone / every organization can customize/tune for themselves

onetom04:09:34

they can combine extra rules, like github metrics into their /user or /org trust score formula and thats it

onetom04:09:16

it helps to compare libraries against each other and also monitor over time the trust level of the various software components within the organization

richiardiandrea04:09:52

Yeah well, impersonation is very very complicated to ascertain in the digital world..and maybe a bit off here, at the end of the day we want to first of all be sure that the coder is trustful and she is not including strange stuff in the oss library I am using..

richiardiandrea04:09:02

Yeah I like the above

onetom04:09:32

so we don't have to tackle the problem of globally understood and accepted trust score calculation...

onetom04:09:34

yet it would be very useful and already better that what we have right now, which roughly equals to: 1. blind trust 2. trust central corporations because we have to pay them for witnessing information...

richiardiandrea04:09:50

Agree, also successful peer reviews could increase the "trust score"

onetom04:09:08

and what we havent talked about yet is digitally signed timestamps which should ideally accompany all signatures, so we can't cheat on time either...

onetom04:09:42

it should be easy to setup a time stamping service just for being able to fully dissolve centralization, but a few years ago when i checked i havent found any easy solution to this either...

richiardiandrea04:09:24

Uhm yeah nothing comes to mind actually, the signed timestamp is implicit when you deploy an artifact right? It would be implicit in ipfs

onetom04:09:55

@richiardiandrea which part of IPFS ensures trustable time stamping?

onetom04:09:19

(i still don't know the whole IPFS in and out, that's why im asking)

onetom04:09:47

so i imagine an interface which shows all of the players in a trust network, assigns default scores to them, but presents a list where you can override these. for example if something is timestamped with a privately operated timestamp server, i could say i don't trust it, but if i happen to trust the operator of it, then it's default wouldn't be "dont trust"

richiardiandrea04:09:25

Oh ok that's a different UX, I was just thinking of an implicit stamped timestamp as part of the (immutable) metadata sent over with an artifact..I see you are talking about something else ;)

miikka18:09:41

Regarding the security discussion: here's a specification for securing package indexes like Clojars. https://theupdateframework.github.io

miikka18:09:58

And here's a summary about it in the context of Python: https://lwn.net/Articles/629426/

miikka18:09:42

Other languages have investigated it as well - at least Ruby, Haskell and Rust - but I can't tell if any of the implementations is complete

miikka18:09:36

I'm not sure how easy it would be to adapt to Maven repos, but it's worth taking a look.

flyboarder19:09:52

A few thoughts RE: Security: 1. The repo is your only real source of truth, packages and even compilers can be hacked. 2. That means that the only way to really validate code is to a. trust a developer (and the access control on the repo) and b. content address the code to make sure your getting what the trusted dev was intending. 3. Ideally at the repo level trust is established socially via real world interactions which could also be related to online activity like github data (this is kinda what we already do, for better or worse) 4. In order to validate the packages content, you cannot receive the verification data and the package from the same source, this is a basic single point of failure in any architecture. It’s not that I don’t trust clojars, it’s that it makes them a single point of failure and also means they are the ones telling me what my trusted developers packages are. 5. Whats needed is a way to verify from the developer what the data should be (or from the repo directly) or any other distributed trust, and then verify that against the packages, in transit security is “good enough” already if you are using certificates 6. none of this takes into account the versioning or human readable layer to all this, versions would basically be sugar on top of all the verification, since it’s still needed in a package manager.

xcthulhu19:09:38

> Whats needed is a way to verify from the developer what the data should be (or from the repo directly) or any other distributed trust, and then verify that against the packages, in transit security is “good enough” already if you are using certificates Okay, as I have pointed out elsewhere, IPFS literally dogfoods itself in its source code to get upstream dependencies and avoids trusting certs - https://github.com/ipfs/go-ipfs/blob/0e2b4eb4eb68fab5497d0e8f8472d025b5c1d431/core/corenet/net.go#L7

flyboarder19:09:05

I agree IPFS is probably the way to go

xcthulhu19:09:10

Now, registries can also be hacked, but you could write an append-only registry on a blockchain.

flyboarder19:09:42

couldnt you just use ipfs as the registry

xcthulhu19:09:52

Yeah, they have a nameservice thingy

xcthulhu19:09:59

I'm not sure how to make it append-only

flyboarder19:09:47

i think that would be more for the ipfs app to handle

xcthulhu19:09:18

IPFS isn't a consensus layer, it's a content addressable storage and delivery mechanism with a distributed hash table.

xcthulhu19:09:37

Yeah, it's a TODO

xcthulhu19:09:53

But as far as I know it's not part of the white paper or anything

flyboarder19:09:43

Yeah seems very concept

xcthulhu19:09:36

You can already make a registry using a blockchain

xcthulhu19:09:48

It's just a lot of work to get integration.

tcrawley20:09:10

wow, lots of activity around securing clojars while I was on vacation. That's great that folks are interested!

tcrawley20:09:31

there's lots to reply to here, but a few initial thoughts:

tcrawley20:09:13

clojars is currently a standard maven repo (from a read point of view). Moving away from that to some custom format or distribution protocol would mean providing plugins for any build tool that would access it (currently maven, boot, lein, gradle, plus scripts that wget/curl deps directly)

tcrawley20:09:23

IIUC, IPFS would provide immutability and distribution, but couldn't guarantee that version X+1 was published by the same party as version X. All we can know is the version X that we pulled yesterday is the same as the version X we pulled today (which can only be done currently by keeping the hash of X local, so when you pull it again you can verify you are getting the same thing)

tcrawley20:09:46

please correct me if I'm wrong, my understanding of IPFS is surface

tcrawley20:09:48

we're working towards putting the repo behind a CDN, and have been working towards that for 9 months. That's a simple change when compared to moving to IPFS, so I can only imagine that transition would take years to implement

tcrawley20:09:50

we actually don't want true immutability, given that there are legitimate reasons to remove artifacts (which are all handled by the admins, never by users directly)

tcrawley20:09:03

we use gpg because maven does, but gpg is nothing without a WoT and a way to verify that (which has been touched on several times)

tcrawley20:09:32

I think the best approach here is to enumerate the real problems here as gh issues (https://github.com/clojars/clojars-web/issues), without discussing technologies initially - we need a clear statement of what the issues actually are

tcrawley20:09:47

and, tbh, if users have to understand IPFS and/or blockchains to build, then we're here: http://howfuckedismydatabase.com/nosql/fault-tolerance.png

micha20:09:55

@tcrawley are there really cases where you must delete an artifact after it's been published? like i see reasons why it's desired, but i'm not aware of any real need?

micha20:09:33

like if i accidentally push an artifact, like boot with version 9999.0.0 by accident

micha20:09:47

it's not awesome, but it's also just a number

micha20:09:11

i could push the next artifact with version 10000.0.0

tcrawley20:09:52

@micha: the "must" case is when a user accidentally pushes private code or credentials in a jar

tcrawley20:09:12

those generally don't even show up as gh issues, they mail <mailto:[email protected]|[email protected]>

tcrawley20:09:04

I don't delete in the case of "version X is broken and I want to replace it"

micha21:09:49

ah, private code is an interesting case

micha21:09:35

credentials i don't really think is a problem that needs deletion because you must assume they've been compromised anyway. if you don't you're negligent imo

micha21:09:38

but private code though, that's interesting, or i guess code you aren't authorized to upload to clojars in the first place

tcrawley21:09:36

yeah, good point re: credentials. I don't think we've ever had that case, it's always been code you don't have rights to release

micha21:09:12

i wonder if a cooling-off period would be sufficient?

micha21:09:30

like a level 1 cache that isn't in the immutable store for some amount of time

micha21:09:46

only visible to the user who uploaded the artifact or something?

micha21:09:29

perhaps like the "promote" button we currently have

micha21:09:17

i guess the legal situation maybe makes immutability a no-go fundamentally

juhoteperi21:09:03

(There has not been promote button in nearly a year)

micha21:09:58

so it wasn't immutable

tcrawley21:09:14

what wasn't immutable?

micha21:09:26

sorry bad joke

tcrawley21:09:53

the promote button was part of the trusted repo feature that was never finished and never used

tcrawley21:09:02

it just led to confusion

tcrawley21:09:45

that's how http://oss.sonatype.org works - you deploy to a staging repo, then manually release once you are happy with it

tcrawley21:09:06

but maven users hated that step so much that sonatype released a plugin that auto-releases for you

micha21:09:13

clojars can still be directed by a court to remove something or sued if you can't do it

tcrawley21:09:16

doing away with some of the value of the staging repo

micha21:09:41

so it seems that immutablility is a pipe dream for a real repo

tcrawley21:09:52

yes, and at this point, that would mean I would get sued directly. which is fun to think about

tcrawley21:09:06

THANKS FOR BRINGING IT UP

micha21:09:12

oh man sorry

micha21:09:33

i'll send you a file in a cake

tcrawley21:09:19

a jar file?

micha21:09:49

most disappointing mail call ever

micha21:09:11

jars got me in here, jars will get me out

micha21:09:16

things like poms can be stored immutably though

micha21:09:29

or metadata

micha21:09:50

it's only the actual artifacts that are a liability, maybe

tcrawley21:09:52

what does that buy us?

micha21:09:35

the ability to verify that an artifact at some URL has the right contents

micha21:09:10

and also the history of contents that have ever been at that URL

micha21:09:26

so you'd at least be able to know if an artifact has been modified

micha21:09:07

like if there was a repo-wide metadata file in git, sort of

micha21:09:19

when something is changed or added you make a commit

micha21:09:55

tools could then refuse to use artifacts that aren't in the metadata file or have the wrong hash or whatever

micha21:09:27

you could still delete the artifact from storage if you need to

micha21:09:34

without deleting the metadata from git, you just make a revert instead, so you preserve the ability to audit