Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: nomenclature #5

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

joehand
Copy link
Contributor

@joehand joehand commented Feb 6, 2018

This DEP is still work-in-progress.

I've added the summary/motivation and Bryan's questions. I need to put a bit of thought in how to best organize this, in the case we have a lot of terms in here.

We also have existing terms in the Dat documentation: https://docs.datproject.org/terms. I can update/consolidate those to this DEP.

I will try to collect some good examples of other nomeclature/naming convention docs as motivation, if you have suggestions.

@pfrazee
Copy link
Contributor

pfrazee commented Feb 6, 2018

Good call. I share the question about registers.

@joehand
Copy link
Contributor Author

joehand commented Feb 6, 2018

On that topic, I started to try to add some decision making criteria when deciding between a few possible words (realizing now I should make a section for this):

By defining Dat nomenclature, we can ensure the writing of the wider Dat community also uses the preferred terms. ... To reduce barriers to entry, this DEP will prefer words that are less technical while conveying the same meaning.

So, though register is more technically accurate it seems like feed or log may be preferred.

@joehand
Copy link
Contributor Author

joehand commented Mar 21, 2018

Note to self from previous meeting:

discuss syncing/seeding/etc in nomenclature DEP - jhand will formalize terms that need definition

@martinheidegger
Copy link
Contributor

Things I miss specified:

  • Version: What is considered a version in DAT?
  • Bootstrapping: What do we mean when bootstrapping the network?
  • Sparse: DAT's can be "sparsely" replicated?!
  • Checkout: What is considered a checkout?
  • Live: There are properties in code that refer to something being "live"

... and a link to the terminology used in the dat protocol book: https://github.com/datprotocol/book/blob/a3ca149853b9153c7140876d6f749ecad5c6edbb/src/ch03-01-terminology.md

@pfrazee
Copy link
Contributor

pfrazee commented Nov 15, 2018

I'll offer some definitions here...

  • Version: Internally every dat data-structure is composed of append-only logs (hypercores). Any time an entry is appended to the log, a new version is created. The version is identified according to the semantics of the data-structure. In the case of single-writer hyperdrive, it's currently being identified by the metadata log's latest message number.
  • Bootstrapping: This is probably referring to getting connected to the discovery DHT network.
  • Sparse: Means that the data-set is only partially downloaded/replicated.
  • Checkout: Viewing a previous version of a dat.
  • Live: This one is a little vague but usually means "connected to peers and downloading updates as they come."

@bnewbold
Copy link
Contributor

I would define a Checkout as a folder containing files from a dat/hyperdrive feed at a specific version (which could be the most recent version; doesn't need to be "previous"). This is distinct from having the same content stored locally in SLEEP files. The terminology comes from git and git checkout.

To clarify Version, it's the integer message number of a hypercore feed. These days, with multi-writer, the term gets a bit more ambiguous because there are multiple feeds, so the version of a hyperdb overall can be an array of (feed, integer) pairs. UX/nomenclature around this will probably need an update for dat-on-multiwriter-hyperdb.

Sparse usually implies that not only is the dataset/feed only partially replicated, but that it's intentionally only partially replicated: the user only wanted, eg, a sub-directory, or only specific versions replicated. I don't think there is clarity/terminology around the case of having "the entire most recent version of a hyperdrive/hyperdb" (eg, full values for all keys/files at the most recent version) but not full history: is that considered Sparse? In conversation i've usually heard people refer to this as the default condition (just having the most recent version), and having the full history of the feed being a speciall "Full History" or "Archival" copy.

I agree with pfrazee on Bootstrapping and Live.

@aral
Copy link

aral commented Dec 14, 2018

Suggestion regarding key naming, to strengthen the intent and usage of the keys and remove ambiguity about what abilities various keys grant:

  • Public key → Read key
  • Secret key → Write key
  • Discovery key → Discovery key (unchanged)

This way, people new to the system will not be misled into thinking, for example, that the public key is public (where would they get such an idea?) ;P

And the keys do exactly what they say on the tin.

Example usage:

The Read Key grants read access to a DAT whereas the Write Key is required to write to a DAT. The Discovery Key is used to discover a DAT and it is derived as a hash of the Read Key. The Read Key and Write Key should both be kept secret.

Thoughts?

@yoshuawuyts
Copy link
Contributor

yoshuawuyts commented Dec 18, 2018

@aral I think that's a pretty reasonable suggestion that could remove some ambiguity. I also always mistake "secret key" with "private key", which this would also help solve.

@martinheidegger
Copy link
Contributor

@aral I am considering working on an "encrypted DAT". :DATs that are additionally encrypted with yet another key in order to implement proxies/bridges that don't know about the content of a DAT. Do you have any idea how this Key should be called? ;)

@aral
Copy link

aral commented Jan 18, 2019

If you mean encrypting the contents of hypercores, I’d say “encryption key” does what it says on the tin.

@martinheidegger
Copy link
Contributor

Thanks, naming is hard :-)

@martinheidegger
Copy link
Contributor

What entails "sync" in dat sync and "share" in dat share? (Question that came up in chat) Also: how to call a peer that has a write key vs a peer that doesn't?

@aral
Copy link

aral commented Feb 28, 2019

Here’s the latest glossary for Hypha, in case it helps – please feel free to use the definitions that apply: https://ar.al/2019/02/18/hypha-glossary/

Regarding the last question in your latest comment, @martinheidegger, authorised vs unauthorised is what I’m using.

@martinheidegger
Copy link
Contributor

"authorised" makes sense in a multiwriter context, but in a single-writer context (that will exist in future) it feels weird as there is no way how to every authorise another peer.

@RangerMauve
Copy link
Contributor

Pinning Service: A server you can send your dat:// URL to in order for it to replicate your content and stay online. They're useful for making sure content is available in the network. Example: Hasbhase

@martinheidegger
Copy link
Contributor

martinheidegger commented Mar 19, 2019

@RangerMauve "pinning service" was also known as "publishing": https://github.com/datproject/dat/blob/master/src/commands/publish.js

  • registry: A server that can replicate a DAT - usually with a login
  • publish: Telling a registry to replicate a DAT

@martinheidegger
Copy link
Contributor

martinheidegger commented Mar 19, 2019

@RangerMauve Do you think registry/publish is worse than "pinning service"/pinning? (is it the same thing?)

@RangerMauve
Copy link
Contributor

I haven't seen registry/publish used outside of datbase, and I haven't seen datbased used much.

I like pinning because it's more descriptive of what's actually happening. A registry/publishing implies some sort of centralization or control. Whereas pinning has more of a "Hey, I'm keeping this around for you" feeling where you're still in control and it's no big deal who's pinning it.

I also like the term "Seeding" since it relates to the BitTorrent world

@joehand
Copy link
Contributor Author

joehand commented Mar 19, 2019

Some of the pinning stuff may be addressed in https://www.datprotocol.com/deps/0003-http-pinning-service-api/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants