Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
225 changes: 172 additions & 53 deletions content/user-guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,54 +3,79 @@ title: User Guide
layout: sidenav
---

## Generating a CodeMeta file
## What are CodeMeta files?

You can use the [codemeta-generator](https://codemeta.github.io/codemeta-generator/) directly at <https://codemeta.github.io/codemeta-generator/>
CodeMeta files, also called "CodeMeta instance files" are the `codemeta.json`
documents that are placed in the root of a software's code repository tree.
They define various aspects of the project in a JSON variant called JSON-LD,
which uses linking attributes to connect the data in this file with data from
other available sources.

## The CodeMeta Generator

The CodeMeta Generator is a tool for taking user input and either generating a
valid `codemeta.json` file, or testing an existing file to make sure that it
is valid.

### Generating a CodeMeta instance file

CodeMeta files can be generated using the
[CodeMeta Generator](https://codemeta.github.io/codemeta-generator/).
Instructions for [using the CodeMeta Generator](create) are available.

A _*beta*_ version of an automatic generator is also linked on that page.

### Testing a CodeMeta instance file

Your CodeMeta files can be validated using the
[codemeta-generator](https://codemeta.github.io/codemeta-generator/). Paste
the contents of a `codemeta.json` file into the bottom box, and click the
`Validate codemeta.json` button.

## Creating a CodeMeta instance file manually

A CodeMeta instance file describes the metadata associated with a software object using JSON's linked data (JSON-LD) notation. A CodeMeta file can contain any of the properties described on the [CodeMeta terms page](/terms/). Most CodeMeta files are called `codemeta.json` by convention.
A CodeMeta instance file describes the metadata associated with a software
object using JSON's linked data (JSON-LD) notation. A CodeMeta file can
contain any of the properties described on the [CodeMeta terms page](terms).

Here is an example of a basic `codemeta.json` that you can put at the root of a GitHub repo
([link to full example](https://github.com/gem-pasteur/macsyfinder/blob/master/codemeta.json)):
Any plaintext or code editor is sufficient for creating a CodeMeta instance
file. An editor that has syntax highlighting for `JSON` can assist by
making errors in the syntax stand out.

```json
{
"@context": "https://w3id.org/codemeta/3.1",
"type": "SoftwareSourceCode",
"applicationCategory": "Biology",
"codeRepository": "https://github.com/gem-pasteur/macsyfinder",
"description": "MacSyFinder is a program to model and detect macromolecular systems, genetic pathways… in prokaryotes protein datasets.",
"downloadUrl": "https://pypi.org/project/MacSyFinder/",
"license": "https://spdx.org/licenses/GPL-3.0+",
"name": "macsyfinder",
"version": "2.1.4",
"continuousIntegration": "https://github.com/gem-pasteur/macsyfinder/actions",
"developmentStatus": "active",
"issueTracker": "https://github.com/gem-pasteur/macsyfinder/issues",
"referencePublication": "https://doi.org/10.24072/pcjournal.250"
}
```
Most CodeMeta files are called `codemeta.json` by convention. While other
filenames are valid, they will be less recognisable and may be overlooked.
{.tip}

### Basics
### Understanding JSON and JSON-LD

When creating a CodeMeta document, note that they contain JSON name ("property" in linked-data), value pairs where the values can be simple values, arrays or JSON objects. A simple value is a number, string, or one the literal values *false*, *null* *true*, for example:
CodeMeta files contain JSON *key-value pairs*, sometimes referred to as
*name-value pairs* where the values can be *simple values*, *arrays*, or *JSON
objects*. Keys are also known as *properties* in linked data.

#### Simple Values

A simple value is a number, string, or one the literal values *false*, *null*
*true*. For example:

```json
"name" : "R Interface to the DataONE REST API"
```

There must be a comma between of these key-value pairs, and no comma at the end before the closing bracket (`}`).
Key-value pairs must be separated by a comma. There must be no comma at the
end before the closing brace (`}`).

### Arrays
#### Arrays

A JSON array is surrounded by the characters `[` and `]`, and can contain multiple values separated by commas:
A JSON array is surrounded by parentheses; `[` and `]`. Arrays can contain
one or multiple values separated by commas:

```json
"keywords": [ "data sharing", "data repository", "DataONE" ]
```

As with any JSON documents, you can add line breaks between values for improved quality. For example, the former key-value pair is this is equivalent to:
Arrays should contain line breaks between values and indenting (spaces at the
start of a line). These make the data easier for humans to read. The above
example is equivalent to:

```json
"keywords": [
Expand All @@ -60,7 +85,9 @@ As with any JSON documents, you can add line breaks between values for improved
]
```

All fields that accept a value of a given type accept an array of values of this type, and vice-versa. For example, a software with two licenses could have this attribute:
Fields that accept a value of a given type will accept an array of values of
that type. For example, a software with two licenses could have this
attribute:

```json
"license": [
Expand All @@ -69,9 +96,11 @@ All fields that accept a value of a given type accept an array of values of this
]
```

### Objects
#### Objects

Some properties, such as `author`, can refer to other JSON objects surrounded by curly braces and can contain other JSON values or objects, for example:
Some properties, such as `author`, can refer to other JSON objects. Objects
are surrounded by braces; `{` and `}`. These can contain other JSON values or
objects. For example:

```json
"author": {
Expand All @@ -83,18 +112,43 @@ Some properties, such as `author`, can refer to other JSON objects surrounded by
}
```

The JSON-LD "@type" keyword associates a JSON value or object with a well known type, for example, the
statement `"@type":"Person"` associates the `author` object with `http://schema.org/Person`.
It is good practice to always provide the `@type` for any property which specifies a node (JSON object).
The [terms page](/terms/) indicates these node types.
#### Keywords

JSON-LD has the concept of Keywords, which are properties prefaced with a `@`.
Keywords give instructions to the processor instead of describing relations
between entities.

This includes:

* defining shorthands (`@context`, `@vocab`),
* changing value semantics (`@list` and `@set`, `@value`, `@language`, ...),
* intrinsically describing objects (`@id` and `@type`)

The diagram below visualises how `@context` instructs the externally stored
definition for CodeMeta to be *embedded* to *expand* the `codemeta.json`
document:

![Diagram of a JSON-LD reference pulling data in from an external data source](/img/jsonld-references-diagram.svg)

The "author" JSON object illustrates the use of the JSON-LD keyword "@id", which is used to associate an IRI with the JSON object. Any such node object can be assigned an `@id`, and we may use the `@id` to refer to this same object (the person, Peter), elsewhere in the document; e.g. we can indicate the same individual is also the `maintainer` by adding:
The JSON-LD `@type` keyword associates a JSON value or object with a well
known type. In the previous example, the statement `"@type":"Person"`
associates the `author` object with `http://schema.org/Person`. The
`@type` for any property which specifies a node (JSON object) should be
provided. The [terms page](/terms/) indicates these node types.

The `author` JSON object illustrates the use of the JSON-LD keyword `@id`,
which is used to associate an IRI with the JSON object. Any such node object
can be assigned an `@id`, and future uses of the `@id` property's *value* will
refer to this same object, (the person, Peter), elsewhere in the document. For
example, we can indicate the same individual is also the `maintainer` by
adding:

```json
"maintainer": "http://orcid.org/0000-0003-0077-4738"
```

This should be added at the top level of the document, indicating that this individual is the `maintainer` of the software being described, like this:
This should be added at the top level of the document, indicating that this
individual is the `maintainer` of the software being described, like this:

```json
{
Expand All @@ -113,7 +167,11 @@ This should be added at the top level of the document, indicating that this indi
}
```

JSON-LD operations can later *expand* this reference and *embed* the full information at both locations. This means the example above is equivalent to:
JSON-LD operations can later *expand* this reference and *embed* the full
information at both locations.

This means the previous example is equivalent to:


```json
{
Expand All @@ -138,9 +196,10 @@ JSON-LD operations can later *expand* this reference and *embed* the full inform
}
```

### Nesting objects
#### Nesting objects

We saw before a simple (root) SoftwareSourceCode object:
The following SoftwareSourceCode object is an example of a simple root
object:

```json
{
Expand All @@ -150,7 +209,8 @@ We saw before a simple (root) SoftwareSourceCode object:
}
```

and this root object can refer to other objects, for example recommend a SoftwareApplication:
A root object can refer to other objects. For example, it may recommend a
SoftwareApplication:

```json
{
Expand All @@ -165,7 +225,8 @@ and this root object can refer to other objects, for example recommend a Softwar
}
```

And you may in turn want to add attributes to this application:
Nesting can go many layers deep. In this example, to add attributes to this
application:

```json
{
Expand All @@ -185,9 +246,22 @@ And you may in turn want to add attributes to this application:
}
```

It is important to mind the order of curly brackets (an object begins with a `{` and ends with a matching `}`) and indentation (spaces at the beginning of a line) to reflect the hierarchy: "Central R Archive Network (CRAN)" is the name of the provider of "rmarkdown", which is a softwareSuggestions of CodemetaR.
Indentation and matching braces are important. These reflect the hierarchy of
the document.

Each object begins with a `{` and ends with a matching `}`. Each object should
also have a depth of indentation (the spaces at the beginning of a line) that
reflects its place in the hierarchy.

For example, the above code is not equivalent to:
This above example defines "Central R Archive Network (CRAN)" as the name of
the provider of "rmarkdown", which is a softwareSuggestions of CodemetaR.

Putting key-value or property-value pairs in a different place in the document
hierarchy can change the meaning of the document.

The code below has the `"url"` pair at a different hierarchy. The result is
that it no longer belongs with the `"provider"` information, and the meaning
of the document has changed. It is *_not_* equivalent to the code above.

```json
{
Expand All @@ -207,23 +281,68 @@ For example, the above code is not equivalent to:
}
```

because in the latter, `"https://cran.r-project.org"` is the `"url"` of `rmarkdown`, instead of being the url of `Central R Archive Network (CRAN)`.
The change in hierarchy means that `"https://cran.r-project.org"` is
represented as the `"url"` of `rmarkdown`, instead of being the url of
`Central R Archive Network (CRAN)`.

### Example of a CodeMeta file

The following is an example of a basic `codemeta.json` that can be put at the
root of a code repository:

```json
{
"@context": "https://w3id.org/codemeta/3.1",
"type": "SoftwareSourceCode",
"applicationCategory": "Biology",
"codeRepository": "https://github.com/gem-pasteur/macsyfinder",
"description": "MacSyFinder is a program to model and detect macromolecular systems, genetic pathways… in prokaryotes protein datasets.",
"downloadUrl": "https://pypi.org/project/MacSyFinder/",
"license": "https://spdx.org/licenses/GPL-3.0+",
"name": "macsyfinder",
"version": "2.1.4",
"continuousIntegration": "https://github.com/gem-pasteur/macsyfinder/actions",
"developmentStatus": "active",
"issueTracker": "https://github.com/gem-pasteur/macsyfinder/issues",
"referencePublication": "https://doi.org/10.24072/pcjournal.250"
}
```

([Link to full example](https://github.com/gem-pasteur/macsyfinder/blob/master/codemeta.json)).

## The context

Every CodeMeta document must refer to the context file *codemeta.jsonld*, for example via a URL. This indicates that all terms in the document should be interpreted in the "context" of CodeMeta. Most terms are chosen to match the equivalent terms in <http://schema.org>, but CodeMeta provides a few additional terms not found in <http://schema.org> which may be helpful for software projects. CodeMeta also restricts the context to use only those <https://schema.org> terms that are explicitly listed on the [terms](/terms/) page. Users wanting to include additional terms must extend the context (see [developer-guide](/developer-guide/)).
Every CodeMeta document must refer to the context file *codemeta.jsonld*, for
example via a URL. This indicates that all terms in the document should be
interpreted in the "context" of CodeMeta.

Most terms are chosen to match the equivalent terms in <http://schema.org>,
but CodeMeta provides a few additional terms not found in <http://schema.org>
which may be helpful for software projects.

CodeMeta also restricts the context to use only those <https://schema.org>
terms that are explicitly listed on the [terms](/terms/) page. Users wanting
to include additional terms must extend the context (see
[the developer guide](/developer-guide/)).

The context file may be modified and updated in the future, if new JSON
properties are added or existing ones modified.

The context file may be modified and updated in the future, if new JSON properties are added or existing ones modified.
The CodeMeta GitHub repository defines tags to allow specific versions of a file to be referenced, and assigns
*digital object identifiers*, or DOIs, to each release tag. Please use the [appropriate release](https://github.com/codemeta/codemeta/releases) of the CodeMeta schema in order to refer to the
appropriate context file, e.g.
The CodeMeta GitHub repository defines tags to allow specific versions of a
file to be referenced, and assigns *digital object identifiers*, or DOIs, to
each release tag. Please use the
[appropriate release](https://github.com/codemeta/codemeta/releases) of the
CodeMeta schema in order to refer to the appropriate context file, e.g.

```json
"@context": "https://w3id.org/codemeta/3.1"
```

Release candidate versions may be referred to consistently using their [git tag](https://github.com/codemeta/codemeta/tags) for the raw version, e.g. <https://raw.githubusercontent.com/codemeta/codemeta/2.0-rc/codemeta.jsonld>. *Please do not refer to the raw GitHub URL for the master branch*, as this is subject to change and will not guarantee a stable metadata file.
## Referencing CodeMeta

## Testing An Instance file
Release candidate versions may be referred to consistently using their
[git tag](https://github.com/codemeta/codemeta/tags) for the raw version, e.g.
<https://raw.githubusercontent.com/codemeta/codemeta/2.0-rc/codemeta.jsonld>.
*Please do not refer to the raw GitHub URL for the master branch*, as this is
/subject to change and will not guarantee a stable metadata file.

Our [codemeta-generator](https://codemeta.github.io/codemeta-generator/) can also check a codemeta.json file you wrote is valid. To do that, copy-paste your code in the bottom box, and click "Validate codemeta.json".
Loading