-
Notifications
You must be signed in to change notification settings - Fork 2
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
13 changed files
with
246 additions
and
57 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -9,24 +9,12 @@ libraries, other formats that aim to describe fragment ions, and software tools | |
- Official mzPAF homepage: [psidev.info/mzPAF](https://psidev.info/mzPAF) | ||
- mzPAF documentation: [mzpaf.readthedocs.io](https://mzpaf.readthedocs.io) | ||
|
||
|
||
## Specification status | ||
|
||
Updated: 2023-09-01 | ||
|
||
The specification has been resubmitted to the PSI Document Process and is undergoing final community review. Ratification to formally become a PSI standard is anticipated near the end of 2023. | ||
|
||
Your comments and suggestions are still very much welcome. Please submit an issue at the repo to | ||
provide your feedback and send an e-mail to the HUPO-PSI editor Sylvie Ricard-Blum | ||
([email protected]). | ||
|
||
|
||
## In short | ||
|
||
- mzPAF is a single string of characters, case sensitive, without length limit | ||
- Multiple possible explanations are separated with a comma | ||
- Deltas of observed – theoretical *m/z* values are prefixed with a slash (`/`) | ||
- Confidences can be provided for different annotations prefixed with an asterisk (`*`) | ||
- Multiple possible explanations are comma-separated | ||
- Deltas of observed – theoretical _m/z_ values are prefixed with a slash (`/`) | ||
- Confidence of annotations are prefixed with an asterisk (`*`) | ||
|
||
The basic format of each annotation is: | ||
|
||
|
@@ -55,32 +43,70 @@ b2-H2O/3.2ppm*0.75,b4-H2O^2/3.2ppm*0.25 | |
mzPAF supports: | ||
|
||
- Annotations of multiple analytes: `1@y12/0.13,2@b9-NH3/0.23` | ||
- Mass deltas in ppm instead of *m/z* unit: `y1/-1.4ppm` | ||
- Mass deltas in ppm instead of _m/z_ unit: `y1/-1.4ppm` | ||
- Confidence levels per annotation: `y1/-1.4ppm*0.75` | ||
- Advanced ion notation: `[ion type](neutral loss)(isotope)(adduct type)(charge)`, e.g.: `y4-H2O+2i[M+H+Na]^2`: | ||
- Ion types: | ||
- Peptide ion series (a, b, c, x, y, z): `y4` | ||
- Unknown ions: `?` | ||
- Immonium ions: `IY` | ||
- Internal fragment ions: `m3:6` | ||
- Intact precursor ions: `p^2` | ||
- A set of reference ions: `r[TMT127N]` | ||
- Named compounds: `_{Urocanic Acid}` | ||
- Chemical formulas: `f{C16H22O}` | ||
- Smiles: `s{CN=C=O}[M+H]` | ||
- Embedded ProForma annotations: `0@b2{LC[Carbamidomethyl]}` | ||
- Neutral gains and losses: `y2+CO-H2O` | ||
- Isotopes: `y2+2i` | ||
- Adduct types: `y2[M+H]` | ||
- Charge states: `^2` | ||
- Ion types: | ||
- Peptide ion series (a, b, c, x, y, z): `y4` | ||
- Unknown ions: `?` | ||
- Immonium ions: `IY` | ||
- Internal fragment ions: `m3:6` | ||
- Intact precursor ions: `p^2` | ||
- A set of reference ions: `r[TMT127N]` | ||
- Named compounds: `_{Urocanic Acid}` | ||
- Chemical formulas: `f{C16H22O}` | ||
- Smiles: `s{CN=C=O}[M+H]` | ||
- Embedded ProForma annotations: `0@b2{LC[Carbamidomethyl]}` | ||
- Neutral gains and losses: `y2+CO-H2O` | ||
- Isotopes: `y2+2i` | ||
- Adduct types: `y2[M+H]` | ||
- Charge states: `^2` | ||
- Multiple peaks per annotation: `&y7/-0.001` and `y7/0.000*0.95` | ||
|
||
Read the [full specificiation](https://mzpaf.readthedocs.io/specification) for more details and | ||
examples. | ||
Read the | ||
[full DRAFT specificiation](https://github.com/HUPO-PSI/mzPAF/blob/main/specification/mzPAF_specification_v1.0-draft14.docx?raw=true) | ||
for more details and examples. | ||
|
||
## Getting started | ||
|
||
### mzPAF in Python | ||
|
||
The [mzPAF Python package](https://mzpaf.readthedocs.io/en/latest/implementations/python/) can | ||
parse mzPAF strings into their components, convert to the JSON representation, or serialize back | ||
to an mzPAF string. | ||
|
||
```python | ||
>>> import mzpaf | ||
>>> annotations = mzpaf.parse_annotation("b2-H2O/3.2ppm*0.75,b4-H2O^2/3.2ppm*0.25") | ||
>>> print(annotations[0].to_json()) | ||
{'neutral_losses': ['-H2O'], 'isotope': 0, 'adducts': [], 'charge': 1, 'analyte_reference': None, 'mass_error': {'value': 3.2, 'unit': 'ppm'}, 'confidence': 0.75, 'molecule_description': {'series_label': 'peptide', 'series': 'b', 'position': 2, 'sequence': None}} | ||
>>> print(anno[0].serialize()) | ||
'b2-H2O/3.2ppm*0.75' | ||
``` | ||
|
||
Learn more at the | ||
[package documentation](https://mzpaf.readthedocs.io/en/latest/implementations/python/). | ||
|
||
### mzPAF regular expressions | ||
|
||
[todo] | ||
|
||
### mzPAF Lark grammar | ||
|
||
[todo] | ||
|
||
## Specification status | ||
|
||
Updated: 2023-09-01 | ||
|
||
The specification has been resubmitted to the PSI Document Process and is undergoing final | ||
community review. Ratification to formally become a PSI standard is anticipated near the end of 2023. | ||
|
||
Your comments and suggestions are still very much welcome. Please submit an issue at the repo to | ||
provide your feedback and send an e-mail to the HUPO-PSI editor [Sylvie Ricard-Blum](mailto:[email protected]). | ||
|
||
### Links | ||
|
||
### Available Materials | ||
- The current DRAFT specification: https://github.com/HUPO-PSI/mzPAF/blob/main/specification/mzPAF_specification_v1.0-draft14.docx?raw=true | ||
- The GitHub repo associated with mzPAF: https://github.com/HUPO-PSI/mzPAF | ||
- The GitHub repo associated with the related mzSpecLib standard: https://github.com/HUPO-PSI/mzSpecLib | ||
|
||
- HUPO-PSI homepage: https://www.psidev.info/ |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,17 @@ | ||
############ | ||
Lark grammar | ||
############ | ||
|
||
|
||
About | ||
===== | ||
|
||
[todo] | ||
|
||
|
||
Railroad diagram | ||
================ | ||
|
||
.. figure:: ../../_static/img/lark-railroad-diagram.svg | ||
:alt: Lark grammar | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,25 @@ | ||
################### | ||
Regular expressions | ||
################### | ||
|
||
mzPAF has been defined in several regular expression dialects. | ||
|
||
.. tip:: | ||
|
||
Regex101.com is a great tool to test regular expressions. Try out the mzPAF regex there: | ||
`regex101.com/r/gDPlJu/1 <https://regex101.com/r/gDPlJu/1>`_. | ||
|
||
Python | ||
====== | ||
|
||
.. literalinclude:: ../../../specification/grammars/regex_sre.py | ||
:language: python | ||
:linenos: | ||
|
||
|
||
Javascript ECMA | ||
=============== | ||
|
||
.. literalinclude:: ../../../specification/grammars/regex_ecma.js | ||
:language: javascript | ||
:linenos: |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -5,3 +5,4 @@ sphinx_click | |
myst-parser | ||
sphinx-autobuild | ||
jsonschema2md | ||
pandas |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,17 @@ | ||
#################### | ||
Format specification | ||
#################### | ||
###################### | ||
Specification document | ||
###################### | ||
|
||
.. toctree:: | ||
:hidden: | ||
:glob: | ||
|
||
Specification document <self> | ||
./* | ||
|
||
.. | ||
TODO: Add when released | ||
The latest draft of the specification can be found on | ||
`GitHub <https://github.com/HUPO-PSI/mzPAF/blob/main/specification/mzPAF_specification_v1.0-draft14.docx?raw=true>`_. | ||
|
||
[TODO: Add when released] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,35 @@ | ||
################### | ||
Reference molecules | ||
################### | ||
|
||
About | ||
===== | ||
|
||
.. include:: ../../specification/reference_data/README.md | ||
:parser: myst_parser.sphinx_ | ||
:start-line: 2 | ||
:end-line: -1 | ||
.. | ||
skip including title and last line with reference to this page | ||
See :ref:`Reference molecule ions` in the specification document for more information. | ||
|
||
|
||
Reference molecule table | ||
======================== | ||
|
||
The following analytes can be annotated as reference molecules with the ``r`` prefix and the | ||
listed name between square brackets (e.g. ``r[TMT127N]``). | ||
|
||
.. include:: ../../specification/reference_data/reference_molecules.md | ||
:parser: myst_parser.sphinx_ | ||
|
||
|
||
JSON schema | ||
=========== | ||
|
||
The ``reference_molecules.json`` file is defined by the following schema: | ||
|
||
.. include:: ../../specification/reference_data/reference_molecule_schema.md | ||
:parser: myst_parser.sphinx_ | ||
:start-line: 3 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,13 +1,16 @@ | ||
# mzPAF specification reference data files | ||
|
||
The mzPAF specification uses these files as auxiliary reference data so that enumerated values can be extended without altering the specification document. | ||
The mzPAF specification uses `specification/reference_data/reference_molecules.json` as auxiliary | ||
reference data. In this way, the set of reference molecules can be extended without updating the | ||
specification document itself. | ||
|
||
- reference_molecules.json - Easily software parsable list of "reference molecules" often seen in peptide fragmentation spectra, but | ||
not normal peptide fragments, including isobaric labeling reagent related molecules, monosaccharides, nucleotides, etc. These | ||
molecules may be inidividual charged ions (typically protonated), or may be used as neutral losses as appropriate. | ||
The following files are available: | ||
|
||
- reference_molecules.md - Human-readable markdown tabular version of reference_molecules.json | ||
- `reference_molecules.json`: Software parsable list of "reference molecules" often seen in | ||
peptide fragmentation spectra, but not normal peptide fragments. This includes isobaric labeling | ||
reagent related molecules, monosaccharides, nucleotides, etc. These molecules may be individual | ||
charged ions (typically protonated), or may be used as neutral losses as appropriate. | ||
|
||
- reference_molecule_schema.json - JSON schema for reference_molecules.json | ||
- `reference_molecule_schema.json`: JSON schema defining the structure of the JSON file | ||
|
||
- reference_mol_to_md.py - Python script to transform reference_molecules.json into a markdown table | ||
A human-readable table with all reference molecules is available on https://mzpaf.readthedocs.io. |
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,34 @@ | ||
# HUPO-PSI mzSpecLib reference molecule and ion list | ||
|
||
*Describe reference molecules or ions found in spectral libraries* | ||
|
||
## Pattern Properties | ||
|
||
- **`.{1,}`**: Refer to *[#/definitions/molecule](#definitions/molecule)*. | ||
## Definitions | ||
|
||
- <a id="definitions/molecule"></a>**`molecule`** *(object)*: A single molecule that may be present as a reporter ion or signature ion, or be a component of a neutral loss. | ||
- **`name`** *(string)*: The formal name for this molecule by which it should be referenced. | ||
- **`cv_term`** *(array)* | ||
- **Items** *(string)* | ||
- **`neutral_mass`** *(number)*: The neutral mass of the molecule not including any charge or charge carrier. | ||
- **`molecule_type`** *(string)*: A categorical label for this molecule. | ||
|
||
Examples: | ||
```json | ||
"monosaccharide" | ||
``` | ||
|
||
```json | ||
"reporter" | ||
``` | ||
|
||
```json | ||
"reporter+balance" | ||
``` | ||
|
||
- **`ion_mz`** *(number)*: The m/z of the molecule if it is expected to be reasonably different from the uncharged version. | ||
- **`chemical_formula`** *(string)*: The elemental formula of the neutral molecule. | ||
- **`ion_chemical_formula`** *(string)*: The chemical formula of the charged molecule. | ||
- **`references`** *(array)*: An array of sources and references describing this entity. | ||
- **Items** *(string)* |