We've updated our Privacy Policy to make it clearer how we use your personal data. We use cookies to provide you with a better experience. You can read our Cookie Policy here.

Advertisement

A Unified Standard Format for Proteomics Mass Spectrometry Data

Listen with
Speechify
0:00
Register for free to listen to this article
Thank you. Listen to this article using the player above.

Want to listen to this article for FREE?

Complete the form below to unlock access to ALL audio articles.

Read time: 1 minute

The Human Proteome Organisation’s Proteomics Standards Initiative (HUPO-PSI) has announced a roadmap for creating a unified data interchange format for proteomics mass spectrometry at the Conference of the American Society for Mass Spectrometry.

The format will combine the current HUPO-PSI format (mzData) with the mzXML format.

The format will include features from both formats: An interchange schema which has split data vectors compatible with other analytical interchange formats; Support for both random access indexes and digital signatures via a wrapper schema.

In support of the format, the format project will also include tools to support developers and users of the format: A program to normalize XML files for random access and digital signatures; A validation program to insure that the use of controlled vocabulary terms matches minimum reporting ("MIAPE") requirements; An 'Application Programming Interface' (API) including language bindings for popular programming languages; Abstract data models and other documentation to assist software developers who wish to implement systems based on the interchange format.

In addition to the interchange format and software to help read and validate documents, the project will also develop reference implementations of data converters to create the format from as many mass spectrometry instruments as possible. 

Reference implementations of converters will be developed as open source software projects with the assistance of mass spectrometry instrument vendors and the community of software developers working in the field of mass spectrometry informatics.

The time line for the project calls for the majority of the project deliverables to be completed by the end of the year, 2006:

August:
- Data model (UML)
- Ontology models

September:
- Documentation
- Draft specification of schema
- Language bindings (Parsing API)

December:
- Binary indexing & signatures programs
- Validation program
- Reference implementations of converters