Analysemetstruct¶
- acceptIDsSuggested(metabolite_structure, IDsSuggested, annotationSource)[source]¶
function [metabolite_structure,IDsAdded] = acceptIDsSuggested(metabolite_structure,IDsSuggested) This function accepts suggested IDs as provided in the list IDsSuggested and adds them to the metabolite_structure. Note each row in IDsSuggested that shall be accepted must have an entry in the 6th column specifying that the entry in the 2nd column shall be accepted for the metabolite in the 1st column. Note that at this stage the annotation/curation level will be raised to curated, so it is imperative that each row and suggested ID will be carefully evaluated before being accepted.
INPUT metabolite_structure metabolite structure IDsSuggested list of suggested IDs, each row to be accepted
must have ‘accepted’ in the 6th column
annotationSource source of annotation, e.g. ‘curator (name)’ OUTPUT metabolite_structure updated metabolite structure IDsAdded List of added IDs
Ines Thiele, 01/2021
- addAnnotations(metabolite_structure, RAW, annotationSource, annotationType, annotationVerification)[source]¶
function [metabolite_structure] = addAnnotations(metabolite_structure,RAW,annotationSource,annotationType) This function adds annotations (fields) to the metabolite_structure. It is generally used to populate the metabolite_structure with new metabolites from an xlsx sheet (RAW).
INPUT metabolite_structure metabolite structure RAW data read in using the function xlsread, e.g.,
[NUM,TXT,RAW]=xlsread(‘MetaboliteTranslationTable.xlsx’); Note that the xlsx sheet has to have certain headers to be correctly read in.
- annotationSource define annotation source (to track where the
information came from, e.g., ‘Recon3D’). If not specified, ‘unknown’ will be added
annotation Type type of annotation, e.g., ‘automatic’ (Default), ‘manual’ annotationVerification verification of annotation, e.g., ‘not verified’ (Default), ‘verified by curator’, ‘verified based on inchiKeys’
OUTPUT metabolite_structure updated metabolite structure
Ines Thiele 2020/2021
- addInfoFromMolFiles(metabolite_structure, folderName, startSearch, endSearch)[source]¶
This function creates inchiStrings, smiles, and inchiKeys from provided mol files, in the case that these fields are empty (NaN) in the structure.
INPUT metabolite_structure metabolite structure folderName name of folder that contains the mol structures startSearch specify where the search should start in the
metabolite structure. Must be numeric (optional, default: all metabolites in the structure will be search for)
- endSearch specify where the search should end in the
metabolite structure. Must be numeric (optional, default: all metabolites in the structure will be search for)
OUTPUT metabolite_structure Updated metabolite structure
Ines Thiele, 09/2021
- addMetFormulaCharge(metabolite_structure, startSearch, endSearch)[source]¶
This function uses getInchiString2ChargedFormula.m to calculate charge and neutral formula.
INPUT metabolite_structure metabolite structure startSearch specify where the search should start in the
metabolite structure. Must be numeric (optional, default: all metabolites in the structure will be search for)
- endSearch specify where the search should end in the
metabolite structure. Must be numeric (optional, default: all metabolites in the structure will be search for)
OUTPUT metabolite_structure updated metabolite structure
Ines Thiele 09/21
- check4DuplicatesInList(list)[source]¶
This function checks for duplicate entries in a list.
INPUT list List of e.g. metabolite abbr
OUTPUT listDuplicates List of duplicated entries. Second (or more) occurance
of the duplicate is provided.
Ines Thiele, 09/2021
- checkAbbrExists(list, metab_rBioNet_online, rxn_rBioNet_online, metabolite_structure_rBioNet)[source]¶
This function checks whether the abbreviations in the list exist already in the VMH or the most recent rBioNetDB either as reaction or metabolite abbr
INPUT list List of abbreviations (either metabolite or reactions);
Alternatively a metabolite structure can be given as input and more fields are compared
metab_rBioNet_online rxn_rBioNet_online metabolite_structure_rBioNet
OUTPUT VMH_existance Lists whether the abbreviation exists in VMH (online),
as a reaction (2nd entry) or as a metabolite (3rd entry)
- rBioNet_existance Lists whether the abbreviation exists in rBioNet (as deposited in cobra toolbox online),
as a reaction (2nd entry) or as a metabolite (3rd entry)
Ines Thiele 09/2021
- checkLinkValidity(metabolite_structure, startSearch, endSearch)[source]¶
the aim of this script is go take each of the id’s collected and test whether
- getIDfromMetStructure(metabolite_structure, idName)[source]¶
INPUT metabolite_structure Structure containg metabolite related informations and ID’s idName Name of the ID as used in the metabolite structure
to be retrieved (e.g., ‘pubChemId’)
OUTPUT VMH2IDmappingAll Mapping of all VMH metabolites present in the
metabolite structure (including NaNs)
- VMH2IDmappingPresent Mapping of all VMH metabolites present in the
metabolite structure (excluding NaNs)
- VMH2IDmappingMissing Abbreviations of metabolites that are NaN’s in the
metabolite structure
IT, Aug 2020
- getStatsMetStruct(metabolite_structure)[source]¶
INPUT metabolite_structure Metabolite structure
OUTPUT IDs List of ID names IDcount Count per ID Table Table listing IDs per reaction
Ines Thiele, 09/2021
- list2MetaboliteStructure(fileName, molFileDirectory, metList, fileNameOutput, metabolite_structure_rBioNet, customMetAbbrList)[source]¶
This function reads in an xlsx file and converts it into a metabolite_structure. The minimum requirement is that the VMH ID are present in one column of the table.
INPUT fileName Name of the xlsx file molFileDirectoryIn Location where to locate the mol files obtained from ctf and new mol files will be added. metList fileNameOutput
OUTPUT metabolite_structure metabolite structure containing the metabolites
with VMH ID listed in the xlsx file
- rBioNet_existance This array indicates whether the query abbr exist
already in rBioNet (online and the growing internal database) (col 1: assigned initial VMHId, col 2: Id exists as rxn abbr, col 3: Id exists as met abbr, col 4: VMHId - if not empty this abbr was used in the metabolite structure instead of the one given in col 1.
- VMH_existance This array indicates whether the query abbr exist
already in the VMH (online) (col 1: assigned initial VMHId, col 2: Id exists as rxn abbr, col 3: Id exists as met abbr).
Ines Thiele, 09/2021
- verifyInchiString(metabolite_structure)[source]¶
function [metabolite_structure] = verifyInchiString(metabolite_structure) This function verifies whether the inchiString and the formula/charge match for the entries in the metabolite_structure. If the inchiString is neutral but the chargedFormula is not neutral, only a note to the inchiString_source will be added. If the inchiString does not match or represents a different charge (not neutral and not overlapping with the metabolite charge), the inchiString will be removed from the metabolite_structure and added to a IDsSuggested list.
Ines Thiele 2020/2021