Connect2resources¶
- VMH2Metabolon(metabolite_structure)[source]¶
read in Metabolon to VMH mapping which has been done in parts manually and cross checked from two sides independently. Currently, we have 400 metabolites mapped. information missing in the current rBioNet flat files will be substituted with this information. I will also read in the Metabolon ID. (CHEM_ID in this input file).
INPUT metabolite_structure metabolite structure
OUTPUT metabolite_structure Updated metabolite structure
Ines Thiele, 09/2021
- VMH2Seed(metabolite_structure)[source]¶
read in Metabolon to VMH mapping which has been done in parts manually and cross checked from two sides independently. Currently, we have 400 metabolites mapped. information missing in the current rBioNet flat files will be substituted with this information. I will also read in the Metabolon ID. (CHEM_ID in this input file).
INPUT metabolite_structure metabolite structure
OUTPUT metabolite_structure Updated metabolite structure
Ines Thiele, 09/2021
- assignAGORAReconPresence(metabolite_structure, reaction)[source]¶
this function assigns whether a metabolite occurs in AGORA_X and ReconX
INPUT metabolite_structure metabolite structure reaction default: false (0). Set to true (1) if input is a reaction
structure
OUTPUT metabolite_structure Updated metabolite structure
Ines Thiele, 09/2021
- assignClassyFire(metabolite_structure, startSearch, endSearch)[source]¶
get metabolite classification from ClassyFire
INPUT metabolite_structure metabolite structure startSearch specify where the search should start in the
metabolite structure. Must be numeric (optional, default: all metabolites in the structure will be search for)
- endSearch specify where the search should end in the
metabolite structure. Must be numeric (optional, default: all metabolites in the structure will be search for)
OUTPUT metabolite_structure updated metabolite structure
Ines Thiele, 09/2021
- convertOld2NewHMDB(HMDBId)[source]¶
This function converts the old style HMDB ids to the new style old style ‘HMDB06525’ new style ‘HMDB0006525’ – 7 digits – fill up old ID to new ID with 0
INPUT HMDBId HMDB id
OUTPUT HMDBId_new new style HMDB id
Ines Thiele 03/2022
- getCas2CTD(metabolite_structure)[source]¶
The input file was obtained from http://ctdbase.org/reports/CTD_chemicals.csv.gz 1st col: ctd id, 3rd col cas
- getCas2Echa(metabolite_structure)[source]¶
The input file was downloaded from https://echa.europa.eu/documents/10162/13629/ec_inventory_en.xlsx
INPUT metabolite_structure metabolite structure
OUTPUT metabolite_structure Updated metabolite structure
Ines Thiele, 2020-2021 first column contains echa_id,4th col is cas registry
- getIDsFromBIGG[source]¶
This m file annotates the metabolite studeture with IDs from BiGG using an offline file. Ines Thiele 2020/2021
- getIDsfromFiehnLab(metabolite_structure, sourceId, targetId, startSearch, endSearch)[source]¶
connect to Fiehn lab (associated paper: https://academic.oup.com/bioinformatics/article/26/20/2647/194184_ url from / to / query term e.g., http://cts.fiehnlab.ucdavis.edu/service/convert/kegg/inchikey/C00234
- getIds2VMH(metabolite_structure)[source]¶
map Seed metabolites file obtained from https://www.pnas.org/highwire/filestream/616377/field_highwire_adjunct_files/0/pnas.1401329111.sd01.xlsx for PMID 24927599 when getting the biggId’s the script is checking whether the id’s are still valid by testing the weblink. Only valid bigg id’s will be added
- getInchiStringFromHMDB(HMDBID)[source]¶
This function retrieves the inchiString from HMDB (online) for a given HMDB ID.
INPUT HMDBID Human metabolome database (HMDB) ID
OUTPUT inchiString Retrieved inchiString
Ines Thiele, 09/2021
- getMetIdsFromInchiKeys(metabolite_structure, inchiKeyCheck, inchiStringCheck, inchiKeyAltCheck, metList)[source]¶
This function connects to UniChem and grebs available ID’s for metabolites that have Inchi Strings.
- getMetIdsFromUniChem(metabolite_structure, startSearch, endSearch, vmhIdCheck, cheBIIdCheck, drugBankCheck, pubChemIdCheck, keggIdCheck, hmdbCheck, inchiKeyCheck, inchiStringCheck, inchiKeyAltCheck)[source]¶
This function connects to UniChem and grebs available ID’s for metabolites that have Inchi Strings.
- getRxnFromKegg(metabolite_structure, metabolite_structure_rBioNet, metsField)[source]¶
get reaction from kegg
- getSeed2Kegg(metabolite_structure)[source]¶
This function parses the file: ftp://ftp.kbase.us/assets/KBase_Reference_Data/Biochemistry/compounds.xls first column contains seed ID, 5th col contains Kegg ID. This file is provided in /data/ as ‘compounds.xlsx’
INPUT metabolite_structure metabolite structure
OUTPUT metabolite_structure Updated metabolite structure
Ines Thiele, 2020-2021
- parseBiggID4VMH(metabolite_structure, startSearch, endSearch, grebMoreIDs)[source]¶
the problem is that by chance Bigg and VMH could have the same ID but for different metabolites – I do not do any additional checks right now which is dangerous (hence I do not greb more ID’s by default)
- parseBridgeDb(metabolite_structure, startSearch, endSearch)[source]¶
function [metabolite_structure,IDsAdded,IdsMismatch] = parseBridgeDb(metabolite_structure) This function takes existing database-dependent identifiers and searches BridgeDB (https://bridgedb.github.io/) via their webservice for other database identifiers (see below) and adds them to the metabolite structure if the metabolite does not have the respective identifier. In the case that the metabolite has such identifier already but if there is a mismatch, this will be listed in ‘IdsMismatch’
INPUT metabolite_structure metabolite structure
OUTPUT metabolite_structure updated metabolite structure IDsAdded List of added IDs from BridgeDB IdsMismatch List of mismatching IDs between VMH and BridgeDB
Ines Thiele October 2020
- parseCHOmineWebpage(metabolite_structure, startSearch, endSearch)[source]¶
try to guess chomine abbreviation based on VMH ID
- parseChemIDPlusWebpage(metabolite_structure, startSearch, endSearch)[source]¶
uses unii IDs to parse
- parseDBCollection(metabolite_structure, startSearch, endSearch)[source]¶
This function takes substantial time. Also note that order matters, hence, some resources are parsed twice
INPUT metabolite_structure metabolite structure startSearch specify where the search should start in the
metabolite structure. Must be numeric (optional, default: all metabolites in the structure will be search for)
- endSearch specify where the search should end in the
metabolite structure. Must be numeric (optional, default: all metabolites in the structure will be search for)
OUTPUT metabolite_structure Updated metabolite structure
Ines Thiele, 09/2021
- parseEPA4VMH(metabolite_structure, startSearch, endSearch)[source]¶
search EPA - comptox using casRegistry or using inchiKey
- parseMetaNetXWebpage(metabolite_structure, startSearch, endSearch)[source]¶
function [metabolite_structure,IDsAdded,IDsSuggested] = parseMetaNetXWebpage(metabolite_structure) This function first retrieves MetaNetX IDs based on existing IDs in the metabolite_structure (defined in queryFields). MetaNetX IDs will only be added to the metabolite_structure if the MetaNetX inchiKey and the metabolite_structure inchiKey agree (and added to IDsAdded(, otw it will be added to IDsSuggested. The function then takes all the MetaNetX IDs can retrieves further IDs to be added to the metabolite_structure. Therefore, we first verify the MetaNetX ID in the metabolite_structure by comparing the inchiKey in the metabolite_structure with the one from the MetaNetX ID if they do not agree the MetaNetX ID, the function tries to find the right ID based on the inchiKey in the metabolite structure. If unsuccesfull, the MetaNetX ID is removed from the metabolite_structure and added to the IDsSuggested list. Further ID’s are only retrieved for verified MetaNetX IDs.
INPUT metabolite_structure metabolite structure startSearch specify where the search should start in the
metabolite structure. Must be numeric (optional, default: all metabolites in the structure will be search for)
- endSearch specify where the search should end in the
metabolite structure. Must be numeric (optional, default: all metabolites in the structure will be search for)
OUTPUT metabolite_structure updated metabolite structure IDsAdded list of addded IDs IDsSuggested list of suggested IDs
Ines Thiele 2020/2021
- parseVMH4IDs(metabolite_structure, startSearch, endSearch)[source]¶
INPUT metabolite_structure metabolite structure startSearch specify where the search should start in the
metabolite structure. Must be numeric (optional, default: all metabolites in the structure will be search for)
- endSearch specify where the search should end in the
metabolite structure. Must be numeric (optional, default: all metabolites in the structure will be search for)
OUTPUT metabolite_structure Updated metabolite structure
Ines Thiele, 09/2021
- parseWikipediaWebpage(metabolite_structure, startSearch, endSearch)[source]¶
This function searches wikipedia for identifiers. It will either use wikipedia ids provided by the metabolite structure or try to find perfect hits based on metabolite name search.
INPUT metabolite_structure metabolite structure startSearch specify where the search should start in the
metabolite structure. Must be numeric (optional, default: all metabolites in the structure will be search for)
- endSearch specify where the search should end in the
metabolite structure. Must be numeric (optional, default: all metabolites in the structure will be search for)
OUTPUT metabolite_structure updated metabolite structure
Ines Thiele, 09/2021
- queryExposomeExplorer(metabolite_structure)[source]¶
the function will search for metabolite names http://exposome-explorer.iarc.fr/search?utf8=%E2%9C%93&query=2-aminophenol+sulfate&button=
INPUT metabolite_structure metabolite structure startSearch specify where the search should start in the
metabolite structure. Must be numeric (optional, default: all metabolites in the structure will be search for)
- endSearch specify where the search should end in the
metabolite structure. Must be numeric (optional, default: all metabolites in the structure will be search for)
OUTPUT metabolite_structure updated metabolite structure
Ines Thiele, 09/2021
- queryLipidMaps(metabolite_structure, startSearch, endSearch)[source]¶
the function will search for metabolite names
https://www.lipidmaps.org/search/quicksearch.php?Name=2-methyl-dodecanedioic+acid
INPUT metabolite_structure metabolite structure startSearch specify where the search should start in the
metabolite structure. Must be numeric (optional, default: all metabolites in the structure will be search for)
- endSearch specify where the search should end in the
metabolite structure. Must be numeric (optional, default: all metabolites in the structure will be search for)
OUTPUT metabolite_structure updated metabolite structure
Ines Thiele, 09/2021
- retrievePotHitsHMDB(met)[source]¶
This function connects to HMDB can searches the metabolite name. The first 10 hits will be looked at and the metabolite name will be search for in traditional name, IUPAC name, synonyms, and common name. If one or more hits are found, the HMDB Ids will be returned.
INPUT met Metabolite name
OUTPUT hmdb One or more HMDB id’s. If empty, no hmdb ID could be found. multipleHits This variable indicates whether there are multiple hits.
Ines Thiele, 09/2021
- searchMultipleUnknownMetOnline(metabolite_structure, metabolite_structure_rBioNet, metab_rBioNet_online, rxn_rBioNet_online, startSearch, endSearch)[source]¶
INPUT metabolite_structure metabolite structure startSearch specify where the search should start in the
metabolite structure. Must be numeric (optional, default: all metabolites in the structure will be search for)
- endSearch specify where the search should end in the
metabolite structure. Must be numeric (optional, default: all metabolites in the structure will be search for)
OUTPUT metabolite_structure updated metabolite structure
Ines Thiele, 2020-2021
- searchUnknownMetOnline(met, VMHId, metabolite_structure_rBioNet, metab_rBioNet_online, rxn_rBioNet_online)[source]¶
This function searches HMDB by names and returns a metabolite structure and the HMDB ID if the name appear in the common name, IUPAC, synonyms, or traditional name.
INPUT met metabolite name (try to spell it correctly)
OUTPUT metabolite_structure metabolite structure
Ines Thiele, 09/2021