Analyse combinedModel, input to component contribution
Author: Ronan Fleming, NUI Galway, Leiden University
Reviewers:
INTRODUCTION
PROCEDURE
Configure the environment
All the installation instructions are in a separate .md file named vonBertalanffy.md in docs/source/installation
With all dependencies installed correctly, we configure our environment, verfy all dependencies, and add required fields and directories to the matlab path.
aPath = which('initVonBertalanffy');
basePath = strrep(aPath,'vonBertalanffy/initVonBertalanffy.m','');
addpath(genpath(basePath))
folderPattern=[filesep 'old'];
editCobraToolboxPath(basePath,folderPattern,method)
removing: /home/rfleming/work/sbgCloud/code/fork-cobratoolbox/src/analysis/thermo/componentContribution/old
removing: /home/rfleming/work/sbgCloud/code/fork-cobratoolbox/src/analysis/thermo/directionalityReport/old
removing: /home/rfleming/work/sbgCloud/code/fork-cobratoolbox/src/analysis/thermo/groupContribution/old
removing: /home/rfleming/work/sbgCloud/code/fork-cobratoolbox/src/analysis/thermo/inchi/old
removing: /home/rfleming/work/sbgCloud/code/fork-cobratoolbox/src/analysis/thermo/molFiles/old
removing: /home/rfleming/work/sbgCloud/code/fork-cobratoolbox/src/analysis/thermo/protons/old
removing: /home/rfleming/work/sbgCloud/code/fork-cobratoolbox/src/analysis/thermo/trainingModel/old
aPath = which('initVonBertalanffy');
basePath = strrep(aPath,'vonBertalanffy/initVonBertalanffy.m','');
addpath(genpath(basePath))
folderPattern=[filesep 'new'];
editCobraToolboxPath(basePath,folderPattern,method)
adding: /home/rfleming/work/sbgCloud/code/fork-cobratoolbox/src/analysis/thermo/componentContribution/new
adding: /home/rfleming/work/sbgCloud/code/fork-cobratoolbox/src/analysis/thermo/groupContribution/new
adding: /home/rfleming/work/sbgCloud/code/fork-cobratoolbox/src/analysis/thermo/inchi/new
adding: /home/rfleming/work/sbgCloud/code/fork-cobratoolbox/src/analysis/thermo/molFiles/new
adding: /home/rfleming/work/sbgCloud/code/fork-cobratoolbox/src/analysis/thermo/protons/new
adding: /home/rfleming/work/sbgCloud/code/fork-cobratoolbox/src/analysis/thermo/trainingModel/new
initVonBertalanffy
ChemAxon Marvin Beans is installed and working.
linux-vdso.so.1 (0x00007fff445d9000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f91d2049000)
libopenbabel.so.5 => /usr/lib/libopenbabel.so.5 (0x00007f91d1df9000)
libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f91d1bdf000)
libgcc_s.so.1 => /usr/local/bin/MATLAB/R2021a/sys/os/glnxa64/libgcc_s.so.1 (0x00007f91d19c7000)
/lib64/ld-linux-x86-64.so.2 (0x00007f91d2269000)
libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f91d19bf000)
libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007f91d19a3000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f91d1854000)
libgomp.so.1 => /usr/lib/x86_64-linux-gnu/libgomp.so.1 (0x00007f91d180f000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f91d17ec000)
babel must depend on the system libstdc++.so.6 not the one from MATLAB
Trying to edit the 'LD_LIBRARY_PATH' to make sure that it has the correct system path before the Matlab path!
The solution will be arch dependent
Load combined model
load('data_prior_to_componentContribution')
Statistics on the combined model
fprintf('%u%s\n',nnz(combinedModel.trainingMetBool),' training metabolites')
fprintf('%u%s\n',nnz(combinedModel.trainingMetBool & combinedModel.groupDecomposableBool),' of which are Moiety decomposable.')
627 of which are Moiety decomposable.
fprintf('%u%s\n',nnz(combinedModel.trainingMetBool & ~combinedModel.inchiBool),' of which have no inchi.')
45 of which have no inchi.
fprintf('%u%s\n',nnz(combinedModel.trainingMetBool & combinedModel.inchiBool & ~combinedModel.groupDecomposableBool),' of which are not Moiety decomposable.')
0 of which are not Moiety decomposable.
fprintf('%u%s\n',nnz(combinedModel.testMetBool),' test metabolites')
fprintf('%u%s\n',nnz(combinedModel.testMetBool & combinedModel.groupDecomposableBool),' of which are Moiety decomposable.')
1997 of which are Moiety decomposable.
fprintf('%u%s\n',nnz(combinedModel.testMetBool & ~combinedModel.inchiBool),' of which have no inchi.')
994 of which have no inchi.
fprintf('%u%s\n',nnz(combinedModel.testMetBool & combinedModel.inchiBool & ~combinedModel.groupDecomposableBool),' ... of which are not Moiety decomposable.')
3 ... of which are not Moiety decomposable.
fprintf('%u%s\n',size(combinedModel.S,1),' combined model metabolites.')
3666 combined model metabolites.
fprintf('%u%s\n',nnz(combinedModel.trainingMetBool & ~combinedModel.testMetBool),' ... of which are exclusively training metabolites.')
672 ... of which are exclusively training metabolites.
fprintf('%u%s\n',nnz(combinedModel.trainingMetBool & combinedModel.testMetBool),' ... of which are both training and test metabolites.')
0 ... of which are both training and test metabolites.
fprintf('%u%s\n',nnz(~combinedModel.trainingMetBool & combinedModel.testMetBool),' ... of which are exclusively test metabolites.')
2994 ... of which are exclusively test metabolites.
Sparsity pattern of combinedModel.S
title([int2str(size(combinedModel.S,1)) ' x ' int2str(size(combinedModel.S,2))])
nReactionsPerMetabolite=full(sum(combinedModel.S~=0,2));
histogram(nReactionsPerMetabolite(nReactionsPerMetabolite~=0),'BinWidth',2)
title('Number of Reactions per metabolite')
ylabel('log(#Reactions)')
fprintf('%u%s\n',nnz(nReactionsPerMetabolite==0),' metabolites without reactions.')
2998 metabolites without reactions.
nMetabolitesPerReaction=full(sum(combinedModel.S~=0,1)');
histogram(nMetabolitesPerReaction)
title('Number of Metabolites per Reaction')
if any(nMetabolitesPerReaction==0)
error('combinedModel.S reaction without a metabolite')
dX = diag(combinedModel.S*combinedModel.S');
ylabel('log(stoichiometric degree)')
xlabel('metabolites sorted by stoichiometric degree')
Sparsity pattern of combinedModel.G
title([int2str(size(combinedModel.G,1)) ' x ' int2str(size(combinedModel.G,2))])
nGroupPerMetabolite = full(sum(combinedModel.G~=0,2));
histogram(nGroupPerMetabolite,'BinWidth',2)
xlabel('#Moieties per metabolite')
if any(nGroupPerMetabolite==0)
error('Metabolite without any Moiety')
nMetabolitePerGroup = sum(combinedModel.G~=0,1)';
histogram(nMetabolitePerGroup);
histogram(nMetabolitePerGroup);
xlabel('#Metabolites per Moiety')
if any(nMetabolitePerGroup==0)
error('Moiety without metabolite')
Sparsity pattern of combinedModel.StG
StG=combinedModel.S'*combinedModel.G;
title(['S''*G ' int2str(size(StG,1)) ' x ' int2str(size(StG,2))])
nReactingMoieties=full(sum(StG~=0,2));
histogram(nReactingMoieties,'BinWidth',2)
title('Number of Reacting Moieties per reaction')
xlabel('#Reacting moieties')
fprintf('%u%s\n',nnz(nReactingMoieties==0),' reactions without reacting moieties.')
19 reactions without reacting moieties.
printRxnFormula(combinedModel,combinedModel.rxns(nReactingMoieties==0));
TECRDB_373 C00002 + C00015 -> C00008 + C00075
TECRDB_764 C00004 + C00006 -> C00003 + C00005
TECRDB_765 C00004 + C00006 -> C00003 + C00005
TECRDB_766 C00004 + C00006 -> C00003 + C00005
TECRDB_767 C00004 + C00006 -> C00003 + C00005
TECRDB_768 C00004 + C00006 -> C00003 + C00005
TECRDB_769 C00004 + C00006 -> C00003 + C00005
TECRDB_770 C00004 + C00006 -> C00003 + C00005
TECRDB_771 C00004 + C00006 -> C00003 + C00005
TECRDB_772 C00004 + C00006 -> C00003 + C00005
TECRDB_773 C00004 + C00006 -> C00003 + C00005
TECRDB_822 C00003 + C00005 -> C00004 + C00006
TECRDB_823 C00003 + C00005 -> C00004 + C00006
TECRDB_824 C00003 + C00005 -> C00004 + C00006
TECRDB_2364 C00002 + C00015 -> C00008 + C00075
TECRDB_2521 C00002 + C00035 -> C00008 + C00044
TECRDB_2639 C00002 + C00104 -> C00008 + C00081
TECRDB_2870 C00166 -> C02763
TECRDB_3904 C00036 -> C03981
nReactionsPerMoiety=full(sum(StG~=0,1)');
histogram(nReactionsPerMoiety)
title('Number of Reactions per moiety')
fprintf('%u%s%u%s\n',nnz(nReactionsPerMoiety==0),' of the ', length(nReactionsPerMoiety), ' moieties do not react in any training reaction.')
3789 of the 5189 moieties do not react in any training reaction.