Title: | Disease-Drived Differential Proteins Co-Expression Network Analysis |
---|---|
Description: | Functions designed to connect disease-related differential proteins and co-expression network. It provides the basic statics analysis included t test, ANOVA analysis. The network construction is not offered by the package, you can used 'WGCNA' package which you can learn in Peter et al. (2008) <doi:10.1186/1471-2105-9-559>. It also provides module analysis included PCA analysis, two enrichment analysis, Planner maximally filtered graph extraction and hub analysis. |
Authors: | Kefu Liu [aut, cre] |
Maintainer: | Kefu Liu <[email protected]> |
License: | GPL-2 |
Version: | 0.3.0 |
Built: | 2025-03-09 06:14:38 UTC |
Source: | https://github.com/liukf10/ddpna |
disease drived proteins associated network in different species crosstalk.
The package is used to analysis differential proteomics consensus network in two or more datasets.
The function Data_impute
need impute package from Bioconductor, the function ID_match
and the function MaxQdataconvert
need Biostrings package from Bioconductor.
Package: | DDPNA |
Type: | Package |
Version: | 0.2.5 |
Creat Data: | 2019-03-18 |
Date: | 2020-06-26 |
License: | GPL (>= 2) |
~~ An overview of how to use the package, including the most important ~~ ~~ functions ~~
Kefu Liu
Maintainer: Kefu Liu <[email protected]>
anova analysis in proteomic data.
anova_p(data, group)
anova_p(data, group)
data |
protein quantification data. column is sample. row is protein ID. |
group |
sample group information |
Kefu Liu
data(imputedData) data <- imputedData logD <- data$log2_value rownames(logD) <- data$inf$ori.ID group <- gsub("[0-9]+", "", colnames(logD)) anova_P <- anova_p(logD[1:100,], group)
data(imputedData) data <- imputedData logD <- data$log2_value rownames(logD) <- data$inf$ori.ID group <- gsub("[0-9]+", "", colnames(logD)) anova_P <- anova_p(logD[1:100,], group)
extract significant differential protein
changedID(relative_value, group, vs.set2, vs.set1 = "WT", rank = "none", anova = TRUE, anova.cutoff = 0.05, T.cutoff = 0.05, Padj = "fdr", cutoff = 1.5, datatype = c("none","log2"), fctype = "all",...)
changedID(relative_value, group, vs.set2, vs.set1 = "WT", rank = "none", anova = TRUE, anova.cutoff = 0.05, T.cutoff = 0.05, Padj = "fdr", cutoff = 1.5, datatype = c("none","log2"), fctype = "all",...)
relative_value |
protein quantification data |
group |
sample group information |
vs.set2 |
compared group 2 name |
vs.set1 |
compared group 1 name |
rank |
order by which type. This must be (an abbreviation of) one of the strings " |
anova |
a logical value indicating whether do anova analysis. |
anova.cutoff |
a numberic value indicated that anova test p value upper limit. |
T.cutoff |
a numberic value indicated that t.test p value upper limit. |
Padj |
p adjust methods of multiple comparisons.
it can seen in |
cutoff |
a numberic value indicated that foldchange lower limit. |
datatype |
The quantification data is normal data or log2 data. |
fctype |
foldchange is ordered by up-regulated or down-regulated or changed |
... |
Other arguments. |
extract significant differential protein ID based on foldchange, t.test p value, anova p value.
a vector of protein ID information.
Kefu Liu
data(imputedData) data <- imputedData logD <- data$log2_value rownames(logD) <- data$inf$ori.ID group <- gsub("[0-9]+","", colnames(logD)) up <- changedID(logD[201:260,], group, vs.set2 = "ad", vs.set1 = "ctl", rank = "foldchange",anova = FALSE, Padj = "none", cutoff = 1, datatype = "log2", fctype = "up")
data(imputedData) data <- imputedData logD <- data$log2_value rownames(logD) <- data$inf$ori.ID group <- gsub("[0-9]+","", colnames(logD)) up <- changedID(logD[201:260,], group, vs.set2 = "ad", vs.set1 = "ctl", rank = "foldchange",anova = FALSE, Padj = "none", cutoff = 1, datatype = "log2", fctype = "up")
data clean process: detect and remove outlier sample and impute missing value. The process is following: 1. Remove some genes which the number of missing value larger than maxNAratio. 2. Outlier sample detect and remove these sample. 3. Repeat Steps 1-2 untile meet the iteration times or no outlier sample can be detected. 4. impute the missing value. The function also can only do gene filter or remove outlier or impute missing value.
Data_impute(data, inf = "inf", intensity = "LFQ", miss.value = NA, splNExt = TRUE, maxNAratio = 0.5, removeOutlier = TRUE, outlierdata = "intensity", iteration = NA, sdout = 2, distmethod = "manhattan", A.IAC = FALSE, dohclust = FALSE, treelabels = NA, plot = TRUE, filename = NULL, text.cex = 0.7, text.col = "red", text.pos = 1, text.labels = NA, abline.col = "red", abline.lwd = 2, impute = TRUE, verbose = 1, ...)
Data_impute(data, inf = "inf", intensity = "LFQ", miss.value = NA, splNExt = TRUE, maxNAratio = 0.5, removeOutlier = TRUE, outlierdata = "intensity", iteration = NA, sdout = 2, distmethod = "manhattan", A.IAC = FALSE, dohclust = FALSE, treelabels = NA, plot = TRUE, filename = NULL, text.cex = 0.7, text.col = "red", text.pos = 1, text.labels = NA, abline.col = "red", abline.lwd = 2, impute = TRUE, verbose = 1, ...)
data |
MaxQconvert data or a list Vector which contain two data.frame:ID information and quantification data |
inf |
the data.frame name contain protein ID information |
intensity |
the data.frame name only contain quantification data |
miss.value |
the type of miss.value showed in quantificaiton data.
The default value is |
splNExt |
a logical value whether extract sample name.(suited for MaxQuant quantification data) |
maxNAratio |
The maximum percent missing data allowed in any row (default 50%).For any rows with more than maxNAratio% missing will deleted. |
removeOutlier |
a logical value indicated whether remove outlier sample. |
outlierdata |
The value is deprecated.
which data will be used to analysis outlier sample detect.This must be (an abbreviation of) one of the strings " |
iteration |
a numberic value indicating how many times it go through the outlier sample detect and remove loop. |
sdout |
a numberic value indicating the threshold to judge the outlier sample. The default |
distmethod |
The distance measure to be used. This must be (an abbreviation of) one of the strings " |
A.IAC |
a logical value indicated whether decreasing |
dohclust |
a logical value indicated whether doing hierarchical clustering and plot dendrograms. |
treelabels |
labels of dendrograms |
plot |
a logical value indicated whether plot numbersd scatter diagrams. |
filename |
the filename of plot. The number and plot type information will added automatically. The default value is |
text.cex |
outlier sample annotation text size(scatter diagrams parameters) |
text.col |
outlier sample annotation color(scatter diagrams parameters) |
text.pos |
outlier sample annotation position(scatter diagrams parameters) |
text.labels |
outlier sample annotation (scatter diagrams parameters) |
abline.col |
the threshold line color (scatter diagrams parameters) |
abline.lwd |
the threshold line width (scatter diagrams parameters) |
impute |
a logical value indicated whether do knn imputation. |
verbose |
integer level of verbosity. Zero means silent, 1 means have some Diagnostic Messages. |
... |
Other arguments. |
detect and remove outlier sample and impute missing value.
a list of proteomic data.
inf |
Portein information included protein IDs and other information. |
intensity |
Quantification informaton. |
relative_value |
intensity divided by geometric mean |
log2_value |
log2 of relative_value |
Kefu Liu
data(Dforimpute) data <- Data_impute(Dforimpute,distmethod="manhattan")
data(Dforimpute) data <- Data_impute(Dforimpute,distmethod="manhattan")
summrize the statistics information of data
dataStatInf(prodata, group, intensity = "intensity", Egrp = NULL, Cgrp = "ctl", meanmethod = "mean", datatype = c("none", "log2"), anova = TRUE, T.test = c("pairwise", "student", "none"), Aadj = "none", Tadj = "none", cutoff = FALSE, ...)
dataStatInf(prodata, group, intensity = "intensity", Egrp = NULL, Cgrp = "ctl", meanmethod = "mean", datatype = c("none", "log2"), anova = TRUE, T.test = c("pairwise", "student", "none"), Aadj = "none", Tadj = "none", cutoff = FALSE, ...)
prodata |
proteome data. a list Vector which contain two data.frame: ID information and quantification data |
intensity |
the data.frame name only contain quantification data |
group |
sample group information |
Egrp |
experiment group name. It must be assigned when use Student T.test. |
Cgrp |
control group name. It must be assigned. The default value is "ctl". |
meanmethod |
Arithmetic mean of sample group or median of sample group.
This must be (an abbreviation of) one of the strings " |
datatype |
The quantification data is normal data or log2 data. |
anova |
a logical value indicating whether do anova analysis. |
T.test |
T.test method. "none" means not running t.test.
"pairwise" means calculate pairwise comparisons between group levels with corrections for multiple testing
"student" means student t test.
This must be (an abbreviation of) one of the strings " |
Aadj |
anova P value adjust methods. it can seen in |
Tadj |
T test P value adjust methods. it can seen in |
cutoff |
a logical value or a numeric value. The default value is FALSE, which means do not remove any P value. If the value is TRUE, P value > 0.05 will remove and showed as NA in result. If the value is numeric, P value > the number will remove and showed as NA in result. |
... |
Other arguments. |
a data.frame of protein ID and Statistics information.
Kefu Liu
data(imputedData) group <- gsub("[0-9]+","", colnames(imputedData$intensity)) data <- imputedData data$inf <- data$inf[1:100,] data$intensity <- data$intensity[1:100,] stat <- dataStatInf(data, group, meanmethod = "median", T.test = "pairwise", Aadj = "fdr", Tadj = "fdr", cutoff = FALSE)
data(imputedData) group <- gsub("[0-9]+","", colnames(imputedData$intensity)) data <- imputedData data$inf <- data$inf[1:100,] data$intensity <- data$intensity[1:100,] stat <- dataStatInf(data, group, meanmethod = "median", T.test = "pairwise", Aadj = "fdr", Tadj = "fdr", cutoff = FALSE)
get the DEP enrich fold in Module and plot a HeatMap
DEP_Mod_HeatMap(DEP_Mod, xlab = "DEP", filter = c("p","p.adj"), cutoff = 0.05, filename = NULL, ...)
DEP_Mod_HeatMap(DEP_Mod, xlab = "DEP", filter = c("p","p.adj"), cutoff = 0.05, filename = NULL, ...)
DEP_Mod |
a list of DEP_Mod enrichment information. data.frame in list is get from |
xlab |
it indicate x value in heatmap. it must be a value between " |
filter |
p value or p.adjust value used to filter the enrich significant module. |
cutoff |
a numeric value is the cutoff of p value. Larger than the value will remove to show in plot. |
filename |
plot filename. If filename is null, it will print the plot. |
... |
other argument. |
a list of enrich fold heatmap information.
enrichFold |
enrichFold of DEP in Modules. |
textMatrix |
siginificant enrichment module information. |
Kefu Liu
data(net) data(imputedData) data <- imputedData logD <- data$log2_value rownames(logD) <- data$inf$ori.ID group <- gsub("[0-9]+","", colnames(logD)) Module <- Module_inf(net, data$inf) # define 2 DEP ID data: a and b a <- Module$ori.ID[1:100] b <- Module$ori.ID[50:100] a <- Module_Enrich(Module, a, coln="ori.ID", enrichtype = "ORA") b <- Module_Enrich(Module, b, coln="ori.ID", enrichtype = "ORA") rowname <- a$module.name; a <- data.frame(Counts = a$Counts, module.size = a$module.size, precent = a$precent, p = a$p, p.adj = a$p.adj, Z.score = a$Z.score, stringsAsFactors = FALSE) rownames(a) <- rowname; rowname <- b$module.name; b <- data.frame(Counts = b$Counts, module.size = b$module.size, precent = b$precent, p = b$p, p.adj = b$p.adj, Z.score = b$Z.score, stringsAsFactors = FALSE) rownames(b) <- rowname; DEP_Mod <- list(a = a , b = b) heatMapInf <- DEP_Mod_HeatMap(DEP_Mod)
data(net) data(imputedData) data <- imputedData logD <- data$log2_value rownames(logD) <- data$inf$ori.ID group <- gsub("[0-9]+","", colnames(logD)) Module <- Module_inf(net, data$inf) # define 2 DEP ID data: a and b a <- Module$ori.ID[1:100] b <- Module$ori.ID[50:100] a <- Module_Enrich(Module, a, coln="ori.ID", enrichtype = "ORA") b <- Module_Enrich(Module, b, coln="ori.ID", enrichtype = "ORA") rowname <- a$module.name; a <- data.frame(Counts = a$Counts, module.size = a$module.size, precent = a$precent, p = a$p, p.adj = a$p.adj, Z.score = a$Z.score, stringsAsFactors = FALSE) rownames(a) <- rowname; rowname <- b$module.name; b <- data.frame(Counts = b$Counts, module.size = b$module.size, precent = b$precent, p = b$p, p.adj = b$p.adj, Z.score = b$Z.score, stringsAsFactors = FALSE) rownames(b) <- rowname; DEP_Mod <- list(a = a , b = b) heatMapInf <- DEP_Mod_HeatMap(DEP_Mod)
remove hubs which is not in the IDsets and replot the PFG network
DEP_Mod_net_plot(ModNet, IDsets = NULL, data = NULL, module = NULL, plot = TRUE, filename = NULL, filetype = "pdf", OnlyPlotLast = TRUE, BranchCut = TRUE, reconstructNet = TRUE, iteration = Inf, label.hubs.only = TRUE, node.default.color = "grey", hubLabel.col = "black", ...)
DEP_Mod_net_plot(ModNet, IDsets = NULL, data = NULL, module = NULL, plot = TRUE, filename = NULL, filetype = "pdf", OnlyPlotLast = TRUE, BranchCut = TRUE, reconstructNet = TRUE, iteration = Inf, label.hubs.only = TRUE, node.default.color = "grey", hubLabel.col = "black", ...)
ModNet |
data contains network information which get from |
IDsets |
ID sets information which get from |
data |
the value should be defined only when |
module |
the value should be defined only when |
plot |
a logical value whether plot a picture. |
filename |
the filename of plot. The default value is |
filetype |
the file type of plot. the type should be one of "eps", "ps", "tex" (pictex), "pdf", "jpeg", "tiff", "png", "bmp", "svg" or "wmf" (windows only). |
OnlyPlotLast |
a logical value whether plot the final network. |
BranchCut |
a logical value whether remove unhub proteins which have no connection to DEPs. |
reconstructNet |
a logical value whether reconstruct network. |
iteration |
iteration times when reconstruct network. |
label.hubs.only |
a logical value whether show labels for hubs only. |
node.default.color |
Default node colors for those that do not intersect with signatures in gene.set. |
hubLabel.col |
Label color for hubs. |
... |
a list contains network information
netgene |
all IDs in network. |
hub |
hub IDs |
PMFG |
PMFG graph data frame information |
Kefu Liu
data(net) data(imputedData) Module <- Module_inf(net, imputedData$inf) group <- gsub("[0-9]+","", colnames(imputedData$intensity)) data <- imputedData data$inf <- data$inf[1:100,] data$intensity <- data$intensity[1:100,] stat <- dataStatInf(data, group, meanmethod = "median", T.test = "pairwise", Aadj = "fdr", Tadj = "fdr", cutoff = FALSE) stat1 <- stat$ori.ID[stat$ad > 1] stat2 <- stat$ori.ID[stat$asym > 1] datalist <- list(stat1 = stat1, stat2 = stat2) sets <- DEPsets(datalist) logD <- imputedData$log2_value rownames(logD) <- imputedData$inf$ori.ID Mod3 <- getmoduleHub(logD, Module, 3, coln = "ori.ID", adjustp = FALSE) newnet <- DEP_Mod_net_plot(Mod3, sets, data = logD, module = Module, plot = FALSE, filename = NULL, filetype = "pdf", OnlyPlotLast = FALSE,reconstructNet = FALSE)
data(net) data(imputedData) Module <- Module_inf(net, imputedData$inf) group <- gsub("[0-9]+","", colnames(imputedData$intensity)) data <- imputedData data$inf <- data$inf[1:100,] data$intensity <- data$intensity[1:100,] stat <- dataStatInf(data, group, meanmethod = "median", T.test = "pairwise", Aadj = "fdr", Tadj = "fdr", cutoff = FALSE) stat1 <- stat$ori.ID[stat$ad > 1] stat2 <- stat$ori.ID[stat$asym > 1] datalist <- list(stat1 = stat1, stat2 = stat2) sets <- DEPsets(datalist) logD <- imputedData$log2_value rownames(logD) <- imputedData$inf$ori.ID Mod3 <- getmoduleHub(logD, Module, 3, coln = "ori.ID", adjustp = FALSE) newnet <- DEP_Mod_net_plot(Mod3, sets, data = logD, module = Module, plot = FALSE, filename = NULL, filetype = "pdf", OnlyPlotLast = FALSE,reconstructNet = FALSE)
extract two or more IDsets interesection set and complementary set and define the colors.
DEPsets(datalist, colors = c("red", "green", "blue"))
DEPsets(datalist, colors = c("red", "green", "blue"))
datalist |
a list contains more than two ID sets. |
colors |
define each ID sets color. |
a list contains interesection set and complementary set information and colors.
gene.set |
a list of each set ID information. |
color.code |
the colors of each set |
Kefu Liu
data(net) data(imputedData) Module <- Module_inf(net, imputedData$inf) group <- gsub("[0-9]+","", colnames(imputedData$intensity)) data <- imputedData data$inf <- data$inf[1:100,] data$intensity <- data$intensity[1:100,] stat <- dataStatInf(data, group, meanmethod = "median", T.test = "pairwise", Aadj = "fdr", Tadj = "fdr", cutoff = FALSE) stat <- rename_dupnewID(stat, Module, DEPfromMod = TRUE) stat1 <- stat$new.ID[stat$ad > 1] stat2 <- stat$new.ID[stat$asym > 1] datalist <- list(stat1 = stat1, stat2 = stat2) sets <- DEPsets(datalist)
data(net) data(imputedData) Module <- Module_inf(net, imputedData$inf) group <- gsub("[0-9]+","", colnames(imputedData$intensity)) data <- imputedData data$inf <- data$inf[1:100,] data$intensity <- data$intensity[1:100,] stat <- dataStatInf(data, group, meanmethod = "median", T.test = "pairwise", Aadj = "fdr", Tadj = "fdr", cutoff = FALSE) stat <- rename_dupnewID(stat, Module, DEPfromMod = TRUE) stat1 <- stat$new.ID[stat$ad > 1] stat2 <- stat$new.ID[stat$asym > 1] datalist <- list(stat1 = stat1, stat2 = stat2) sets <- DEPsets(datalist)
Pick up proteins based on foldchange and return proteins position in data.
fc.pos(fc, vs.set2, vs.set1 = "WT", cutoff = 1, datatype = c("none", "log2"), fctype = "all", order = TRUE)
fc.pos(fc, vs.set2, vs.set1 = "WT", cutoff = 1, datatype = c("none", "log2"), fctype = "all", order = TRUE)
fc |
proteomic data of mean value in groups. |
vs.set2 |
compared group 2 name |
vs.set1 |
compared group 1 name |
cutoff |
a numberic value indicated foldchange threshold. |
datatype |
The quantification data is normal data or log2 data.
This must be (an abbreviation of) one of the strings " |
fctype |
foldchange is ordered by up-regulated or down-regulated or changed |
order |
a logical value indicated that whether ordered by foldchange. |
Kefu Liu
data(imputedData) data <- imputedData relative <- data$relative_value rownames(relative) <- data$inf$ori.ID group <- gsub("[0-9]+", "", colnames(relative)) datamean <- groupmean(relative, group, name = FALSE) fc_1vs2 <- fc.pos(datamean, vs.set2 = "ad", vs.set1 = "ctl", cutoff = 1, datatype = "none", fctype = "up", order = TRUE) fc_ID <- rownames(relative)[fc_1vs2]
data(imputedData) data <- imputedData relative <- data$relative_value rownames(relative) <- data$inf$ori.ID group <- gsub("[0-9]+", "", colnames(relative)) datamean <- groupmean(relative, group, name = FALSE) fc_1vs2 <- fc.pos(datamean, vs.set2 = "ad", vs.set1 = "ctl", cutoff = 1, datatype = "none", fctype = "up", order = TRUE) fc_ID <- rownames(relative)[fc_1vs2]
plot of FCS enrichment analysis
FCSenrichplot(FCSenrich, count = 1, p = 0.05, filter = "p", plot = TRUE, filename = NULL,filetype = "pdf", ...)
FCSenrichplot(FCSenrich, count = 1, p = 0.05, filter = "p", plot = TRUE, filename = NULL,filetype = "pdf", ...)
FCSenrich |
FCS enrichment information which is getted in |
count |
a numeric value. Module will choosed when countnumber is larger than count value . |
p |
a numeric value. Module will choosed when any Fisher's extract test p value is less than count value . |
filter |
filter methods.
This must be (an abbreviation of) one of the strings " |
plot |
a logical value indicating whether draw enrichment variation trend plot. |
filename |
the filename of plot. The default value is |
filetype |
the file type of plot. the type should be one of "eps", "ps", "tex" (pictex), "pdf", "jpeg", "tiff", "png", "bmp", "svg" or "wmf" (windows only). |
... |
Other arguments. |
Kefu Liu
data(imputedData) data(net) data <- imputedData logD <- data$log2_value rownames(logD) <- data$inf$ori.ID group <- gsub("[0-9]+","", colnames(logD)) Module <- Module_inf(net, data$inf) pos<-which(Module$moduleNum %in% c(11:13)) up <- changedID(logD[pos,], group, vs.set2 = "ad",vs.set1 = "ctl", rank = "foldchange",anova = FALSE, Padj = "none",cutoff = 1, datatype = "log2",fctype = "up") FCSenrich <- Module_Enrich(Module[pos,], up, coln="ori.ID") FCSenrich <- FCSenrichplot(FCSenrich)
data(imputedData) data(net) data <- imputedData logD <- data$log2_value rownames(logD) <- data$inf$ori.ID group <- gsub("[0-9]+","", colnames(logD)) Module <- Module_inf(net, data$inf) pos<-which(Module$moduleNum %in% c(11:13)) up <- changedID(logD[pos,], group, vs.set2 = "ad",vs.set1 = "ctl", rank = "foldchange",anova = FALSE, Padj = "none",cutoff = 1, datatype = "log2",fctype = "up") FCSenrich <- Module_Enrich(Module[pos,], up, coln="ori.ID") FCSenrich <- FCSenrichplot(FCSenrich)
extract PMFG information and get Module hub proteins.
getmoduleHub(data, module, mod_num, coln = "new.ID", cor.sig = 0.05, cor.r = 0, cor.adj="none", adjustp = TRUE, hub.p = 0.05)
getmoduleHub(data, module, mod_num, coln = "new.ID", cor.sig = 0.05, cor.r = 0, cor.adj="none", adjustp = TRUE, hub.p = 0.05)
data |
proteomic quantification data. |
module |
module information which is getted in |
mod_num |
the module name which module will be calculate. |
coln |
column name of module contains protein IDs. it could be matched with " |
cor.sig |
a numberic value indicated that correlation p value less than cor.sig will be picked. |
cor.r |
a numberic value indicated that correlation r value larger than cor.r will be picked. |
cor.adj |
P value correction method. method information can see in |
adjustp |
a logical value indicating whether pick hub protein by FDR methods. |
hub.p |
a numberic value indicated that hub proteins are p value less than hub.p. |
a list contains PMFG network information. list(hub = hubgene, degreeStat = Stat, graph = g, PMFG = gg)
hub |
hub information. |
degreeStat |
degree statistics information |
graph |
the original graph data frame |
PMFG |
PMFG graph data frame |
Kefu Liu
data(net) data(imputedData) data <- imputedData logD <- data$log2_value rownames(logD) <- data$inf$ori.ID group <- gsub("[0-9]+","", colnames(logD)) Module <- Module_inf(net, data$inf) Mod10 <- getmoduleHub(logD, Module, 10, coln = "ori.ID", adjustp = FALSE) if (requireNamespace("MEGENA", quietly = TRUE)) { library(MEGENA) plot_subgraph(module = Mod10$degreeStat$gene, hub = Mod10$hub,PFN = Mod10$PMFG, node.default.color = "black", gene.set = NULL,color.code = c("grey"),show.legend = TRUE, label.hubs.only = TRUE,hubLabel.col = "red",hubLabel.sizeProp = 0.5, show.topn.hubs = 10,node.sizeProp = 13,label.sizeProp = 13, label.scaleFactor = 10,layout = "kamada.kawai") }
data(net) data(imputedData) data <- imputedData logD <- data$log2_value rownames(logD) <- data$inf$ori.ID group <- gsub("[0-9]+","", colnames(logD)) Module <- Module_inf(net, data$inf) Mod10 <- getmoduleHub(logD, Module, 10, coln = "ori.ID", adjustp = FALSE) if (requireNamespace("MEGENA", quietly = TRUE)) { library(MEGENA) plot_subgraph(module = Mod10$degreeStat$gene, hub = Mod10$hub,PFN = Mod10$PMFG, node.default.color = "black", gene.set = NULL,color.code = c("grey"),show.legend = TRUE, label.hubs.only = TRUE,hubLabel.col = "red",hubLabel.sizeProp = 0.5, show.topn.hubs = 10,node.sizeProp = 13,label.sizeProp = 13, label.scaleFactor = 10,layout = "kamada.kawai") }
mean of sample group
groupmean(data, group, method = c("mean", "median"), name = TRUE)
groupmean(data, group, method = c("mean", "median"), name = TRUE)
data |
protein quantification data. column is sample. row is protein ID. |
group |
sample group information |
method |
Arithmetic mean of sample group or median of sample group.
This must be (an abbreviation of) one of the strings " |
name |
a logical value indicated whether add "mean" or "median" in sample group name. |
Kefu Liu
data(imputedData) data <- imputedData logD <- data$log2_value group <- gsub("[0-9]+","", colnames(logD)) datamean <- groupmean(logD, group, name = FALSE)
data(imputedData) data <- imputedData logD <- data$log2_value group <- gsub("[0-9]+","", colnames(logD)) datamean <- groupmean(logD, group, name = FALSE)
homolog protein Uniprot ID match
ID_match(data, db1.path = NULL, db2.path = NULL,out.folder = NULL, blast.path = NULL,evalue = 0.1, verbose = 1)
ID_match(data, db1.path = NULL, db2.path = NULL,out.folder = NULL, blast.path = NULL,evalue = 0.1, verbose = 1)
data |
dataset of protein information.Column Names should contain "ori.ID" and "ENTRY.NAME". "ori.ID" is Uniprot ID |
db1.path |
fasta file, database of transfered species |
db2.path |
fasta file, database of original species |
out.folder |
blast result output folder, the folder path should be the same with db1.path |
blast.path |
blast+ software install path |
evalue |
blast threshold, the lower means more rigorous |
verbose |
integer level of verbosity. Zero means silent, 1 means have Diagnostic Messages. |
homolog protein Uniprot ID match is based on the ENTRY.NAME, gene name and sequence homophyly in two different species or different version of database.
a data.frame included 4 columns: ori.ID, ENTRY.NAME, new.ID, match.type.
This function should install 'blast+' software, Version 2.7.1. 'blast+' download website:https://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/LATEST/ If unstall 'blast+' software, it could use R function replaced, but it will take a lot of time. db1.path, db2.path, out.folder are both need the complete path. Out.folder and db1.path should be in the same folder. Path should have no special character. data should have colname: ori.ID, ENTRY.NAME.
Kefu Liu
# suggested to install blast+ software # it will take a long time without blast+ software data(Sample_ID_data) if(requireNamespace("Biostrings", quietly = TRUE)){ out.folder = tempdir(); write.table(Sample_ID_data$db1,file.path(out.folder,"db1.fasta"), quote = FALSE,row.names = FALSE, col.names = FALSE); write.table(Sample_ID_data$db2,file.path(out.folder,"db2.fasta"), quote = FALSE,row.names = FALSE, col.names = FALSE); data <- ID_match(Sample_ID_data$ID_match_data, db1.path = file.path(out.folder,"db1.fasta"), db2.path = file.path(out.folder,"db2.fasta"), out.folder = out.folder, blast.path = NULL, evalue = 0.1, verbose = 1) file.remove( file.path(out.folder,"db1.fasta"), file.path(out.folder,"db2.fasta")) }
# suggested to install blast+ software # it will take a long time without blast+ software data(Sample_ID_data) if(requireNamespace("Biostrings", quietly = TRUE)){ out.folder = tempdir(); write.table(Sample_ID_data$db1,file.path(out.folder,"db1.fasta"), quote = FALSE,row.names = FALSE, col.names = FALSE); write.table(Sample_ID_data$db2,file.path(out.folder,"db2.fasta"), quote = FALSE,row.names = FALSE, col.names = FALSE); data <- ID_match(Sample_ID_data$ID_match_data, db1.path = file.path(out.folder,"db1.fasta"), db2.path = file.path(out.folder,"db2.fasta"), out.folder = out.folder, blast.path = NULL, evalue = 0.1, verbose = 1) file.remove( file.path(out.folder,"db1.fasta"), file.path(out.folder,"db2.fasta")) }
'Maxquant' quantification data extract and homolog protein Uniprot ID match.
MaxQdataconvert(pgfilename, IDname = "Majority.protein.IDs", IDtype = c("MaxQ","none"), CONremove = TRUE, justID = TRUE, status1 = TRUE, ENTRY1 = TRUE, db1.path = NULL, db2.path = NULL, out.folder = NULL, blast.path = NULL, savecsvpath = NULL, csvfilename = NULL, verbose = 1, ...)
MaxQdataconvert(pgfilename, IDname = "Majority.protein.IDs", IDtype = c("MaxQ","none"), CONremove = TRUE, justID = TRUE, status1 = TRUE, ENTRY1 = TRUE, db1.path = NULL, db2.path = NULL, out.folder = NULL, blast.path = NULL, savecsvpath = NULL, csvfilename = NULL, verbose = 1, ...)
pgfilename |
'Maxquant' quantification file "protein groups.txt" |
IDname |
The column name of uniprot ID. The default value is " |
IDtype |
" |
CONremove |
a logical value indicated whether remove contaminant IDs. When IDtype is "none", it will remove unmatch ID compared with database2. |
justID |
a logical value indicated whether only extract ID when IDtype is "MaxQ". |
status1 |
a logical value indicated whether extract the first ID status when IDtype is "MaxQ". |
ENTRY1 |
a logical value indicated whether extract the first ID ENTRY NAME when IDtype is "MaxQ". |
db1.path |
fasta file, database of transfered species |
db2.path |
fasta file, database of original species |
out.folder |
blast result output folder, the folder path should be the same with db1.path |
blast.path |
blast+ software install path |
savecsvpath |
the information of csv file name output path. The default value means don't save csv file. |
csvfilename |
the name of csv file which the data are to be output. The default value means don't save csv file. |
verbose |
integer level of verbosity. Zero means silent, higher values make the output progressively more and more verbose. |
... |
Other arguments. |
one-step to extract MaxQuant or other quantification data and convert. The function contain ID_match function.
a list of proteomic information.
protein_IDs |
Portein IDs which is |
intensity |
Quantification intensity informaton. When |
iBAQ |
Quantification iBAQ intensity informaton.(only for |
LFQ |
Quantification LFQ intensity informaton.(only for |
The function should install 'blast+' software, Version 2.7.1. 'blast+' download website:https://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/LATEST/ db1.path, db2.path, out.folder are both need the complete path. Out.folder and db1.path should be in the same folder. Path should have no special character.
Kefu Liu
# suggested to install blast+ software # it will take a long time without blast+ software data(Sample_ID_data) if(requireNamespace("Biostrings", quietly = TRUE)){ out.folder = tempdir(); write.table(Sample_ID_data$db1,file.path(out.folder,"db1.fasta"), quote = FALSE,row.names = FALSE, col.names = FALSE); write.table(Sample_ID_data$db2,file.path(out.folder,"db2.fasta"), quote = FALSE,row.names = FALSE, col.names = FALSE); write.table(Sample_ID_data$pginf, file = file.path(out.folder,"proteingroups.txt"), quote = FALSE, sep = "\t",dec = ".", row.names = FALSE, col.names = TRUE ) Maxdata <- MaxQdataconvert(file.path(out.folder,"proteingroups.txt"), IDtype = "MaxQ", db1.path = file.path(out.folder,"db1.fasta"), db2.path = file.path(out.folder,"db2.fasta"), out.folder = out.folder, blast.path = NULL) file.remove( file.path(out.folder,"db1.fasta"), file.path(out.folder,"db2.fasta"), file.path(out.folder,"proteingroups.txt") ) }
# suggested to install blast+ software # it will take a long time without blast+ software data(Sample_ID_data) if(requireNamespace("Biostrings", quietly = TRUE)){ out.folder = tempdir(); write.table(Sample_ID_data$db1,file.path(out.folder,"db1.fasta"), quote = FALSE,row.names = FALSE, col.names = FALSE); write.table(Sample_ID_data$db2,file.path(out.folder,"db2.fasta"), quote = FALSE,row.names = FALSE, col.names = FALSE); write.table(Sample_ID_data$pginf, file = file.path(out.folder,"proteingroups.txt"), quote = FALSE, sep = "\t",dec = ".", row.names = FALSE, col.names = TRUE ) Maxdata <- MaxQdataconvert(file.path(out.folder,"proteingroups.txt"), IDtype = "MaxQ", db1.path = file.path(out.folder,"db1.fasta"), db2.path = file.path(out.folder,"db2.fasta"), out.folder = out.folder, blast.path = NULL) file.remove( file.path(out.folder,"db1.fasta"), file.path(out.folder,"db2.fasta"), file.path(out.folder,"proteingroups.txt") ) }
The function will seperate data into 4 parts: protein information, intensity, iBAQ and LFQ (iBAQ and LFQ only fit for 'MaxQuant' software result). For MaxQ data, it can remove the contaminant and reverse protein.
MaxQprotein(proteinGroups, IDname = "Majority.protein.IDs", IDtype = "MaxQ", remove = TRUE, QuanCol = NULL, verbose = 1)
MaxQprotein(proteinGroups, IDname = "Majority.protein.IDs", IDtype = "MaxQ", remove = TRUE, QuanCol = NULL, verbose = 1)
proteinGroups |
the proteomic quantification data |
IDname |
The column name of uniprot ID. The default value is " |
IDtype |
" |
remove |
a logical value indicated whether remove contaminant and reverse ID. |
QuanCol |
The quantification data columns. It's only needed when |
verbose |
integer level of verbosity. Zero means silent, 1 means have Diagnostic Messages. |
a list of proteomic information.
protein_IDs |
Portein IDs which is |
intensity |
Quantification intensity informaton. When |
iBAQ |
Quantification iBAQ intensity informaton.(only for |
LFQ |
Quantification LFQ intensity informaton.(only for |
Kefu Liu
data(ProteomicData) # example for MaxQ Data MaxQdata <- MaxQprotein(ProteomicData$MaxQ) # example for other type Data otherdata <- MaxQprotein(ProteomicData$none, IDname = "Protein", IDtype = "none", QuanCol = 2:9)
data(ProteomicData) # example for MaxQ Data MaxQdata <- MaxQprotein(ProteomicData$MaxQ) # example for other type Data otherdata <- MaxQprotein(ProteomicData$none, IDname = "Protein", IDtype = "none", QuanCol = 2:9)
put sample names as rownames in WGCNA module eigenvalue data.frame.
ME_inf(MEs, data, intensity.type = "LFQ", rowname = NULL)
ME_inf(MEs, data, intensity.type = "LFQ", rowname = NULL)
MEs |
module eigenvalue which is calculated in WGCNA package. |
data |
protein quantification data. column is sample. row is protein ID. |
intensity.type |
quantification data type, which can help extract sample name.
This must be (an abbreviation of) one of the strings " |
rowname |
sample names when " |
Kefu Liu
data(net) data(imputedData) data <- imputedData logD <- data$log2_value MEs <- ME_inf(net$MEs, logD)
data(net) data(imputedData) data <- imputedData logD <- data$log2_value MEs <- ME_inf(net$MEs, logD)
extract module pca component
modpcomp(data, colors, nPC = 2, plot = FALSE, filename = NULL, group = NULL)
modpcomp(data, colors, nPC = 2, plot = FALSE, filename = NULL, group = NULL)
data |
protein quantification data. column is sample. row is protein ID. |
colors |
protein and module information. which is calculated in WGCNA package. |
nPC |
how many PCA component will saved. |
plot |
a logical value indicating whether draw PCA plot. This function need load ggfortify first. |
filename |
The filename of plot. The default value is |
group |
sample group information. |
Kefu Liu
data(net) data(imputedData) data <- imputedData logD <- data$log2_value rownames(logD) <- data$inf$ori.ID Module_PCA <- modpcomp(logD, net$colors) # if plot PCA and plot module 6 PCA group <- gsub("[0-9]+", "", colnames(logD)) pos <- which(net$colors == 6) if (requireNamespace("ggfortify", quietly = TRUE)){ require("ggfortify") Module_PCA <- modpcomp(logD[pos,], net$colors[pos], plot = TRUE, group = group) }
data(net) data(imputedData) data <- imputedData logD <- data$log2_value rownames(logD) <- data$inf$ori.ID Module_PCA <- modpcomp(logD, net$colors) # if plot PCA and plot module 6 PCA group <- gsub("[0-9]+", "", colnames(logD)) pos <- which(net$colors == 6) if (requireNamespace("ggfortify", quietly = TRUE)){ require("ggfortify") Module_PCA <- modpcomp(logD[pos,], net$colors[pos], plot = TRUE, group = group) }
Enrichment analysis of a sets of proteins in all modules. The function offered two enrichment methods:ORA and FCS.
Module_Enrich(module, classifiedID, enrichtype = "FCS", coln = "new.ID", datainf = NULL, p.adj.method = "BH")
Module_Enrich(module, classifiedID, enrichtype = "FCS", coln = "new.ID", datainf = NULL, p.adj.method = "BH")
module |
module information which is getted in |
classifiedID |
a sets of protein IDs which is ordered by change value/ p value and so on. |
enrichtype |
enrichment method.
This must be (an abbreviation of) one of the strings " |
coln |
column name of module contains protein IDs. it could be matched with " |
datainf |
proteomic data protein ID information.
The default value is " |
p.adj.method |
p adjust methods of multiple comparisons.
it can seen in |
a list contains classifiedID enrichment information.
Counts |
the counts of classifiedID in module. |
module.size |
the number of module ID |
module.name |
module name |
precent |
counts divided module.size |
p |
enrichment p value in each module |
p.adj |
enrichment p.adj value in each module |
Z.score |
Z score is -log2 P value. |
Kefu Liu
data(net) data(imputedData) data <- imputedData logD <- data$log2_value rownames(logD) <- data$inf$ori.ID group <- gsub("[0-9]+","", colnames(logD)) Module <- Module_inf(net, data$inf) up <- changedID(logD, group, vs.set2 = "ad",vs.set1 = "ctl", rank = "foldchange",anova = FALSE, Padj = "none",cutoff = 1, datatype = "log2",fctype = "up") FCSenrich <- Module_Enrich(Module, up, coln="ori.ID")
data(net) data(imputedData) data <- imputedData logD <- data$log2_value rownames(logD) <- data$inf$ori.ID group <- gsub("[0-9]+","", colnames(logD)) Module <- Module_inf(net, data$inf) up <- changedID(logD, group, vs.set2 = "ad",vs.set1 = "ctl", rank = "foldchange",anova = FALSE, Padj = "none",cutoff = 1, datatype = "log2",fctype = "up") FCSenrich <- Module_Enrich(Module, up, coln="ori.ID")
module and protein information match
Module_inf(net, inf, inftype = "Convert", IDname = NULL, ...)
Module_inf(net, inf, inftype = "Convert", IDname = NULL, ...)
net |
module network which is calculated in WGCNA package. |
inf |
proteome quantification data information which contains protein IDs. |
inftype |
data information type.
This must be (an abbreviation of) one of the strings " |
IDname |
IDname is " |
... |
other argument. |
Kefu Liu
data(net) data(imputedData) data <- imputedData Module <- Module_inf(net, data$inf)
data(net) data(imputedData) data <- imputedData Module <- Module_inf(net, data$inf)
extract intersection ID between dataset and one of module
moduleID(inf, module, num, coln = "new.ID")
moduleID(inf, module, num, coln = "new.ID")
inf |
dataset protein ID information. a vector of protein IDs. |
module |
module information which is getted in |
num |
module number which will extract to compared with dataset ID information. |
coln |
column names of module protein IDs. |
column coln
information in module
when module number is num
intersect with inf.
Kefu Liu
data(net) data(imputedData) data <- imputedData logD <- data$log2_value rownames(logD) <- data$inf$ori.ID group <- gsub("[0-9]+","", colnames(logD)) Module <- Module_inf(net, data$inf) up <- changedID(logD, group, vs.set2 = "ad",vs.set1 = "ctl", rank = "foldchange",anova = FALSE, Padj = "none",cutoff = 1, datatype = "log2",fctype = "up") intersection <- moduleID(up, Module, 5, coln = "ori.ID")
data(net) data(imputedData) data <- imputedData logD <- data$log2_value rownames(logD) <- data$inf$ori.ID group <- gsub("[0-9]+","", colnames(logD)) Module <- Module_inf(net, data$inf) up <- changedID(logD, group, vs.set2 = "ad",vs.set1 = "ctl", rank = "foldchange",anova = FALSE, Padj = "none",cutoff = 1, datatype = "log2",fctype = "up") intersection <- moduleID(up, Module, 5, coln = "ori.ID")
multiple comparisons t test and choose significant proteins in proteomic data.
multi.t.test(data, group, sig = 0.05, Adj.sig = TRUE, grpAdj = "bonferroni", geneAdj = "fdr", ...)
multi.t.test(data, group, sig = 0.05, Adj.sig = TRUE, grpAdj = "bonferroni", geneAdj = "fdr", ...)
data |
protein quantification data. column is sample. row is protein ID. |
group |
sample group information |
sig |
siginificant P value threshold. The default is 0.05. |
Adj.sig |
a logical value indicated that whether adjust P-values for multiple proteins comparisons in each two groups. |
grpAdj |
adjust multiple groups comparisions P-value in each two groups. The default is |
geneAdj |
adjust multiple proteins comparisions P-value in each group. The default is |
... |
Other arguments. |
Kefu Liu
data(imputedData) data <- imputedData logD <- data$log2_value rownames(logD) <- data$inf$ori.ID group <- gsub("[0-9]+", "", colnames(logD)) Tsig_P <- multi.t.test(logD[1:100,], group, Adj.sig = FALSE, geneAdj = "fdr")
data(imputedData) data <- imputedData logD <- data$log2_value rownames(logD) <- data$inf$ori.ID group <- gsub("[0-9]+", "", colnames(logD)) Tsig_P <- multi.t.test(logD[1:100,], group, Adj.sig = FALSE, geneAdj = "fdr")
uniprot ID, ENTRYNAME and status information extract.(only fit for 'MaxQuant' data.)
P.G.extract(inf, ncol = 4, justID = FALSE, status1 = FALSE, ENTRY1 = FALSE, verbose = 0)
P.G.extract(inf, ncol = 4, justID = FALSE, status1 = FALSE, ENTRY1 = FALSE, verbose = 0)
inf |
protein groups IDs information. |
ncol |
column numbers of output result. |
justID |
a logical value indicated whether only extract uniprot ID. |
status1 |
a logical value indicated whether extract the first ID status. |
ENTRY1 |
a logical value indicated whether extract the first ID ENTRY NAME. |
verbose |
integer level of verbosity. Zero means silent, 1 means have Diagnostic Messages. |
Kefu Liu
data(ProteomicData) MaxQdata <- MaxQprotein(ProteomicData$MaxQ) inf <- P.G.extract(MaxQdata$protein_IDs, justID = TRUE, status = TRUE, ENTRY = TRUE)
data(ProteomicData) MaxQdata <- MaxQprotein(ProteomicData$MaxQ) inf <- P.G.extract(MaxQdata$protein_IDs, justID = TRUE, status = TRUE, ENTRY = TRUE)
rename the duplicated newID in moduleinf and renew the ID in DEPstat
rename_dupnewID(DEPstat, moduleinf, DEPfromMod = FALSE)
rename_dupnewID(DEPstat, moduleinf, DEPfromMod = FALSE)
DEPstat |
a dataframe contains columns:"new.ID" and "ori.ID". it can get from |
moduleinf |
a dataframe contains columns:"new.ID" and "ori.ID". it can get from |
DEPfromMod |
a logical value indicated that whether DEPstat and moduleinf is getted from the same datasets. The default value is FALSE. |
a data.frame contains DEPstat information and renewed the new.ID column.
Kefu Liu
data(net) data(imputedData) Module <- Module_inf(net, imputedData$inf) group <- gsub("[0-9]+","", colnames(imputedData$intensity)) data <- imputedData data$inf <- data$inf[1:100,] data$intensity <- data$intensity[1:100,] stat <- dataStatInf(data, group, meanmethod = "median", T.test = "pairwise", Aadj = "fdr", Tadj = "fdr", cutoff = FALSE) stat <- rename_dupnewID(stat, Module, DEPfromMod = TRUE)
data(net) data(imputedData) Module <- Module_inf(net, imputedData$inf) group <- gsub("[0-9]+","", colnames(imputedData$intensity)) data <- imputedData data$inf <- data$inf[1:100,] data$intensity <- data$intensity[1:100,] stat <- dataStatInf(data, group, meanmethod = "median", T.test = "pairwise", Aadj = "fdr", Tadj = "fdr", cutoff = FALSE) stat <- rename_dupnewID(stat, Module, DEPfromMod = TRUE)
FCS enrichment analysis of a sets of proteins in one module.
single_mod_enrichplot(module, Mod_Nam, classifiedID, coln = "new.ID", datainf = NULL, plot = TRUE, filename = NULL, ...)
single_mod_enrichplot(module, Mod_Nam, classifiedID, coln = "new.ID", datainf = NULL, plot = TRUE, filename = NULL, ...)
module |
module information which is getted in |
Mod_Nam |
the module name which module will be calculate. |
classifiedID |
a sets of protein IDs which is ordered by change value/ p value and so on. |
coln |
column name of module contains protein IDs. it could be matched with " |
datainf |
proteomic data protein ID information.
The default value is " |
plot |
a logical value indicating whether draw enrichment variation trend plot. |
filename |
the filename of plot. The default value is |
... |
Other arguments. |
Kefu Liu
data(net) data(imputedData) data <- imputedData logD <- data$log2_value rownames(logD) <- data$inf$ori.ID group <- gsub("[0-9]+","", colnames(logD)) Module <- Module_inf(net, data$inf) up <- changedID(logD, group, vs.set2 = "ad",vs.set1 = "ctl", rank = "foldchange",anova = FALSE, Padj = "none", cutoff = 1, datatype = "log2", fctype = "up") m5enrich <- single_mod_enrichplot(Module, 5, up, coln="ori.ID")
data(net) data(imputedData) data <- imputedData logD <- data$log2_value rownames(logD) <- data$inf$ori.ID group <- gsub("[0-9]+","", colnames(logD)) Module <- Module_inf(net, data$inf) up <- changedID(logD, group, vs.set2 = "ad",vs.set1 = "ctl", rank = "foldchange",anova = FALSE, Padj = "none", cutoff = 1, datatype = "log2", fctype = "up") m5enrich <- single_mod_enrichplot(Module, 5, up, coln="ori.ID")
pick soft thresholding powers for WGCNA analysis and plot
SoftThresholdScaleGraph(data, xlab = "Soft Threshold (power)", ylab = "Scale Free Topology Model Fit, signed R^2", main = "Scale independence", filename = NULL)
SoftThresholdScaleGraph(data, xlab = "Soft Threshold (power)", ylab = "Scale Free Topology Model Fit, signed R^2", main = "Scale independence", filename = NULL)
data |
protein quantification data. row is sample. column is protein ID. |
xlab |
x axis label |
ylab |
y axis label |
main |
plot title |
filename |
the filename of plot. The default value is |
pick soft thresholding powers for WGCNA analysis and plot. The function is also can replaced by "pickSoftThreshold
" function in WGCNA package.
A list with the following components:
powerEstimate |
the lowest power fit for scale free topology. |
fitIndices |
a data frame containing the fit indices for scale free topology. |
Kefu Liu
pickSoftThreshold
in WGCNA package.
#it will take some times data(imputedData) data <- imputedData logD <- data$log2_value rownames(logD) <- data$inf$ori.ID if (requireNamespace("WGCNA", quietly = TRUE)) sft <- SoftThresholdScaleGraph(t(logD))
#it will take some times data(imputedData) data <- imputedData logD <- data$log2_value rownames(logD) <- data$inf$ori.ID if (requireNamespace("WGCNA", quietly = TRUE)) sft <- SoftThresholdScaleGraph(t(logD))
The major parameter optimization in function blockwiseModules
in WGCNA package. The function will do a series of network construction by change various parameter in blockwiseModules
and record the result. (it will take a long time)
wgcnatest(data, power = NULL, TOMType = "unsigned", detectCutHeight = NULL, maxBlockSize = 5000, deepSplit = TRUE, minModSize = TRUE, pamRespectsDendro = FALSE, minKMEtoStay = TRUE, minCoreKME = FALSE, reassignThreshold = FALSE, mergeCutHeight = FALSE, maxModNum = 30, minModNum = 8, MaxMod0ratio = 0.3)
wgcnatest(data, power = NULL, TOMType = "unsigned", detectCutHeight = NULL, maxBlockSize = 5000, deepSplit = TRUE, minModSize = TRUE, pamRespectsDendro = FALSE, minKMEtoStay = TRUE, minCoreKME = FALSE, reassignThreshold = FALSE, mergeCutHeight = FALSE, maxModNum = 30, minModNum = 8, MaxMod0ratio = 0.3)
data |
protein quantification data used in network construction. Row is sample. Column is protein ID.
More information can get from |
power |
Soft-thresholding power for network construction. The default value is NULL. it will run |
TOMType |
one of "none", "unsigned", "signed".
More information can get from |
detectCutHeight |
dendrogram cut height for module detection.
The default value is NULL, which means it will calculate the cutheight through correlation r when p value is 0.05. When the value is larger than 0.995, it will set to detectCutHeight or 0.995.
More information can get from |
maxBlockSize |
integer giving maximum block size for module detection.
More information can get from |
deepSplit |
The default value is TRUE, which means the function will test deepSplit from 0 to 4. If the value is FALSE, deepSplit is 2. You also can setting integer value between 0 and 4 by yourself.
integer value between 0 and 4.
More information can get from |
minModSize |
minimum module size for module detection.
The default value is TRUE, which means the function will test 15, 20, 30, 50. If the value is FALSE, minModSize is 20. You also can setting integer value by yourself.
More information can get from |
pamRespectsDendro |
a logical value indicated that whether do pamStage or not.
More information can get from |
minKMEtoStay |
The default value is TRUE, which means the function will test 0.1, 0.2, 0.3. If the value is FALSE, minKMEtoStay is 0.3. You also can setting value by yourself.
Value between 0 to 1.
More information can get from |
minCoreKME |
The default value is FALSE, minCoreKME is 0.5. If the value is TRUE, which means the function will test 0.4 and 0.5. You also can setting value by yourself.
Value between 0 to 1.
More information can get from |
reassignThreshold |
p-value ratio threshold for reassigning genes between modules.
The default value is FALSE, reassignThreshold is 1e-6. If the value is TRUE, which means the function will test 0.01 and 0.05. You also can setting value by yourself.
More information can get from |
mergeCutHeight |
dendrogram cut height for module merging.
The default value is FALSE, mergeCutHeight is 0.15. If the value is TRUE, which means the function will test 0.15, 0.3 and 0.45. You also can setting value by yourself.
More information can get from |
maxModNum |
The maximum module number. If network construction make more than maxModnum of modules. The result will not record. |
minModNum |
The mininum module number. If network construction make less than minModNum of modules. The result will not record. |
MaxMod0ratio |
The maximum Mod0 protein numbers ratio in total proteins. If network construction make more than MaxMod0ratio in module 0. The result will not record. |
More information can get from blockwiseModules
in WGCNA package.
a data.frame contains protein number in each module and the parameter information.
Kefu Liu
data(imputedData) wgcnadata <- t(imputedData$intensity) sft <- SoftThresholdScaleGraph(wgcnadata) # It will take a lot of time if (requireNamespace("WGCNA", quietly = TRUE)){ require("WGCNA") WGCNAadjust <- wgcnatest(wgcnadata, power = sft$powerEstimate) }
data(imputedData) wgcnadata <- t(imputedData$intensity) sft <- SoftThresholdScaleGraph(wgcnadata) # It will take a lot of time if (requireNamespace("WGCNA", quietly = TRUE)){ require("WGCNA") WGCNAadjust <- wgcnatest(wgcnadata, power = sft$powerEstimate) }