Warning in as.POSIXlt.POSIXct(Sys.time()): unknown timezone 'zone/tz/2018c.
1.0/zoneinfo/America/Chicago'

Last updated: 2018-04-22

Code version: b535190


Summary

We preprocessed the data to remove technical biases due to PCR amplification and sequencing run. The analysis proceeds in two sequential steps:

  1. Normalization: apply Cumulative Sum Scaling as described in Paulson et al, 2015.

  2. Batch correction: fit a linear model to each OTU at the speciees level, then estimate and remove batch effect due to sequencin run for each OTU.

After aggregating reads at the species level, we are left with 553 features in 197 samples.

\(~\)


Package

library(knitr)
library(kableExtra)
library(dplyr)
library(metagenomeSeq)
library(CountClust)

Loading data

Read in filtered data.

MRobj = readRDS("../data/nasal_filtered.rds")
MRobj
MRexperiment (storageMode: environment)
assayData: 4423 features, 197 samples 
  element names: counts 
protocolData: none
phenoData
  sampleNames: EM0042 EM0047 ... E0168 (197 total)
  varLabels: StudyID age ... infnone (94 total)
  varMetadata: labelDescription
featureData
  featureNames: denovo6 denovo17 ... denovo42992 (4423 total)
  fvarLabels: Kingdom Phylum ... Species (7 total)
  fvarMetadata: labelDescription
experimentData: use 'experimentData(object)'
Annotation:  

\(~\)


Normalization

We calculate the normalization scaling factors using cumulative sum scaling, see Paulson et al for more information.

MRobj = cumNorm(MRobj)

\(~\)


Batch correction

Batch correction for sequencing run effect is performed for each taxonomy level. Here, we focus on species level analysis and perform batch correcdtion on the raw counts at teh species level. Details for batch correction are described in the manuscript.

obj <- aggTax(MRobj,lvl="Species")
obj@expSummary$expSummary$normFactors <- normFactors(MRobj)

counts <- t(MRcounts(obj,norm=FALSE,log=FALSE))
batchcorrect_counts <- BatchCorrectedCounts(counts,
                                         batch_lab = pData(MRobj)$seqrun,
                                         use_parallel=FALSE)
counts = t(batchcorrect_counts)

assayData(obj)[["counts"]] = counts

obj

Save normalized, batch corrected counts.

saveRDS(obj, file = "../data/nasal_filtered_normed_batchcorrected.rds")

\(~\)


Session Info

R version 3.4.1 (2017-06-30)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS High Sierra 10.13

Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods  
[8] base     

other attached packages:
 [1] CountClust_1.4.1     ggplot2_2.2.1        metagenomeSeq_1.20.1
 [4] RColorBrewer_1.1-2   glmnet_2.0-13        foreach_1.4.4       
 [7] Matrix_1.2-12        limma_3.34.9         Biobase_2.38.0      
[10] BiocGenerics_0.24.0  dplyr_0.7.4          kableExtra_0.7.0    
[13] knitr_1.20          

loaded via a namespace (and not attached):
 [1] maptpx_1.9-3       Rcpp_0.12.15       ape_5.0           
 [4] lattice_0.20-35    gtools_3.5.0       assertthat_0.2.0  
 [7] rprojroot_1.3-2    digest_0.6.15      slam_0.1-42       
[10] R6_2.2.2           plyr_1.8.4         backports_1.1.2   
[13] stats4_3.4.1       evaluate_0.10.1    httr_1.3.1        
[16] pillar_1.2.1       gplots_3.0.1       rlang_0.2.0       
[19] lazyeval_0.2.1     gdata_2.18.0       vegan_2.4-6       
[22] rmarkdown_1.9      readr_1.1.1        stringr_1.3.0     
[25] munsell_0.4.3      compiler_3.4.1     pkgconfig_2.0.1   
[28] SQUAREM_2017.10-1  mgcv_1.8-23        htmltools_0.3.6   
[31] nnet_7.3-12        tibble_1.4.2       codetools_0.2-15  
[34] matrixStats_0.53.1 permute_0.9-4      viridisLite_0.3.0 
[37] MASS_7.3-49        bitops_1.0-6       grid_3.4.1        
[40] nlme_3.1-131.1     gtable_0.2.0       git2r_0.21.0      
[43] magrittr_1.5       scales_0.5.0       KernSmooth_2.23-15
[46] stringi_1.1.6      reshape2_1.4.3     flexmix_2.3-14    
[49] bindrcpp_0.2       xml2_1.2.0         boot_1.3-20       
[52] cowplot_0.9.2      iterators_1.0.9    tools_3.4.1       
[55] picante_1.6-2      glue_1.2.0         hms_0.4.1         
[58] yaml_2.1.18        colorspace_1.3-2   cluster_2.0.6     
[61] caTools_1.17.1     rvest_0.3.2        bindr_0.1         
[64] modeltools_0.2-21 

This R Markdown site was created with workflowr