Common Bioconductor Methods and Classes
We strongly recommend reusing existing methods for importing data, and reusing established classes for representing data. Here are some suggestions for importing different file types and commonly used Bioconductor classes. For more classes and functionality also try searching in BiocViews for your data type.
Importing
- GTF, GFF, BED, BigWig, etc., – rtracklayer
::import()
- VCF – VariantAnnotation
::readVcf()
- SAM / BAM – Rsamtools
::scanBam()
, GenomicAlignments::readGAlignment*()
- FASTA – Biostrings
::readDNAStringSet()
- FASTQ – ShortRead
::readFastq()
- MS data (XML-based and mgf formats) – Spectra
::Spectra()
, Spectra::Spectra(source = MsBackendMgf::MsBackendMgf())
Common Classes
- Rectangular feature x sample data –
SummarizedExperiment
::SummarizedExperiment()
(RNAseq count matrix, microarray, quantitative proteomics, …) - Genomic coordinates – GenomicRanges
::GRanges()
(1-based, closed interval) - Genomic coordinates from multiple samples –
GenomicRanges
::GRangesList()
- Ragged genomic coordinates – RaggedExperiment
::RaggedExperiment()
- DNA / RNA / AA sequences – Biostrings
::*StringSet()
- Gene sets – BiocSet
::BiocSet()
, GSEABase::GeneSet()
, GSEABase::GeneSetCollection()
- Multi-omics data –
MultiAssayExperiment
::MultiAssayExperiment()
- Single cell data –
SingleCellExperiment
::SingleCellExperiment()
- Mass spec data – Spectra
::Spectra()