Biomedical Data Science Core (BDSC)

 

The Biomedical Data Science Core (BDSC) is core for  Center for Molecular Studies in Digestive and Liver Diseases,  directed by Hongzhe Li, PhD with James Lewis, MD, MSCE as co-director.  This core offers fee for service study design consultation and study implementation for Center investigators. Areas of assistance include data analysis, logistical planning, sample collection methods and tools, case report form templates, recruitment methods, and budget development.

 

The Core provides the followinfg services:

A) Basic Analyses (1 free meeting with Dr. Hongzhe Li, Cost: $150 per hour for staff analysis supervised by Dr. Li): The Core will provide the following basic biostatistics and bioinformatics analysis for omics data with analyses performed by MS level statisticians/bioinformaticians under the supervision of the Core PI. 

 - Basic statistical tests/regression analysis etc.
 - Data processing, quality control, batch-effects adjustments for high-throughput genomics, 
   metagenomics, epigenomics and metabolomics.
- Exploratory analysis to identify clusters and patterns in the data sets using methods such as
   PCA, MDS, eSNE, etc.
- Differential expression and differential abundance analysis based on omics data, including
   differential gene expression analysis based on RNA-seq data and differential abundance 
   analysis based on shotgun metagenomics data.
- Machine learning methods such as random forests to build predictive models for various 
    clinical outcomes using high dimensional omics data.
- Support fully-collaborative grant-funded investigations. This includes preliminary data
   development, hypothesis formulation, grant narrative development, data analysis and
   biological inference, custom software development, and co-authored dissemination of findings. 

B) Customized Advanced Analyses(Cost: Discuss with Dr. Hongzhe Li): The Core can also provide more advanced statistical models which may involve developing new statistical methods for more complex problems or for new data types. The Core PI will work closely with the Center investigators and MS statisticians to develop and evaluate these methods. Some examples include: 

 - Integrative analysis and causal inference of multiple omics data sets in order to gain 
   mechanistic insights into diseases and biological processes.
 - Integrative network and pathways analysis for omics data.
 - Analysis of single cell genomics data, including scRNA-seq data using the state-of-art 
   methods.
 - Careful evaluation of new data types and development of new computational and statistical 
   tools in response to new data types and new technologies, including methods for analysis of 
   data from Hyperion 2-D Mass cytometry and Nanospray desorption ionization mass 
   spectrometry data (DESI-MS).

 


 

About Us

We are interested in statistical inference methods in big data in health science research. 

PUBLICATIONS

We publish both in top statistical journals such as JASA, JRSS-B, Biometrika, Annals of Applied Statistics and in top subject area journals such as Science, Nature, Nature Genetics. 

  

Contact Us

Assistant: Janine M. Pritchard

Tel:   (215) 573-4045

Email: jpritcha@pennmedicine.upenn.edu