Umetrics Suite Blog

Learn how the Omics skin in SIMCA can improve biomarker analysis and detection

February 8, 2018

In this blog post, we’ll take a closer look at a feature of the SIMCA data analytics software called the Omics skin. So what exactly is an “omics” skin?

An Omics skin is a customized view within SIMCA designed to help people who typically work in various biological fields such as proteomics, genomics, metabolomics or transcriptomics. The Omics skin graphical user interface (GUI) is specifically designed to help with the complex analysis of biological or gene data obtained through methods such as mass spectrometry.

Omics skin for genomics data analytics.jpeg

The Omics skin (GUI) in SIMCA can improve analysis and detection of discriminating variables and putative biomarkers.

An Omics skin, or GUI view, may be useful for people in a number of different fields ranging from scientists, engineers and technicians working in industrial and academic sectors, to pharmaceuticals or agricultural businesses.

Easy for data analysis novices and pros alike

If you’re a biologist, an analytical chemist, or another type of researcher, the Omics skin may be just the solution you need to help you gain meaningful insights from your carefully collected data. If you’re used to working with high-tech microarray instruments, and gathering a lot of intricate data, you may have the sort of complex analytics needs that could be achieved more successfully using an Omics skin.

The key benefit of using an Omics skin is that with only a minimum of training, and only a basic understanding of multivariate data analysis (MVDA), you can swiftly turn your data set into a list of discriminating variables that would help you further your research.

In addition to reliable data analytics, you’ll more easily be able to identify a short list of potential biomarkers or discriminating variables that separate the groups of samples in a way that is meaningful.

Starting with a wizard

One of the beneficial elements of the Omics skin is that it includes a wizard. The analysis wizard uses a workflow that can guide even an inexperienced user all the way from data import through key analysis over to a report of the most interesting findings. The wizard and skin were developed in a way that means you do not necessarily have to know a lot about multivariate data analysis in order to use it successfully.

SIMCA actually offers three different skins: the default skin you normally see, the Omics skin and a spectroscopy skin. You start on the home tab, and the wizard will guide you through importing the data, selecting the right type of data from a list of five predetermined types (such as spectrometry data or chromatographic data). It will guide you through the most accurate type of analysis needed depending on whether scaling or centering is appropriate or not. You can, of course, overrule the suggestions made by the wizard.

Next you will specify the objectives for your data analysis. These will typically be either:

After importing the data and selecting your objectives, you will be able to perform an analysis of your data using one of these common methods of data analytics. The first step is to do an analysis using PCA to gain an overview. In the example below, one deviator is diagnosed, highlighted by a red color. 

omics-blog-pca-model-1The PCA model gives an overview of the data structure (groups, trends, outliers, etc). In this case, a deviator is diagnosed because it is located outside (above) the model border indicated by the dashed red line in the bar chart.
 

Interpreting deviators in the data

Next you will interpret deviators, if any. So if you have a sample that is slightly deviating outside the model border, the tool then lets you click on the sample to create a contribution plot. It’s very simple to interpret this data. For every column that you have in your data table, you will get one bar. And the larger the numerical value is for this bar, the more strongly this sample is deviating in the variable that is highlighted.

A positive bar means that the deviating sample has a larger numerical value than the average sample. And the negative bar means that the variable in question has a smaller numerical value for this sample compared with the group average.

omics-blog-contribution-plot-2The contribution plot, to the right, is used to interpret why a deviator is different. A positive bar for a variable means that the deviating sample has a larger numerical value than the average sample.
 

After diagnosing and interpreting any deviators (aka ‘outliers’), you can also exclude those observations or rows in your data table using the exclude tool. Then you can update your model with more representative and relevant data before moving on.

The second step in the wizard is to do a data analysis using OPLS-DA, where the goal is separate the two groups of samples. The initial view of the OPLS-DA model is similar to that of the PCA model, meaning you have a score plot and distance to model plot side by side.

omics-blog-opls-model-3The OPLS-DA model is used to separate (‘discriminate’) the two groups of samples in the dataset. In the scatter plot, left above, one group of samples is given a green color and the other group a blue color. Since the green color is to the left and the blue color to the right, the inference is that both groups of samples are separable.
 

Drill down further

Using the Omics skin it’s possible to investigate every single variable by clicking. You can click on a variable to judge its discriminating power for separating the two groups of samples (the green cluster is one group and the blue cluster another group). A very strong discriminating variable will have no overlap between the two groups. 

omics-blog-discriminating-parameters-view-4

The Omics skin allows you to investigate differences between variables by drilling down into details.

 

Using a custom-built wizard and Omics skin allows you to focus on the specific data that is useful in evaluating biological samples and data such as MS, NMR, identified metabolites and chromatographic data, but any data type can be analyzed. If you are working in an omics field, the SIMCA Omics skin could be the solution you need to get the right information from your data.

View the demo

Want to know more? View the online video demo of the OMICS skin now. 

Watch the video

 

Topics: Multivariate Data Analysis, SIMCA, Omics Data Analytics

Lennart Eriksson

Written by Lennart Eriksson

Sr Lecturer and Principal Data Scientist at Sartorius Stedim Data Analytics