Umetrics Suite Blog

Using data analytics to optimize design space and setpoint conditions for bioreactors

May 18, 2018

At the heart of any process used to manufacture biological products is a bioreactor setup that supports a stable and reproducible biologically active environment. The bioreactor provides a controlled environment to achieve optimal growth for the particular cell cultures being used.

optimizing set point for bioreactors using data analytics and design of experiments

Explore how a factorial design using data analytics can be used to develop the optimal bioreactor conditions.

Depending on the application, different data analytics and design of experiments (DOE) tools can be used in order to achieve stable and reproducible bioprocess conditions. In early stage development advanced data analytics and DOE are used to compare batches, for example, to figure out how to modify cell expansion steps so that they lead to higher cell densities and product titers. In late stage development, advanced data analytics and DOE are used when scaling-up manufacturing processes to verify comparable performance at difference scales. And in full production, advanced data analytics is used for real-time bioprocess monitoring and early fault detection of batches deviating from good, normal operating conditions.

A primary objective in bioprocess development is to understand which critical process parameters (CPPs; ‘factors’ below) affect the critical quality attributes (CQAs; ‘responses’ below). In this respect, design of experiments is an indispensable tool. To explain how this works, we can review the results of a feasibility study in which DOE is used to explore the possibilities of defining a design space for a particular bioprocess under development.

DOE for mixture formulations

As you will see, a systematic factorial design of experiments approach can produce optimal results in the bioreactor process development. In a previous blog, we discussed how a cell culture formulation was optimized through data analytics of systematic changes in four base media mixtures.

Based on the results, three mixture formulations were identified:

  • One formulation was optimized for titer production
  • One formulation was optimized for doubling time
  • One formulation was optimized for titer production, doubling time and VCD weighted equally

The relevance of the three chosen formulations was further scrutinized in a second phase design of experiment exploring how changes in temperature, pH and media composition affect critical cell culture responses. Let’s take a look at how the second DOE was conducted using Sartorius-Stedim Biotech ambr®15 bioreactors and CHOptimizer®: Design Space Estimation.

Sample exercise: Creating a factorial design

This exercise investigates how to use factorial design to encode systematic alterations in reactor environments and use the data to define the appropriate design space and robust set-point conditions.

We have created an exercise to take you through the process. In it, you will:
  • Learn how to generate a factorial design using MODDE and its Design Wizard
  • Analyze DOE data using the Analysis Wizard of the MODDE software
  • Understand how changes in factors correlate with the responses
  • Use response contour plots for model visualization and interpretation
  • Use SweetSpot and Design space plots to propose suitable operating conditions where specifications on the responses are fulfilled.

Defining the factors

The factorial design exercise explores the influence of three factors on the design space. The first factor is a qualitative factor in three settings representing the three optimized media defined in the first exercise. The other two factors are quantitative: Temperature and pH.



Three responses relating to titer, viable cell count (VCC) and percentage viability were measured. All three should be maximized and the relevant settings for minimum and target values are given below.



Completing the exercises, you’ll be taken through a series of tasks to help you better understand the steps in each process.

The first task is to create a MODDE project and reproduce the experimental design that was used. The basic design used is a Full Factorial Mixed design in 12 runs (3*2*2) without center points.

During the second task, for each response, you will need to judge the replicate experiments quality, response normal distribution, model quality etc. For example:

  • Are there any deviators?
  • Which factors have the highest / lowest influence?
  • Are the investigated factors influencing the three responses in the same way?
  • Which factor setting is favorable for maximizing titer? Maximizing VCC? Maximizing viability?

Following this, you’ll use contour plots to visualize how changes in the factors correlate with changes in titer, VCC and viability.

Then, you’ll use SweetSpot and Design space plots to investigate if there exists a region in factor space in which all goals for the responses are met.

Finally, you’ll use the Optimizer to make more rigorous design space calculations and to search for a robust setpoint.

After this, you’ll see a series of solutions for each step.

Example plots

The plots of the Analysis Wizard arising from calculating the initial interaction model for the titer response are seen below.



The replicate plot shows there is fairly small variability among the replicates. The histogram plot indicates that the response does not need a transformation and can be analyzed using the untransformed metric. According to the coefficient plot, the temperature is the factor having the strongest impact on titer. Running on low temperature coupled with Mix 1 yields the most titer. The summary of fit plot, the normal probability plot of residuals and the observed versus predicted plot all point to a very good model for the titer response.

In order to optimize the model we used the One-click analysis functionality. The results of the optimized model are shown below. The final model consists of the three main effects and the interaction term between Temperature and pH. It is a very good model with high performance statistics.



Factorial design for robust bioreactor performance

By completing this exercise, you’ll learn how factorial design can be used for studying the influence of temperature, pH and cell culture media on important responses such as titer, cell count and viability. The conclusion within the current set of experiments is that the factor temperature is by far the most influential one. Running on low temperature is beneficial for all three responses.

Overall, very good regression models were obtained, indicating high-quality data underpinning this DOE-exercise. The high-performance statistics of the three models imply high model reliabilities and strong connections between the three factors and the three responses.

Because of the strong models, the Optimizer was launched and used to search for a robust setpoint and indeed such a point existed. Using the visual capabilities of the Design Space Explorer tool it was concluded that the settings:

  • Mix 1
  • Temperature 32.3 ± 0.3
  • pH 6.81 ± 0.01

correspond to cell culture conditions within which there is lower than 1% risk of failing to comply with the response specifications (i.e., titer > 350 mg/l, VCC > 4 and viability > 70%).

In this scenario, pH needs to be tightly controlled. If the proposed range is too narrow for a real situation, specifications on the responses have to be relaxed or a higher risk level (e.g. 5% or 10% risk) must be accepted.

Download the full exercise

Want to try it yourself? Download the step-by-step instructions for this exercise.

CHOptimizer: Design Space Estimation

Read about the first step of this DOE exercise in this blog: How to optimize cell culture media to speed biopharma development or download the published poster.


Get the software

To run the full exercises, get a free trial of MODDE now.

Get MODDE free trial




Topics: Data Analytics, Design of Experiments (DOE), MODDE

Lennart Eriksson

Written by Lennart Eriksson

Sr Lecturer and Principal Data Scientist at Sartorius Stedim Data Analytics

Search the Blog

    Subscribe to the Blog

    View the:

    Data Analytics Glossary of Terms

    List of Webinars

    Get a free trial