Umetrics Suite Blog

How One Company Integrated Data to Implement Real-Time Bioprocess Monitoring

September 4, 2019

Most biopharma manufacturing companies are keen to adopt new methods that would streamline production, reduce errors and ensure product quality. That was the goal of Bristol-Myers Squibb when they implemented a complex real-time process monitoring system that involved integrating data from a number of different technologies, systems and vendors to gain greater control over complex batch processes.

sartorius stedim bioreactor batch processes data analytics

The ultimate goal of real-time monitoring is to reduce the risk of batch loss by uncovering process deviations and detecting faults early.

In order to provide their operations team with a proactive, multivariate tool to monitor batch manufacturing processes in real-time, Bristol-Myers Squibb had to overcome some challenges and develop clever work-abounds to certain integration limitations between various third-party applications used to collect and store data. In the end, they used data gathered at the site level, incorporating data tools and systems from vendors such as OSISoft PI, Seeq, BioVia Discoverant, and Sartorius-Stedim (SIMCA and SIMCA-online) among others, to create a successful end-to-end multivariate process monitoring system for their company.

Implementing multivendor real-time process monitoring

The ultimate goal of real-time monitoring for Bristol-Myers Squibb was to gain early fault detection that would reduce the risk of batch loss. Batches that deviate from approved process parameters risk contaminating downstream batches and causing exponentially expensive losses later in the process pipeline.

One of the challenges to deploying multivariate monitoring in biopharma manufacturing is analyzing the required batch context and continuous data in real-time. Typically, offline multivariate models use data sets that incorporate batch time context and time series data, which offers a historical perspective, not a current view.

Transitioning to online monitoring requires providing the same batch time context and time series data solely from real-time data sources such as PI. In the case of Bristol-Myers Squibb, this required the implementation team to develop innovative design choices to circumvent certain limitations of the vendor-supplied PI integration package and constraints imposed by their own network infrastructure.

Reducing batch loss and deviations

Bi0process performance is impacted by the compounding nature of multiple variables: the effects that pH, carbon dioxide, temperature, air, and other concentrations at various stages of the process can have on the final output – so it’s not just about measuring one or more individual variables, but understanding and predicting the impact they all have on each other.

Bristol-Myers Squibb spent several years optimizing and automating their data infrastructure at every stage of the process before they were ready to implement real-time monitoring.

“We spent a number of years getting our DeltaV systems mature at our sites and getting our recipe models established for different processes,” said Matthew Morrow, Team Leader for Global Product Development and Supply IT at Bristol-Myers Squibb. “We also spent a lot of time in our OSI PI system aggregating the data from all our PI sites to roll them up to our enterprise PI system,” he said at an OSI World conference.

Morrow explained how the process monitoring implementation included integrating data from various production systems such OSISoft PI, SAP, IMPS NES and SEEQ with BIOVA Discoverant, which sits on top to link data systems together, and then uses SIMCA and SIMCA-online  for process monitoring.

During their journey, he said Bristol-Myers Squibb uncovered two key hurdles in their data that they had to overcome to allow effective real-time monitoring:

  1. Data latency
  2. Lack of data context

This is because the data was a snapshot from a past point in time, and to eliminate noise in the signal, PI data compression doesn’t keep every single data point. It was necessary to pull data closer to the source and use multivariate data monitoring to get a clear picture.

What is multivariate data monitoring?

Morrow explains it well: “Although it is very complex math, what we are doing with multivariate data analytics (MVDA) is looking at a highly correlated set of process variables and trying to compress that information into a small number of uncorrelated (or synthetic) variables. In mathematical speak, these are your ‘principal components.’”

By looking at these principal components, which are really decompositions of all your variables (for example, with a bioreactor, these might be temperature, pressure, pH, etc.), you can better understand your process. So rather than having twelve highly correlated variables, you might be able to get this down to two or three relevant variables. This way, you get better meaning from your data by focusing on fewer signals for your process.


Why is multivariate monitoring useful?

Multivariate monitoring helps put emphasis around context and causation. In the old world of univariate monitoring, you would be looking at one signal at a time. As Morrow explained, if operators are seeing several alarms on a Delta system, they won’t know which alarm they should really focus on to trouble-shoot and correct the process. However, by looking at one or two principal components with multivariate monitoring, the operator can tell which of the key variables are causing a deviation in the process.

In addition, Morrow said: “SIMCA-online is very good at using PAST experience to define a multivariate batch tunnel to start to predict within a confidence interval of high and low limits where your NORMAL process operation should reside, so whenever there is an excursion, it can send an alarm to the operators, and the operators can then drill in and specifically where in the process they should be focusing their energy on.”

How are multivariate models developed and deployed?

Bristol-Myers Squibb started by retrieving historical data from SEEQ, OSIsoft PI and Discoverant. Then data from target batches were imported into SIMCA for analysis, a batch evolution model was created and a multivariate “golden batch” model was identified. From there, the model was validated to show what the tolerance from normal operations could be, which allows the system to identify faults.

Next, a batch level model was created from the data. This model uses historical data to compare each new batch to previous (good) batches to see differences. Each model was validated with a representative and unrepresentative batch. Once all batches were ready, it was configured in SIMCA-online and connections were built between OSIsoft PI and SIMCA-online to manage data flow. In building these connections, Bristol-Myers Squibb had to overcome some challenges.

Contributing data sets for real-time MVDA monitoring

Let’s take a look at how the models created above are used for real-time MVDA monitoring. There are several charts that are used in SIMCA-online to make this work. The first is the score plot (below). It shows the control limits (orange and red dashed lines) and active batches (green lines). Any deviation on the score plot can indicate a potential process upset.




The gray area in the illustration shows a warning. From there you can drill down to see a contribution chart


The contribution chart helps to determine which parameters may be contributing to an alarm, and shows how each variable contributes to the deviation.

Each bar in the contribution chart represents a variable and the direction of the bars indicate whether there is a positive or negative effect compared to historical mean, and how far away it is from the historical mean.

As you can see highlighted in gray area (above) this variable is quite high. If you double-click this bar in SIMCA-online, it will bring you to the univariate plot of that specific variable so you can see what’s going on with it.


Solving a challenge

With batch process modeling, time is implicitly built into the data sets (events happened in the past), so the challenge is what to do with this data when you want real-time monitoring. In this case, they created was a workaround to avoid sourcing the data from the archive but instead to be able to use data snapshots.

The big challenge is: how do we source the batch context in the real-time data feeds for real time monitoring?

For Bristol-Myers Squibb this was solved by creating a set of rules that might seem very simple, but which effectively solve the challenge for representing data in real-time rather than using latent archived data. In this case, that means the model uses a sleep condition, which was set up very simply using a Boolean to account for two states: active phase or process phase. And to provide additional uniqueness, it was necessary to look at the batch ID tag as well.

The solution looks like this:

Real-time batch context triggers in SIMCA-online

  • Phase execution condition: phase active = 1
  • Sleep condition: process phase active and running = 0
  • Batch identifier tag: Batch ID

The sleep condition helps define when to monitor the process phase. What was of interest for monitoring was not only the automation recipe phase, but really the subphase that defines a specific part of the process, so it would be possible to monitor or not monitor that part.

This solution was used to align the equipment model (from DeltaV control system) to everything seen in AF. What that enables is to write data to the same containers – to the same units by both the PI interface and AF attributions to the same locations.

After passing data from their enterprise system to SIMCA-online, however, the implementation team noticed inherent data latency. The tags were not written fast enough in real time to enable monitoring. Then, using PI Asset Analytics, and working with the site automation team, they were able to source signals close to the data origination (specially using the batch ID and phase state tag) and send these directly through from PI site instead of the enterprise system.

Read more: Understanding the basics of batch process data analytics

Best practices to implement real-time monitoring for batch processes

In summary, Morrow provides this set of best practices when implementing real-time data monitoring for bioprocesses using OSIsoft PI and SIMCA-online:

1. Use PI Asset Analytics equations to define real-time batch context. Bristol-Myers Squibb found it to be critical for their situation (to solve data latency).

2. Look for alternate signals to source your batch ID, your phase start and end times. Do that from historical DeltaV tags by way of the control module.

3. Look for ways to simplify your equations. Sourcing data from equipment modules rather than using empirical equations may help simplify a lot of the conditional logic and bulky error checking (such as for process phase active and running).

4. Build your system at the site PI system (if mature), not the enterprise system. The site system is closer to your data source and you’ll thereby enable the real-time capability.

5. Make sure you source your real time continuous data from snapshot data, which is coming in from your real-time feed and not your archive, which has an unpredictable delay.

Some specific examples

Bristol-Myers Squibb also shares several case examples of how implementing real-time data analytics helps provide tangible, immediate business benefits.

These include:

1. Preventing potential process deviations downstream through early detection of a tight membrane, and immediate adjustments to reduce the TMP within IPC limits.

2. Early fault detection in a cell culture valve reactor indicated by abnormally high oxygen score plots that allowed a leak to be fixed promptly – preventing waste of utility.

3. Preventing process impact from a leaking valve reactor, when a contribution plot indicated a high volume as a result of steam in a bioreactor.

4. Process change that streamlined the recipe execution and led to a new policy of more frequent pH probe replacement when a score plot showed drifting pH during chromatography.

Get the details on these examples and the steps Bristol-Myers Squibb used to implement real-time data monitoring using OSIsoft PI and SIMCA-online along with other data collection tools.

Watch the video from OSI World or download the presentation.

Read more: Here’s another example of how a biopharma company relies on batch process MVDA to improve product consistency

Meet us at PI World

Meet us at OSIsoft PI World Gothenburg Sept 16-19 to find out more about the latest ways SIMCA-online enables real time process monitoring and control.

Request a meeting




Topics: Real Time Process Monitoring, Pharmaceutical manufacturing, Batch processes, Continuous Process Monitoring

Jonas Elfving

Written by Jonas Elfving

Product Manager at Sartorius Stedim Data Analytics

Search the Blog

    Subscribe to the Blog

    View the:

    Data Analytics Glossary of Terms

    List of Webinars

    Get a free trial