Transforming Downstream Bioprocesses With Real-Time Monitoring
App Note / Case Study
Published: October 27, 2025
Credit: Thermo Fisher Scientific
Proteins and excipients are key components in the production of therapeutics. Accurate monitoring of their concentration and structural integrity is essential to ensure product quality and safety.
Buffer matrix interference can compromise traditional analytical methods. However, real-time, reliable monitoring with Raman spectroscopy can eliminate the need for extensive laboratory analytics, reduce batch variability and ensure uniformity of product quality.
This app note presents a comprehensive overview of Raman spectroscopy applications for real-time, in-line monitoring of downstream bioprocesses, exploring methods for accurate protein quantification and structural analysis and buffer assessment.
Download this app note to discover:
- Cutting-edge approaches for real-time process monitoring, supporting high-quality biomanufacturing
- Methods for accurate quantification of proteins, excipients and buffer components across complex matrices
- Strategies to reduce reliance on offline analytics, enabling faster, cost-effective and scalable downstream operations
Enhancing Downstream
Bioprocesses with
Raman Spectroscopy
Compendium of application notes
Compendium
The accurate measurement of protein and excipient
concentration in biomanufacturing processes is
essential to production of therapeutics like monoclonal
antibodies (mAbs), nucleic acids and other important
drug products. Raman spectroscopy, a Process
Analytical Technology (PAT) based on specific molecular
signatures, is particularly suitable for monitoring
complex biological processes characterized by
multi-component interactions and dynamic changes.
In this compendium, we present a collection of studies
demonstrating the application of Raman spectroscopy
towards monitoring downstream bioproduction
processes. When deployed as a PAT, Raman
spectroscopy provides both qualitative and quantitative
feedback, in real time, of proteins, amino acids, and
even buffer quality. As the application notes contained
herein show, Raman spectroscopy’s intrinsic capabilities
allow it to be highly useful in bioprocess monitoring
even when the presence of complex biomolecules or
interferents like clarified harvest might render other
analytical methods ineffective.
Using a Process Raman Analyzer as an
In-line Tool for Accurate Protein Quantification
in Downstream Processes 4
A Classical Least Squares (CLS) Approach for
Protein Quantification in Downstream Processing
Using Raman Spectroscopy 10
Direct in-line quantification of titer in clarified
harvest using Raman spectroscopy 16
Raman-based Accurate Protein Quantification in
a Matrix that Interferes with UV-Vis Measurement 21
Process Raman as a comprehensive solution
for downstream buffer workflow 27
Protein secondary structure analysis using in-line
process Raman and offline SERS surface 33
Table of Contents
Using a Process Raman Analyzer as an In-line Tool for
Accurate Protein Quantification in Downstream Processes
Application note
Authors
Michelle Nolasco¹, Kristina Pleitt¹, Nimesh Khadka²
¹ BioProduction Group, Thermo Fisher Scientific,
St. Louis, Missouri USA
² Analytical Instrument Group, Thermo Fisher Scientific,
Tewksbury, Massachusetts, USA
Significance
In-line and real-time measurement of accurate protein
(monoclonal antibody, or mAb) concentration has been
demonstrated over a wide dynamic range (0-135 g/L)
using the Raman process analyzer for the ultrafiltration/
diafiltration (UF/DF) as shown in Figure 1.
The methodology described in the paper offers actionable
results for downstream processing, providing valuable
insights for monitoring and controlling protein concentration.
Additionally, we have demonstrated that the Raman models
developed for measuring mAb concentration are transferrable
to a similar class of monoclonal antibodies.
Introduction
Therapeutic proteins (e.g., monoclonal antibodies, insulin,
Fc-fusion protein, antibody-drugs conjugates, hormones) are
used to treat a variety of conditions and diseases such as
cancer, immunologic diseases, communicable diseases like
COVID-19, and many others.¹ The efficacy and functionality of
a therapeutic protein are dependent on dosage and its physical
state and structure. Abrupt changes in the environment of
the protein, such as shifts in pH, temperature, shear force, or
chemical modification, can induce conformational changes
within the protein. The structural changes may lead to protein
denaturation, aggregation, and even degradation—all of which
can lower the efficacy of the therapeutic protein or trigger
negative health consequences for the patient. Thus, ensuring
the quantity and quality of therapeutic proteins throughout the
manufacturing process—from production, through the fill and
finish stage, and during storage— is of paramount importance.²
Figure 1. Diagram showing integration of process Raman in a
UF/DF process using a flowcell probe for in-line estimation of
accurate protein concentration.
Buffer
Pump
Pump
Membrane
Retentate
Permeate
Outlet
FlowCell
Process
Raman
Ultrafiltration/
Diafiltration
Accurate Protein
Concentration
Therapeutic proteins like monoclonal antibodies are produced
in a bioreactor where cells are maintained at optimum
conditions by controlling nutrition and physical factors like
pH, temperature, and dissolved oxygen. The bioreactor’s
controlled environment leads to improved monoclonal antibody
production. Following the upstream process, operations like
centrifugation, filtration, chromatography, viral inactivation,
concentration, buffer exchange, formulation, and fill and
finish are performed to purify the monoclonal antibodies and
formulate them into drug products. These operations are
collectively categorized as downstream processes (Figure 2).
In this study, we placed a Thermo Scientific™ MarqMetrix™ AllIn-One Process Raman Analyzer in a UF/DF run, as described
in Figure 1, for use as an in-line process analytical technology
(PAT) solution for measuring mAb concentration. In-line PAT
enables real-time monitoring and control of critical process
parameters that are key to reducing batch-to-batch process
variability and ensuring uniformity of products. A well-controlled
process also implies high efficiency, product quality, and
minimized manufacturing costs.³ Currently, ultraviolet- visible
(UV-Vis) spectrometry is the standard technology for in-line
quantification of protein concentration during downstream
purification.⁴ However, in recent years, process Raman has
gained popularity as a complementary technology for in-line
monitoring of protein concentration with its additional benefits
of being able to measure excipient concentrations, buffer
components, and critical quality attributes (CQAs) of products.⁵
Here, we demonstrate the use of process Raman as a viable
PAT solution for accurate and real-time quantification of mAb
concentration during downstream processing. In addition,
we also show the model’s transferability to a different
monoclonal antibody.
Experimental design and data analysis
Calibration model development
Partial least square (PLS) calibration models for mAb
quantification were developed using the calibration samples of
product A ranging in concentration from 0 to 135 mg/mL. The
samples were passed through the FlowCell probe integrated
with process Raman at a 100 mL/min flow rate. The Raman
spectra were acquired using a 785 nm laser with the following
acquisition parameters: laser power 450 mW; integration time
3000 ms; average 3 (i.e., a single spectrum per 18 s); and ten
replicates per concentration.
PLS models were developed for mAb quantification using two
spectral regions. The “Amide I PLS model” utilized the spectral
region of approximately 1550 to 1850 cm-1, while the “Extended
Region PLS model” utilized a broader spectral region from
approximately 850 to 1850 cm-1. The terminal region of the
spectra (~ 3098 to 3230 cm-1) that corresponds to water band
vibrations was also included in the normalization model. The
models were built after following the data preprocessing steps:
1) normalization to the water band in every spectrum using
infinity norm; 2) SavGol filter 1st derivative (smoothing window
13 and polynomial order 2); and 3) mean centering. Both
models were internally validated using a leave-one-out
cross-validation (LOOCV) strategy (contiguous block of 10).
Initially, both models were built using the latent variables
ranging from 1 through 20. The root mean square error of
calibration (RMSEC), and the root mean square error of crossvalidation (RMSECV) was calculated for each model. Finally,
the optimum latent variable PLS model was selected using
the following criteria: 1) adding more latent variables did not
significantly improve the RMSECV; and 2) the values of RMSEC
and RMSECV were similar. To evaluate model specificity,
the variable importance in the projection (VIP) scores was
calculated for the above two models. All chemometric works
were performed using Solo 9.3 (2024) Eigenvector Research,
Inc, Manson, WA USA 98831 software.
Figure 2. Upstream and downstream workflow for production of monoclonal antibodies.
Monoclonal
Antibody
Production Centrifugation →
depth filtration
Upstream Process Downstream Processes
Cells Monoclonal
Antibody
Bioreactor
Capture
chromatography
Viral
inactivation
Viral
filtration
Intermediate and polishing
chromatography
Monoclonal Antibody Purification and Formulation
Ultrafiltration/
Diafiltration
Bulk fill Fill and Finish
Validation of model performance
The Raman spectra for the test samples were collected for
two different monoclonal antibodies (products A and B) using
the procedure described for the training data. The formulation
buffers for products A and B were different during the
diafiltration steps. These acquired spectra were fed into the
models to get the real-time prediction of protein concentration.
The predicted concentrations were then compared with the
reference in-line and offline UV-Vis values to estimate the
prediction error.
Results
The correlation plot between measured and predicted (during
cross-validation) protein concentrations for the Amide I PLS
models is shown in Figure 3A. The Amide I PLS model was
developed using four latent variables as the RMSECV did not
improve by adding more latent variables (data not shown). The
RMSEC and RMSECV for the Amide I PLS model were 0.526
mg/mL and 0.607 mg/mL respectively. The ratio of RMSEC and
RMSECV being close to 1 suggests that the model is not overly
fitted. Similarly, the R² of CV is close to 1, and negligible CV
bias indicates that the model fits well with the training data. The
model statistics are summarized in Table 1.
Figure 3. The correlation plot for measured and cross validation predicted protein concentration for mAb using Amide I and Extended
Region PLS model.
Table 1.
PLS Model Latent
Variables
RMSEC
(mg/mL)
RMSECV
(mg/mL) CV Bias R² CV
Amide I PLS model 4 0.526 0.607 -0.036 ~1
Extended Region PLS model 4 0.237 0.319 0.014 ~1
0
20
40
60
80
140
Y Measured 1 Protein Conc (mg/mL) Y Measured 1 Protein Conc (mg/mL)
Amide I PLS Model Extended Region PLS Model
Y CV Predicted 1 Protein Conc (mg/mL)
Y CV Predicted 1 Protein Conc (mg/mL)
0 20 40 60 80 100 120 140 6
100
120
0
20
40
60
80
140
100
120
0 20 40 60 80 100 120 140
4 Latent Variables
RMSEC = 0.52639
RMSECV = 0.60764
Calibration Bias = 0
CV Bias = -0.03626
R² (Cal, CV) = 1.000, 1.000
4 Latent Variables
RMSEC = 0.2379
RMSECV = 0.31917
Calibration Bias = -3.5527e-15
CV Bias = -0.014102
R² (Cal, CV) = 1.000, 1.000
A B
The Amide I model is primarily based on the Raman signature
of the carbonyl group (–C=O) of the peptide bond (–CO–NH–)
of mAb. The carbonyl group on different secondary structures
has different Raman shifts ranging from 1600 to 1750 cm-1. The
mAb secondary structure is primarily a β-sheet structure. The
carbonyl in the β-sheet secondary structure has a Raman peak
at ~ 1670 cm-1. Thus, it can be hypothesized that the Raman
shift at ~1670 cm-1 should influence the Amide I PLS model.
The VIP scores were calculated to assess the influence of each
Raman shift on the model. The Raman shifts with VIP scores
over one are considered significant for the model. Figure 4A
shows the VIP scores plot for the Amide I PLS model. The red
dotted line represents the threshold of a VIP score equal to 1.
The region from 1640 to 1700 cm-1 have VIP scores of more
than one and are influential to the model. The region around
1670 cm-1 has the highest VIP score in the Amide I PLS model.
This indicates the carbonyl Raman signature in the β-sheet
secondary structure highly dominates the Amide I PLS model.
In other words, the VIP scores plot for the Amide I PLS model
indicates that the model has high specificity for mAb.
Based on the published work, the Amide I region of mAb is
unique. It has a distinct Raman signature compared to other
molecules (excipients and buffers) commonly used in the
downstream processes. No spectral interference in the Amide I
region suggests that the Amide I PLS model is across different
matrixes. In addition, most of the mAb have similar mass (a
similar number of carbonyl residues) and secondary structure.
Thus, the Amide I PLS model is transferable across multiple
mAb within the same classes. This was the basis of our intent
to develop the Amide I PLS model.
The correlation plot for the four latent variables of the Extended
Region PLS model is shown in Figure 3, and the model
statistics are summarized in Table 1. Like the Amide I PLS
model, the Extended Region PLS model’s excellent statistics
suggest that it adequately captures the spectral information
and correlates it with the measured concentration. The VIP
scores plot for the Extended Region PLS model is shown in
Figure 4B. Besides the Amide I region, the Raman shift that
corresponds to the CH deformation (~1450 cm-1), the breading
mode of the phenylalanine ring (~1005 cm-1), and the tyrosine
vibration mode (~850 cm-1) are influential in this model. Thus,
the VIP scores plot for the Extended Region PLS model
indicates the specificity of the model for mAb.
The RMSEC and RMSECV for the Amide I PLS model are
higher than the Extended Region PLS model, as shown
in Table 1, which suggests that the former model is less
accurate than the latter. The Extended Region PLS model
leverages additional variables (Raman shifts) in explaining the
information of the training dataset. Although the Extended
Region PLS model provides more accuracy, in some cases, the
transferability of this model across different matrices may result
in higher prediction error due to spectral overlap. Thus, both
models have pros and cons, and depending on the need, one
model might be a better choice.
Figure 4. The VIP scores plot for Amide I PLS model (plot A) and Extended Region PLS model (plot B).
0 0
0.5 0.5
1.0 1.0
1.5
1.5
2.0
2.0
2.5
2.5
Raman Shift (cm-1) Raman Shift (cm-1)
VIP Scores for Amide I Model
VIP Scores for Extended Region Model
1,600 1,800 2,000 2,200 2,400 2,600 2,800 3,000 1,000 1,500 2,000 2,500 3,000
A B
3.0
3.5
4.0
3,200
3.0
3.5
4.0
4.5
Performance of models on the UF/DF run using
product A (mAb)
Transferability of models on product B (mAb)
UF/DF run
The Amide I and Extended Region PLS models were developed
using the Product A data set. A UF/DF run was carried out with
different protein (product B) in a different formulation matrix
to validate the model performance with other monoclonal
antibodies. Figure 6A shows the real-time monitoring and
prediction of Product B concentration during the UF/DF
process using Amide I and Extended Region PLS models. The
prediction from the Amide I model (blue trace) and Extended
Region Model (orange trace) excellently overlay each other. To
the right in Figure 6B, the absolute prediction error between
the predicted (Raman) and reference (offline UV-Vis) is shown
as a function of protein concentration. The blue and orange
bars represent the prediction errors from the Amide I PLS
and Extended Region PLS models, respectively. The overall
prediction errors were below 3% across the concentration
range tested.
The real-time prediction of protein concentration during the
process of UF/DF using data from in-line UV-Vis (grey), Amide
I PLS model (blue trace), and Extended Region PLS model
(orange) is demonstrated in Figure 5. As shown in the figure, all
three predictions overlayed each other, demonstrating excellent
agreement. Minute data discrepancies between in-line UV-Vis
and Raman were observed. The Raman data was acquired
every 18 sec, while the in-line UV-Vis data was acquired
every 12 sec. Since the UF/DF process is highly dynamic,
the difference in the acquisition times for the in-line UV-Vis
and Raman instruments explains the discrepancies. Such
discrepancies can be easily overcome with proper control of
the process dynamics or acquisition settings. Finally, when the
Raman prediction from both models was compared with the
offline UV-Vis reference values, the absolute prediction errors
were below 5% throughout the process.
Figure 5. Showing excellent correlation of the real-time
monitoring of UF/DF of product A.
Figure 6. Data demonstrate excellent model transferability
between proteins with different formulation matrices. The
calibration models for predicting protein concentration were
built with Product A and applied to Product B that has a different
formulation buffer. The absolute prediction error was < 3 %.
0
80
100
120
140
160
Time (min)
mAb concentration (mg/mL)
0 60 120 180 240
20
40
UV-Vis
Amide I
Extended Region
60
0
0%
40
5%
60
10%
80
100
120
Spectra
Protein Concentration (g/L)
Protein Concentration (g/L) % Error
200 600 1,000
42.1
0 400 800
35.0 94.5
20
Amide I
Extended Region
Amide I
Extended Region
A
B
2.2% 2.5%
0.5%
1.0%
2.2%
0.5%
Learn more at thermofisher.com/marqmetrixAIO
For research use only. Not for use in diagnostic procedures. For current certifications, visit thermofisher.com/certifications
© 2024 Thermo Fisher Scientific Inc. All rights reserved. All trademarks are the property of Thermo Fisher Scientific
and its subsidiaries unless otherwise specified. MCS-AN1145-EN 9/25
Conclusions
9 We demonstrated process Raman to be a rapid, reliable,
and easy-to-use PAT solution for in-line monitoring of
protein concentration during downstream processes, with
an accuracy comparable to the reference values from in-line
and offline UV-Vis instrumentation.
9 In this study, we outlined two strategies for developing
the chemometric models for the quantification of mAbbased on specific Raman signatures. Because of the high
specificity of both models for mAb, we also demonstrated
their excellent transferability to a different mAb in a UF/DF
run, regardless of differences in buffer formulations.
9 Based on works published in the literature, process Raman
is a broader PAT solution for downstream applications
because of the additional benefits it brings regarding
measurement of buffer components, critical quality
attributes of proteins (aggregation, secondary structure,
disulfide region), protein modification (antibody-drugs
conjugates), formulation components, and more.6–8
References
1. Leader, B.; Baca, Q. J.; Golan, D. E. Protein Therapeutics: A Summary and
Pharmacological Classification. Nat Rev Drug Discov 2008, 7 (1), 21–39. https://doi.
org/10.1038/nrd2399.
2. Alt, N.; Zhang, T. Y.; Motchnik, P.; Taticek, R.; Quarmby, V.; Schlothauer, T.; Beck, H.;
Emrich, T.; Harris, R. J. Determination of Critical Quality Attributes for Monoclonal
Antibodies Using Quality by Design Principles. Biologicals 2016, 44 (5), 291–305.
https://doi.org/10.1016/j.biologicals.2016.06.005.
3. Research, C. for D. E. and. PAT — A Framework for Innovative
Pharmaceutical Development, Manufacturing, and Quality Assurance. U.S.
Food and Drug Administration. https://www.fda.gov/regulatory-information/
search-fda-guidance-documents/pat-framework-innovative-pharmaceuticaldevelopment-manufacturing-and-quality-assurance (accessed 2023-03-21).
4. Rolinger, L.; Rüdt, M.; Hubbuch, J. Comparison of U.V.- and Raman-Based
Monitoring of the Protein A Load Phase and Evaluation of Data Fusion by PLS Models
and CNNs. Biotechnol Bioeng 2021, 118 (11), 4255–4268. https://doi.org/10.1002/
bit.27894.
5. Rolinger, L.; Rüdt, M.; Hubbuch, J. A Critical Review of Recent Trends, and a Future
Perspective of Optical Spectroscopy as PAT in Biopharmaceutical Downstream
Processing. Anal Bioanal Chem 2020, 412 (9), 2047–2064. https://doi.org/10.1007/
s00216-020-02407-z.
6. Sukumaran, S. Protein Secondary Structure Elucidation Using FTIR Spectroscopy. 4.
7. Dolui, S.; Mondal, A.; Roy, A.; Pal, U.; Das, S.; Saha, A.; Maiti, N. C. Order, Disorder,
and Reorder State of Lysozyme: Aggregation Mechanism by Raman Spectroscopy. J.
Phys. Chem. B 2020, 124 (1), 50–60. https://doi.org/10.1021/acs.jpcb.9b09139.
8. Hernández, B.; Pflüger, F.; López-Tobar, E.; Kruglik, S. G.; Garcia-Ramos, J. V.;
Sanchez-Cortes, S.; Ghomi, M. Disulfide Linkage Raman Markers: A Reconsideration
Attempt: Disulfide Raman Markers. J. Raman Spectrosc. 2014, 45 (8), 657–664.
https://doi.org/10.1002/jrs.4521.
A Classical Least Squares (CLS) Approach for
Protein Quantification in Downstream Processing
Using Raman Spectroscopy
Technical note
Authors
Nimesh Khadka¹, Ph.D.
Kristina Pleitt², Ph.D.
Michelle Nolasco²
¹Analytical Instrument Group, Thermo Fisher Scientific,
Tewksbury, Massachusetts USA
²Bioproduction Group, Thermo Fisher Scientific,
St. Louis, Missouri USA
Industry/Application
Biopharma PAT / Downstream processing
Products used
Thermo Scientific™ MarqMetrix™ All-In-One Process
Raman Analyzer, Thermo Scientific™ MarqMetrix™ FlowCell™
Sampling Optic
Goals
Demonstrate quick and easy chemometric strategies
to develop an accurate protein quantification model for
downstream applications using only a single spectrum of
known concentration. Highlight the developed strategy’s
performance and transferability across quantification of
other types of monoclonal antibodies (mAbs), different
matrices, and processes.
Key analytes
Protein (mAb) quantification
Key benefits
• Cost and time savings from eliminating the
need for extensive training data collection and
laboratory analytics
• Enable immediate deployment into process monitoring
and control, allowing customers to leverage the
numerous benefits of process Raman spectroscopy
The successful deployment of Raman spectroscopy to
monitor complex processes relies on the robustness of
chemometric models, which typically require large datasets
and sophisticated algorithms. The time and cost associated
with generating extensive and reliable training datasets often
hinder the adoption and integration of Raman technology in
biopharmaceutical applications. In this work, we introduce a
chemical information-based approach to develop a robust and
accurate chemometric model with a minimal training dataset.
We employed the classical least squares (CLS) algorithm
using a selected region of a single spectrum of a protein with
a known concentration to develop a protein quantification
model for monitoring ultrafiltration/diafiltration (UF/DF) in
downstream processing. The performance of the CLS model
and its transferability across different monoclonal antibodies
are discussed. This approach can also be leveraged to build
quick and accurate chemometric models for various other
applications, making Raman technology more accessible for
adoption and utilization.
Data acquisition and chemometric modelling
A monoclonal antibody (mAb; protein) at a concentration
of 93.56 mg/mL in a formulation aqueous buffer containing
histidine, arginine, and sucrose was passed through a Thermo
Scientific™ MarqMetrix™ FlowCell Sampling Optic at a flow rate
of 100 mL/min. Raman data were acquired during the dynamic
flow using a Thermo Scientific™ MarqMetrix™ All-In-One Process
Raman Analyzer with acquisition parameters of 450 mW
power, 3000 ms integration time, and 3 averages. Ten spectra
were acquired and preprocessed to remove any cosmic ray
interference. The ten spectra were then averaged into a single
spectrum that was used to build the CLS model.
Two spectral regions were selected before developing the CLS
model: 3100 to 3230 cm⁻¹ (water band) and 1570 to 1750 cm⁻¹
(protein amide I region). The infinity norm for the 3100 to
3230 cm⁻¹ spectral region was calculated and used as a weight
to normalize the entire spectrum, correcting spectral path
differences. Baseline features were removed by applying the
Savitsky-Golay filter (2nd derivative, polynomial order = 2, window
width = 13). The derivatized spectrum was then used to develop
the CLS model, which in essence is a ratio-metric model that
translates the ratio of Raman intensity of the amide I band to the
water band into predicted concentrations.
Demonstrating a simpler chemometric approach for quantifying protein concentration in a downstream ultrafiltration/diafiltration
(UF/DF) processing using process Raman.
Buffer
Pump
Pump
Membrane
Retentate
Permeate
Outlet
FlowCell
Process Raman
Test Data
Ultrafiltration/
Diafiltration
CLS Model
-14
-12
-10
-8
-6
-4
Raman shift (cm-1)
Raman intensity (a.u.)
1,600 1,650 1,700 1,750
-2
0
2
4
x 10-
³
105
100
95
90
85
80
Amide I region
0
20
40
60
80
100
Measured concentration (mg/mL)
Predicted concentration (mg/mL)
0 20 40 60 80 100 120 140 160
120
140
160
RMSEC = 0.74464
RMSEP = 2.2493
Prediction Bias = 0.55971
R² (Pred) = 0.997
Single Spectrum Predictions
Following the CLS model development, validation data
were acquired using the same parameters over the mAb
concentration range of 0 to 155 mg/mL in the same formulation
buffer. The model’s performance was further validated by
applying it to the UF/DF process with different mAbs and
by including tryptophan in the formulation buffer alongside
histidine, arginine, and sucrose. A brief comparison of CLS and
partial least square (PLS) models was also performed.
All data management, cosmic ray removal, averaging, and
timestamp alignment were performed in Python™ programming
language. The data were then processed in a commercially
available chemometric package. All chemometric works
were performed using software package SOLO 9.3.1 (2024),
Eigenvector Research. Inc., Manson, WA USA 98831.
Results and discussion
Standard normal variate (SNV) is a widely used normalization
technique for spectroscopic data to correct for path length
differences.¹ While effective when the primary contribution to
variables is noise and they share the same overall signal,
SNV may lead to non-linear responses if the overall signal
changes significantly between samples.² This may especially
be true in downstream processes where concentrations are
high and dynamic changes result in rapid changes in spectral
features and intensities. In this study, we opted to use the
water band as an internal standard to normalize the spectra as
the water concentration remains relatively constant throughout
the bioprocesses.³–
⁵
The region from 3100 to 3230 cm⁻¹ includes the Raman band
attributed to the symmetric stretching of the O-H vibrational
bond in water molecules. The O-H stretching vibrational
band is susceptible to changes in pH, ionic strength, and
temperature; nevertheless these parameters are well-defined
and controlled during process development. This ensures the
reliability of this region for spectral normalization. Additionally,
unlike the symmetric bending vibration of water molecules at
1640 cm⁻¹, the 3100 to 3230 cm⁻¹ spectral region has minimal
spectral overlap from the constituents commonly used in
the downstream processes. This makes it a viable option for
spectral normalization.⁶
One concern is the low quantum efficiency of silicon optical
sensors in this high Raman shift region. We, and others, have
previously used this region for modelling, and the region’s
performance has proven to be acceptable, likely because the
high concentration of water (~55.5 M) compensates for the
limitation in efficiency.³–
⁵
Figure 1a shows the average spectrum used for building the
training model, while Figure 1b presents the 2nd derivative plot
of the Raman spectra in the amide I region of mAb. The amide I
region spans the Raman shift from approximately 1630 to
1700 cm⁻¹, primarily influenced by the variations in the energy
of C=O symmetric stretching vibrations in different secondary
structures of mAb⁷. Our previous work has demonstrated that
the amide I region is free of spectral interferences and can be
effectively utilized to develop models with high specificity for
mAb for downstream processes.³,
⁵ Consequently, the amide I
region was selected for the CLS model. The Savitsky-Golay
filter (2nd derivative) was used in the preprocessing step to
remove the baseline shifts as well as the broad water band,
ensuring that the Raman information from the mAb is used to
train the model.
Figure 1. (a) Selected region used for chemometric model
development. The amide I region provides specificity
to the model for mAb while the water band is used for
correction of spectral path length differences. (b) Showing
the 2nd derivative preprocessed spectrum of amide I region
used for model development.
x 10⁴
-14
-12
-10
-8
-6
-4
Raman shift (cm-1)
Raman intensity (a.u.)
1,600 1,650 1,700 1,750
0.5
1
1.5
2
2.5
3
Raman shift (cm-1)
Raman intensity (a.u.)
2,000 2,500 3,000
105
-2
0
2
4
x 10-
³
100
95
90
85
80
105
100
95
90
85
80
a
b
Amide I region
Amide I region
Water band
The CLS model is a quantitative analytical method that explains
the observed spectrum of a given sample by using the linear
combination of the spectra of the pure components present
in the sample.⁸ For the CLS model to perform accurately, it is
essential to acquire the pure spectrum for each component of
the mixture, which is practically challenging or impossible for a
complex bioprocess. This problem is addressed in this study
by using the mAb-specific amide I region that is free of spectral
overlap. If a larger region is used to create the model, the CLS
model showed a decrease in performance (data not shown).
The loading for the CLS model is shown in Figure 2. The model
has high influence from the Raman shift at approximately
1670 cm-1 in the amide I region, which is assigned to the
symmetric C=O stretching of the β-sheet secondary structure
of the mAb, thus providing specificity to the model. As expected
for the single-component CLS model, the loading and the
preprocessed spectrum are similar (compare Figures 1b and 2).
The CLS model was then applied to the validation dataset
(shown in Figure 3a) across a wide concentration range in
different buffer matrices: 0 mg/mL in water; 1 to 33 mg/mL
in tris buffer; and 33 to 155 mg/mL in histidine, arginine,
and sucrose buffer. Across the concentration range in
diverse buffer components, the root mean square error of
prediction (RMSEP) was approximately 2.25 mg/mL as shown
in the correlation plot of Figure 3b. The low RMSEP for the
concentration range of 0 to 155 mg/mL demonstrates excellent
model performance and transferability across buffer matrices.
Note that the spectra shown in Figure 3a exhibit differences
in water band intensities at approximately 3240 cm-1. The water
concentration is relatively constant across the samples.
These differences in intensities within the water band indicate
variations in optical path length during data acquisition,
caused by turbidity and occasionally small air bubbles trapped
inside the MarqMetrix FlowCell probe. The inclusion of water
band normalization in the CLS model appropriately corrects for
the path length differences and improves prediction accuracy.
Additionally, no baseline was removed before water band
normalization. Although the data is not shown, the CLS model
demonstrated similar performance with or without baseline
removal (using automatic Whittaker filter and automatic
weighted least squares) before water band normalization.
Figure 2. The regression vector for the CLS model demonstrating
the influence of the approximately 1670 cm-1 Raman shift
in the amide I region, which is assigned to the symmetric C=O
stretching of the β-sheet secondary structure of the mAb.
-14
-12
-10
-8
-6
-4
Raman shift (cm-1)
Variable/Loadings
1,580 1,600 1,620
x 10⁵
1,640 1,660 1,680 1,700 1,720 1,740
-2
0
2
4
6
β-sheet
Figure 3. (a) Spectral data used for validation of the CLS model.
Also, highlighted are the differences in the intensities of water
band across spectra which substantiate the need of inclusion
of the water band normalization step in the model. (b) Correlation
plot for measured vs. predicted concentration along with
performance statistics shown as inset.
0
20
40
60
80
100
Measured concentration (mg/mL)
Predicted concentration (mg/mL)
0 20 40 60 80 100 120 140
x 10⁴
0
1
2
3
4
Raman shift (cm-1)
Raman intensity (a.u.)
1,000 1,500 3,000
140
120
100
80
40
20
a
60
500 2,000 2,500
5
6
160
120
140
160
RMSEC = 0.74464
RMSEP = 2.2493
Prediction Bias = 0.55971
R² (Pred) = 0.997
b
Water band
To validate the performance of the CLS model, the training
and validation data used to develop the CLS model were
combined into a single dataset. The combined data was
fed into the PLS algorithm using the same spectral region
and preprocessing as the CLS model. A one-latent-variable
PLS model was selected based on the leave-one-out crossvalidation (LOOCV) strategy, where each class was left
out once. The root mean square error of cross-validation
(RMSECV) was calculated, as shown in inset of Figure 4.
The RMSECV for the PLS model was 2.88 mg/mL, while the
RMSEP for the CLS model was 2.24 mg/mL. The RMSEP of
CLS model and RMSECV of the PLS model is not a direct
comparison, nonetheless, with some approximation, these
results do indicate that the CLS model performed comparable
to the PLS model.
To further test the model’s performance, scalability, and
transferability, we applied the lab-based PLS and CLS models
to Raman data collected in-line during a UF/DF pilot run. This
run used a different type of mAb and a formulation buffer. The
buffer included tryptophan, histidine, arginine, and sucrose
that were added during diafiltration. We have described the
relevance of this experiment before.³ The predicted protein
concentrations from the CLS and PLS models showed a high
correlation. This is shown in Figure 5 with the orange and
blue traces. The pooled samples (marked by red stars) were
measured using HPLC and UV-Vis spectroscopy. The absolute
prediction errors for both models are shown in Table 1.
The predictions from the CLS model exhibited lower errors
compared to the PLS model, however it does not mean CLS
is superior to PLS based on a statistically insignificant sample
size of n=1 dataset. In this study, the CLS model was built
by selecting the amide I region that has high specificity for
mAb and minimum spectral interferences If the entire spectral
region was used with complex spectral overlap or in cases
with ill-conditioning matrix and multicollinearity, PLS or other
regression models are better choices with a multitude of other
advantages. Additionally, using different regions of spectra and
other combinations of preprocessing, the performance of the
PLS model may be further improved in the above case. Here,
PLS is used only as reference but not for comparison.
Figure 4. PLS model shown with model statistics in inset. The
training and test dataset used for the CLS model were combined
to develop the PLS model. The RMSECV of the PLS model
calculated using leave-one-out cross validation is close to the
RMSEP for the CLS model, indicating similar model performance.
-15
5
25
45
65
165
Linear index
Predicted concentrations (mg/mL)
100 200 300 400 500 600 700 800
Figure 5. This plot shows the agreement in prediction of protein
concentration from the PLS (orange) and CLS (blue) models for
the UF/DF run.
Table 1. Performance of CLS model.
Reference
concentration (mg/mL)
PLS model predictions
(mg/mL)
CLS model prediction
(mg/mL)
PLS model average
abs. % error
CLS model average
abs. % error
30.72 30.6 ± 0.4 30.8 ± 0.1 1.12 0.36
155.9 146 ± 1 154 ± 1 6.15 0.67
50
100
150
Measured concentration (mg/mL)
Predicted concentration (mg/mL)
50 100 150
150
100
50
0
0
0
1 Latent Variable
RMSEC = 2.2415
RMSECV = 2.8879
Protein concentration predictions in UF/DF
0
85
105
125
145
CLS Model Predictions PLS Model Predictions
Protein
concentration
Protein
concentration
Buffer
exchange
Sampling
Conclusion
This study demonstrated an alternative approach of a
single-spectrum-based CLS model for protein quantification
in downstream bioprocesses. The CLS model exhibited lower
prediction errors than the broadly acceptable tolerance of
< 5-10 % for process monitoring. It also showed that the
single-spectrum-based CLS model has scalability from lab
to pilot scale and transferability across different monoclonal
antibody and buffer matrices.
A key factor in the successful implementation of the CLS
model was appropriate region selection. We found that the
unique Raman signature of the amide I region of monoclonal
antibodies has minimal spectral interference from other
constituents commonly used in downstream processes.³,
⁵
This allowed us to develop a mAb-specific CLS model using
the Raman intensity of the amide I region that linearly scales
with the concentration. Identifying similar unique regions in
other applications can provide a rapid and straightforward
method to build robust and accurate chemometric models. In
addition, augmenting more data to the CLS model especially
at the upper and lower concentration range will further improve
the model.
Another important aspect involves normalizing the spectra
using the O-H symmetric stretching Raman band of water
molecules. Recent literature has also utilized a similar
normalization strategy for developing chemometric models
for upstream bioreactor monitoring.⁴ This strategy appears to
work across all modalities.
Finally, the ability to build the chemometric model using a
minimal dataset not only facilitates the adoption of Raman
technology but also broadens its applicability.
References:
1. Barnes, R. J.; Dhanoa, M. S.; Lister, S. J. Standard Normal Variate Transformation
and De-Trending of Near-Infrared Diffuse Reflectance Spectra. Appl Spectrosc 1989,
43 (5), 772–777. https://doi.org/10.1366/0003702894202201.
2. Advanced Preprocessing: Sample Normalization - Eigenvector Research
Documentation Wiki. https://www.wiki.eigenvector.com/index.
php?title=Advanced_Preprocessing:_Sample_Normalization.
3. Nolasco, M.; Pleitt, K.; Khadka, N. Using a Process Raman Analyzer as an In-Line
Tool for Accurate Protein Quantification in Downstream Processes.
4. Pétillot, L.; Pewny, F.; Wolf, M.; Sanchez, C.; Thomas, F.; Sarrazin, J.; Fauland, K.;
Katinger, H.; Javalet, C.; Bonneville, C. Calibration Transfer for Bioprocess Raman
Monitoring Using Kennard Stone Piecewise Direct Standardization and Multivariate
Algorithms. Engineering Reports 2020, 2 (11), e12230. https://doi.org/10.1002/
eng2.12230.
5. Nolasco, M.; Pleitt, K.; Khadka, N. Raman-Based Accurate Protein Quantification in a
Matrix That Interferes with UV-Vis Measurement.
6. Palacký, J.; Mojzeš, P.; Bok, J. SVD-Based Method for Intensity Normalization,
Background Correction and Solvent Subtraction in Raman Spectroscopy Exploiting
the Properties of Water Stretching Vibrations. Journal of Raman Spectroscopy 2011,
42 (7), 1528–1539. https://doi.org/10.1002/jrs.2896.
7. Peters, J.; Park, E.; Kalyanaraman, R.; Luczak, A.; Ganesh, V. Protein Secondary
Structure Determination Using Drop Coat Deposition Confocal Raman Spectroscopy.
2016, 31, 31–39.
8. Lackey, H. E.; Sell, R. L.; Nelson, G. L.; Bryan, T. A.; Lines, A. M.; Bryan, S. A.
Practical Guide to Chemometric Analysis of Optical Spectroscopic Data. J. Chem.
Educ. 2023, 100 (7), 2608–2626. https://doi.org/10.1021/acs.jchemed.2c01112.
Learn more at thermofisher.com/marqmetrix
For Research Use Only. Not for use in diagnostic procedures. © 2025 Thermo Fisher Scientific Inc. All rights reserved. Python is a
trademark of Python Software Foundation. SOLO is a trademark of Eigenvector Research. Inc. Manson, WA USA. All other trademarks are the
property of Thermo Fisher Scientific and its subsidiaries unless otherwise specified. MCS-TN1384-EN 4/25
Direct in-line quantification of titer in clarified harvest
using Raman spectroscopy
Application note
Authors
Nimesh Khadka¹, Ph.D.
Lin Zhang¹, Ph.D.
Michelle Nolasco²
¹Analytical Instrument Group, ThermoFisher Scientific,
Tewksbury, Massachusetts USA
²BioProduction Group, ThermoFisher Scientific,
St. Louis, Missouri USA
Industry/Application:
Biopharma PAT / Downstream
Products used:
Thermo Scientific™ MarqMetrix™ All-In-One Process Raman
Analyzer, Thermo Scientific™ MarqMetrix™ Performance
BallProbe™ Sampling Optic
Goals:
Eliminating offline analytics with real time titer quantification
using process Raman in a clarified harvest
Key Analytes:
Titer; monoclonal antibody; clarified harvest
Key Benefits:
• Raman analysis streamlines downstream biopharma
processes with cost and time benefits by eliminating
the need for offline chromatographic analysis when
calculating loading volume.
• This methodology paves a path toward automation and
continuous manufacturing by coupling upstream and
downstream processes
Thermo Scientific MarqMetrix All-In-One Process Raman
Analyzer, Thermo Scientific MarqMetrix Performance BallProbe
Sampling Optic.
Introduction
Monoclonal antibodies (mAbs) are manufactured in two stages:
upstream and downstream processes, as shown in Figure 1.
In the upstream process, cells are cultured to produce mAbs.
These mAbs are then purified in the downstream process
through a series of chromatography and filtration operations.
Before chromatographic purification, cells and debris are
removed by centrifugation and/or depth filtration in a process
called clarification. The resulting supernatant after clarification,
known as clarified harvest, contains the desired product
(mAb), along with soluble metabolites and waste products.
The clarified harvest is then subjected to further downstream
processing to isolate and purify the target mAb.
Knowing the concentration of the mAb (titer) in the
clarified harvest before loading onto an affinity (capture)
chromatography column (e.g., protein A) is essential for several
reasons. It allows for the determination of loading volumes,
ensuring the column is loaded with the right amount of clarified
harvest to maximize the resin binding capacity. Overloading
the column can lead to incomplete binding of the mAb, product
loss, and, thereby, reduced purification efficiency. Since
Raman spectroscopy provides real-time measurements, it
facilitates continuous manufacturing and automation from the
upstream bioreactor run to the clarification step to downstream
purification, eliminating the need for conventional laboratory
analytics and significantly reducing the time otherwise required
for HPLC to measure titer concentration. In this study, we
demonstrate the capability of process Raman to directly
quantify the titer in the clarified harvest without any need of
sample preparation.
Experimental details
Data collection
Calibration samples with known mAb concentration were
obtained from multiple bioreactor conditions. The titer in
clarified samples ranged between 0 and 9 g/L. The samples
were prepared by mixing titer-free clarified harvest (filtrate
collected during protein A affinity column) with purified mAb
to create different concentration of training samples using the
design of experiment (DoE) approach based on the algorithm
of Uniform Design (UD).¹ Each sample was scanned with a
Thermo Scientific™ MarqMetrix™ Performance BallProbe™
Sampling Optic integrated with the Thermo Scientific™
MarqMetrix™ All-In-One Process Raman Analyzer. The
acquisition parameters were set to a power of 450 mW, an
integration time of 5000 ms, and an average of 10 spectra,
resulting in a spectrum every 2 minute.
Chemometric model development
The spectral regions 775 to 1920 cm-1 were selected to
develop the Partial Least Square (PLS) model for titer
quantification (Figure 2A). Baseline was removed from each
spectrum using automatic Whittaker filter with the parameters
for asymmetry and lambda set to 0.001 and 1000 respectively.
Each spectrum was then normalized using the L1 norm
calculated for the region 1590 to 1655 cm-1. To improve the
model performance, all spectra were derivatized using a
Savitzky-Golay filter (1st derivative; order = 2; window width =9)
followed by mean centering.
Figure 1. The workflow for the manufacturing of monoclonal antibodies. Also highlighted is the clarification step by the blue dotted box
where in-line process Raman was used for real time titer concentration measurement.
Monoclonal
Antibody
Production Centrifugation →
depth filtration
Upstream Process Downstream Processes
Cells Monoclonal
Antibody
Bioreactor
Capture
chromatography
Viral
inactivation
Viral
filtration
Intermediate and polishing
chromatography
Monoclonal Antibody Purification and Formulation
Ultrafiltration/
Diafiltration
Bulk fill Fill and Finish
Protein
Concentration
Raman
0
2
4
6
8
10
Measured titer concentration (mg/mL)
Predicted titer concentratin (mg/mL)
0 2 4 6 8 10
5 Latent Variables
RMSEC = 0.11456
RMSECV = 0.23534
RMSEP = 0.3575
Calibration Bias = -4.4409e-16
CV Bias = -0.048638
Prediction Bias = -0.12852
R² (Cal, CV) = 0.998, 0.994
R² (Pred) = 0.978
Fit (slope = 0.9985)
1:1
Calibration
Test
0.0
0.5
1.0
1.5
2.0
2.5
Raman Shift (cm-1)
Raman intensity (a.u.) (x10⁴)
800 1,000 1,200 1,400 1,600 1,800
3.0
2
3
4
5
6
7
A
Figure 2. The spectral region used to develop the titer PLS model is shown in plot A. The selectivity ratio (SR) plot is shown in B and
indicates the Amide I spectral region (1650 to 1700 cm-1) has strong influence in the model performance and specificity. Plot C and D
show the predictive performance of the model for the DoE samples and different batches of clarified harvest, respectively. The gray
filled circle are training data and red diamond are test data.
0
2
4
6
8
10
Measured titer concentration (mg/mL)
Predicted titer concentratin (mg/mL)
0 2 4 6 8 10
0
5
10
15
20
Raman Shift (cm-1)
Selectivity Ratio for titer prediction
800 1,000 1,200 1,400 1,600 1,800
The number of latent variables (LVs) for the PLS model was selected using leave-oneout cross-validation (LOOCV). During this process, each unique concentration block
was left out from model training and used in prediction exactly once, and all replicates
for a given concentration were treated as a single block to prevent data leakage. The
optimal number of LVs was determined by minimizing the root mean square error of
calibration and cross-validation while maintaining their ratios close to 1.
The developed PLS model for the clarified harvest titer was evaluated using a
validation set prepared with the DoE approach and by applying it to Raman data
acquired from samples collected from different clarified harvests.
All data management, cosmic ray removal, averaging, and timestamp alignment were
performed in Python. All chemometric works were performed using software package
SOLO 9.3.1 (2024). Eigenvector Research. Inc. Manson, WA USA 98831.
B
C D
1
8
9
3
5
7
9
11
1
5 Latent Variables
RMSEC = 0.11456
RMSECV = 0.23534
RMSEP = 0.19075
Calibration Bias = -4.4409e-16
CV Bias = -0.048638
Prediction Bias = 0.057436
R² (Cal, CV) = 0.998, 0.994
R² (Pred) = 0.996
Fit (slope = 1.0070)
1:1
Calibration
Test
Results and discussion
The PLS model was developed using five latent variables
(LVs). The root mean square error of calibration (RMSEC) and
cross-validation (RMSECV) were 0.114 mg/mL and 0.235 mg/
mL, respectively. The specificity of the model was evaluated
by calculating the selectivity ratio (SR), as shown in Figure
2B. The SR is the ratio of the explained variance to residual
variance for each Raman shift.² A higher SR for a given
Raman shift indicates its greater importance for the model,
forming the basis for model specificity. The Raman shifts
between 1650 to 1700 cm-1 for the developed PLS model had
high SR. This region, known as the Amide I region, primarily
includes the Raman signature associated with the symmetrical
stretching of the carbonyl group of the amide (peptide) linkage.³
Depending on the location of the carbonyl group in different
secondary structures, they experience different electronic
environments and thus have different energies associated with
the symmetrical stretching of the carbonyl group. The Amide
I region provides molecular information on the secondary
structure of the protein, where its total area is proportional
to the total amount of carbonyl functional groups present in
the protein, and its features or peak positions depend on the
presence of different secondary structures. The titer (mAbs) is
a globular protein, and its secondary structure is dominated
by β-sheet structure. As shown in Figure 2B, the SR ratio
at ~ 1670 cm-1 is most prominent; this can be assigned to
the symmetrical stretching of carbonyl group in the β-sheet
structure. Thus, Amide I region contributes to the specificity
for titer quantification in the clarified harvest. Similarly, the CH
deformation (~1440 cm-1), Amide III region (~ 1230 cm-1;
symmetric stretching C-N (ν(C-N)), N-H bending (δ(N-H)),
symmetric C-C stretch (~1130 cm-1), phenylalanine ring
breathing mode (~1005 cm-1), and tyrosine doublet (~830
and 850 cm-1) due to Fermi resonance between the in-plane
breathing mode of the phenol ring and an overtone of outof-plane deformation mode are other Raman features of titer
that are influential in the model.⁴ All these features collectively
provide specificity for the model to quantify titer against the
matrix of the host cell proteins (HCPs), metabolites, and other
waste products. The model performance when applied to the
independent validation set is shown in Figures 2C and 2D.
Initially, the model was applied to the validation set samples
prepared using the Uniform Design. The root means square
error of prediction (RMSEP) was 0.19 mg/mL across the
concentration range of 0 to 9 mg/mL. Similarly, when the model
was applied to different batches of clarified harvest samples,
the average RMSEP was 0.36 mg/mL. The offline analysis on
the clarified harvest samples revealed that different batches
of clarified harvest had varying matrices, including differences
in the concentration and composition of HCPs, metabolites,
and other molecules. After preprocessing, overlaying, and
color-coding the training and clarified harvest datasets with
titer concentration, a clear correlation was observed between
Raman intensity and concentration in the spectral regions
around 1670 cm-1, 1440 cm-1, 1005 cm-1, 830 cm-1, and 850
cm-1 as shown in Figure 3. However, other spectral regions
exhibited strong interference from the Raman signatures
of the matrices, likely from Raman signals from the HCPs,
resulting in a lack of correlation between Raman intensity and
titer concentration in these spectral regions. These findings
validated the model’s specificity, as shown in the specificity
plot (Figure 2B), and insight into the value of multivariate
chemometrics in extracting useful information from complex
spectra. Since HCPs vary between batches, it is recommended
to augment the titer model with multiple batches of clarified
harvest process data. This approach could further optimize
the model by capturing process variations, thereby improving
performance and lowering the RMSEP.
Figure 3. Spectral overlay of the preprocessed training and test
datasets (different batches of clarified harvest). The spectra are
color-coded by titer concentrations, as indicated by the vertical
bar. The red arrow points to the spectral regions that visually
correlate with titer concentrations. Uncorrelated spectral regions
are mainly due to Raman signals from host cell proteins.
0.00
0.01
0.02
0.03
Raman Shift (cm-1)
Raman intensity (a.u.) (x10⁴)
800 1,000 1,200 1,400 1,600
0.04
2
3
4
5
6
7
1
8
9
Conclusion
In this study, we demonstrated real-time quantification of titer in the complex
matrix of clarified harvest without any sample preparation. The results indicate
that by implementing in-line process Raman in the workflow, users can eliminate
the need for conventional offline HPLC analysis to quantify titer in clarified harvest.
Thus, users can directly proceed to the downstream purification step by reliably
calculating the loading volumes for purification columns. This result not only
demonstrates the capability of Raman spectroscopy for complex mixture analysis
by leveraging unique molecular Raman signatures, but also provides a practical
solution with time and cost benefits to the user. We and others have previously
demonstrated Raman as a reliable tool for monitoring and feedback control of
upstream and downstream processes.⁵–
⁹ The results shown here directly bridge our
previous works by coupling upstream with downstream processes and establish
process Raman as a single sensor with wide applications for biomanufacturing, as
well as achieving automated continuous manufacturing.
References
1. Zhang, L.; Liang, Y.-Z.; Jiang, J.-H.; Yu, R.-Q.; Fang, K.-T. Uniform Design Applied to Nonlinear Multivariate
Calibration by ANN. Anal. Chim. Acta 1998, 370 (1), 65–77. https://doi.org/10.1016/S0003-2670(98)00256-6.
2. Kvalheim, O. M. Variable Importance: Comparison of Selectivity Ratio and Significance Multivariate Correlation for
Interpretation of Latent-Variable Regression Models. J. Chemom. 2020, 34 (4), e3211. https://doi.org/10.1002/
cem.3211.
3. Peters, J.; Park, E.; Kalyanaraman, R.; Luczak, A.; Ganesh, V. Protein Secondary Structure Determination Using
Drop Coat Deposition Confocal Raman Spectroscopy. 2016, 31, 31–39.
4. Dolui, S.; Mondal, A.; Roy, A.; Pal, U.; Das, S.; Saha, A.; Maiti, N. C. Order, Disorder, and Reorder State of
Lysozyme: Aggregation Mechanism by Raman Spectroscopy. J. Phys. Chem. B 2020, 124 (1), 50–60. https://
doi.org/10.1021/acs.jpcb.9b09139.
5. Abu-Absi, N. R.; Kenty, B. M.; Cuellar, M. E.; Borys, M. C.; Sakhamuri, S.; Strachan, D. J.; Hausladen, M. C.; Li,
Z. J. Real Time Monitoring of Multiple Parameters in Mammalian Cell Culture Bioreactors Using an In-Line Raman
Spectroscopy Probe. Biotechnol. Bioeng. 2011, 108 (5), 1215–1221. https://doi.org/10.1002/bit.23023.
6. Villa, J.; Zustiak, M.; Kuntz, D.; Zhang, L.; Woods, S.; Scientific, F. Real Time Metabolite Monitoring Using the
MarqMetrix All-In-One Process Raman Analyzer and the 500L HyPerforma Dynadrive Single-Use Bioreactor
(S.U.B.).
7. Villa, J.; Zustiak, M.; Kuntz, D.; Zhang, L.; Khadka, N.; Broadbelt, K.; Woods, S. Use of Lykos and TruBio Software
Programs for Automated Feedback Control to Monitor and Maintain Glucose Concentrations in Real Time.
8. Nolasco, M.; Pleitt, K.; Khadka, N. Using a Process Raman Analyzer as an In-Line Tool for Accurate Protein
Quantification in Downstream Processes.
9. Nolasco, M.; Pleitt, K.; Khadka, N. Raman-Based Accurate Protein Quantification in a Matrix That Interferes with
UV-Vis Measurement.
Learn more at thermofisher.com
For research use only. Not for use in diagnostic procedures. For current certifications, visit thermofisher.com/certifications
© 2025 Thermo Fisher Scientific Inc. All rights reserved. All trademarks are the property of Thermo Fisher Scientific
and its subsidiaries unless otherwise specified. MCS-AN1439-EN 06/25
Raman-based Accurate Protein Quantification in a
Matrix that Interferes with UV-Vis Measurement
Application note
Authors
Michelle Nolasco¹, Kristina Pleitt¹,
Nimesh Khadka²
¹BioProduction Group,
Thermo Fisher Scientific,
St. Louis, Missouri USA
²Analytical Instrument Group,
Thermo Fisher Scientific,
Tewksbury, Massachusetts, USA
Significance
Ultraviolet-Visible spectroscopy (UV-Vis) is widely used for accurate quantification
of concentrations of purified proteins. In cases where the matrix strongly
absorbs photons of the key wavelength (280 nm), UV-Vis leads to the inaccurate
quantification of protein concentration. The issue is further complicated if the matrix
is dynamic, as it adds potential error to mathematical operations applied to remove
the background information. This work presents a case study of an ultrafiltration/
diafiltration (UF/DF) experiment with a matrix containing tryptophan. As shown in
Figure 1, both in-line UV-Vis and in-line Raman were used to monitor the process.
Our findings demonstrate that Raman-based estimation of protein concentration
proves to be more accurate and reliable when matrix interference is present. These
results serve as a paradigm, reinforcing that Raman spectroscopy, based on specific
molecular signatures, is particularly suitable for monitoring complex biological
processes characterized by multi-component interactions and dynamic changes.
Buffer
Pump
Pump
Membrane
Retentate
Permeate
Outlet
Raman
Ultrafiltration/
Diafiltration
Accurate Protein
Concentration
Figure 1. Advantage of Raman based estimation of protein concentration in a UF/DF
run in presence of interfering background due to matrix which varies with progress of
the process.
UV-Vis
Inaccurate Protein
Concentration
Matrix (absorbs
280 nm light)
Introduction
The industrial biomanufacturing of therapeutic monoclonal
antibodies (mAb) involves upstream and downstream
processes,¹,
² as depicted in Figure 2. In the upstream process,
cells are cultivated in bioreactors to produce the mAb. These
mAb then undergo a series of operations in the downstream
processing, including clarification, chromatography, filtration,
viral inactivation, viral filtration, ultrafiltration/diafiltration (UF/DF),
final filtration, and fill and finish steps. These operations are
collectively categorized as downstream processing and are
crucial for ensuring the purity, stability, and efficacy of the mAb.
Accurate quantification of the protein at different stages is
important for downstream processes.³,
⁴ It enables real-time
decision-making. For example, accurate quantification of the
target protein after the clarification step provides actionable results
such as yield estimation, GO/NO GO decisions (e.g., discontinuing
a batch if the product yield falls below the threshold to save costs,
time, and resources), and selection of the appropriate number
of chromatography cycles required to maintain the process
within expected resin loadings. Accurate protein quantification
is essential during UF/DF to monitor and control the protein
concentration to a predefined target. Similarly, precise protein
quantification in fill and finish products ensures correct dosages
are delivered to patients. Thus, accurate protein quantification is
a critical process parameter for downstream processing.
The widely accepted standard method for estimating total protein
concentration is at-line or in-line UV-Vis absorption spectroscopy
using a light source at approximately 280 nm.⁵,
⁶ This method relies
on the absorption of aromatic amino acid residues (tyrosine and
tryptophan) and disulfide bonds, which act as chromophores
contributing to the UV-Vis absorption.⁷ In UV-Vis spectroscopy,
the absorbance is linearly correlated with concentration as defined
by the Beer-Lambert Law:
In cases where the matrix absorbs photons at 280 nm,
the total absorbance (matrix + mAb) will be higher than the
absorbance of the pure mAb for a given concentration, resulting
in overprediction. If the absorbance of the matrix is low and
constant, a simple correction factor can be calculated and
subtracted from the total absorbance to obtain an accurate mAb
concentration. However, estimating the correction factor can be
challenging when the matrix is dominant and dynamic.
One example of complex matrix interference in downstream
processing is the clarified harvest. The clarified harvest pool
contains interfering components of various concentrations from
the upstream process, such as host cell proteins, amino acids,
and other biomolecules. This prevents the use of
UV-Vis to estimate mAb concentration. Another example of
matrix interference is diafiltration buffer excipients that absorb
UV-Vis photons, leading to overestimating protein concentration
during UF/DF. In both these examples, mAb is first purified using
affinity chromatography (Protein A). Then, the concentration is
determined by reading through the at-line UV-Vis instrument
(i.e., an HPLC titer assay). This additional sample preparation
and analytical steps increase the cost and process time from a
couple of hours to potentially days, depending on the workload.
An alternative technology that can be used instead of UV-Vis for
downstream monitoring, particularly in situations where there
is a matrix effect, is process Raman. Raman spectroscopy is
an optical technique that involves measuring the vibrational
modes of molecules by observing the inelastic scattering of
photons (known as Raman scattering) after interacting with a
light wave.⁸ It is highly specific for the target molecules and, thus,
widely used for identification and quantification. In this study, we
present a case study where we used process Raman and in-line
UV-Vis to monitor a UF/DF run with a UV-Vis interfering matrix
and demonstrate the advantage of Raman for such applications.
Monoclonal
Antibody
Production Centrifugation →
depth filtration
Upstream Process Downstream Processes
Cells Monoclonal
Antibody
Bioreactor
Capture
chromatography
Viral
inactivation
Sterile
filtration
Intermediate and polishing
chromatography
Monoclonal Antibody Purification and Formulation
Ultrafiltration/
Diafiltration
Bulk fill Fill and Finish
Figure 2. Upstream and downstream processes for monoclonal antibody production.
Concentration = Absorbance
path length * molar extinction coefficient
Experimental details
Calibration model development
The protein quantification partial least square (PLS) models
for Raman were developed using the spectra collected on a
single monoclonal antibody (concentration range
0 to 157 mg/mL measured at-line using UV-Vis instrument) in
various buffer and excipient-containing solutions. The samples
were passed through the flowcell probe integrated with
Thermo Scientific™ MarqMetrix™ All-In-One Process Raman
Analyzer at a 100 mL/min flow rate. The Raman spectra were
acquired using a 785 nm laser with the following acquisition
parameters: laser power 450 mW, integration time 3,000 ms,
an average of 3 (i.e., a single spectrum per 18 s), and ten
replicates per concentration.
As preprocessing steps, each spectrum was normalized by
dividing it by the infinity norm, which was calculated using the
water band of each spectrum using region 3,098 to 3,230 cm-1.
The normalized spectra were further processed with the
SavGol filter (1st derivative, order = 2, window width = 13) and
mean centering. The PLS model was developed using the
spectral region 1,600 to 1,750 cm-1 and water band region
3,098 to 3,230 cm-1. The Raman spectral region 1,600 to
1,750 cm-1 of mAb is assigned to the vibration of the carbonyl
group in the amide bond (-CO-NH-) located at different
secondary structures like α-helix, β-sheet, turns, and random
coils.⁹ Thus, the model is termed the “Amide I model.” Another
PLS model was developed using the same preprocessing but
utilizing the spectral region 800 to 1,750 cm-1 and water band
region 3,098 to 3,230 cm-1, which we termed the “Extended
Region model.” The models were internally validated using
a leave-one-out cross validation strategy. The variable
importance in projection (VIP) scores was calculated for each
model to calculate the importance of each Raman shift for
the development of the model.¹⁰ It also allows us to validate
the statistical model with chemical information. The details on
strategies for model development, validation, and VIP analysis
were discussed previously. All chemometric works were
performed using SOLO 9.3.1 (2024). Eigenvector Research. Inc.
Manson, WA USA 98831.
Tryptophan as an excipient in the ultrafiltration/
diafiltration (UF/DF)
Ultrafiltration diafiltration (UF/DF) is a common unit operation in
downstream bioprocessing involving a known
pore-size tangential flow filtration membrane. It is used to
buffer exchange and concentrate the desired biomolecules
to prepare the product for final formulation. In this study, we
started the experiment with ~10 mg/mL mAb in tris buffer
and concentrated it to ~23 mg/mL in tris buffer. At this point,
we initiated diafiltration to exchange the tris buffer with the
excipient buffer. One of the components of the excipient buffer
was tryptophan (20mM). The mAb concentration throughout
the UF/DF run was monitored using an in-line process Raman
and in-line UV-Vis spectrometer, as shown in Figure 1. The
reference mAb concentration at the different stages of the
UF/DF run was quantified by analytical protein A titer assay.
The in-line Raman predicted mAb concentration and at-line
HPLC-titer measured mAb concentration were used to
calculate the root mean square error of prediction (RMSEP).
Test of model transferability
The Amide I and Extended Region calibration models were
developed without incorporating any training dataset on
tryptophan. Instead, the training dataset for both models
consisted of mAb (IgG 4) samples in various buffers, including
tris, histidine, arginine, sucrose, and polysorbate. As a result,
the predictive performance of these models also assesses their
transferability across different processes.
Results
The results of the PLS calibration models for predicting protein concentration
based on Raman data are shown in Figure 3. The details of the strategy of model
development, validation, and the VIP plot have been discussed in previous work. Two
latent variables were selected to build the Amide I and Extended Region PLS model.
The root means square error of cross-validation (RMSECV) of approximately
0.70 mg/mL for both models (Figure 3A and 3C inset) across the training
concentration of 0 to 157 mg/mL mAb indicates that both models have high accuracy.
The R2
for cross-validation being close to 1 for both models indicates that the models
explain the variance in the training data related to mAb very well. The VIP score plot
shows the importance of the Raman shift for the model. Here, the VIP score one is
defined as the threshold (red dotted line in Figure 3B and 3D). Any Raman shifts with
a VIP score of more than one are considered important for the model.
Similarly, the higher the scores, the more important the Raman shift is for the model.
For the Amide I model, the Raman shift at ~1,670 cm-1 is influential in the model
(Figure 3B). On the other hand, the amide region (~1,670 cm-1), CH deformation
(~1,450 cm-1), breathing mode of phenylalanine (~1,005 cm-1), and fermi doublet of
tyrosine (~830-850 cm-1) are dominant in the Extended Region model (Figure 3D).
Figure 3. Plots A and B show the Amide I calibration model and its VIP score plot, respectively. Similarly, plots C and D show the
Extended Region calibration model and its VIP score plot. The model statistics are shown in the inset of plots A and C.
0
20
40
60
80
100
Measured Titer concentration (mg/mL)
Extended Region Model
CV Predicted Titer concentration (mg/mL)
0
0.0
1.0
2.0
3.0
4.0
Raman Shift (cm-1)
VIP Scores for Extended Region Model
1,000 1,500 2,000 2,500 3,000
Measured Titer concentration (mg/mL)
Amide I Model
CV Predicted Titer concentration (mg/mL)
0 20 40 60 80
Raman Shift (cm-1)
VIP Scores for Amide I Model
100 120 140 160 1,600 2,000 2,400 2,800 3,200
20 40 60 80 100 120 140 160
120
140
160
0.5
1.5
2.5
3.5
4.5
0
20
40
60
80
100
120
140
160
0.0
1.0
2.0
3.0
4.0
0.5
1.5
2.5
3.5
1,800 2,200 2,600 3,000
2 Latent Variables
RMSEC = 0.49881
RMSECV = 0.6318
Calibration Bias = 0
CV Bias = NaN
R² (Cal, CV) = 1.000, 1.000
2 Latent Variables
RMSEC = 0.75816
RMSECV = 0.72856
Calibration Bias = 3.5527e-15
CV Bias = NaN
R² (Cal, CV) = 0.999, 1.000
A B
C D
The results for the UF/DF run are shown in Figure 4. The
process was initiated with 10 mg/mL mAb in tris buffer and
concentrated to 23 mg/mL in tris buffer. In this concentration
step, the prediction of mAb concentration from in-line UV-Vis
and in-line Raman (for both models) showed excellent agreement
with an overall absolute prediction error of less than 5%. In
the diafiltration step, the mAb in the tris buffer was exchanged
with the buffer containing tryptophan. As the tryptophan was
introduced into the system, the prediction of mAb concentration
from in-line UV-Vis had an absolute prediction error of 20 to
70% compared to the HPLC-titer (Figure 4). In contrast, the
real-time predictions from in-line process Raman using Amide I
and Extended Region model were more accurate. The absolute
prediction error for the Amide I model was below 5%, while for
the Extended Region model, it was below 10% (Figure 5).
Why are Raman predictions accurate?
The accuracy of Raman predictions can be explained based on
molecular specificity, which forms the fundamental basis of all
Raman applications. Raman is based on the unique signature
exhibited by molecules with different atomic compositions
or molecular bonds. In this study, the molecular specificity
for mAb is provided by the Raman signature of the Amide I
region that extends from ~1,640 – 1,700 cm-1. Amide bonds
(-CO-NH-) are absent in this study’s tryptophan and other
buffer components. This is evident in the second derivative
plot of mAb concentration (~ 31 mg/mL) in tris buffer (red) and
excipient buffer with tryptophan (green), as shown in Figure 6A.
The spectral overlay of the Amide I region with and without
tryptophan is approximately identical in intensity (at ~1,670 cm-1),
and peak features indicate no spectral interference. As shown
in the VIP plot for the Amide I model (Figure 3B), the 1,670 cm-1
region is dominant in the calibration model and crucial in making
predictions. This unique Raman signature of the Amide I region
of mAb explains the accurate transferability of the Amide I model
with an absolute prediction error of less than 5% (Figure 5).
Tryptophan, an amino acid part of mAb, has a strong Raman
signal that overlaps the protein spectra beyond the Amide I
region, as shown in Figure 6B. The Raman peaks from
tryptophan at ~1,623 cm-1 and ~1,555 cm-1 are assigned to the
stretching vibration of the benzene ring; the peak at ~1,436 cm-1
to the stretching vibration of the pyrrole ring; and at ~882 cm-1
to the skeletal vibration, with significant contribution from the
pyrrole NH in-plane deformation.¹¹ However, the prediction errors
from the Extended Region model were less than 10% (Figure 5)
due to the molecular specificity provided by including the Amide I
region in the calibration model. The slightly higher error in the
prediction from the Extended Region Model compared to the
Amide I Model is mainly due to spectra overlap with tryptophan.
By augmenting the Extended Region model with an appropriate
tryptophan dataset using the design of the experiment approach,
the prediction error could be improved further.
Figure 4. Overlay of mAb predictions from in-line UV-Vis (grey)
and in-line process Raman (blue and orange) in the UF/DF run.
The blue and orange traces represents the Raman predictions
of the mAb concentration using Amide I and Extended Region
models respectively. The excipient buffer containing tryptophan
was introduced in the system at time ~120 min. The mAb
prediction using in-line UV-Vis and in-line Raman were accurate
and similar (<5% differences) before tryptophan was introduced
(0 to 120 min). After introducing tryptophan, the predicted error
form in-line UV-Vis was between 20 to 70%.
Figure 5. Plot showing absolute error for mAb predictions
from Amide I and Extended Region models at the reference
concentration of mAb of 30.72 and 155.9 mg/mL in
presence of 20mM tryptophan. Note, none of the tryptophan
dataset were included in the training set for either model.
0
50
100
150
200
Time (min)
Concentration (g/L)
0 100 200 300 400
Amide I
Extended Region
UV-Vis
0
2
4
6
8
Reference concentration (mg/mL)
Absolute Prediction Error %
30.72 155.9
Amide I Prediction Error
Extended Region Prediction Error
Learn more at thermofisher.com/marqmetrixAIO
For research use only. Not for use in diagnostic procedures. For current certifications, visit thermofisher.com/certifications
© 2024 Thermo Fisher Scientific Inc. All rights reserved. All trademarks are the property of Thermo Fisher Scientific
and its subsidiaries unless otherwise specified. MCS-AN1163-EN 8/24
Conclusion
• We presented a case study demonstrating the advantage
of in-line process Raman for accurately estimating
mAb concentration in downstream processes, specifically
UF/DF, where tryptophan interfered with direct in-line
UV-Vis analysis. Raman spectroscopy overcame
this interference and provided reliable and real-time mAb
concentration predictions.
• Our study showcased the transferability of Raman models
across processes with high accuracy. This transferability
is attributed to the fact that Raman spectroscopy relies
on specific molecular signatures, making it highly specific
for the target analytes. This characteristic of Raman
spectroscopy makes it particularly suitable for monitoring
and controlling complex and dynamically changing
biological processes.
• In downstream processing for mAb production, the analytes
are typically present at high concentrations and purity
levels. This makes Raman spectroscopy an ideal tool
for monitoring and controlling these processes. Not only
can Raman quantify different analytes accurately, it also
provides valuable insights into critical quality attributes.
The unique advantage of process Raman is its ability to
provide real-time and actionable results by measuring
multiple analytes in a single scan. This capability aligns with
the objectives outlined in the FDA guidance for Process
Analytical Technology (PAT). It accelerates progress towards
reducing batch-to-batch variability and ensuring uniformity
in the quality of products.
References
1. Bergemann, K.; Eckermann, C.; Garidel, P.; Grammatikos, S.; Jacobi, A.; Kaufmann,
H.; Kempken, R.; Pisch-Heberle, S. Production and Downstream Processing. In
Handbook of Therapeutic Antibodies; John Wiley & Sons, Ltd, 2007; pp 199–237.
https://doi.org/10.1002/9783527619740.ch9.
2. Gronemeyer, P.; Ditz, R.; Strube, J. Trends in Upstream and Downstream Process
Development for Antibody Manufacturing. Bioengineering 2014, 1 (4), 188–212.
https://doi.org/10.3390/bioengineering1040188.
3. Rolinger, L.; Rüdt, M.; Diehm, J.; Chow-Hubbertz, J.; Heitmann, M.; Schleper, S.;
Hubbuch, J. Multi-Attribute PAT for UF/DF of Proteins—Monitoring Concentration,
Particle Sizes, and Buffer Exchange. Anal. Bioanal. Chem. 2020, 412 (9),
2123–2136. https://doi.org/10.1007/s00216-019-02318-8.
4. Rathore, A. S.; Bhambure, R.; Ghare, V. Process Analytical Technology (PAT)
for Biopharmaceutical Products. Anal. Bioanal. Chem. 2010, 398 (1), 137–154.
https://doi.org/10.1007/s00216-010-3781-x.
5. Brestrich, N.; Rüdt, M.; Büchler, D.; Hubbuch, J. Selective Protein Quantification for
Preparative Chromatography Using Variable Pathlength UV/Vis Spectroscopy and
Partial Least Squares Regression. Chem. Eng. Sci. 2018, 176, 157–164. https://doi.
org/10.1016/j.ces.2017.10.030.
6. McKechnie, W. S.; Tugcu, N.; Kandula, S. Accurate and Rapid Protein Concentration
Measurement of In-Process, High Concentration Protein Pools. Biotechnol. Prog.
2018, 34 (5), 1234–1241. https://doi.org/10.1002/btpr.2695.
7. Biter, A. B.; Pollet, J.; Chen, W.-H.; Strych, U.; Hotez, P. J.; Bottazzi, M. E. A Method
to Probe Protein Structure from UV Absorbance Spectra. Anal. Biochem. 2019, 587,
113450. https://doi.org/10.1016/j.ab.2019.113450.
8. Mulvaney, S. P.; Keating, C. D. Raman Spectroscopy. Anal. Chem. 2000, 72 (12),
145–158. https://doi.org/10.1021/a10000155.
9. Dolui, S.; Mondal, A.; Roy, A.; Pal, U.; Das, S.; Saha, A.; Maiti, N. C. Order, Disorder,
and Reorder State of Lysozyme: Aggregation Mechanism by Raman Spectroscopy.
J. Phys. Chem. B 2020, 124 (1), 50–60. https://doi.org/10.1021/acs.jpcb.9b09139.
10.Chong, I.-G.; Jun, C.-H. Performance of Some Variable Selection Methods When
Multicollinearity Is Present. Chemom. Intell. Lab. Syst. 2005, 78 (1), 103–112.
https://doi.org/10.1016/j.chemolab.2004.12.011.
11. Characterization of a few Raman lines of tryptophan - Hirakawa - 1978 - Journal
of Raman Spectroscopy - Wiley Online Library. https://analyticalsciencejournals.
onlinelibrary.wiley.com/doi/abs/10.1002/jrs.1250070511 (accessed 2024-06-30).
Figure 6. Spectral overlap of mAb in tris (red) and tryptophan containing buffer (green) after applying SavGol filter (2nd derivative).
The Amide I region is free of spectral overlap, thus providing specificity to the models.
-40
-30
-20
-10
0
10
Raman Shift (cm-1)
Amide I Region
Preprocessed Raman Spectra
1,600 1,650 1,700 1,750
mAb in tris buffer
mAb in buffer with tryptophan
-150
-100
-50
0
50
Raman Shift (cm-1)
Extended Region
Preprocessed Raman Spectra
800 1,000 1,200 1,400 1,600 1,800
mAb in tris buffer
mAb in buffer with tryptophan
20
A B
Process Raman as a comprehensive solution for
downstream buffer workflow
Application note
Authors
Michelle Nolasco¹
Andrew Siemers¹
Kristina Pleitt¹, Ph.D.
Nimesh Khadka², Ph.D.
¹BioProduction Group, ThermoFisher
Scientific, St. Louis, Missouri USA
²Analytical Instrument Group,
ThermoFisher Scientific, Tewksbury,
Massachusetts USA
Industry/Application:
Biopharma PAT / Downstream
Products used:
Thermo Scientific™ MarqMetrix™ All-In-One Process Raman Analyzer, Thermo
Scientific™ MarqMetrix™ FlowCell Sampling Optic, Thermo Scientific™ MarqMetrix™
BallProbe™ Sampling Optic
Goals:
Enabling real-time excipient quantification and quality assessment using process Raman
Key Analytes:
Arginine, Histidine, Sucrose
Key Benefits:
• Enables real-time excipient quantification with cost and time benefits from
eliminating the need for laboratory analytics
• Provides a platform to take actionable decision using real time data rather than
theoretical estimation.
• Demonstrates potential of process Raman as a PAT tool for automation of
Ultrafiltration/Diafiltration (UF/DF) and other downstream processes through
simultaneous and real-time monitoring, quality assessment, and allowing
multimodal feedback controls.
Introduction
Raman technology is rapidly gaining interest as a promising Process Analytical
Technology (PAT) solution for real-time, non-invasive monitoring and control of
downstream biopharma processes, especially for therapeutics like monoclonal
antibodies (mAbs) and nucleic acids. Raman measurement, based on the vibration
of molecular bonds, is highly specific for identification and quantification, even in
complex or interfering matrices.
As an in-line PAT tool, Raman spectroscopy offers direct and rapid measurement
in aqueous phases without sample preparation. These features make it ideal for
monitoring and controlling dynamic processes such as downstream processing.
This study demonstrates a real-time methodology for accurately quantifying
formulation excipients in the dynamic ultrafiltration/diafiltration (UF/DF) process using
the Thermo Scientific™ MarqMetrix™ All-In-One Process Raman Analyzer (Figure 1). In
addition, this study also illustrates a case where process Raman was able to provide
real-time information on buffer quality.
To allow the generalization of these excipient models to their
use in ultrafiltration/diafiltration (UF/DF) process for IgG1 mAb,
Raman spectra for IgG1 mAb (5 to 150 mg/mL in various
matrices) were added to the training dataset, and new models
were developed. The addition of these protein spectra allowed
the resulting PLS model to better distinguish Raman signals
among L-arginine, L-histidine, and protein.
Ultrafiltration/Diafiltration process
The Raman Process Analyzer with FlowCell Probe was
integrated in-line to monitor an UF/DF process (Figure 2). A
PES membrane was equilibrated with tris buffer pH 7.0 prior
to feeding a purified IgG1 mAb at 10 g/L to a target loading
of 500 g/m². In the first ultrafiltration (UF) step, the mAb was
concentrated at feed rate of 300 L/m²/hr, and TMP was
maintained between 10-15 psi via manual flow restrictor. The
mAb was then buffer exchanged into final formulation matrix
containing L-histidine, L-arginine, and sucrose by manually
feeding in the diafiltration (DF) buffer to the recirculation tank to
maintain constant volume. After buffer exchange, the mAb was
further concentrated to the desired final concentration in the
second UF step.
Figure 2. Ultrafiltration/Diafiltration (UF/DF) Process Diagram.
Recirculation
Tank
TMP
Control
Valve
Ramina
Process
Analyzer
Feed
Pressure
Permeate
Pressure
Permeate
Waste
Retentate
Pressure
Diafiltration
Buffer
Product
Feed
Experimental details
Excipient quantification models
Calibration samples with defined concentrations of L-histidine,
L-arginine, and sucrose were prepared using a design of
experiments (DoE) approach called Uniform Design (UD) derived
from number theory.¹ UD significantly reduces the total number
of experiments while optimally spans the whole process space
for model building and validation. These excipients were chosen
due to their relevance in high-concentration monoclonal antibody
(mAb) formulations. The analyte concentrations in the mixtures
were designed with UD, with ranges of L-histidine (0-15 mg/mL),
L-arginine (0-40 mg/mL), and sucrose (0-200 mg/mL) to develop
calibration models. Each sample was passed in randomized
order through a FlowCell Probe integrated with the MarqMetrix
All-In-One Process Raman Analyzer at a flow rate of 100 mL/min.
The acquisition parameters were set to a laser power of 450 mW,
an integration time of 3000 ms, and an average of 3 spectra,
resulting in an 18 second total collection time per spectrum.
A Partial Least Squares (PLS) chemometric model was
developed using the spectral range of 800 to 3235 cm-1 Raman
shift. The spectra were normalized using infinity norm calculated
in the spectral region of 2900 to 3230 cm-1 and preprocessed
with a Savitzky-Golay (SavGol) filter (1st derivative, polynomial
order = 2, window width = 13) and mean-centered. Overfitting
was minimized by selecting appropriate latent variables using
a leave-one-out cross-validation (LOOCV) strategy. To initially
validate the model performance, seven different validation
samples were collected in three different instruments using the
same acquisition parameters as used in training data acquisition. Figure 1. Thermo Scientific MarqMetrix All-In-One Process
Raman Analyzer, Thermo Scientific MarqMetrix FlowCell
Sampling Optic.
B
-1
0
1
2
3
4
5
L 0 60 120 180 240 -
h
i
s
t
i
d
i
n
e
c
o
n
c
.
(
m
g
/
m
L
)
Time (min)
UF1 DF
A
-1
1
3
5
7
9
0 60 120 180 240
L
-
a
r
g
i
n
i
n
e
c
o
n
c
.
(
m
g
/
m
L
)
Time (min)
UF1 DF
C
UF1 DF UF2
-10
10
30
50
70
90
0 60 120 180 240 300
S
u
c
r
o
s
e
c
o
n
c
.
(
m
g
/
m
L
)
Time (min)
B
-1
0
1
2
3
4
5
L 0 60 120 180 240 -
h
i
s
t
i
d
i
n
e
c
o
n
c
.
(
m
g
/
m
L
)
Time (min)
UF1 DF
A
-1
1
3
5
7
9
0 60 120 180 240
L
-
a
r
g
i
n
i
n
e
c
o
n
c
.
(
m
g
/
m
L
)
Time (min)
UF1 DF
C
UF1 DF UF2
-10
10
30
50
70
90
0 60 120 180 240 300
S
u
c
r
o
s
e
c
o
n
c
.
(
m
g
/
m
L
)
Time (min)
B
-1
0
1
2
3
4
5
L 0 60 120 180 240 -
h
i
s
t
i
d
i
n
e
c
o
n
c
.
(
m
g
/
m
L
)
Time (min)
UF1 DF
A
-1
1
3
5
7
9
0 60 120 180 240
L
-
a
r
g
i
n
i
n
e
c
o
n
c
.
(
m
g
/
m
L
)
Time (min)
UF1 DF
C
UF1 DF UF2
-10
10
30
50
70
90
0 60 120 180 240 300
S
u
c
r
o
s
e
c
o
n
c
.
(
m
g
/
m
L
)
Time (min)
Buffer stability
Buffers in downstream processes are often prepared in
advance and typically used based on prior knowledge of their
stability, rather than confirming stability through analytical
analyses before each UF/DF run. In two of our downstream
runs, we noticed that the process Raman spectroscopy
predicted a lower-than-expected sucrose concentration in
the excipient buffer. To mitigate potential risks, we discarded
the previously prepared buffer and made a fresh batch. The
predictions from process Raman analyzer on the new buffer
were much closer to the reference values obtained from
HPLC (high performance liquid chromatography). To further
validate this capability in a controlled experiment, we measured
the Raman spectra of the excipient buffer, which contains
L-histidine, L-arginine, and sucrose, at room temperature
for 15 days. The buffer was monitored through a Thermo
Scientific MarqMetrix BallProbe Sampling Optic integrated
with the MarqMetrix All-In-One Process Raman Analyzer. The
acquisition parameters were set to a laser power of 450 mW,
an integration time of 3000 ms, and an average of 3 spectra,
resulting in an 18 second total collection time per spectrum.
Additionally, we collected data on pH, osmolarity, and
performed HPLC analysis at various time intervals.
Results
The Partial Least Squares (PLS) models for L-histidine, L-arginine,
and sucrose were initially tested using seven independent
samples collected on three different process Raman analyzers.
Data were mathematically processed to standardize spectra
across instruments before applying the models. All the spectra
were interpolated to have a common x-axis by equally spacing
the 2048 pixels across 60 to 3250 cm-1 Raman shift, followed
by relative y-axis standardization using the SRM fluorescence
data as described in the NIST standardization protocol.²
Figure 3 shows the correlation plot of predicted versus reference
values for L-histidine, L-arginine, and sucrose for the validation
samples. A correlation coefficient of over 95% and a root means
square error (RMSE) of less than 5% of the reference value
for calibration, cross-validation, and prediction across three
instruments demonstrate the reliability of process Raman to
accurately predict the concentrations of these excipients, as well
as easy model transferability.
Figure 3. Correlation plot of predicted vs reference values for L-histidine, L-arginine, and sucrose across three different instruments.
Figure 4. Raman concentration predictions during bench-scale UF/DF of IgG1 mAb for L-arginine (A), L-histidine (B), and sucrose (C);
target end conc. shown with light blue dashed line.
0
0
10
3
20
5
30
7
Measured L-Arginine mg/mL
Time (min)
Predicted L-Arginine mg/mL L-arginine conc. (mg/mL)
30
180
10
60
0
0
20
120
40
240
0
40
80
120
Measured Sucrose mg/mL
Predicted Sucrose mg/mL
0 50 100 150 200
0
5
10
15
Measured L-Histidine HCI mg/mL
Predicted L-Histidine HCI mg/mL
0
1
2
3
Time (min)
L-histidine conc. (mg/mL)
60 120 180 240 -10
50
70
90
Time (min)
Sucrose conc. (mg/mL)
60 120 180 240 300
UF1 DF
-1
9 A B C
0 0
4
5 UF1 DF
10
30
UF1 DF UF2
RMSEC = 0.44192
RMSECV = 0.44823
RMSEP = 1.1258
Calibration Bias = 3.5527e-15
CV Bias = -0.0005461
Prediction Bias = 0.17372
R² (Cal, CV) = 0.99, 0.999
R² (Pred) = 0.995
RMSEC = 0.90539
RMSECV = 2.3907
RMSEP = 3.5746
Calibration Bias = -7.1054e-15
CV Bias = NaN
Prediction Bias = 1.0537
R² (Cal, CV) = 1.000, 0.998
R² (Pred) = 0.997
RMSEC = 0.19718
RMSECV = 0.99342
RMSEP = 0.66593
Calibration Bias = 0
CV Bias = NaN
Prediction Bias = 0.015517
R² (Cal, CV) = 0.998, 0.931
R² (Pred) = 0.981
Fit
1:1
BOX 1
BOX 2
BOX 3
Fit
1:1
BOX 1
BOX 2
BOX 3
Fit
1:1
BOX 1
BOX 2
BOX 3
40
50
160
200
0 2 4 6 8 10 12 14
Concentration g/L
Glucose
0
20
40
60
80
Fructose Sucrose
Before
After
The L-histidine, L-arginine, and sucrose models were then
applied to the data acquired during the ultrafiltration/diafiltration
(UF/DF) process. The predicted values for L-histidine, L-arginine,
and sucrose are shown in Figures 4A, 4B, and 4C, respectively.
The average predicted concentrations of L-histidine, L-arginine,
and sucrose at the end of buffer exchange were compared to
the reference values, resulting in a prediction error of less than
5% of the reference values (Table 3). This clearly illustrates the
capability of process Raman analyzers to monitor and quantify
excipients in real-time.
The real-time prediction of sucrose concentrations over 15
days at room temperature in a briefly air-exposed formulation
buffer, containing L-histidine, L-arginine, and sucrose, is shown
in Figure 5A. This scenario mimics what may occur during
the storage of the formulation buffer. Initially, the predicted
sucrose concentration was 86 mg/mL and remained stable for
5 days, but then steadily decreased to 57 mg/mL by day 15.
Predictions from the arginine and histidine PLS models behaved
similarly to those from the sucrose model, showing accurate
and stable values up to day 5, but then steadily increasing until
day 15 (data not shown). Raman spectral analysis revealed that
the decrease in the sucrose peak was accompanied by the
appearance of glucose and fructose Raman peaks (Figures 5B
and 5C). HPLC analysis confirmed the intactness of arginine
and histidine for all 15 days, and the hydrolysis of sucrose into
glucose and fructose (Figures 5D and 5E).
Figure 5. Showing the sucrose prediction over 14 days (A). Initially, the sucrose concentration was 86 mg/mL that lowered to about
57 mg/mL over the span of 14 days. The decrease in sucrose prediction were evident by decrease in sucrose specific intensity of
835 cm-1 that is assigned to the twisting (τ (CH₂)) with some contribution form symmetric stretching (ν(CC)) vibrational mode (B).
Showing in figure C, the decrease in sucrose specific band (red arrow; ~550 cm-1 assigned as in-plane bending (β(OCO)) is followed
by increase in glucose (blue; ~525 cm-1) and fructose (green; ~ 640 cm-1) specific Raman band that are assigned mainly to the
deformation of CCC, CCO, and OCO bands. In figure D and E the result of day 1 (Before; blue) and day 15 (After; cyan) are compared
where arginine and histidine remained unchanged while sucrose hydrolyzed to glucose and fructose.
Table 3. Prediction Error Calculation.
Excipient Reference concentration (g/L) Predicted concentration (g/L) % absolute error
L-histidine 4.2 4.1 1.0
L-arginine 7.0 7.1 1.4
Sucrose 92.4 95.6 3.4
55
70
75
80
Linear index (x10⁴)
Predicted sucrose mg/mL
0.5 1.0 1.5 2.0 2.5
85
90
60
65
A
Raman shift (cm-1) Raman shift (cm-1)
Raman intensity (a.u.)
Raman intensity (a.u.)
600 1,000 1,400
-0.005
B C
-0.010
0.005
0.000
0.010
500 550 650
-8
600
-6
-4
-2
0
2
4
6
Sucrose
Glucose
Fructose
HPLC analysis
Concentration g/L
Histidine
D E
0
2
4
6
8
Arginine
Before
After
100
Sucrose
Glucose Fructose
Hydrolysis
Acidic hydrolysis of sucrose is well documented in the
literature.³ To investigate if the root cause had pH-association,
we examined the osmolarity and pH profiles during the
experiment (Figures 6A and 6B). The onset of sucrose hydrolysis
coincided with an increase in osmolarity and a decrease in
pH, suggesting that the hydrolysis of sucrose into glucose and
fructose is most likely driven by the lower pH. Since no acid was
added to the system, the pH decrease was likely due to external
factors. Although we did not identify the exact cause of the pH
decrease, potential factors in practice could include bacterial
growth, improper pH adjustment during buffer preparation, or
dissolution of carbon dioxide or other acid-producing gases,
among others.
Note that the sucrose PLS model was developed using a
mixture of L-histidine, L-arginine, sucrose, and protein, and
lacks any spectral information from glucose and fructose. Since
glucose and fructose have significant overlaps in a wide spectral
region, this explains why the Raman predictions were higher
compared to the reference HPLC values.⁴ The same reasoning
applies to the discrepancies observed in Raman predictions
for L-arginine and L-histidine (data not shown) when compared
to the HPLC values (Figure 5D). In both cases, predictions can
be improved by augmenting the model with additional training
data that includes glucose and fructose spectral information.
However, this was beyond the scope of the current work.
Not including glucose and fructose spectral information in
the models is advantageous for monitoring buffer quality. As
glucose and fructose are produced by sucrose hydrolysis, new
spectral features appear that were not present in the training
dataset. The Q residual is one of the model statistics calculated
using the residual spectra remaining after projecting the original
spectra into the model space.⁵ As the spectral information of
glucose and fructose increases with the progress of sucrose
hydrolysis, the magnitude of the Q residual increases over time,
as shown in Figure 7. Users can leverage this information to
design quality control measures based on the reduced Q vs.
T² plot to assess buffer quality.⁵ For instance, in this study, a
mean value of 0.15 for the reduced Hotelling T² and 0.5 for the
reduced Q residual, with 95% confidence intervals for upper
and lower limits (red dotted oval in Figure 7), can be used as
quality control thresholds. Any buffer with reduced Q and T²
values beyond these limits is deemed to fail quality control. All
spectra after day 5 had low reduced T² and high reduced Q
values, thus failing the quality control.
Figure 7. Showing increase in Q residual with sucrose hydrolysis,
calculated by projecting the Raman data into the PLS sucrose
model. The red dotted oval showing one of the possible control
limits for quality assessment.
0
0.5
1.0
2.0
2.5
3.0
Hotelling T² Reduced (p=0.950) (96.94%)
Q Residuals Reduced (p=0.950) (3.06%)
0.07 0.08 0.09 0.10 0.11 0.12 0.13 0.14
Figure 6. Showing increase in osmolarity by 40% (A) and decrease in pH by 1 unit during the hold period (B).
400 4.0
450
4.5
500
550 5.0
5.5
Hold Time (days) Hold Time (days)
Osmolality (mOsm/kg)
pH
0 2 4 6 8 10 12 14
0.15
1.5
3.5
4.0
A B
650
650
16 0 2 4 6 8 10 12 14 16
Learn more at thermofisher.com
For research use only. Not for use in diagnostic procedures. For current certifications, visit thermofisher.com/certifications
© 2025 Thermo Fisher Scientific Inc. All rights reserved. All trademarks are the property of Thermo Fisher Scientific
and its subsidiaries unless otherwise specified. MCS-AN1440-EN 06/25
Conclusion
1. Real-time quantification of L-arginine, L-histidine, and
sucrose were demonstrated with absolute error of
< 5% in a UF/DF process using process Raman analyzer
configured with FlowCell probe. In absence of in-line
PAT tools to quantify these excipients, the volume needed
for diafiltration (Vdf ) is carried out based on following
mathematical expression:
To balance the osmotic pressure difference, sucrose is
excluded along with water.⁶ This effect was accurately captured
in the real-time predicted data from the process Raman
analyzer. Thus, process Raman provides unique capabilities to
ensure product quality by offering real-time data, rather than
relying on empirical hypotheses.
2. This work, combined with our previous demonstrations
of accurate in-line protein quantification during UF/DF
processes,⁷,
⁸ clearly highlights the value of process Raman
for downstream process monitoring. Raman spectroscopy
allows for the simultaneous measurement of multiple
critical process parameters (CPPs) with a single scan.
These findings establish process Raman as a PAT tool with
unparalleled benefits compared to other analytical methods.
3. Simultaneous measurement of protein and excipient
concentrations not only allows tighter process
control but also opens opportunities for automating
downstream processing.
4. The ability of process Raman to provide real-time
insights into buffer quality before its use in UF/DF runs
offers substantial value by preventing batch failures.
This capability enhances quality control, making Raman
spectroscopy an essential tool for integration as an
in-line sensor to improve downstream process monitoring,
control, and automation.
References
1. Zhang, L.; Liang, Y.-Z.; Jiang, J.-H.; Yu, R.-Q.; Fang, K.-T. Uniform Design Applied
to Nonlinear Multivariate Calibration by ANN. Analytica Chimica Acta 1998, 370 (1),
65–77. https://doi.org/10.1016/S0003-2670(98)00256-6.
2. Choquette, S. J.; Etz, E. S.; Hurst, W. S.; Blackburn, D. H.; Leigh, S. D. Relative
Intensity Correction of Raman Spectrometers: NIST SRMs 2241 through 2243 for
785 Nm, 532 Nm, and 488 Nm/514.5 Nm Excitation. Appl Spectrosc 2007, 61 (2),
117–129. https://doi.org/10.1366/000370207779947585.
3. Torres, A. P.; Oliveira, F. a. r.; Silva, C. l. m.; Fortuna, S. p. THE INFLUENCE of pH
ON the KINETICS of ACID HYDROLYSIS of SUCROSE. Journal of Food Process
Engineering 1994, 17 (2), 191–208. https://doi.org/10.1111/j.1745-4530.1994.
tb00335.x.
4. Wiercigroch, E.; Szafraniec, E.; Czamara, K.; Pacia, M. Z.; Majzner, K.; Kochan,
K.; Kaczor, A.; Baranska, M.; Malek, K. Raman and Infrared Spectroscopy of
Carbohydrates: A Review. Spectrochimica Acta Part A: Molecular and Biomolecular
Spectroscopy 2017, 185, 317–335. https://doi.org/10.1016/j.saa.2017.05.045.
5. Kumar, S.; Martin, E. B.; Morris, J. DETECTION OF PROCESS MODEL CHANGE IN
PLS BASED PERFORMANCE MONITORING. IFAC Proceedings Volumes 2002, 35
(1), 125–130. https://doi.org/10.3182/20020721-6-ES-1901.00752.
6. Agrawal, P.; Wilkstein, K.; Guinn, E.; Mason, M.; Serrano Martinez, C. I.; Saylae, J.
A Review of Tangential Flow Filtration: Process Development and Applications in the
Pharmaceutical Industry. Org. Process Res. Dev. 2023, 27 (4), 571–591. https://doi.
org/10.1021/acs.oprd.2c00291.
7. Nolasco, M.; Pleitt, K.; Khadka, N. Using a Process Raman Analyzer as an In-Line
Tool for Accurate Protein Quantification in Downstream Processes.
8. Nolasco, M.; Pleitt, K.; Khadka, N. Raman-Based Accurate Protein Quantification in a
Matrix That Interferes with UV-Vis Measurement.
where:
• V0
is the initial volume of the solution.
• D is the number of diavolumes required.
• R is retention factor
• C0
is initial concentration of solute
• Cf
is the final concentration of solute
In practice, the retention factor for excipients is typically
assumed to be 0, as the pore size of the diafiltration membrane
is significantly larger than the hydrodynamic size of excipients.
However, charge buildup across the membrane results in
electrochemical potential which in turn prevents the free
mobility of charged excipients, thereby increasing their
retention factor above 0. This effect is known as the GibbsDonnan effect.⁶ In such scenarios, the empirically calculated
volume needed for diafiltration (Vdf ) can result in incomplete
buffer exchange, which may affect the functionality and stability
of monoclonal antibodies (mAbs). An in-line process Raman
analyzer provides a reliable solution to this issue by offering
real-time monitoring of excipient concentrations. This enables
tighter process control and ensures product quality by allowing
for immediate adjustments to the diafiltration process, thereby
preventing incomplete buffer exchange and maintaining the
stability and functionality of the therapeutic product.
Similarly, in Figure 4C, the sucrose concentration in the
retentate decreased during UF2, as confirmed by offline HPLC
analysis. Given the hydrodynamic size of sucrose relative to
the pore size of the membrane, sucrose should theoretically
exchange freely between the retentate and filtrate, resulting
in equal concentrations in both. However, as the protein
concentration increases, the osmotic pressure also rises,
making the exclusion of water thermodynamically unfavorable.
Equation 1.
Vdf = V0 * D = V0 * ln(C0 /Cf ) / ln(1−R)
Protein secondary structure analysis using in-line process
Raman and offline SERS surface
Application note
Authors
Nimesh Khadka, Ph.D.
Bin Han, Ph.D.
Analytical Instrument Group, ThermoFisher Scientific,
Tewksbury, Massachusetts USA
Industry/Application:
Biopharma PAT / R&D laboratories / Downstream
Products used:
Thermo Scientific™ MarqMetrix™ All-In-One Process
Raman Analyzer, Thermo Scientific™ MarqMetrix™ FlowCell
Sampling Optic, Thermo Scientific™ MarqMetrix™ BallProbe™
Sampling Optic, Thermo Scientific™ OMNIC™ Software Suite
Goals:
• Demonstrate the capability of the MarqMetrix All-In-One
Process Raman analyzer to acquire high-quality data for
in-line protein secondary structure determination.
• Highlight the easy probe swap feature, facilitated
by an optical fiber cable, which allows for seamless
deployment for offline uses. In this case, SERS
measurements to determine protein structure in samples
in low concentration samples are featured.
Key Analytes:
Lysozyme, protein secondary structure
Key Benefits:
• Process Raman provides real-time insight into protein
secondary structure during downstream purification that
ensures quality of products.
• Process Raman’s ability to provide simultaneous
information on protein concentration and its quality
makes it a valuable process analytical technology
(PAT) tool.
• Use of process Raman for offline SERS measurements
enable users to leverage MarqMetrix All-In-One Process
Raman analyzer for all mode of applications: in-line,
at-line, online, or offline.
A. Thermo Scientific MarqMetrix All-In-One Process
Raman Analyzer with Thermo Scientific MarqMetrix FlowCell
Sampling Optic.
B. Thermo Scientific MarqMetrixBallProbe Sampling Optic
C. Easy swapable fiber head and other sampling probes
A
B C
Introduction
Proteins are made up of amino acids linked together by peptide
bonds (amide bonds; -CO-(NH)-), which are formed when the
α-carboxyl group of one amino acid reacts with the α-amino
group of another, releasing a water molecule in the process.¹
This sequence of amino acids is called the primary structure.
The primary structure can fold spontaneously, or with the help
of molecular chaperones, into various secondary structures
such as α-helices, β-sheets, β-turns, and 3₁₀ helices. These
secondary structures then interact in three-dimensional
space to form the protein’s tertiary structure. For proteins
with multiple subunits, these individual units come together to
form a multimeric three-dimensional complex through various
interactions known as the quaternary structure.
Protein secondary structures are crucial for determining the
functions of proteins because they maintain the structural
integrity of the molecules. Research has shown that tracking
changes in these structures can reveal functional losses
due to conformational changes, degradation, denaturation,
and aggregation of proteins. Although X-ray crystallography
and cryo-electron microscopy (cryo-EM) provide accurate
information on protein secondary structures, they are
impractical for routine use in protein biomanufacturing
due to their high costs, lengthy processing times, and
resource demands.² A more practical and efficient solution
is optical analysis. Techniques like Fourier transform infrared
spectroscopy (FTIR) and Raman spectroscopy have proven
effective for quickly analyzing protein secondary structures
in offline operation mode.³ For routine assessments, the
spectroscopic data are processed using multivariate analyses
to create chemometric models. These models are validated by
comparing their results with established secondary structure
information from X-ray crystallography or cryo-EM. Once
confirmed for accuracy, the models can be used for rapid
analysis of future batches of protein samples, enabling users to
evaluate protein secondary structure in a laboratory setting.⁴
This study presents the feasibility of using the in-line Thermo
Scientific™ MarqMetrix™ All-In-One Process Raman Analyzer
for real-time protein secondary structure analysis during
downstream processing. Our previous works have already
demonstrated accurate protein quantification in clarified
harvest and ultrafiltration/diafiltration (UF/DF), as well as
excipients quantification in downstream workflow.⁵–
⁷ Thus,
this work complements our previous findings and establishes
process Raman as a reliable PAT tool for multicomponent
monitoring and multi-modal feedback to enable tighter process
control. Additionally, this report discusses the integration of
process Raman analyzer with an internally developed SurfaceEnhanced Raman Spectroscopy (SERS) solid substrate as
an offline alternative for determining secondary structures in
samples with low protein concentrations.
Experimental Details
Data collection for initial proof of concept (PoC)
A 17 mg/mL lysozyme solution was prepared in water.
Approximately 1 mL of this solution was flushed through the
Thermo Scientific™ MarqMetrix™ FlowCell Sampling Optic,
which has a sampling volume of approximately 180 μL. The
residual lysozyme solution inside the FlowCell after flushing
was used to collect the Raman spectrum. The FlowCell was
connected to the MarqMetrix All-In-One Process Raman
analyzer via an optical fiber. Raman data were acquired using
a laser power of 450 mW, an integration time of 5000 ms, and
an average of 10 scans. Three acquired spectra were further
averaged to improve the signal-to-noise ratio (SNR). The same
strategy and acquisition parameters were used to acquire
Raman spectra of water.
Cation exchange purification
A buffered solution of 3.95 mg/mL lysozyme in 20mM MES
buffer pH 6.2 was loaded at 0.83 ml/mL into a column packed
with POROS™ 50 HS Strong Cation Exchange Resin. After
loading, the column was washed with 5 column-volumes of
20mM MES buffer pH 6.2. Finally, the bound lysozyme was
eluted at a flow rate of 3mL/min using elution buffer (20mM
MES buffer 1M NaCl pH 6.2). This entire chromatographic run
was monitored with the UV-Vis detector of the AKTA Pure (300
nm wavelength) and MarqMetrix All-In-One Process Raman
analyzer integrated with a FlowCell probe. The Raman spectra
were acquired using the acquisition settings of laser power 450
mW, integration time 3000 ms, and average of 3. During the
elution step, the lysozymes are concentrated and eluted from
the column. When the concentrated lysozyme reached the
FlowCell cavity, the flow was paused and the Raman spectrum
was acquired with the acquisition setting of power 450 mW,
integration time 5000 ms, and average of 10. Three of these
spectra were averaged for further analysis.
SERS data collection
A 5 μL aliquot of 0.1 mg/mL lysozyme in water was added on
the SERS surface. The sample was allowed to concentrate
through evaporation for approximately 10 min. This step
provides enhancement due to both the increased concentration
and true SERS effect. Raman spectra were collected using
a MarqMetrix All-In-One Process Raman analyzer integrated
with the Thermo Scientific™ MarqMetrix™ Proximal BallProbe™
Sampling Optic via optical fiber. The Raman data were
acquired using a laser power of 200 mW, an integration time of
5000 ms, and an average of 10 scans after optimizing the focal
distance using XYZ three axis micrometer stage to obtain the
maximum signal.
Data analysis and peak deconvolution
For the initial proof-of-concept (PoC) study, the Raman spectra
of water and a 17 mg/mL lysozyme solution in water were
smoothed using a Savitzky-Golay filter (window width = 7,
order = 2) and then each spectrum was then normalized using
the weight vector that was obtained by calculating the infinity
norm in the region of 3000 to 3240 cm-1, which corresponds to
the symmetric stretching of O-H bonds in water molecules. This
region was chosen for normalization as it has minimal spectral
interferences from the Raman signatures of lysozyme or buffer
analytes, and thus can correct for any path length differences.
After normalization, the spectral region from 1400 to 1800 cm-1
was selected for each spectrum. The baseline was removed
from each spectrum using an Automatic Whittaker filter
(lambda = 5000, asymmetry (p) = 0.001). The baseline-removed
water Raman spectrum was then subtracted from the
baseline-removed lysozyme solution Raman spectrum to
isolate the pure lysozyme spectrum.
This pure lysozyme Raman spectrum was preprocessed using
a Savitzky-Golay filter (window width = 9, order = 2, second
derivative) to identify the number and positions of peaks. After
identifying the peaks, the pure lysozyme Raman spectrum
was exported as a .spc file. All data analysis explained above
were performed using the SOLO 9.3.1 software package (2024,
Eigenvector Research, Inc., Manson, WA, USA 98831).
The exported .spc file was loaded into Thermo Scientific™
OMNIC™ Software for peak deconvolution in the spectral region
of approximately 1500 to 1800 cm-1. The Voigt function was
selected for peak fitting. The initial guesses for the number of
peaks and their positions were defined based on previously
calculated second derivative data. The initial guess for the
full width at half maximum (FWHM) for the Voigt peaks was
set at 8 cm-1, considering the resolution of instrument was
about 6 cm-1. The noise level was calculated using the spectral
range of 1750 to 1780 cm-1. With these initial guesses, the
global optimization was performed in the OMNIC software to
optimize for the number of peaks, peak positions, and peak
widths. Convergence was attained by minimizing the residual
between the observed spectrum and the fitted spectrum.
Finally, the positions and percentage contributions of the fitted
peaks within the spectral region of 1630 to 1700 cm-1 were
obtained to assign the types of secondary structures and their
respective amounts.
The Raman spectra collected on lysozyme during cation
exchange purification were analyzed similarly , except that
instead of pure water, the Raman spectrum of 20 mM MES
buffer with 1 M NaCl at pH 6.2 was used to subtract the
background information.
For SERS data analysis, the background spectrum was
collected using water. The rest of the data analysis was the
same as explained above.
Figure 1. Spectral overlay of 17.5 mg/mL lysozyme in water (red) and water (green) is shown in plot A. The fingerprint region between
800 to 1800 cm-1 Raman shift is shown in plot B.
0.0
0.5
1.0
1.5
2.0
Raman shift (cm-1)
Raman intensity (a.u.)
500 1,000 1,500 2,000 2,500 3,000
Lysozyme in water
Water
0.1
0.2
0.3
0.4
0.5
0.6
Raman shift (cm-1)
Raman intensity (a.u.)
800 1,000 1,200 1,400 1,600 1,800
0.7
0.8
0.9
A B
Lysozyme in water
Water
Results and Discussions
Initial PoC study
The spectrum of 17.5 mg/mL lysozyme is shown in Figure 1.
In Figure 1A, the spectra are normalized using infinity norm
calculated in the region 3000 to 3240 cm-1. Distinct Raman
features of lysozymes in the fingerprint region are shown in
Figure 1B; these match with reference values reported in the
literature.⁸ The spectral overlay after removal of baseline for the
Raman shift region 1500 to 1800 cm-1 is shown in Figure 2.
The pure lysozyme spectrum obtained after subtracting water
background is shown in Figure 3A. Likewise, the second
derivative plot of Figure 3A is shown in Figure 3B which
illustrates the presence of four peaks (blue arrow) within the
region of 1630 to 1700 cm-1.
The result of peak deconvolution of Figure 3A using Voigt
function is shown in Figure 4. The Voigt function was selected
based on the recommendation from the literature for aqueous
biological samples.⁹ The spectral region 1620 to 1750 cm-1
of protein is called Amide I region. The peaks in the Amide I
region are attributed to the symmetric stretching of the carbonyl
functional group in the peptide bond. The carbonyl groups in
different secondary structures experience different electronic
environments, leading to different vibrational energies that
appear as peaks at different positions. Thus, the Amide I region
provides information on the secondary structure of proteins.9
The symmetric stretching of the carbonyl group in the peptide
bond is both Raman and infrared (IR) active because this
vibrational mode is associated with change in polarization and
dipole moment during transition, making both Raman and IR
spectroscopic techniques suitable for secondary structure
analysis. In the literature, offline FTIR, FT-Raman and drop coat
deposition Raman (DCDR) have been reported for the study
of native protein secondary structures as well as monitoring
structural changes during protein degradation, denaturation,
aggregation, and chemical modification. However, the
possibility of elucidating the protein secondary structure in
aqueous phase using in-line Raman has not been reported,
which sets the foundation for this study.
Figure 3. The water background-subtracted lysozyme spectra is
shown in plot A and its 2nd derivative is shown in plot B. The blue
arrows in the plot B shows four peaks in the spectral region of
1630 to 1700 cm-1.
Figure 2. Spectral over of selected region after baseline removal.
The lysozyme peak is distinct at approximately 1660 cm-1.
0.00
0.05
0.10
0.15
0.20
Raman shift (cm-1)
Raman intensity (a.u.)
1,500 1,550 1,600 1,650 1,700 1,750
Lysozyme in water
Water
0.25
0.30
0.35
-0.01
0.00
0.01
0.02
0.03
Raman shift (cm-1)
Raman intensity (a.u.)
1,580 1,600 1,620 1,640 1,660 1,680
-1.5
-1.0
-0.5
0.0
0.5
Raman shift (cm-1)
Raman intensity (a.u.) x10-3
1,600 1,620 1,640 1,660 1,680 1,700
A
B
Lysozyme
1,700 1,720
0.04
0.05
0.06
0.07
0.08
0.09
The peak positions and peak contributions in the Amide I
region of lysozyme from Figure 4 are summarized in Table 1.
The results from process Raman showed that there were
four peaks centered at Raman shift of 1640, 1657, 1672 and
1688 cm-1 that were assigned to the extended conformation
(random coils), α-helices, β-sheet, and extended/PPII
(β-turns) respectively. The assignments were adopted
based on published work.8 The peak positions and their
relative contribution to the Amide I region were in close
agreement with the published work with FTIR, FT-Raman,
X-ray crystallography, and DCDR. These results clearly
demonstrated that the data quality from the MarqMetrix™
All-In-One analyzer provides a platform for the direct analysis
of the secondary structure of proteins in the aqueous phase
without any sample preparation.
Cation exchange purification
After initial validation of the possibility of in-line protein
secondary structure analysis, the strategy was applied to
the in-line process data collected during elution step of a
lysozyme form column packed with POROS™ 50 HS Strong
Cation Exchange Resin. The eluted lysozyme concentration
was approximately 25 mg/mL as confirmed by offline analysis.
The result for peak deconvolution of Amide I region of
lysozyme (1630 to 1700 cm-1) from the in-line process data is
shown in Figure 5. A small peak at approximately 1700 cm-1
was also observed that was initially assumed to be associated
with aggregated lysozyme. However, literature review on
Raman based studies on lysozyme aggregation revealed that
the peak at 1700 cm-1 was not associated with lysozyme.
Thus, four peaks centered at 1640, 1656, 1670, and 1684 cm-1
were only considered Amide I peaks. The percentage area
contributions for these selected peaks, shown in Figure 5,
were in close agreement with values shown in Table 1. Note,
even if the uncertain 1700 cm-1 were taken into consideration,
it would have affected the results insignificantly. These results
clearly demonstrated that process Raman is a valuable in-line
process analytical technology (PAT) that can provide real time
insights into protein secondary structure as the downstream
process is happening.
Secondary Structure
Process Raman FT-Raman FT-IR X-Ray
Peak
Center
% Area Peak
Center
% Area Peak
Center
% Area Peak
Center
% Area
Solvent Exposed
Extended Conformation
1640 24 1637 27 1645 19 NA 19
α-Helix 1657 42 1655 42 1654 40 NA 45
β-Sheet 1672 23 1673 27 1670 20 NA 23
Extended and PPII 1688 11 1685 4 1683 21 NA 13
Secondary Structure Raman
Peak Center %Area
Solvent Exposed
Extended Conformation
1640.803 22.59
α-Helix 1656.157 41.93
β-Sheet 1670.16 24.47
Extended and PPII 1684.597 11.01
Figure 5. Showing result of peak deconvolution performed on inline process data using process Raman.
Table 1. Results from peak convolution from process Raman and its comparison with published result using FT-Raman, FTIR, and X-ray
crystallography.
0
200
400
600
800
1,000
Raman shift (cm-1)
Raman Intensity
1,600 1,620 1,640 1,660 1,700
Figure 4. The result of peak deconvolution of lysozyme spectra of
Figure 3A is shown. The four peaks in the spectral region between
1620 to 1750 cm-1 corresponds to four different secondary
structure of lysozyme.
0.00
0.02
0.04
0.06
0.08
0.10
Raman shift (cm-1)
Raman Intensity
1,550 1,600 1,650 1,700
0.01
0.03
0.05
0.07
0.09
1,680
1,200
1,400
1,600
1,800
2,000
2,200
2,400
SERS
SERS is often used as direct or indirect sensor to identify and quantify biological or
non-biological analytes due to its high sensitivity. In some cases, the detection limit is
close to a single molecule.¹⁰ Several types of SERS exist including colloidal solution,
suspended nanoparticle in solution, immobilized nanoparticles on solid surface,
nanostructure fabricated on solid surface, or metallic nanowires/foils. Because
of high stability, sensitivity, reproducibility of results, low background signal, and
homogeneous signal enhancement across the surface, the internally developed SERS
substrate was evaluated for this study to analyze the protein secondary structure. The
experimental set up and the results for lysozyme secondary structure analysis using
SERS substrate is shown in Figure 6. The laser spot size of ~ 500 μm allows sampling
from larger area of SERS substrate while the optical fiber provides easy connectivity
and sample handling during SERS measurement using the MarqMetrix All-In-One
analyzer coupled with the Thermo Scientific™ MarqMetrix™ Proximal BallProbe™
Sampling Optic. A non-contact probe was used to avoid any physical disturbance
to the dried protein sample on the SERS substrate and to prevent contamination
or degradation that could interfere with the signal. This setup also enabled easier
handling of delicate low-concentration samples. The peak deconvoluted results for
Amide I of lysozyme from SERS were like that shown in Table 1. Thus, SERS is a viable
option to study protein secondary structure when the sample is at low concentration.
Figure 6. Demonstrating the workflow for SERS measurement using the MarqMetrix All-In-One Process Raman analyzer and the output
of peak deconvolution for Amide I region of lysozyme.
Secondary Structure SERS
Peak Center % Area
Solvent Exposed
Extended Conformation
1644 23
α-Helix 1659 41
β-Sheet 1675 24
Extended and PPII 1691 12
0
1,000
2,000
3,000
Raman shift (cm-1)
Raman Intensity
1,600 1,700
10,000
4,000
5,000
6,000
7,000
8,000
9,000
Thermo Scientific™
Marqmetrix™ All-In-One
Process Raman Analyzer
Thermo Scientific™
Marqmetrix™ Proximal
BallProbe™ Sampling Optic
Fiber optics
~8mm
Optimize Distance
SERS Chip
XYZ micrometer stage
As shown in Figure 7, the Raman spectrum of 20 mg/mL
lysozyme (red) and 0.1 mg/mL lysozyme on SERS surface
(green) have overall similar peaks profile but minute differences
in the spectral regions at 1358 cm-1 (CH deformation),
1210 cm-1 (aromatic residue), 935 cm-1(C-C stretch), ratio of
508 cm-1, 523 cm-1, 540 cm-1 (disulfide region) were also
observed. This is expected for larger molecules as not all
sections of lysozyme experience homogeneous
electro-chemical SERS enhancement effect.¹¹ Nonetheless,
the result shown in Figure 7 demonstrates these SERS
substrates are homogeneous and can be used for protein
analysis. To test enhancement efficiency of SERS substrate,
using the same methodology the data was collected on
the non-SERS substrate for 0.1 mg/mL lysozyme but only
a weak spectrum was observed (data not shown) under
the experimental conditions. Thus, a significant portion of
enhancement was attributed to SERS effect, not increase in
concentration due to drying as is the case in DCDR.
Another key aspect of SERS measurement is the enhancement
of Raman signal relative to fluorescence. The SERS spectra of
lysozyme (green) in Figure 7 is shown without baseline removal
while aqueous 20 mg/mL lysozyme (red) is plotted after removal
of baseline. Thus, SERS can be advantageous for samples
that have high fluorescence. Although the details are beyond
the scope of this work, a plot is shown in supplementary
information that illustrates the enhancement of Raman signals
even with high fluorescent background (Figure S1).
Conclusion
In this study, the feasibility of using the Thermo Scientific
MarqMetrix All-In-One Process Raman Analyzer for analyzing
the secondary structure of lysozyme is demonstrated. The
lysozyme secondary structures that were determined were
comparable to those obtained using FTIR, FT-Raman, DCDR,
and X-ray crystallography. This method’s accuracy and the lack
of need for sample preparation offer significant advantages for
probing protein secondary structure in its native aqueous state.
This capability can be leveraged to determine protein secondary
structure during real-time downstream processing (e.g., UF/
DF), where protein concentrations are relatively high, aiding in
the acquisition of high SNR spectra. This ensures the quality
of proteins during purification, as published work has shown
changes in secondary structure with protein denaturation,
aggregation, or degradation. Thus, this capability helps users
make actionable decisions to proceed with further downstream
processes confidently or, in some cases, scrap the run when
quality is compromised. In conjunction with previous work, this
study establishes process Raman as a PAT tool for both realtime quantification and quality assessment.
SERS-based protein secondary structure analysis is also
demonstrated in a sample with low concentration lysozyme.
The protein secondary structures determined from SERS
and aqueous solutions were similar, and the accuracy
was comparable to FTIR, FT-Raman, DCDR, and X-ray
crystallography. The SERS measurement was performed using
the MarqMetrix All-In-One Process Raman analyzer, leveraging
its features to easily swap probe types and use optical fibers for
ease of use in laboratory settings.
References
1. Branden, C. I.; Tooze, J. Introduction to Protein Structure, 2nd ed.; Garland Science: New York,
2012. https://doi.org/10.1201/9781136969898.
2. Wang, H.-W.; Wang, J.-W. How Cryo-Electron Microscopy and X-Ray Crystallography
Complement Each Other. Protein Sci. 2017, 26 (1), 32–39. https://doi.org/10.1002/
pro.3022.
3. Pelton, J. T.; McLean, L. R. Spectroscopic Methods for Analysis of Protein Secondary
Structure. Anal. Biochem. 2000, 277 (2), 167–176. https://doi.org/10.1006/abio.1999.4320.
4. Peters, J.; Jin, C.; Luczak, A.; Lyons, B.; Kalyanaraman, R. Machine Learning Enabled Protein
Secondary Structure Characterization Using Drop-Coating Deposition Raman Spectroscopy.
J. Pharm. Biomed. Anal. 2025, 259, 116762. https://doi.org/10.1016/j.jpba.2025.116762.
5. Khadka, N.; Pleitt, K.; Nolasco, M. A Classical Least Squares (CLS) Approach for Protein
Quantification in Downstream Processing Using Raman Spectroscopy.
6. Nolasco, M.; Pleitt, K.; Khadka, N. Using a Process Raman Analyzer as an In-Line Tool for
Accurate Protein Quantification in Downstream Processes.
7. Nolasco, M.; Pleitt, K.; Khadka, N. Raman-Based Accurate Protein Quantification in a Matrix
That Interferes with UV-Vis Measurement.
8. Dolui, S.; Mondal, A.; Roy, A.; Pal, U.; Das, S.; Saha, A.; Maiti, N. C. Order, Disorder, and
Reorder State of Lysozyme: Aggregation Mechanism by Raman Spectroscopy. J. Phys. Chem.
B 2020, 124 (1), 50–60. https://doi.org/10.1021/acs.jpcb.9b09139.
9. Peters, J.; Park, E.; Kalyanaraman, R.; Luczak, A.; Ganesh, V. Protein Secondary Structure
Determination Using Drop Coat Deposition Confocal Raman Spectroscopy. 2016, 31, 31–39.
10. Schlücker, S. Surface-Enhanced Raman Spectroscopy: Concepts and Chemical
Applications. Angew. Chem. Int. Ed. 2014, 53 (19), 4756–4795. https://doi.org/10.1002/
anie.201205748.
11. Pérez-Jiménez, A. I.; Lyu, D.; Lu, Z.; Liu, G.; Ren, B. Surface-Enhanced Raman Spectroscopy:
Benefits, Trade-Offs and Future Developments. Chem. Sci. 2020, 11 (18), 4563–4577.
https://doi.org/10.1039/D0SC00809E.
Figure 7. Demonstrating the similarity of Raman spectrum
of aqueous 20 mg/mL lysozyme (baseline corrected and water
background subtracted) and 0.1 mg/mL lysozyme (without
baseline correction) on SERS substrate.
Raman shift (cm-1)
Raman intensity (a.u.)
600 800 1,000 1,200 1,400 1,600
0.1 mg/mL Lysozyme SERS
20 mg/mL Lysozyme in water
Learn more at thermofisher.com
For research use only. Not for use in diagnostic procedures. For current certifications, visit thermofisher.com/certifications
© 2025 Thermo Fisher Scientific Inc. All rights reserved. All trademarks are the property of Thermo Fisher Scientific
and its subsidiaries unless otherwise specified. MCS-AN1443-EN 06/25
Supplementary information
Figure S1. Demonstrating of SERS substrate enhancement of Raman signal even in
presence high fluorescence. Both spectra were collected using liquid samples using same
acquisition parameters, demonstrating SERS substrate for both liquid and solid samples.
0
0.5
1.0
1.5
2.0
2.5
Raman shift (cm-1)
Data (x10-4 )
500 1,000 1,500 2,000 2,500 3,000
3.0
3.5 Rhodhamine Droplet SERS
Rhodhamine Droplet Hydrophobic Surface
Notes
Learn more at thermofisher.com
For research use only. Not for use in diagnostic procedures. For current certifications, visit thermofisher.com/certifications
© 2025 Thermo Fisher Scientific Inc. All rights reserved. All trademarks are the property of Thermo Fisher Scientific
and its subsidiaries unless otherwise specified. MCS-CM1559-EN 9/25
Brought to you by
Download the Application Note for FREE Now!
Information you provide will be shared with the sponsors for this content. Technology Networks or its sponsors may contact you to offer you content or products based on your interest in this topic. You may opt-out at any time.
Experiencing issues viewing the form? Click here to access an alternate version