The Power of Nanopores for Understanding Proteins -Part 1
Article Jun 26, 2018
Nanopore technology is providing science with a cheap and accessible method to sequence DNA. It’s compact nature also means sequencing is no longer restricted to the lab with applications from remote regions to combat wildlife crime and even in space. Now with ground-breaking research from the University of Groningen, the secrets hidden within proteins are also being revealed with the help of nanopores. We spoke to Professor Giovanni Maglia (GM), group leader in chemical biology at the University of Groningen, about the progress he and his team have been making.
KS: How did you come to be working in this field of research?
GM: When I was in Oxford, during my post-doctoral research, I was exposed to biological nanopores. At that time, I was working on a project with DNA sequencing using nanopores. During the four and a half years that I was working there, we made a lot of progress in the field. It was quite amazing for me how you can get so much information from ionic currents from nanopores, and how you can study single molecules. They're really very promising tools, because they can be incorporated into portable devices and still look at the single molecule level.
When I finished my position in Oxford I started my own laboratory at the University of Leuven in Belgium, where I started to look at proteins using biological nanopores. I moved to looking at proteins because when I left it became quite clear that most of the advances in the DNA field were done, or about to be done, and a company moved into the field with a lot of resources. So, I felt that DNA sequencing was a mature field already.
KS: What are the major stumbling blocks you have had to overcome whilst developing this technology?
GM: Proteins were different to DNA and it’s not straightforward to translate DNA sequencing to protein sequencing. What we knew about DNA sequencing was that you need to stretch the DNA inside the pores, so you can access the different bases. Also, it became quite obvious that you needed to have a nanomachine that could ratchet the DNA base by base, but neither of these things are possible with proteins, or at least it’s not obvious how to do it. So I started to work on a different aspect which was using bigger biological pores and tried to study folded proteins rather than unfolded and sequenced proteins.
Meanwhile, we came across a different pore called FraC which looked pretty good for protein sequencing. The crystal structure was published in 2015.
For sequencing DNA, you read about five bases at a time, but having very precise DNA movement inside the pore, facilitated by polymerases or helicases that fracture DNA very accurately base by base, allows you to get away with a signal made by more than one base. Furthermore, you only have four bases in DNA if you discount modifications, so single-residue recognition is intrinsically easier at the single unit resolution.
The real challenge was that proteins have 20 amino acids, so intrinsically you would have too many amino acids in the reading area, making the signal very complex. So, you really need to have a pore that will allow much narrower recognition.
With FraC, there were however quite a few challenges because in order to assemble the nanopore you need to have a special lipid composition, the expression level is very low, and so on and so forth. However, we managed to overcome these issues and showed, actually with DNA because initially it was easier, in 2016 that we can use this pore to sequence polymers.
Then we said, all right, so now we have a pore, let’s see what we can do in the protein sequencing field. We knew a bit about our proteins and about large pores with folded proteins, so we set out to see what the limitations were of the size of the pore to study proteins.
So, we now have two approaches to identify proteins, the sequencing that I just described, and the other one is to recognise the protein itself by nanopore currents, no matter if it’s unfolded and transported across the pore, or folded, partially folded, or just inside the pore rather than translocating through the pore. The two approaches are very different, and they do require different pores and present different challenges.
The challenge in protein recognition with nanopores is that there are about 25,000 proteins in the proteome plus all the modifications of those proteins. So, effectively, to be able to recognise a protein, every protein would have to have an individual current signature, and that’s difficult for the bandwidth that we have for the nanopores. 25,000 individual signals is quite a challenge!
Therefore, to recognise a protein we think we need to have a pre-purification step, in which you isolate a group of proteins that you want to study. You could do this with antibodies, or simply by size, or you could use the chemical properties of that molecule. After you have narrowed down these thousands of proteins, you can then recognise them with your nanopore.
For protein sequencing with nanopores, you need to unfold the protein, transport it across the nanopore at a constant potential and at a constant speed, they’re the two main challenges.
So, you can see that protein identification is a little bit easier than protein sequencing, you just need to recognise a group of proteins, the difference between very similar proteins. In protein sequencing however, you actually need to control the nanoscale of a single molecule, and the transport of the unfolded polypeptide across the nanopore.
KS: How do you see that the techniques you've been developing compare to the more traditional approaches for looking at proteins and protein sequencing?
GM: Protein sequencing now is done by tandem mass spectrometry especially, in which you just take a protein, chop it into pieces and then read the mass of every different piece. Mass spectrometry has been done for the past hundred or more years, so it works pretty well, and the accuracy of the measurement is pretty amazing. You can read a fraction of molecular weight, so it's really good, but at the same time, it uses a very complex and expensive machine, and it's very hard to make it portable.
There is a lot of work towards making it more portable, just making it small, but you really need to apply a vacuum and you need to have strong magnets. If you talk to experts in the field, they say it's going to be extremely hard to make it more portable. There are efforts towards that direction, but it is to be seen how good they are.
So, on one hand, they're extremely precise and work very well. On the other, they're very expensive and not really portable. Another problem they have is that they can only read mass, or charge. There are peptides that have the same mass, the famous example is isoleucine and leucine, and you can't really distinguish between them. So, there are limitations, and what we tackle with nanopores, if you could use nanopores to sequence proteins – just sequence them – then it would be a portable device that everybody could use, and it would be very cheap. This could also do things that a mass spectrometer could not do. For example, because it works on single molecules, it could differentiate peptides that have the same mass.
There are two main problems that nanopore protein sequencing addresses, one is detecting low abundance proteins. As the mass spec now is an assembled technique, you only see the more abundant proteins in your sample. You therefore need to either purify and concentrate etc., but that's sometimes difficult because there are very low amounts of proteins sometimes in your biological sample. Nanopores, being single molecules themselves, intrinsically have a sensitivity of single molecules. Of course, to be able to actually observe and analyse you need to capture the molecule, and that's a challenge. If you have one molecule in one millilitre of blood, the chance that that molecule can be captured by your nanopore is virtually zero. However, what you can do is drive the molecule to the nanopore. There's been work done on the way that you can concentrate molecules, for example around the membrane, and this could dramatically increase the concentration of your analyte. Or you can just attract it towards the mouth of the nanopore. For example, by attaching binders to the nanopore itself, they can funnel the molecule across the nanopore.
The second thing is that you can actually study post-translational modifications. It appears that many, if not most, of the proteins in our blood are chemically modified. They're made as the protein that we study in textbooks, but as soon as they see the cell you have all sorts of reactions that might happen at the surface, a lot of amino acids get chemically modified.
A lot of these modifications might not be important, but some of them at least are known to be very important for the function of the protein, and for the interaction between proteins and so on. This process of modification is very hard to study within an assembled technique because if the modification is heterogeneous, you cannot easily see it. If all proteins are modified, or a large amount of protein is modified, you could see it. However, if the modifications can happen in many different ways, glycosylation is a famous example, every molecule is slightly different to the others, so a single molecule technique is required to study it. At the moment, the single molecule techniques to sequence proteins just don't exist. So, developing the first single molecule technique that allows you to really see how the molecules are is an important step.
Professor Giovanni Maglia was speaking to Dr Karen Steward, Science Writer for Technology Networks.
Click here to read part 2 of the interview.
Given the complexity of cancer, it’s arguably unlikely that single molecules will work as clinically meaningful biomarkers for cancer. Today, biomarker discovery involves detecting patterns – characteristics or phenotypes that can be measured and monitored throughout a patient’s journey. Here, we look at two approaches being explored in this evolving field.READ MORE
When developing diagnostic tests or evaluating results, it is important to understand how reliable those tests and therefore the results you are obtaining are. By using samples of known disease status, values such as sensitivity and specificity can be calculated that allow you to evaluate just that.READ MORE