Late last year it was announced that DeepMind, the UK-founded and headquartered AI company now owned by Alphabet, Google’s holding company, had designed an algorithmic program able to accurately determine the shape of protein structures from their chemical structure.
Hailed as a potentially revolutionary breakthrough for science, providing a whole new vista of knowledge on the building blocks of biological life, scientists were excited about what the tool would mean for fields from drug discovery to materials science and biotechnology.
Several months on and details of the first large scale output of the AI program named AlphaFold have been revealed. And it has provoked a new wave of excitement in the scientific community.
The program has spent several months teaching itself the ‘rules’ for how proteins fold by studying the structures of the comparatively small number of proteins that information was known for. It’s also now reached the stage it can assign a score of how confident it is its protein shape predictions are accurate.
The fruit of that progress has been the DeepMind program recently predicting the shape of 350,000 different protein structures in a single day. That’s as many as were previously known to science, meaning the number catalogued was doubled in just twenty four hours. Deepmind now intends to generate predictions for the shape of millions of more protein structures and work on improving existing models.
Edith Heard, who is director-general of the European Molecular Biology Laboratory worked with DeepMind on the project, advising the AI experts on how to make the program’s predictions accessible to the scientific community. Her closer insight has convinced her protein-mapping work being carried out by AlphaFold will lead to a “revolution” in life sciences over coming years.
She explains:
“Proteins represent the fundamental building blocks that living organisms are made of. Accurately predicting their structures, has a huge range of scientific applications from developing new drugs and treatments for disease, right through to designing future crops that can withstand climate change or enzymes that can degrade plastics. So the applications are actually limited only by our imagination.”
Why is understanding proteins better so important?
Every individual process in biology, including the processes in our cells, are driven by proteins. They are the molecular machines that underpin life at an atomic level and reliably being able to predict not just their chemical structure, but what they actually look like, will give us a far deeper understanding of biology at its deepest levels. Heard is convinced it “will be transformative for our understanding of how life works”.
Scientists are already benefiting from AlphaFold’s insights into protein structures
Deepmind has already given some researchers working in fields that stand to benefit from AlphaFold’s protein-mapping predictions early access to the powerful AI tool. One is Professor John McGeehan of the University of Portsmouth who leads a team working on plastic-digesting enzymes.
He says access to the AlphaFold proteins catalogue meant his team’s project had jumped “at least a year ahead” and that the painstaking process that “took us months and years to do, AlphaFold was able to do in a weekend”.
The AlphaFold proteins library
The library of protein structures AlphaFold has so far compiled is described in the scientific journal Nature as including predictions for the length of 98.5% of the proteins found in human biology. Until AlphaFold started work just a few months ago we only knew the structure of about 17% of those proteins.
DeepMind believes the accuracy of 36% of AlphaFold’s predictions are as accurate as what would result from the lengthy and painstaking experimental methods used until now. For another 58%, the accuracy is expected to be close enough to be scientifically useful and to quickly be improved upon as the algorithms continue to learn.
Why is it so hard to map protein structures?
The difficulty lies in the fact proteins are both tiny and have extremely complex structures. Prior to AlphaFold’s development, the only really viable way to map their structure has been to crystalise them before trying to painstakingly figure out the position of each atom using x-rays.
It’s a process that can take months and has been compared to trying to work out the finer details of the interior of a skyscraper by peering through a single window.
AlphaFold was given access to a database of all the proteins whose structures scientists had succeeded in slowly mapping until now. Machine learning was then used to comb through those known structures and pick out patterns across how different atoms in the proteins folded and weaved into each other.
The accuracy of the technique was then benchmarked against recently analysed proteins whose structure was known but not yet published. It proved itself by comfortably winning an international competition to predict the shape of new proteins.
Why is understanding protein shape so important to scientists?
We’ve been able to accurately map the chemical compositions of individual proteins for a while because their DNA gives us the code for the amino acids they are built from. Unfortunately, it is the three-dimensional shape chemicals that make up a protein form, rather than the chemicals themselves, that determine the function of a protein.
What will knowing the 3D structure of proteins allow us to do?
The simple answer is ‘potentially everything’. We’re still at the early stages of better understanding and then mastering proteins but knowing what protein structures have which functions, and the exact shape that allows for that function, gives us the basic building blocks of biology.
Biotech scientists could potentially, eventually, learn to build our own molecular machines and engineer and tweak biological processes. That should give us the ability to create new drugs and other therapies able to tackle diseases and illnesses at their root causes in basic biology.
Outside of medicine, we may soon be able to synthetically recreate biological structures or processes or develop entirely new ones that don’t occur naturally in nature. Experts believe the knowledge AlphaFold will give us of proteins over future years would have applications across almost every scientific discipline across biology and chemistry.
There is still a lot of work to be done before the results of AlphaFolds work in predicting protein structures will directly lead to major medical and other scientific breakthroughs. Scientists will need to test the predictions through experiments to verify their accuracy before relying upon them. At least for as long as it takes for AlphaFold’s machine learning to demonstrably reach the point of almost 100% accuracy.
There is also an even more complex field that many of the most interesting biological processes rely on – how proteins interact with each other. That will be a future application of the AI but it may take longer to make major breakthroughs in that direction.
But scientists expect the DeepMind tool to have a major impact even in the short term by accelerating important research projects. Nobel Prize-winning biologist Sir Paul Nurse, head of London’s Crick Laboratory, is one of the scientists to have been granted an early peek at the proteins library compiled by AlphaFold. He is in little doubt as to its significance, saying:
“This is a very major contribution, very major. What this provides is a much simpler and quicker way of getting that information. There’s a huge amount of data here and what interests me is how you turn that data into knowledge.”
Ewan Birney of the The European Molecular Biology Laboratory adds:
“I know there’s going to be thousands of scientists tomorrow delightfully clicking through and looking at different structures and immediately having ideas, ideas about how that works and ideas about the next experiment. I keep pinching myself a bit about it,” he added. “A whole vista of this science is opening up.”
Like any new technology, it will take time for scientists and researchers to learn how to leverage AlphaFold’s capabilities. But little by little a trickle of positive outcomes would be expected to grow to a torrent over the coming decades.
The few exceptional minds able to fully appreciate the significance of what AlphaFlow’s protein mapping capabilities will bring to the evolution of life sciences are those most excited about it. That should be exciting for the rest of us.


Comments (0)
Average Rating: No ratings yet/5 (0 reviews)
No comments yet. Be the first to comment!