Artificial Intelligence and Machine Learning Have Taken Centre Stage – Here’s Why
Article Oct 12, 2017 | By Mahesh Pancholi, Research Computing and Life Sciences Specialist at High Performance Computing, Storage and Data Analytics Integrator, OCF.
We’ve reached a significant point in time where the interest in Artificial Intelligence (AI), machine learning and deep learning have gained huge amounts of traction - why? We are moving into an era where science fiction is now becoming fact and reality.
AI and machine learning are not new concepts; Greek mythology is littered with references of giant automata such as Talos of Crete and the bronze robot of Hephaestus. However, the ‘modern AI’ idea of thinking machines that we all have come to understand was founded in 1956 at Dartmouth College.
Since the 1950’s, numerous studies, programmes and projects into AI have been launched and funded to the tune of billions; it has also witnessed numerous hype cycles. But, it’s only been in the past 5–10 years that the prospect of AI becoming a reality, has really taken hold.
The rise of research computing
Research computing has been synonymous with High Performance Computing (HPC) for more than twenty years – the tool of choice for fields such as astrophysics. But, over the last two decades many other areas of scientific research that started needing computational power fell outside of traditional HPC systems.
Bioinformatics for example, which is a field of study that aims to develop methods and software tools for understanding biological data, such as human genomes, needed greater computational horsepower, but had very different requirements to many existing HPC systems. However, the fastest way to a result was to cram onto these systems – the existing HPC just wasn’t fit for purpose.
That is where research computing was born. You couldn’t just have one system for all research types, you needed to diversify and provide a service or platform. From there, HPC systems were starting to be built to meet varied workload demands – such as the high memory nodes needed to handle and analyse large, complex biological data.
Even still, scientific researchers are very good at exhausting the available resources of a supercomputer – it’s rare to find an HPC system that ever sits idle, or has the capacity for more research projects.
With the want, and need, for ever larger systems, Universities started to look towards cloud platforms to help with scientific research. That’s one of the reasons why cloud technologies such as OpenStack have started to gain a foothold within higher education.
You can build supercomputers on commodity hardware – affordable, easy to obtain, generally broadly compatible with a wide range of technologies, and can function on a plug and play basis – and use this for day-to-day research. The cloud aspect can then enable organisations to ‘burst out’ to the public cloud for jobs that are too complex or large for the commodity HPC systems.
Public cloud providers have very quickly spotted this opportunity, which is why we see the likes of Amazon and Microsoft putting a lot of work into building HPC-type infrastructures that now incorporate Graphics Processing Unit [GPUs] and InfiniBand connectivity.
The growth in larger HPC systems and the ability to take advantage of the public cloud infrastructures has enabled research to become far more computationally intensive, which is also aided by the growing use of GPUs, which are essentially supercharging scientific research.
This combination of advancing technologies has led people to conduct deep learning and machine learning in a more meaningful sense – the precursors to modern AI systems. Although deep learning and machine learning algorithms have existed for many years, the compute power wasn’t available to run wide datasets in parallel, in any sort of useful timeframes.
You can now have multiple GPUs in a clustered system that can tackle huge amounts of data using massively complex algorithms. And it can do this in time frames that now make deep learning and machine learning projects financially viable.
It’s this research computing heritage and the widening of the research computing platform, to incorporate the public cloud and the progression of GPUs, which is enabling AI. There’s huge interest in AI and most cutting-edge researchers are trying to understand how cognitive computing can be applied to their research and provide a competitive edge.
There isn’t an area that can’t benefit from artificial intelligence
Where there is data there is potential to benefit from AI. If you already have datasets that you use to produce insights or outcomes from (which is pretty much everyone doing anything!) then you already have a training dataset that can teach your algorithm of choice how to assist you. There are many possible impacts to using AI in this way:
- Providing assistance to human decision making. The algorithm makes a suggestion and provides reasoning that allows a human to accept or reject the recommendation. This would be relevant to areas such as medicine where your doctor can provide an AI assisted diagnosis.
- Make life more interesting, any task that a human can achieve in under a second has the potential to be performed by AI. Why should a person be stuck with a repetitive task when AI can learn to do this, allowing people freedom to do more complex tasks and have more free time?
- Increase efficiency without making compromises. The Square Kilometer Array (SKA) is a great example in which humans aren’t physically able to look at, and evaluate, the huge amounts of data generated from a project – a huge proportion of data is thrown away before it has been analysed. Whilst a lot of data from SKA may just be ‘noise’ or temporary files, it’s still worth considering what might be in the data that has been discarded. If you apply AI ‘on the wire’ to that data and analyse it on the fly, you start to provide much greater clarity and certainty to the data you’re keeping and discarding.
The future of research computing and artificial intelligence
It has always been the compute power that has held AI back as well as the ability to connect computing, networking and storage together. Previously, the speed of connections between computing and storage wasn’t fast enough, but then you have vendors such as Mellanox and Intel pushing the boundaries of InfiniBand and Ethernet speeds.
There’s a move towards alternative architectures too. Intel has traditionally had the monopoly in the area of research computing, but ARM is becoming increasingly competitive and OpenPOWER is also pushing novel technology.
This enhanced competition means we will see new combinations and mixes of technologies and vendors in the same systems, which will undoubtedly positively affect the research computing discipline.
Longer term, the real success of true AI will be in connecting disparate HPC systems and frameworks together. Figuring out how to connect them altogether is incredibly complex and presents as a true challenge, but when this issue is solved, we’ll surely be entering the golden age of research computing.
We spoke to Andrew Howley from Adventure Scientists,a pioneering not-for-profit organization that seeks to unite skilled adventurers with scientists keen to receive valuable data from remote areas, to learn more about the initiative and the impact their projects are having in the scientific community and beyond.READ MORE
If you work in science, chances are you spend upwards of 50% of your time analyzing data in one form or another.However, it's easy to get lost when it comes to the question of what techniques to apply to what data. This is where data mining comes in - put broadly, data mining is the utilization of statistical techniques to discover patterns or associations in the datasets you have. Here we provide an overview of the critical steps you'll need to get the most out of your data analysis pipeline.