Machine learning isn’t new. Expert systems were a strong research topic in the 1970’s and 1980’s and often embodied machine learning approaches. Machine learning is a subset of predictive analytics, a subset that is highly automated, embedded, and self-modifying. Currently, enthusiasm for machine learning is seeing a strong resurgence, with two factors driving that renewed interest:
Plentiful data. It’s a popular adage with machine learning experts: In the long run, a weaker algorithm with lots of training data will outperform a stronger algorithm with less training data. That’s because machine learning algorithms naturally adapt to produce better results based on the data they are fed, and the feedback they receive. And clearly, industry is entering an era of plentiful data. Data generated by the Industrial Internet of Things (IIoT) will ensure that. However, on the personal / consumer side of things, that era has already arrived. For example, in 2012 Google trained a machine learning algorithm to recognize cats by feeding it ten million images of cats.Today’s it’s relatively easy to find vast numbers of images, but in the 1980’s who had access to such an image library…? Beyond perhaps a few shady government organizations, nobody. For example, eighteen months ago Facebook reported that users were uploading 350 million images every day. (Yes, you read that correctly, over a third of a billion images every day). Consequently, the ability to find enough relevant training data for many applications is no longer a concern. In fact, the concern may rapidly switch to how do you find the right, or best, training data – but that’s another story…
Lower Barriers to Entry. The landscape of commercial software and solutions has been changed permanently by two major factors in the last decade or so: Open source and the cloud. Red Hat – twenty-two years old and counting – is the first company that provided enterprise software using an open source business model. Other companies have followed Red Hat’s lead, although none have been as commercially successfully. Typically, the enterprise commercial open source business model revolves around a no-fee version of a core software product – the Linux operating system in the case of Red Hat. This is fully functional software, not a time–limited trial, for example. However, although the core product is free, revenue is generated from a number of optional services, and potential product enhancements. The key point of the open source model is this: It makes evaluation and experimentation so much easier. Literally anyone with an internet connection can download the product and start to use it. This makes it easy to evaluate, distribute and propagate the software throughout the organization as desired.
Use of the cloud also significantly lowers the barriers to entry for anyone looking to explore machine learning. In a similar way to the open source model, cloud-based solutions are very easy for potential customers to explore. Typically, this would just involve registering to create a free account on the provider’s website, and then starting to develop and evaluate applications. Usually, online training and educational materials are provided too. The exact amount of “free” resources available varies depending on the vendor. Some may limit free evaluation to a certain period, such as thirty days. Others may limit the number of machine learning models built, or how many times they can be executed, for free. At the extreme though, some providers will provide some limited form of machine learning capacity, free of charge, forever.
Like open source solutions, cloud-based solutions also make it easier – and reduce the risk – for organizations to get started with machine learning applications. Just show up at the vendors website, register, and get started. Compare both the cloud and open source to to the traditionally licensed, on-premise installed software product. In this case, the purchase needs to be made, a license obtained, software downloaded and installed. A process that could, in many corporations, take weeks to achieve. A process that may need to be repeated every time the machine learning application is deployed in a production environment…
My upcoming strategy report on machine learning will review a number of the horizontal machine learning tools and platforms available. If you can’t wait for that to get started, simply type “machine learning” into your search engine of choice and you’re just 5 minutes away from getting started.
(Originally published on industrial-iot.com, a blog by ARC Advisory Group analysts)