Category Archives: Cloud

Machine learning for IIoT: 4 tips to get you started

The Industrial Internet of Things (IIoT) will bring large volumes of fast-moving data.  This brings both challenges and opportunities.  At the risk of stating the  obvious, one challenge is making sense of large complex data sets.  Machine learning approaches can help here, so I’ve got four tips for getting you started with machine learning:

1.  Forget some of what you know about analytics

If you plan to deal with IIoT data, you may need to refresh your thinking about analytics.  Historically, analytics have been a relatively simple and sedate affair.  For example, analysis was often performed on historical data at some point in time after it was generated.  In addition – for better or worse – analytics often mirrored the siloed nature of data.  That is, the integration of data was minimal.  Industrial IoT will bring more data, faster, from a greater variety of sources.  Managing this data complexity to be able to respond to events in a timely way will required a much more automated and frictionless approach to the analytics value chain.  Machine learning is one way to achieve that.  It can be especially powerful with complex data, where patterns are not obvious and it’s difficult – nay, impossible – for humans to formulate and code rules.  Unfortunately, the lack of transparent logic in machine learning can be an obstacle for some people that must be overcome.  Many engineers just aren’t comfortable with black-box solutions.  Tough, get over it.

2.  Explore machine learning as a technology

The cloud changes everything.  In this particular case, it demolishes the barriers to entry for machine learning.  A new generation of machine learning tools (from BigMLMicrosoftAmazon, and IBM for example) are cloud-based products.  Most offer a free trial, some for an indefinite time period.  They also offer a much more guided, tutorial-style development experience than the previous generations of software.  So what’s the cost to learn more experiment with it…?  It’s your time.  At this point, extensive investigation of machine learning tools prior to selection isn’t strictly necessary.

Here’s how the evaluation process can work:

  • Pick a cloud-based machine learning tool; any one, it doesn’t really matter.
  • Spend a day or two playing with it.
  • If you like it, play some more.
  • If you don’t like it, pick another tool and start over using the experience you’ve already gained.

3.  Don’t be fooled – successful machine learning isn’t all data science

True enough, at a technical level, machine learning can appear enigmatic.  Seemingly without rules or logic, it can be daunting to try and understand the details.  But, that’s what IT professionals, analysts, and data scientists are for.  Like all successful IT projects, successful machine learning projects do not start and end with IT.  Business and domain expertise are crucial to success.  Consider the application of machine learning to maintenance.  Domain expertise is necessary to identify potential source data to feed the algorithms.  Further, domain expertise is required to interpret and provide context to the output of machine learning.  Like all successful IT projects, machine learning applications require a collaborative cross-functional team.

4.  Consider prescriptive maintenance applications

Many enterprises will be breaking new ground with IIoT applications.  It’s critical that the first wave of IIoT applications deliver a tangible and measurable return on investment.  Re-inventing the approach to asset maintenance provides a clear path to measurable benefits.  Research by ARC’s Ralph Rio shows that the most common approach to maintenance is still simple preventative maintenance.  And yet, as the same ARC research also shows, that is not the optimal approach for the majority of assets.  Maintenance applications that incorporate machine learning are a promising approach for capitalizing on Industrial IoT data.  The potential return on investment (ROI) in predictive maintenance is real, tangible, and relatively immediate – all good things you need in a beachhead project.

So those are my four tips – consider it my Christmas gift to you.  And no, you can’t take them back to the store for a refund if you don’t like them…

Shameless plug alert:  This and more in my exceedingly good value research report on machine learning for Industrial IoT.

(Originally published on, a blog by ARC Advisory Group analysts)


Let’s play Clue: Who really killed EMC?

I used to love the board game Clue as a kid (or Cluedo as it’s called back home).  Often when you won, you knew with 100% certainty the who, what, and where for the murder before you made your bold pronouncement.  But sometimes, if you thought someone else was close to solving the murder, you had to take an early best guess with a little less certainty.  And that’s a bit like where I am with EMC.  Do I know for sure who killed EMC..?  No.  But I’m willing to go out on a bit of a limb – I think I can guess who killed EMC, where, and with what weapon.

Since the acquisition of EMC by Dell was announced, there’s been a bit of a kerfuffle in the Bay state.  There’s much hand-wringing that another Boston tech giant is, well, no longer a Boston tech giant.  (EMC is relocating it’s HQ to Texas.)  People have long memories, and the ghost of DEC is apparently still haunting my neighbors as we approach Halloween.  Truthfully, I’m a bit shocked that Dell is being cast in a bad light – a bit of a party crasher, a vulture, a bit of an Ebenezer Scrooge.  So let me put that straight – who really killed EMC?

It was Amazon, in the cloud, with a commodity disk drive.  Truthfully, most of my evidence is circumstantial, but when did that ever hold back a lawyer:

  • The amount of data is growing by about 40% a year – or doubling every two years.  In an ironic twist, I’ll cite numbers from IDC in research bought and paid for by EMC.  To counter this somewhat, the cost per byte of raw disk storage seems to be halving roughly every three years at the moment.  Bottom line, money is still being spent on storage.
  • The storage hardware segment of EMC’s business (Information Storage) has  struggled for growth.  From EMC’s public financials, from 2012 to 2013, revenues grew 4%.  But, from 2013 to 2014, growth rate for this business slowed to only 2%.  And if this data from IDC is accurate (and I have no reason to think that it’s not), EMC lost market share and saw revenues decline early this year – particularly in the lucrative storage systems business.
  • Amazon is building out a colossal computing infrastructure using commodity hardware.  James Hamilton notes this in his excellent presentation from re:Invent 2014:  Amazon saw 132% year-year growth in data transferred in its S3 storage solution, and has over one million customers active on AWS.  Every day Amazon adds enough capacity to AWS to support a $7bn ecommerce operation – effectively all of Amazon’s business back in 2004 when it was a $7bn company.  How much capacity is that?  I’m not sure to be honest, but if Amazon’s average sale in 2004 was $30, that’s over 233m sales transactions that need to be recorded, processed and supported.  Sounds like a lot of storage to me…And I very much doubt Amazon uses EMC’s premium products for that.  As James notes, Amazon typically designs it’s own servers and storage racks.

So, I rest my case, your honor.  What used to be stored on EMC systems in corporate data centers is now being stored on cheap disks in Amazon’s cloud.  Amazon did it, Amazon killed EMC.

(Originally published on, a blog by ARC Advisory Group analysts)

Two reasons machine learning is warming up for industrial companies

Machine learning isn’t new.  Expert systems were a strong research topic in the 1970’s and 1980’s and often embodied machine learning approaches.  Machine learning is a subset of predictive analytics, a subset that is highly automated, embedded, and self-modifying.  Currently, enthusiasm for machine learning is seeing a strong resurgence, with two factors driving that renewed interest:

Plentiful data.  It’s a popular adage with machine learning experts:  In the long run, a weaker algorithm with lots of training data will outperform a stronger algorithm with less training data.  That’s because machine learning algorithms naturally adapt to produce better results based on the data they are fed, and the feedback they receive.  And clearly, industry is entering an era of plentiful data. Data generated by the Industrial Internet of Things (IIoT) will ensure that.  However, on the personal / consumer side of things, that era has already arrived.  For example, in 2012 Google trained a machine learning algorithm to recognize cats by feeding it ten million images of cats.Today’s it’s relatively easy to find vast numbers of images, but in the 1980’s who had access to such an image library…?  Beyond perhaps a few shady government organizations, nobody.  For example, eighteen months ago Facebook reported that users were uploading 350 million images every day.  (Yes, you read that correctly, over a third of a billion images every day).  Consequently, the ability to find enough relevant training data for many applications is no longer a concern.  In fact, the concern may rapidly switch to how do you find the right, or best, training data – but that’s another story…

Lower Barriers to Entry.  The landscape of commercial software and solutions has been changed permanently by two major factors in the last decade or so:  Open source and the cloud.  Red Hat – twenty-two years old and counting – is the first company that provided enterprise software using an open source business model.  Other companies have followed Red Hat’s lead, although none have been as commercially successfully.  Typically, the enterprise commercial open source business model revolves around a no-fee version of a core software product – the Linux operating system in the case of Red Hat.  This is fully functional software, not a time–limited trial, for example.  However, although the core product is free, revenue is generated from a number of optional services, and potential product enhancements.  The key point of the open source model is this:  It makes evaluation and experimentation so much easier.  Literally anyone with an internet connection can download the product and start to use it.  This makes it easy to evaluate, distribute and propagate the software throughout the organization as desired.

Use of the cloud also significantly lowers the barriers to entry for anyone looking to explore machine learning.  In a similar way to the open source model, cloud-based solutions are very easy for potential customers to explore. Typically, this would just involve registering to create a free account on the provider’s website, and then starting to develop and evaluate applications. Usually, online training and educational materials are provided too.  The exact amount of “free” resources available varies depending on the vendor. Some may limit free evaluation to a certain period, such as thirty days.  Others may limit the number of machine learning models built, or how many times they can be executed, for free. At the extreme though, some providers will provide some limited form of machine learning capacity, free of charge, forever.

Like open source solutions, cloud-based solutions also make it easier – and reduce the risk – for organizations to get started with machine learning applications.  Just show up at the vendors website, register, and get started. Compare both the cloud and open source to to the traditionally licensed, on-premise installed software product. In this case, the purchase needs to be made, a license obtained, software downloaded and installed. A process that could, in many corporations, take weeks to achieve.  A process that may need to be repeated every time the machine learning application is deployed in a production environment…

My upcoming strategy report on machine learning will review a number of the horizontal machine learning tools and platforms available.  If you can’t wait for that to get started, simply type “machine learning” into your search engine of choice and you’re just 5 minutes away from getting started.

(Originally published on, a blog by ARC Advisory Group analysts)

The cloud changes everything – if you’ll let it…

In my experience, technologies are rarely adopted by corporations as rapidly as expected – or maybe it’s just as rapidly as vendors would like them to be…

Often, the challenge of organizational culture is overlooked.  Few people enthusiastically embrace change, and we’ve surely all experienced new releases or upgrades that really were detrimental.  Change will surely be more difficult for some companies than others when it comes to the Industrial Internet of Things and the adoption of the cloud that will come with that.

But, change is inevitable, like it or not.  At this point, I’m thinking there are really only two types of companies when it comes to cloud adoption:

  1. Companies that have officially blessed putting some data and applications in the cloud, and created policies around that.
  2. Companies that have policies explicitly forbidding use of the cloud – but whose employees are secretly using the cloud anyway!

In the second case, why would those employees commit what, in many cases, is technically a dismissible offense?  It’s usually because some cloud service makes their job much easier to do, whether it’s mere cloud storage or a more sophisticated Software-as-a-Service application.  It’s that simple.

The more I learn and think about the cloud, the more convinced I am that it’s a game changer:

  • A year ago I wrote about how solutions like Amazon’s Redshift had the potential to completely change how business analysts, data warehouse engineers, and even progressive CIOs conceive, design, and execute business intelligence and analytics projects (The Disposable Data Warehouse:  How Will You Use Yours?)
  • At SAP SAPPHIRE NOW in May this year I learned how the cloud helped T-Mobile to complete a proof-of-concept in two weeks, instead of waiting 4 months just to procure the hardware to run the same proof-of-concept on-premise.  In this example, the cloud fosters agility and can help to cut the time needed to bring new products and services to market.
  • In one of my current research projects I’m taking a deeper look at the red-hot world of machine learning.  (So red-hot there are more than 700 startups apparently…).  In this instance, I’m realizing how the cloud can completely change the way enterprises choose software solutions.  Many of the machine learning startups are cloud-based.  That is, users develop, test and deploy their machine learning applications in the cloud.  These solutions typically provide a robust framework to help users get started with their applications quickly.  In this way, the cloud can make the evaluation cycle so much faster for potential buyers:  Pick a cloud-based solution, and try it out for a couple of days.  If you like it, move towards a production application (or a more fully-fledged prototype).  If you don’t like it, just move on – pick another cloud-based machine learning tool and start over…

(Originally published on, a blog by ARC Advisory Group analysts)