Category Archives: Analytics

Machine learning for IIoT: 4 tips to get you started

The Industrial Internet of Things (IIoT) will bring large volumes of fast-moving data.  This brings both challenges and opportunities.  At the risk of stating the  obvious, one challenge is making sense of large complex data sets.  Machine learning approaches can help here, so I’ve got four tips for getting you started with machine learning:

1.  Forget some of what you know about analytics

If you plan to deal with IIoT data, you may need to refresh your thinking about analytics.  Historically, analytics have been a relatively simple and sedate affair.  For example, analysis was often performed on historical data at some point in time after it was generated.  In addition – for better or worse – analytics often mirrored the siloed nature of data.  That is, the integration of data was minimal.  Industrial IoT will bring more data, faster, from a greater variety of sources.  Managing this data complexity to be able to respond to events in a timely way will required a much more automated and frictionless approach to the analytics value chain.  Machine learning is one way to achieve that.  It can be especially powerful with complex data, where patterns are not obvious and it’s difficult – nay, impossible – for humans to formulate and code rules.  Unfortunately, the lack of transparent logic in machine learning can be an obstacle for some people that must be overcome.  Many engineers just aren’t comfortable with black-box solutions.  Tough, get over it.

2.  Explore machine learning as a technology

The cloud changes everything.  In this particular case, it demolishes the barriers to entry for machine learning.  A new generation of machine learning tools (from BigMLMicrosoftAmazon, and IBM for example) are cloud-based products.  Most offer a free trial, some for an indefinite time period.  They also offer a much more guided, tutorial-style development experience than the previous generations of software.  So what’s the cost to learn more and experiment with it…?  It’s your time.  At this point, extensive investigation of machine learning tools prior to selection isn’t strictly necessary.

Here’s how the evaluation process can work:

  • Pick a cloud-based machine learning tool; any one, it doesn’t really matter.
  • Spend a day or two playing with it.
  • If you like it, play some more.
  • If you don’t like it, pick another tool and start over using the experience you’ve already gained.

3.  Don’t be fooled – successful machine learning isn’t all data science

True enough, at a technical level, machine learning can appear enigmatic.  Seemingly without rules or logic, it can be daunting to try and understand the details.  But, that’s what IT professionals, analysts, and data scientists are for.  Like all successful IT projects, successful machine learning projects do not start and end with IT.  Business and domain expertise are crucial to success.  Consider the application of machine learning to maintenance.  Domain expertise is necessary to identify potential source data to feed the algorithms.  Further, domain expertise is required to interpret and provide context to the output of machine learning.  Like all successful IT projects, machine learning applications require a collaborative cross-functional team.

4.  Consider prescriptive maintenance applications

Many enterprises will be breaking new ground with IIoT applications.  It’s critical that the first wave of IIoT applications deliver a tangible and measurable return on investment.  Re-inventing the approach to asset maintenance provides a clear path to measurable benefits.  Research by ARC’s Ralph Rio shows that the most common approach to maintenance is still simple preventative maintenance.  And yet, as the same ARC research also shows, that is not the optimal approach for the majority of assets.  Maintenance applications that incorporate machine learning are a promising approach for capitalizing on Industrial IoT data.  The potential return on investment (ROI) in predictive maintenance is real, tangible, and relatively immediate – all good things you need in a beachhead project.

So those are my four tips – consider it my Christmas gift to you.  And no, you can’t take them back to the store for a refund if you don’t like them…

Shameless plug alert:  This and more in my exceedingly good value research report on machine learning for Industrial IoT.

(Originally published on industrial-iot.com, a blog by ARC Advisory Group analysts)

Two reasons machine learning is warming up for industrial companies

Machine learning isn’t new.  Expert systems were a strong research topic in the 1970’s and 1980’s and often embodied machine learning approaches.  Machine learning is a subset of predictive analytics, a subset that is highly automated, embedded, and self-modifying.  Currently, enthusiasm for machine learning is seeing a strong resurgence, with two factors driving that renewed interest:

Plentiful data.  It’s a popular adage with machine learning experts:  In the long run, a weaker algorithm with lots of training data will outperform a stronger algorithm with less training data.  That’s because machine learning algorithms naturally adapt to produce better results based on the data they are fed, and the feedback they receive.  And clearly, industry is entering an era of plentiful data. Data generated by the Industrial Internet of Things (IIoT) will ensure that.  However, on the personal / consumer side of things, that era has already arrived.  For example, in 2012 Google trained a machine learning algorithm to recognize cats by feeding it ten million images of cats.Today’s it’s relatively easy to find vast numbers of images, but in the 1980’s who had access to such an image library…?  Beyond perhaps a few shady government organizations, nobody.  For example, eighteen months ago Facebook reported that users were uploading 350 million images every day.  (Yes, you read that correctly, over a third of a billion images every day).  Consequently, the ability to find enough relevant training data for many applications is no longer a concern.  In fact, the concern may rapidly switch to how do you find the right, or best, training data – but that’s another story…

Lower Barriers to Entry.  The landscape of commercial software and solutions has been changed permanently by two major factors in the last decade or so:  Open source and the cloud.  Red Hat – twenty-two years old and counting – is the first company that provided enterprise software using an open source business model.  Other companies have followed Red Hat’s lead, although none have been as commercially successfully.  Typically, the enterprise commercial open source business model revolves around a no-fee version of a core software product – the Linux operating system in the case of Red Hat.  This is fully functional software, not a time–limited trial, for example.  However, although the core product is free, revenue is generated from a number of optional services, and potential product enhancements.  The key point of the open source model is this:  It makes evaluation and experimentation so much easier.  Literally anyone with an internet connection can download the product and start to use it.  This makes it easy to evaluate, distribute and propagate the software throughout the organization as desired.

Use of the cloud also significantly lowers the barriers to entry for anyone looking to explore machine learning.  In a similar way to the open source model, cloud-based solutions are very easy for potential customers to explore. Typically, this would just involve registering to create a free account on the provider’s website, and then starting to develop and evaluate applications. Usually, online training and educational materials are provided too.  The exact amount of “free” resources available varies depending on the vendor. Some may limit free evaluation to a certain period, such as thirty days.  Others may limit the number of machine learning models built, or how many times they can be executed, for free. At the extreme though, some providers will provide some limited form of machine learning capacity, free of charge, forever.

Like open source solutions, cloud-based solutions also make it easier – and reduce the risk – for organizations to get started with machine learning applications.  Just show up at the vendors website, register, and get started. Compare both the cloud and open source to to the traditionally licensed, on-premise installed software product. In this case, the purchase needs to be made, a license obtained, software downloaded and installed. A process that could, in many corporations, take weeks to achieve.  A process that may need to be repeated every time the machine learning application is deployed in a production environment…

My upcoming strategy report on machine learning will review a number of the horizontal machine learning tools and platforms available.  If you can’t wait for that to get started, simply type “machine learning” into your search engine of choice and you’re just 5 minutes away from getting started.

(Originally published on industrial-iot.com, a blog by ARC Advisory Group analysts)

Re-inventing Healthcare: Cutting Re-admission rates with predictive analytics

Managing unplanned re-admissions is a persistent and enduring problem for healthcare providers.  Analysis of Medicare claims from over a decade ago showed that over 19% of beneficiaries were re-admitted within 30 days.  Attention on this measure increased when the Affordable Care Act introduced penalties for excessive re-admits.  However, many hospitals – including those in South Florida and Texas – are losing millions in revenue because of their inability to meet performance targets.

Carolinas HealthCare System has applied predictive analytics to the problem, using Predixion Software and Premier Inc.  Essentially, by using patient and population data, Carolinas is able to calculate a more timely, more accurate assessment of the re-admit risk.  The hospital can then put in place a post-acute care plan to try and minimize the risk of re-admission.  You can find a brief ten minute webinar presented by the hospital here.  But, from an analytics, information management  and decision making perspective, here are the key points:

  • The risk assessment for readmission is now done before the patient examination, not after it. Making that assessment early means there is more time to plan for the most appropriate care after discharge.
  • The risk assessment is now more precise, accurate, and consistent.  In the past, the hospital just categorized patients into two buckets – high risk and low risk.  There are now four bands of risk so the care team can make a more nuanced assessment of risk and plan accordingly.  Further, the use of Predixion’s predictive analytics software means that far more variables can be considered to make the determination of risk.  Us puny human’s can only realistically work with a few variables well to make a decision.  Predictive analytics allowed more than 40 data points from the EMR, ED etc. to be used to make a more accurate assessment of risk.  Finally, calculating the risk using software meant that Carolinas could avoid any variability introduced by case managers with different experience and skills.
  • The risk assessment is constantly updated.  In practice, the re-admission risk for any individual patient is going to change throughout the care process in the hospital.  So, a patients re-admission risk is now recalculated and updated hourly – not just once at the time of admission which was situation in the past.
  • The overall accuracy of risk assessment gets better over time.  A software-centered approach means that suggested intervention plans can be built in – so again reducing variability in the quality of care.  But, the data-centric approach means that the efficacy of treatment plans can also be easily measured and adjusted over the long-term.

Overall, this data-driven approach to care is a win-win.  It results in higher care quality and better outcomes for the patient.  And Carolinas HealthCare System improves its financial performance too.  This is all possible because more of the risk assessment is now based on hard data, not intuition.

Visual Data Discovery: Eat Lunch, or Be Lunch…?

It’s time.  Already.

Monumental shifts in the software industry often follow a 3 phase pattern that inevitably leaves blood on the floor when the dust has settled:

  1. Cheeky young upstart enters the market with a great new idea
  2. Cheeky young upstart starts to rake in serious sales revenue
  3. Established vendors react to nullify the threat and protect their own revenues

Think Netscape and Microsoft. Or MySQL and Oracle – there are plenty of examples.

It’s almost hard to believe, but the still fledgling visual data discovery market is already entering stage 3.  A shake out is inevitable, and inevitably there will be blood on the floor.  The only question is, whose blood?

Of course, if I actually knew the answer to that I’d be a wealthy man. I don’t, and I’m not. But, there are definitely some interesting angles to explore and I’ll be doing that in a series of blogs over the next few months. For example:

  • Is Qliktech, one of the pioneering visual data discovery vendors, struggling, or merely consolidating before it pushes on to bigger and better things? Notably, in Q3 last year, Qlik grew it’s maintenance revenues by almost three times as much as licence revenues (33% vs. 12%).  The full year financial report is on February 20th. so I’ll be trying to get more insight from that.
  • Tableau are reporting their latest financials on February 4th. I love Tableau as a product, it’s just such fun to use. But as a company there are surely challenges ahead. Excellent though Tableau is at visual data discovery, it has no ambitions that I know of to provide a full portfolio of BI solutions. That will become a problem (see below).
  • And then, there are the older, long established BI vendors that have been in the reporting and/or dashboard game for many years:  SAP, Oracle, IBM Cognos, MicroStrategy and Information Builders just to name the biggest and most well known.  Now that vendors such as Qliktech, Tableau and TIBCO Spotfire have clearly shown the potential (measured in dollars) of a new class of BI tool, the established vendors all want a piece of the action too.  Hence the introduction of SAP Lumira, MicroStrategy Analytics Desktop etc. over the last 18 months.  The key question here is when will “Free and good enough” trump “License fee for best in class”.

Although still nascent, this market will start to go through some serious upheaval that will play out over the next two or three years.  I’m going to enjoy watching it and I’d like to invite you along for the ride.  Stay tuned!