Saturday, January 21, 2012

I'm A Data Detective

I'm a data detective.  In most of my professional work customers pay me very well (gobs) to come into a situation about which I'm generally and often utterly unfamiliar.  Whether that is about making toilet paper better or estimating production of an oil and gas platform in the North Sea, or the future relative value of thousands of financial securities or the diagnosis of cardiac problems, the list is endless over 20 years. Each situation is quite different and sleep apnea is too, but there is some commonality in the approach.  So, what is my process?  Well, the first thing is to do some thinking...

  1. What are you after?  What are your goals?  Can you measure it numerically?  No?  Find a way. There may be many of them.  Rank them by importance.  Go after the important one first.
  2. What factors, in general terms, are likely to influence your goals?  Which ones are "controllable"?  That is, which factors can you change, which ones can you not?  If you have no controllable factors, you are in a bit of trouble, as you are operating just at the whim of your environment.  That would not be good.
  3. What data do you have about those factors?  Where is it?  Can we get it?
  4. Look at the data, visually by eye, in relation to your goals and each other and how they appear to interact and how "noisy" the data is, how does it move in relation to your goals and each other?  This is more art that I'm trying to make "science" (automated).
  5. Do some preliminary data modeling (that's mathematical) using our software to try to estimate the quantified goals.  Our software is extremely good at sniffing out relations between things even in very noisy data.  Look at which and how the driving factors influence the goals (sensitivity analysis and 3D response curves and surfaces).  Hopefully some of your controllable factors have a significant influence on the results you want, else you are "driving a car with a loose steering wheel".  You make changes but little or nothing changes much.  In that case, again, you are operating somewhat at the whim of your environment.
  6. Of those factors that are in your control and the preliminary modeling indicate has some influence on your goals, do what is considered a "designed experiment", that is, change those factors as much as possible, up and down, keeping within reasonable limits.  This data is golden, as it will give you a clearer picture just how much you can control your goals.  If you jig-jag them around a lot and not much happens... oopsie. Loose steering wheel again.
  7. Remodel the data using the newly generated data. Estimate the current or future values of the goals' values using the models.  Do these estimates usefully approximate the values of the goals?  "Usefully" is important.  You may not be real accurate, but if the result is useful, has value, to the final user, that is goodness.  You might put these models on-line, if data is in real-time and estimate things and maybe alert you to problems.  Or it might be possible before sleeping, given what you have done during the day, to inform you what your sleep quality might be.  Are these estimates useful?  Do they seem to be right?
  8. Lastly, optimize.  Using the models, manipulate (synthetically... the data) the controllable factors to seek the set of values that maximize your good goals and minimize the bad ones.  This is a complex multiple variable optimization of multiple objectives, perhaps within constraints.  Try what it tells you to do.
This process is general, it is adapted to the situation, but it is about what I do to help people and companies (and in this case myself) achieve higher performance. It is somewhat in order of progression, but jumping around is most often done.  For example, if you think you have factors that influence your goals, but later find out you were wrong, you go back and look around.  Or maybe you stated your goals not quite right, that you learned something that changed your mind how you view your goals and quantify them.

We'll see how it goes.

No comments:

Post a Comment