Wednesday, February 22, 2012

Clustering Zeo Brainwaves

One reason "sleep stages" exist is so one can quantify how much time was spent in what kind of sleep. It is a multiple level discrete classification of what in reality is a continuous process, but none the less the concept of "sleep stages" is useful.  I would like to identify the nature of my sleep and associate how I feel the next day based on how much of what kind of sleep I had.  If I can identify and count the duration of different kinds of sleep I might be able to correlate that with how I feel, hence I could then get a handle on sleep quality.  People say the more "Deep" and "REM" you get the better, but I'm guessing there's more to that story.

The challenge I have is that I only have a Zeo, which is a 3 lead frontal lobe EEG non-medical device and I get a bit confused on how sleep is categorized based on the dominance of this wave or that.  This is compounded by each person being different and I read today that Obstructive Sleep Apnea (OSA) persons have somewhat abnormal sleep as well.  I could use Zeo's classification of sleep stages (I might, I have them in real-time) or I could come up with my own scheme.  In this adventure, I going to make my own.

I have Zeo brainwaves (Delta, Alpha, Betas, Theta, Gamma) and sleep stages (Wake, REM, Light, Deep) gathered via a serial port from the Zeo Bedside unit archived in real-time to a historian.  I can extract the data between one date/time and another such as a night's sleep.  Last night's data looks like this:


Now, I'm no expert in sleep stages, but I am one in data analysis and I see a few different forms of sleep.  For this adventure I'm not going to try to identify the sleep stages using the established medical definitions (if it is even possible with the Zeo), but I'm going to let the data itself decide what it sees.  How am I going to do this?

Clustering
Clustering is a data method that separates data into "piles" (clusters) based on their similarities and differences.  In this case the software looks at each moment in time and decides if the Delta, Alpha, Betas, Theta, Gamma wave values (amplitudes) are similar to all the other cases.  In this exercise I took noise out of the data by taking 30-45 second averages of the data (just a double-click in our software).  Then using a "Self Organizing Map" (SOM), I group the data based on its similarity.  SOM's use a process that moves similar data towards each other, and dissimilar data away from each other, on a two dimensional map.  That way I can take the 7 wave forms (7 dimensional data) and project it on a 2D surface, like moving chips around on a table, putting similar ones together automatically.  We humans can think in 2D and 3D pretty good.  7D is very difficult for us mere mortals. What you get after the SOM groups the data is something that looks like this:


Very pretty.  Rather impressive.  Good to have on your desktop when the boss walks by if you are a data analysis person (wink).  Each square has very similar data within it and neighboring squares also have data with similar characteristics.  The closer the squares, the closer the values.  Adjacent red regions contain very similar values.  The green, yellow and blue indicate larger differences, like mountains or valleys between the red regions.  I can group these squares (and the data within them) into clusters of similar data by setting a "data distance" (dissimilarity) criteria.  Below you can see 5 different gray regions (clusters) each of which contain similar data.


OK, so what, you ask?  Well, I can then export out the members (rows of data) of those clusters and their cluster number.  That cluster number is very similar in concept as a sleep stage and look at what it shows us...


At the very bottom you can see the blue line that is our "home-made" Sleep Stages but in a different numbering scheme, independent of any medical professional, based on my personal data.  In comparing the cluster number to the Zeo's sleep stages I can see it is somewhat similar to what the Zeo indicates, but I also see that my scheme identifies the character of the data better, because the Zeo is trying to conform to medical definitions where mine conforms reality.

The blue line is actually smoothed with a simple algorithm that says if the current row of data is uncategorized (not in a cluster) then presume it to be in the just-prior cluster.  That's not perfect, but a pretty good assumption on a real-time continuous process like the brain.  Doing this type of post-processing helps reduce the noise in our new "Sleep Stage".

Next Steps
I can run this clustering "model" in real time with the wave data coming in and auto-assigned to my custom sleep stages and either in real time or after I wake up I can count how much time I slept in what type of sleep.  I can use this, or the Zeo's sleep stages, or both to hopefully correlate that with how I feel the next day, and also correlate it with what I did before to get such sleep.  We'll see about that in future posts.

No comments:

Post a Comment