I'm working on it, but it is taking time to methodically work through. I have a couple of ideas though. Maybe I'll PM you, Deagol, if I think I have cracked it rather than spoiling things for others.
Here is what my start looks like:
I reversed the entire audio from the video, then re-reversed the voice portion (now at the left side of the image) so it plays correctly. The guitar portion sounds like 26 clusters of two or three notes. (I have put boxes around them)... so I am assuming these need to decode into text or hex or ascii or something like that giving a URL possibly (where I will find the next part that Deagol is working on, I presume).
I don't think that much info. spoils anything, but if I figure out more, I'll hold off posting the solution... Deagol can have the honor since it was his Q & A.
