Sunday, 4 December 2011

How to solve the GCHQ challenge



NEWS AGENCIES: Click here for press release text.

Below are three videos demonstrating how to solve the GCHQ challenge by Dr Gareth Owen at the School of Engineering, University of Greenwich, England.

Stage 1 is arguably the most difficult, followed by stage 3 and finally stage 2 as the easiest.

Stage 1



To enlarge videos, click play and then press the Youtube button at bottom of video.






Files to download:
p1-complete.asm (this one prints the decrypted data to the screen - no need to use debugger)
PNG Inspector - my code to check PNG image for comments and stenographic content

Stage 2




Files to download:
PHP VM Implementation (by me)
Explanation of VM code (by me)
Conversion of VM code to C (by me)

There isn't anything further hidden in Stage 2 - GCHQ have confirmed to me. Despite the appearance in the second decrypter (the erroneous jmp); allegedly this is a left over relic because they simplified the puzzle for fear it was too difficult.

Stage 3



GCHQ kindly wrote to me to say the fscanf bug was deliberate - so that you could override the crypt check; seems I took a short cut!

Files to download:
C representation of executable


Press release text

Please feel free to use or modify the following text in your story.

The British spy agency GCHQ recently published a puzzle on www.canyoucrackit.co.uk, just a few days later Dr Gareth Owen, an academic at the University of Greenwich in England has posted a full video explanation of the puzzle. The puzzle has three stages and is not at all simple — likely to challenge even the best computer science graduates.

The first stage is to convert the code on the screen to computer code, which turns out to be a decryption algorithm. The data to be decrypted is hidden in the image on the web site (the image of the numbers).

The second stage asks you to build a virtual computer to run a series of codes - which when run produce the link to the third stage.

The third stage gives you a program to run which requires a licence key - the problem is finding the licence key which requires decoding the program and seeing how it works. Then you have to find three hidden numbers from the first two stages and plug them in to get the web address for the final answer.

There has been some speculation that there is a fourth stage hidden in stage 2, although GCHQ have contacted Gareth and guaranteed there isn't.

Click here for solution videos

34 comments:

  1. This comment has been removed by the author.

    ReplyDelete
  2. The password that generates the hash for stage3 is:

    cyberwin

    Quite simple to find with John the Ripper given the weak DES hashing algorithm used. Hence you can construct the license.txt file:

    "gchqcyberwin" followed by the dwords from the previos stages.

    Obviously this is not really needed, but it's nice to know :)

    ReplyDelete
  3. Such a shame it's written in the wrong assembler language for me. S/370 would have been much more fun.

    I must get round to learning 586/686 assembler some time.

    Thanks for the explanation.

    ReplyDelete
  4. I'm more at home with ARM than x86 but are you sure the "sub esp, 0x100 ; 4096 bytes" is correct? Is it not gaining 256 bytes ready for the following loop to fill with [0, 255]?

    ReplyDelete
  5. Hi Gareth,

    Great explanation and videos. Are you getting your undergrads to do this exercise? I was wondering how many of them would understand it all?

    ReplyDelete
  6. Ralph you're correct - someone else spotted the mistake too and I've amended the video.

    David - I won't be giving it to the undergrads no as we don't teach low level assembly in our department but I'll no doubt show them the videos.

    ReplyDelete
  7. What an amazing presentation, thank you very much.

    ReplyDelete
  8. This was pretty epic, I would think a lot of computer science students would find this extremely challenging, however it's a good eye opening puzzle. I have to say I would of taken a long time to even clue into the fact that the 0xEF was a jump instruction but after you pointed it out it made sense.

    ReplyDelete
  9. it's all about assembly?

    i wonder what exactly your resources in order to understand all this stuffs, another meaning, what books you read to understand x86 assembly? could you recommend some?

    Thanks for the great demonstration

    ReplyDelete
  10. Well done and thanks for the brilliant explanation. However, like you say, a rather disappointing end for quite a lot of work!

    ReplyDelete
  11. Dr Owen,

    As a CIT undergrad in the US, I found this extremely challenging compared to traditional war-games such as smash the stack, over the wire, etc... Thanks for the contribution!

    ReplyDelete
  12. Nicely presented exposition of the challenge.

    The problem with the challenge itself is that it contains no real fundamental test of the would-be solver's raw logico-algorithmic thinking capacity or originality. Instead, it concentrates on a mechanical familiarity with interrelationships of various mid-to-low-level hardware and software frameworks. In that sense, it is more of an orienteering challenge than of cryptanalytic insightfulness.

    ReplyDelete
  13. Hello Gareth!

    How many more students do you think these videos have recruited? If this doesn't act as a good advert for you department ( http://www.gre.ac.uk/schools/engineering/departments/c_and_c ) what does?

    ReplyDelete
  14. All that brain power and the salary is £31k!!

    ReplyDelete
  15. Thank you for the videos. I work in IT but never finished my degree this just motivated me in to getting it done so i would be able to do alot of this sort of thing my self as i find it very intresting

    ReplyDelete
  16. Excellent videos. The skills that you've demonstrated are sadly very rare these days - I know many IT professionals who wouldn't have a clue what you did here, and (correct me if i wrong) i doubt they even teach this stuff any more at University.

    Sadly computers have become like cars - 99% of users have no idea about the inner workings, yet they drive them every day :(

    ReplyDelete
  17. That was an amazing piece of detective work!

    As I'm unfamiliar with that particular machine code, I wouldn't have recognised it as code.
    Nor would I have recognised the BASE64 encoding.

    Well done.

    ReplyDelete
  18. Thanks Gareth, Bit disappointed at the challenge, in a way. Was hoping it might offer a number of different routes, with differing resultant keywords, to make it more inter-disciplinary and to sort out the high fliers from the 99.9% of candidates who -- like me -- were "also rans". Was hoping that the exe itself would manipulate the "supposed" keyword sent in the clear ... Nada.

    ReplyDelete
  19. Etienne, there's speculation there's more to the puzzle than meets the eye :-)

    ReplyDelete
  20. Hope so, Gareth. I don't know why I hadn't even thought of writing my own VM in php or js, and not at all happy about using other folks' code. Will have a go in php and see if I can get a better understanding of what's going on. Regards ~ ET.

    ReplyDelete
  21. Great work, and a very clear and entertaining explanation of the solution. Many thanks for sharing it Gareth.

    ReplyDelete
  22. Your PHP vm does not calculate instruction length correctly for untaken jmpes -- you were lucky the code only uses short ones.

    ReplyDelete
  23. // *jmpe r1*
    // => if (fl == 0) jmp r1
    // else nop

    Also $gregs[DS] = $op2; instead of [CS].

    ReplyDelete
  24. Thanks guys - I'll try to do a fix later this evening.

    ReplyDelete
  25. Thanks folks - made those two corrections.

    ReplyDelete
  26. I've also uploaded a new C file implementing the VM code - much cleaner and clearer.

    ReplyDelete
  27. Many thanks, Gareth. I know that it's non-standard or heretical usage, but personally I find that by avoiding single line unbraced conditional statements; by verbosely spelling out all non-trivial else conditions; by placing matching braces directly under one-another (rather than at the end of the line and below); and by commenting distant end braces, it makes debugging and understanding a whole lot easier -- especially if you have really deeply nested and complex, "evolving" code.

    ReplyDelete
  28. Yes please don't take any code here as an example how to code - it was typed purely with the aim of solving the puzzle quickly. The VM code is particularly poorly written - I wrote it side by side against the spec giving no real consideration to readability or good structure - writing it quickly was the only goal.

    ReplyDelete
  29. First step's to see if GHCQ have been lazy... which they have. Checking for a sitemap via google:

    site:www.canyoucrackit.co.uk

    The text file containing the keyword comes up. Disappointing.

    ReplyDelete
  30. (But all the same, awesome solution Dr Owen)

    ReplyDelete
  31. Great job dr. gareth i enjoyed your presentation. I liked how you cut the asm as well. keep smashingthestack.

    ReplyDelete
  32. You really make it seem so easy with your presentation but I find this topic to be really something that I think I would never understand. It seems too complex and very broad for me. I'm looking forward for your next post, I’ll try to get the hang of it!
    Michigan Printed Sheet Set Full -Solid - Michigan Wolverines

    ReplyDelete
  33. I agree with all of the points about save our website of gchq challenge.Thanks for sharing this.

    Windows Thin Client & Citrix Thin Client

    ReplyDelete