Redacted generator

3/16/2023

That way we can control the character that bleeds over. The most obvious is to never make whitespace guesses on its own, and instead pair it up with some other non-whitespace character. There’s more than one way to tackle this problem. It bleeds over so badly that our correct guess looks to be completely wrong! The problem is that in the solution image, there is another character after the space. This is then pixelated like below, with a trailing blank column as you’d expect: Take this example, making the guess “this is ” (with a trailing space): When that happens, the pixelated blocks will be completely overtaken by the next character. However, this isn’t always true when the character we guess is whitespace. Inherent to this whole problem is the assumption that when we guess a correct character, we expect the resulting pixelated version of it to mostly resemble the challenge image. So, it will automatically chop off most of the block when the guess is bad and keep most of the block when it’s good.Ī specific subset of the character bleed-over problem is that whitespace tends to break a few of our assumptions on how character guessing works. The benefit to doing it this way is that the more our guess character extends into the block, the more likely the block is to be a good guess, and so we keep more of the block. This is because we chop the comparison off at the edge of where the “t” ends: You can see the quality of our match went way up, since we’re including less of that incorrect area on the right. So instead, what we did was to chop off the comparison block at the boundary of the letter itself. There’s always a chance that your letter will accidentally line up and produce a match by pure chance, and this chance goes way up when there’s fewer blocks to consider. The problem with this was that, in practice, it reduced the total size of our guess by so much that you start receiving false positives. It’s the column that will have the most bleed-over and can have quite a bit of error. The first thing we tried was to avoid counting the right-most block of any guess. If we just looked at this alone, you might conclude that the letter “t” was an incorrect first letter, since it gets almost half of the blocks totally wrong. The reason the second column is wrong is because the letter “h” is there messing things up.

So, if we try to make a guess for the letter “t”, the left-most column of blocks turns out correct, but the right-most ones are a bit wrong.Ĭorrect Pixels vs. You can see that the letters “t” and “h” share a column of blocks. To see what I mean, check out this example: This means that a given correct guess might actually have some wrong blocks on the right-most edge.

The first problem we immediately encountered is that the characters of our text don’t line up 1:1 with the blocks of the redaction. Doesn’t sound so hard, right? Well, there’s still a bunch of logistical issues to overcome that might not be so obvious at first! Let’s dig into those further. We’ll do a recursive depth-first search on each character, scoring each guess by how well it marginally matches up to the redacted text.īasically, we guess the letter “a”, pixelate that letter, and see how well it matches up to our redacted image. A change of one pixel somewhere in the original image ONLY impacts the redacted block it belongs to, meaning that we can (mostly) guess the image character by character. In cryptographic terms, we’d say it has no diffusion. The key thing we’re focusing on is that the redaction process is inherently local. The Many Problems to Beating the Redaction In our challenge text, you can see a few words right above the pixelated text that give us this information. These are fairly reasonable assumptions, I would assert, since the attacker in a realistic scenario would likely have received a full report, with just one piece redacted out.

That the redaction is of text to begin with.
So, no matter whether you do this redacting in GiMP, Photoshop, or basically any other tool, the redaction will turn out the same.įor our proof-of-concept, let’s assume that the attacker knows: Notably, this algorithm is widely standardized since it’s so simple. And it’s this leaked information that we’ll be using to our advantage. But while some information is lost in the process, it absolutely leaks plenty through. The effect sort of “smears” the information of the image out across each block. That’s it – just a rolling pixel average for each block. Then for each block, you set the redacted image’s color equal to the average color of the original for that same area. The algorithm is pretty simple you divide up your image into a grid of a given block size (the example above is 8x8).

0 Comments

Redacted generator

Leave a Reply.

Author

Archives

Categories