On Ruby and ꝩduЯ, or
How Scary are Trojan Source Attacks

https://www.sw.it.aoyama.ac.jp/2023/pub/RubyꝩduЯ/

Ruby Kaigi 2023, Matsumoto City, Japan, May 12, 2023

Martin J. DÜRST (マーティンと呼んでください)

duerst@it.aoyama.ac.jp, Aoyama Gakuin University

Ruby Programming Language

© 2023 Martin J. Dürst, Aoyama Gakuin University

Abstract

A bit over a year ago, the Trojan Source attacks (https://trojansource.codes) created quite a bit of a scare. This talk looks at what has already been done, and what can and should be done, for Ruby.

Ruby has embraced Unicode in the form of UTF-8 for source code so that identifiers as well as comments can use non-ASCII characters. This can be very convenient but also may be dangerous.

We will explain the dangers: Bidirectional attacks can use special Unicode formatting characters to regroup source text so that it looks like it does something, but actually does something else. Homoglyph attacks can use lookalike characters to confuse code reviewers. Invisible characters and special spaces can be even more difficult to detect.

Remedies include better Ruby parsing, new checks to editors, IDEs, and code management sites such as github, and stronger linters such as Rubocop. We will discuss what has already been done, what still needs to be done, and how to use the various tools together.

For Best Viewing

This web page can be viewed with any browser, but has been created for projection as slides with Opera (≤12.17 Windows/Mac/Linux; use F11 to switch to projection mode). Texts in gray, like this one, are comments/notes which do not appear on the slides. Please note that depending on the browser and OS you use, some rare characters or special character combinations may not display as intended, but e.g. as empty boxes, question marks, or apart rather than composed.

 

 

Overview

 

Speaker Self-Intro

 

Contributions to Ruby

Mainly in the following areas:

 

Troyan Source: Invisible Vulnerabilities

 

Who Wrote this Paper?

Nicholas Boucher, University of Cambridge

Ross Anderson: Well known security expert at University of Cambridge and University of Edinborgh

Author of Security Engineering, 1182 pages, 1.887kg book front page

 

Troyan Source: Imagine you are ...

 

Unpredictible Programs: Example 1

Final position of a game of Go

white = 40
black = 45
whіtе = white + 6.5 # komi
score = white - black
if score>0 then puts 'white wins'
else            puts 'black wins'
end

Who wins?

 

Example 1

white = 40
black = 45
whіtе = white + 6.5 # komi
score = white - black
if score>0 then puts 'white wins'
else            puts 'black wins'
end

Please raise your hand:

White wins: Left handBlack wins: Right hand

 

Example 2

white = 40
black = 45
white​ = white + 6.5 # komi
score = white - black
if score>0 then puts 'white wins'
else            puts 'black wins'
end

Who wins?

 

Example 2

white = 40
black = 45
white​ = white + 6.5 # komi
score = white - black
if score>0 then puts 'white wins'
else            puts 'black wins'
end

Please raise your hand:

White wins: Left handBlack wins: Right hand

 

Example 3

50 => ‪white‬
40 => ‮‪black‬
score = ‮‪black‬ - ‪white‬
if score>0 then puts 'white wins'
else            puts 'black wins'
end

Who wins?

 

Example 3

50 => ‪white‬
40 => ‮‪black‬
score = ‮‪black‬ - ‪white‬
if score>0 then puts 'white wins'
else            puts 'black wins'
end

Please raise your hand:

White wins: Left handBlack wins: Right hand

 

Example 4

 = 50
​  = 40
if ​- > 0 then puts 'white wins'
else           puts 'black wins'
end

Please raise your hand:

White wins: Left handBlack wins: Right hand

 

My Guess of Your Guesses

White wins Black wins
Example 1 ✋✋✋✋✋✋✋✋
Example 2 ✋✋✋✋✋✋✋✋
Example 3 ✋✋✋✋✋✋✋✋
Example 4 ✋✋✋✋ ✋✋✋✋

 

The Real Results

Example 1: black wins
Example 2: black wins
Example 3: black wins
Example 4: black wins

(source)

 

Who Actually Won

White wins Black wins
Example 1 ✋✋✋✋✋✋✋✋ ✓ OK
Example 2 ✋✋✋✋✋✋✋✋ ✓ OK
Example 3 ✋✋✋✋✋✋✋✋ ✓ OK
Example 4 ✋✋✋✋ ✓ OK✋✋✋✋

Don't be disappointed: It's my fault that you guessed wrong.

 

Where is the Problem in Example 1?

white = 40
black = 45
whіtе = white + 6.5 # komi
score = white - black
if score>0 then puts 'white wins'
else            puts 'black wins'
end

whіtе => wh\u0456t\u0435

 

Example 1: Details

whіtе = white + 6.5 # komi

whіtе => wh\u0456t\u0435

і: U+0456: Cyrillic Small Letter Byelorussian-Ukrainian I (Russian i: и)

е: U+0435: Cyrillic Small Letter IE

The 6.5 points gets assigned to the fake whіtе, so the real white looses

This is a homoglyph attack

 

Homoglyph Attacks

 

What's Going On in Example 2

white = 40
black = 45
white<U+200B> = white + 6.5 # komi
score = white - black
if score>0 then puts 'white wins'
else            puts 'black wins'
end

U+200B: ZERO WIDTH SPACE

Ruby allows all non-ASCII spaces in identifiers!

Let's call this an invisible space attack

 

Invisible Space Attack

 

What's Going On in Example 4

<U+3000> = 50
<U+200B> = 40
if <U+200B> - <U+3000> > 0 then puts 'white wins'
else           puts 'black wins'
end

U+200B: ZERO WIDTH SPACE

U+3000: IDEOGRAPHIC SPACE (fullwidth space, 全角スペース)

Ruby allows variable/method names consisting only of spaces!

(not really an attack)

 

Example 3: Actual Code

50 => LREwhitePDF
40 => RLOLREblackPDF
score = RLOLREblackPDF - LREwhitePDF
if score > 0 then puts 'white wins'
else              puts 'black wins'

RLO, LRE, PDF: Unicode bidirectional (bidi) formatting characters

Allowed as part of variable names in Ruby!

 

Example 3: Core

score = RLOLREblackPDF - LREwhitePDF

is shown as

score = white - black

This is a bidirectional reordering attack

 

Example 3: Characters Involved

 

What are RLO, LRE, PDF ?

 

Bidirectional Text

 

Bidi Formatting Characters

Abbreviation Direction Generation Closed by Influence
LRM 1 - local
RLM 1 - local
ALM 1 - local
LRE 1 PDF general flow
RLE 1 PDF general flow
LRO 1 PDF forcinng
RLO 1 PDF forcinng
PDF 1 -
LRI 2 PDI general flow
RLI 2 PDI general flow
FSI first strong 2 PDI general flow
PDI 2 -

 

Example 3: The Whys

 

Bidi Reordering Attacks

 

Background: Vulnerabilities

 

Who's Fault is it?

Finger pointing doesn't solve any problems

 

The Baby and the Bathwater

Why not simply disallow non-ASCII characters in code?

 

Why Not to (Completely) Loose Your Sleep

(over Trojan source attacks, that is)

 

Defense in Depth

 

Editors: Current Status

 

The Ideal Editor

 

Next Steps for Ruby

 

Next Steps: Impementation

 

Recommendations (Summary)

 

Acknowledgments

 

Q & A

Send questions and comments to Martin Dürst
(mailto:duerst@it.aoyama.ac.jp)
or open/contribute to a bug report or feature request



The latest version of this presentation is available at:

https://www.sw.it.aoyama.ac.jp/2023/pub/RubyꝩduЯ/