Mon, Nov 30, 2020
I recently received a rather suspicious E-Book in epub format. Like PDFs, these can also contain malicious code. I wanted to open it, but I wasn’t sure if I could trust it.
In the past, I would have deleted it straight away. But this time I wanted to actually know if there is malicious code inside this file. So, how do I find that out?
First I figured, I needed an understanding of what a .epub
is and what contents are in it.
I found this W3 specification. Which I of course read in every detail, because I absolutely want to implement an .epub
reader with every detail.
However, I noticed two main points:
.epub
Files could have SVG, XML, HTML, CSS in itThe specification suggests that there are some files packaged into the .epub
. So I tried the old ‘rename it to .zip
and open it’ trick and voilà it worked:
$ tree
.
├── META-INF
│ └── container.xml
├── content.opf
├── cover.jpeg
├── images
│ ├── 00002.jpeg
│ ├── ...
│ └── 00025.jpeg
├── mimetype
├── page_styles.css
├── stylesheet.css
├── text
│ ├── part0000.html
│ ├── part0001.html
│ ├── ...
│ └── part0031.html
├── titlepage.xhtml
└── toc.ncx
Lots of HTML, Images and CSS. When you open these .ncx
and .opf
files, it turns out these are XML.
I had only known that .epub
files could contain malicious code, but not how this code could be executed.
The specification states that one can do scripting with JavaScript. An attacker could read files from the disk or read contents from the clipboard. But: Your E-Book Reader has to be vulnerable to this.
So we have to look for things like this:
<script>...</script>
or this:
<a href="javascript:...">Link</a>
There is also a Security Stack Exchange Question, which gives a hint that someone could also do an XXE.
These XML external entity injection can look like this:
<!DOCTYPE foo [ <!ENTITY entity_name SYSTEM "file:///etc/passwd"> ]>
I am just going to be ignorant how these attack exactly work. My target is to recognize the patterns of malicious code.
For those interested, look at the Web Security Academy from PortSwigger.
I tested a few .epub
files and depending on the file, there could be 30 or so HTML files be in there.
In this next part, I could write some fancy regex based automated malware scanner for .epub
files. OR I use the best neural network in the world every human has built on his shoulders: my brain.
The thing is, I don’t want to manually open each of those files and close each file. I want to look at them, but fast and easy.
Luckily there is a text editor which many are obsessed with closing: VIM.
This is easily one of the best and simplest solutions. By saying this, let’s add something to my .vimrc
configuration file.
if $VIMENV == 'prev'
noremap <Space> :n<CR>
noremap <Backspace> :N<CR>
set noswapfile
endif
This tells vim if VIMENV
is set to prev
, remap space to move a tab forwards and remap Backspace to move a tab backwards.
Beware this trick/idea is from the amazing Tomnomnom. I recommend his youtube videos for more vim magic and bug bounty stuff.
Great, how do I use that configuration?
$ VIMENV=prev vim file.txt
To make that even easier 😉, I add an alias to my .zshrc
file:
alias vimprev="VIMENV=prev vim"
After this, I simply cd
to my epub
dir and execute:
vimprev $(find . -type f)
This will open all the files in the current dir in vim tabs. I can tab through them with space and backspace.
This took about 5 minutes for me, because our brains are normally extremely fast in pattern recognition and anomaly detection. If something looks fishy, I take a second look.
You are probably wondering how to exit vim:
:qall!
Interested on trying it yourself? Or do you want read some copyright free books (at least in the US)? Look at Project Gutenberg for royalty free E-Books.
If you have any questions, let me know on Twitter (my DMs are open).