aboutsummaryrefslogtreecommitdiffstats
path: root/README.md
blob: 19ecff3c8692151a080f901132e0e4b6a78fa4cc (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
Truepolyglot is polyglot file generator project. It means  the generated file is composed of several file formats. The same file can be opened as a ZIP file and as a PDF file for example. The idea of this project comes from the work of [Ange Albertini](https://github.com/corkami), [International Journal of Proof-of-Concept or Get The Fuck Out](https://www.alchemistowl.org/pocorgtfo/pocorgtfo07.pdf) and [Julia Wolf](https://www.troopers.de/wp-content/uploads/2011/04/TR11_Wolf_OMG_PDF.pdf) that explain how we can build a polyglot file.\
Polyglot file can be boring to build, even more if you want to respect the file format correctly.\
That's why I decided to build a tool to generate them.\
My main motivation was the technical challenge.

## Features and versions ##

| Description | Version |
| ----------- | ------- |
| Build a polyglot file valid as PDF and ZIP format and that can be opened with 7Zip and Windows Explorer | POC |
| Add a stream object in the PDF part | POC |
| Polyglot file checked without warning with [pdftocairo](https://poppler.freedesktop.org/) | >= 1.0 |
| Polyglot file checked without warning with [caradoc](https://github.com/ANSSI-FR/caradoc) | >= 1.0 |
| Rebuild the PDF Xref Table | >= 1.0 |
| Stream object with the correct length header value | >= 1.0 |
| Add the format "zippdf", file without offset after the Zip data | >= 1.1 |
| Polyglot file keeps the original PDF version | >= 1.1.1 |
| Add the "szippdf" format without offset before and after the Zip data | >= 1.2 |
| Fix /Length stream object value and the PDF offset for the szippdf format | >= 1.2.1 |
| PDF object numbers reorder after insertion | >= 1.3 |
| Add the format "pdfany" a valid PDF with custom payload content in the first and the last objet | >= 1.5.2 |
| Add "acrobat-compatibility" option to allow szippdf to be read with Acrobat Reader (thanks Ange Albertini)| >= 1.5.3 |
| Add the format "zipany" a valid ZIP with custom payload content at the start and between LHF and CD | >= 1.6 |

## Polyglot file compatibility ##

| Software | Formats | status |
| -------- | ------- | ------ |
| Acrobat Reader | pdfzip, zippdf, szippdf, pdfany | OK |
| Sumatra PDF | pdfzip, zippdf, szippdf, pdfany | OK |
| Foxit PDF Reader | pdfzip, zippdf, szippdf, pdfany | OK |
| Edge | pdfzip, zippdf, szippdf, pdfany | OK |
| Firefox | pdfzip, zippdf, szippdf, pdfany | OK |
| 7zip | pdfzip, zippdf, zipany | OK with warning |
| 7zip | szippdf | OK |
| Explorer Windows | pdfzip, zippdf, szippdf, pdfany, zipany | OK |
| Info-ZIP (unzip) | pdfzip, zippdf, szippdf, pdfany, zipany | OK |
| Evince | pdfzip, zippdf, szippdf, pdfany | OK |
| pdftocairo -pdf | pdfzip, zippdf, szippdf, pdfany | OK |
| caradoc stats | pdfzip, pdfany | OK |
| java -jar | szippdf | OK |

## Examples ##

| First input file | Second input file | Format | Polyglot | Comment |
| ---------------- | ----------------- | ------ | -------- | ------- |
| [doc.pdf](https://truepolyglot.hackade.org/samples/pdfzip/poc1/doc.pdf) | [archive.zip](https://truepolyglot.hackade.org/samples/pdfzip/poc1/archive.zip) | pdfzip | [polyglot.pdf](https://truepolyglot.hackade.org/samples/pdfzip/poc1/polyglot.pdf) | PDF/ZIP polyglot - 122 Ko | 
| [orwell\_1984.pdf](https://truepolyglot.hackade.org/samples/pdfzip/poc2/orwell_1984.pdf) | [file-FILE5\_32.zip](https://truepolyglot.hackade.org/samples/pdfzip/poc2/file-FILE5_32.zip) | pdfzip | [polyglot.pdf](https://truepolyglot.hackade.org/samples/pdfzip/poc2/polyglot.pdf) | PDF/ZIP polyglot - 1.3 Mo |
| [x86asm.pdf](https://truepolyglot.hackade.org/samples/pdfzip/poc3/x86asm.pdf) | [fasmw17304.zip](https://truepolyglot.hackade.org/samples/pdfzip/poc3/fasmw17304.zip) | pdfzip | [polyglot.pdf](https://truepolyglot.hackade.org/samples/pdfzip/poc3/polyglot.pdf) | PDF/ZIP polyglot - 1.8 Mo |
| [doc.pdf](/samples/zippdf/poc4/doc.pdf) | [archive.zip](/samples/zippdf/poc4/archive.zip) | zippdf | [polyglot.pdf](/samples/zippdf/poc4/polyglot.pdf) | PDF/ZIP polyglot - 112 Ko |
| [electronics.pdf](https://truepolyglot.hackade.org/samples/szippdf/poc5/electronics.pdf) | [hello\_world.jar](https://truepolyglot.hackade.org/samples/szippdf/poc5/hello_world.jar) | szippdf | [polyglot.pdf](https://truepolyglot.hackade.org/samples/szippdf/poc5/polyglot.pdf) | PDF/JAR polyglot - 778 Ko |
| [hexinator.pdf](https://truepolyglot.hackade.org/samples/pdfzip/poc6/hexinator.pdf) | [eicar.zip](https://truepolyglot.hackade.org/samples/pdfzip/poc6/eicar.zip) ([scan virustotal.com](https://www.virustotal.com/#/file/2174e17e6b03bb398666c128e6ab0a27d4ad6f7d7922127fe828e07aa94ab79d/detection)) | pdfzip | [polyglot.pdf](https://truepolyglot.hackade.org/samples/pdfzip/poc6/polyglot.pdf) ([scan virustotal.com](https://www.virustotal.com/#/file/f6fef31e3b03164bb3bdf35af0521f9fc0c518a9e0f1aa9f8b60ac936201591a/detection)) | PDF/ZIP polyglot with the Eicar test in Zip - 2.9 Mo |
| [doc.pdf](https://truepolyglot.hackade.org/samples/pdfany/poc7/doc.pdf) | [page.html](https://truepolyglot.hackade.org/samples/pdfany/poc7/page.html) | pdfany | [polyglot.pdf](https://truepolyglot.hackade.org/samples/pdfany/poc7/polyglot.pdf) | PDF/HTML polyglot - 26 Ko |
| [logo.zip](https://truepolyglot.hackade.org/samples/zipany/poc8/logo.zip) | [nc.exe](https://truepolyglot.hackade.org/samples/zipany/poc8/nc.exe) | zipany | [polyglot.zip](https://truepolyglot.hackade.org/samples/zipany/poc8/polyglot.zip) | PDF/PE polyglot - 96 Ko |

## Usage ##

```
usage: truepolyglot format [options] output-file

Generate a polyglot file.

Formats availables:
* pdfzip: Generate a file valid as PDF and ZIP. The format is closest to PDF.
* zippdf: Generate a file valid as ZIP and PDF. The format is closest to ZIP.
* szippdf: Generate a file valid as ZIP and PDF. The format is strictly a ZIP. Archive is modified.
* pdfany: Generate a valid PDF file with payload1 file content as the first object or/and payload2 file content as the last object.
* zipany: Generate a valid ZIP file with payload1 file content at the start of the file or/and payload2 file content between LFH and CD.

positional arguments:       {pdfzip,zippdf,szippdf,pdfany,zipany}
Output polyglot format
output_file           Output polyglot file path

optional arguments:
-h, --help            show this help message and exit
--pdffile PDFFILE     PDF input file
--zipfile ZIPFILE     ZIP input file       
--payload1file PAYLOAD1FILE Payload 1 input file       
--payload2file PAYLOAD2FILE Payload 2 input file 
--acrobat-compatibility Add a byte at the start for Acrobat Reader compatibility with the szippdf format       
--verbose {none,error,info,debug} Verbosity level  (default: info)

TruePolyglot v1.6.1
```

## Code ##

```
git clone https://git.hackade.org/truepolyglot.git/
```

or download [truepolyglot-1.6.1.tar.gz](https://git.hackade.org/truepolyglot.git/snapshot/truepolyglot-1.6.1.tar.gz)

## How to detect a polyglot file ? ##

You can use [binwalk](https://github.com/ReFirmLabs/binwalk) on a file to see if composed of multiple files.

## Contact ##

[truepolyglot@hackade.org](mailto:truepolyglot@hackade.org)

## Credits ##

Copyright © 2018-2019 ben@hackade.org

TruePolyglot is released under [Unlicence](https://unlicense.org/) except for the following libraries:

* [PyPDF2](https://github.com/mstamy2/PyPDF2/blob/master/LICENSE)
* [zipfile.py (cpython)](https://github.com/python/cpython/blob/master/LICENSE)


Truepolyglot is polyglot file generator project. It means  the generated file is composed of several file formats. The same file can be opened as a ZIP file and as a PDF file for example. The idea of this project comes from the work of [Ange Albertini](https://github.com/corkami), [International Journal of Proof-of-Concept or Get The Fuck Out](https://www.alchemistowl.org/pocorgtfo/pocorgtfo07.pdf) and [Julia Wolf](https://www.troopers.de/wp-content/uploads/2011/04/TR11_Wolf_OMG_PDF.pdf) that explain how we can build a polyglot file.\
Polyglot file can be boring to build, even more if you want to respect the file format correctly.\
That's why I decided to build a tool to generate them.\
My main motivation was the technical challenge.

## Features and versions ##

| Description | Version |
| ----------- | ------- |
| Build a polyglot file valid as PDF and ZIP format and that can be opened with 7Zip and Windows Explorer | POC |
| Add a stream object in the PDF part | POC |
| Polyglot file checked without warning with [pdftocairo](https://poppler.freedesktop.org/) | >= 1.0 |
| Polyglot file checked without warning with [caradoc](https://github.com/ANSSI-FR/caradoc) | >= 1.0 |
| Rebuild the PDF Xref Table | >= 1.0 |
| Stream object with the correct length header value | >= 1.0 |
| Add the format "zippdf", file without offset after the Zip data | >= 1.1 |
| Polyglot file keeps the original PDF version | >= 1.1.1 |
| Add the "szippdf" format without offset before and after the Zip data | >= 1.2 |
| Fix /Length stream object value and the PDF offset for the szippdf format | >= 1.2.1 |
| PDF object numbers reorder after insertion | >= 1.3 |
| Add the format "pdfany" a valid PDF with custom payload content in the first and the last objet | >= 1.5.2 |
| Add "acrobat-compatibility" option to allow szippdf to be read with Acrobat Reader (thanks Ange Albertini)| >= 1.5.3 |
| Add the format "zipany" a valid ZIP with custom payload content at the start and between LHF and CD | >= 1.6 |

## Polyglot file compatibility ##

| Software | Formats | status |
| -------- | ------- | ------ |
| Acrobat Reader | pdfzip, zippdf, szippdf, pdfany | OK |
| Sumatra PDF | pdfzip, zippdf, szippdf, pdfany | OK |
| Foxit PDF Reader | pdfzip, zippdf, szippdf, pdfany | OK |
| Edge | pdfzip, zippdf, szippdf, pdfany | OK |
| Firefox | pdfzip, zippdf, szippdf, pdfany | OK |
| 7zip | pdfzip, zippdf, zipany | OK with warning |
| 7zip | szippdf | OK |
| Explorer Windows | pdfzip, zippdf, szippdf, pdfany, zipany | OK |
| Info-ZIP (unzip) | pdfzip, zippdf, szippdf, pdfany, zipany | OK |
| Evince | pdfzip, zippdf, szippdf, pdfany | OK |
| pdftocairo -pdf | pdfzip, zippdf, szippdf, pdfany | OK |
| caradoc stats | pdfzip, pdfany | OK |
| java -jar | szippdf | OK |

## Examples ##

| First input file | Second input file | Format | Polyglot | Comment |
| ---------------- | ----------------- | ------ | -------- | ------- |
| [doc.pdf](https://truepolyglot.hackade.org/samples/pdfzip/poc1/doc.pdf) | [archive.zip](https://truepolyglot.hackade.org/samples/pdfzip/poc1/archive.zip) | pdfzip | [polyglot.pdf](https://truepolyglot.hackade.org/samples/pdfzip/poc1/polyglot.pdf) | PDF/ZIP polyglot - 122 Ko | 
| [orwell\_1984.pdf](https://truepolyglot.hackade.org/samples/pdfzip/poc2/orwell_1984.pdf) | [file-FILE5\_32.zip](https://truepolyglot.hackade.org/samples/pdfzip/poc2/file-FILE5_32.zip) | pdfzip | [polyglot.pdf](https://truepolyglot.hackade.org/samples/pdfzip/poc2/polyglot.pdf) | PDF/ZIP polyglot - 1.3 Mo |
| [x86asm.pdf](https://truepolyglot.hackade.org/samples/pdfzip/poc3/x86asm.pdf) | [fasmw17304.zip](https://truepolyglot.hackade.org/samples/pdfzip/poc3/fasmw17304.zip) | pdfzip | [polyglot.pdf](https://truepolyglot.hackade.org/samples/pdfzip/poc3/polyglot.pdf) | PDF/ZIP polyglot - 1.8 Mo |
| [doc.pdf](/samples/zippdf/poc4/doc.pdf) | [archive.zip](/samples/zippdf/poc4/archive.zip) | zippdf | [polyglot.pdf](/samples/zippdf/poc4/polyglot.pdf) | PDF/ZIP polyglot - 112 Ko |
| [electronics.pdf](https://truepolyglot.hackade.org/samples/szippdf/poc5/electronics.pdf) | [hello\_world.jar](https://truepolyglot.hackade.org/samples/szippdf/poc5/hello_world.jar) | szippdf | [polyglot.pdf](https://truepolyglot.hackade.org/samples/szippdf/poc5/polyglot.pdf) | PDF/JAR polyglot - 778 Ko |
| [hexinator.pdf](https://truepolyglot.hackade.org/samples/pdfzip/poc6/hexinator.pdf) | [eicar.zip](https://truepolyglot.hackade.org/samples/pdfzip/poc6/eicar.zip) ([scan virustotal.com](https://www.virustotal.com/#/file/2174e17e6b03bb398666c128e6ab0a27d4ad6f7d7922127fe828e07aa94ab79d/detection)) | pdfzip | [polyglot.pdf](https://truepolyglot.hackade.org/samples/pdfzip/poc6/polyglot.pdf) ([scan virustotal.com](https://www.virustotal.com/#/file/f6fef31e3b03164bb3bdf35af0521f9fc0c518a9e0f1aa9f8b60ac936201591a/detection)) | PDF/ZIP polyglot with the Eicar test in Zip - 2.9 Mo |
| [doc.pdf](https://truepolyglot.hackade.org/samples/pdfany/poc7/doc.pdf) | [page.html](https://truepolyglot.hackade.org/samples/pdfany/poc7/page.html) | pdfany | [polyglot.pdf](https://truepolyglot.hackade.org/samples/pdfany/poc7/polyglot.pdf) | PDF/HTML polyglot - 26 Ko |
| [logo.zip](https://truepolyglot.hackade.org/samples/zipany/poc8/logo.zip) | [nc.exe](https://truepolyglot.hackade.org/samples/zipany/poc8/nc.exe) | zipany | [polyglot.zip](https://truepolyglot.hackade.org/samples/zipany/poc8/polyglot.zip) | PDF/PE polyglot - 96 Ko |

## Usage ##

```
usage: truepolyglot format [options] output-file

Generate a polyglot file.

Formats availables:
* pdfzip: Generate a file valid as PDF and ZIP. The format is closest to PDF.
* zippdf: Generate a file valid as ZIP and PDF. The format is closest to ZIP.
* szippdf: Generate a file valid as ZIP and PDF. The format is strictly a ZIP. Archive is modified.
* pdfany: Generate a valid PDF file with payload1 file content as the first object or/and payload2 file content as the last object.
* zipany: Generate a valid ZIP file with payload1 file content at the start of the file or/and payload2 file content between LFH and CD.

positional arguments:       {pdfzip,zippdf,szippdf,pdfany,zipany}
Output polyglot format
output_file           Output polyglot file path

optional arguments:
-h, --help            show this help message and exit
--pdffile PDFFILE     PDF input file
--zipfile ZIPFILE     ZIP input file       
--payload1file PAYLOAD1FILE Payload 1 input file       
--payload2file PAYLOAD2FILE Payload 2 input file 
--acrobat-compatibility Add a byte at the start for Acrobat Reader compatibility with the szippdf format       
--verbose {none,error,info,debug} Verbosity level  (default: info)

TruePolyglot v1.6.1
```

## Code ##

```
git clone https://git.hackade.org/truepolyglot.git/
```

or download [truepolyglot-1.6.1.tar.gz](https://git.hackade.org/truepolyglot.git/snapshot/truepolyglot-1.6.1.tar.gz)

## How to detect a polyglot file ? ##

You can use [binwalk](https://github.com/ReFirmLabs/binwalk) on a file to see if composed of multiple files.

## Contact ##

[truepolyglot@hackade.org](mailto:truepolyglot@hackade.org)

## Credits ##

Copyright © 2018-2019 ben@hackade.org

TruePolyglot is released under [Unlicence](https://unlicense.org/) except for the following libraries:

* [PyPDF2](https://github.com/mstamy2/PyPDF2/blob/master/LICENSE)
* [zipfile.py (cpython)](https://github.com/python/cpython/blob/master/LICENSE)