Ambiguous zip parsing allows hiding add-on files from linter and reviewers

Bugzilla #1534483
ambiguous_addon_source.zip, source code for proof of concept

The writeup here fixes minor errors in the Bugzilla ticket.

The add-ons submission and reviewing pipeline makes use of four zip parsers, each of which implements details of the zip format differently:

Node.js yauzl, used by addons-linter
Python zipfile, used by addons-server
nsZipArchive, used by Firefox
Go archive/zip, used by autograph's /sign/file endpoint

The parsing differences make it possible to to construct an ambiguous add-on xpi that contains different files depending on what parser is in use. The effect is that you can submit an add-on to addons.mozilla.org that appears benign at review time, but becomes malicious after being signed and made available for download.

The attached file ambiguous_addon_source.zip contains ambiguous_addon.xpi, a proof of concept. Its xpi zip container is specially crafted to present different files to different parsers. It passes addon-linter with no warning or errors. The reviewer tools say the add-on contains only three harmless files. After being signed, the add-on contains additional, malicious files (in the demo they don't actually do anything bad). The "Step-by-step instructions" show how to try the proof of concept against a local installation of addons-server. The Appendix explains how it works, and the attached program make-ambiguous-addon shows how it was made.

Attack scenario

The attack requires an attacker to upload a malicious add-on and get a user to install it. The advantage afforded by the vulnerability is that the attacker can hide the add-on's malicious nature from reviewers, making it more likely to be accepted and signed.

One avenue of attack would be to create a new add-on and convince people to install it. However it would probably be more effective to take advantage of an existing user base. Suppose an attacker compromises the Firefox account of the developer of a popular add-on. Then they could make a trivial update and submit an upgrade, along with arbitrary malicious files. The add-on's existing users would be infected once they upgraded.

The effect of the attack is anything harmful that an extension can do—code execution with extension-level privileges. This could include, for example, running a cryptominer, clicking on ads, or modifying proxy settings to monitor a user's browsing; but would not include, for example, directly modifying the filesystem. In order to use privileged JavaScript APIs, the extension will have to declare permissions in its manifest, but the manifest that is seen by reviewers see may differ from the manifest that gets signed. That is, a reviewer may see "no permissions," and the add-on's page at addons.mozilla.org may say "no permissions," but on installation the end user will still see a dialog declaring all the permissions the add-on actually has. (I'm not sure how this last point interacts with automatic upgrades.)

It's likely that the attack will be detected if the signed xpi is examined, because autograph re-packs the files into a new zip container and removes all the parsing tricks of the original. The proof of concept makes the malicious nature of the re-packed add-on apparent, but an attacker could of course be more or less artful in disguising it.

The proof of concept needs the autograph /sign/file endpoint to be enabled (as opposed to /sign/data). /sign/data ultimately uses the same Python zip parser as addons-server itself (via signing_clients.apps.JarExtractor), so the files that get signed are the same as the files that get reviewed. Parsing ambiguity still exists, but offhand I couldn't think of a way to make it into as effective an attack. I understand from #9308 that /sign/file is meant to become the default endpoint.

Step-by-step instructions

You start in the role of sysadmin. Check out the source code from https://github.com/mozilla/addons-server. Before starting it, we'll make a couple of small changes:
- In call_signing in src/olympia/lib/crypto/signing.py, add an or True to the waffle.sample_is_active('activate-autograph-file-signing') condition. This is to force use of the autograph /sign/file endpoint.
- In _queue in src/olympia/reviewers/views.py, comment out the qs.filter on files.is_webextension. This is to make WebExtensions show up in the reviewer tools. The auto_approve cron job wasn't working for me, and anyway we want to see what a reviewer would see if the add-on were not auto-approved.
Start addons-server according to the install docs. This process was finicky for me. The Appendix has some suggestions of things to try if it's not working.
At the make initialize_docker step, create a superuser that you can remember for later. I used username user and email user@localhost.
Now you are an attacker in the role of add-on developer. Log in with the superuser account. (You could use a different account for each role, but for simplicity I used the superuser for everything.) I didn't know the "right" way to log in, so I did it like this:
1. Hack wrapper inside login_required in src/olympia/amo/decorators.py, and add the lines:
```
from django.contrib.auth import login
from django.db.models import Q
from olympia.users.models import UserProfile
login(request, UserProfile.objects.get(Q(email="user@localhost")))
```
2. Visit a URL that requires login, like http://olympia.test/un-US/developers/addon/validate.
3. Remove the hack from src/olympia/amo/decorators.py. (You can't just leave it in, because it will interfere with CSRF tokens.)
4. Now you are logged in.
Go to http://olympia.test/en-US/developers/ and click "Submit Your First Add-on".
Agree to the Distribution Agreement and Firefox Policies and Rules.
"How to Distribute this Version?" On this site.
Select ambiguous_addon.xpi. Notice it is linted with no errors or warnings. Click "Continue".
"Do You Need to Submit Source Code?" No.
At "Describe Add-on", notice the "good aspect" in the summary. Select any 2 categories and a license. Click "Submit Version".
Now you are a reviewer. Go to http://olympia.test/en-US/reviewers/queue/new and click the new add-on. If you don't see it, go back and check the qs.filter note from step 1.
Notice "Permissions: None". Click "Validation" and notice there are no warnings or errors.
Click "Contents" and notice that there are only 3 files:
```
manifest.json
background.js
good.html
```
Right-click on "All Platforms" and download ambiguous_add_on-1.0-fx.xpi. Go to about:debugging and "Load Temporary Add-on". Notice the "Being a good add-on" popup. Remove the temporary extension.
Click "Approve", type some text in the Comments box, and click "Save".
Now you are a user. Go to http://olympia.test/en-US/firefox/addon/ambiguous-add-on/. Notice the summary says "good aspect." Notice that no permissions are declared. Right-click "Add to Firefox" and save ambiguous_add_on-1.0-fx.xpi.

Notice the file is signed and contains files that were not visible to the reviewer and would not have passed review.

manifest.json
background.js
bad.html
bad.js
badwords.js
coinhive.js
snoop.js
META-INF/cose.manifest
META-INF/cose.sig
META-INF/manifest.mf
META-INF/mozilla.sf
META-INF/mozilla.rsa

If the add-on were signed by the real addons.mozilla.org key, you would be able to install it directly. But it's not, so you have to load it temporarily. Go to about:debugging and load it using "Load Temporary Add-on". Notice the "Being a bad add-on" popup. In the browser console (Ctrl+Shift+J), notice the messages (it's not really doing these things):
```
saying bad words...
mining cryptocoins...
installing a proxy to monitor all your traffic...
```
Go to about:addons and notice you have installed the "bad aspect."

Remediation

I recommend applying a zip normalization step at the beginning of the pipeline. Before the submitted xpi is sent to addons-linter or anything else, extract and re-pack it using any zip library (doesn't matter which one). Don't use the original submitted file for anything after the normalization step. Normalization will enforce consistent interpretation of the zip container by all steps in the pipeline, modulo other parser bugs.

Consider not allowing the optimized jar layout (with a Central Directory at offset 4) when parsing add-on files in nsZipArchive.

You could try re-linting or re-reviewing add-ons (or a random sample) after they have been signed. Maybe you already do this, I don't know.

Thanks

Thanks to @mat on #amo IRC for helping me get a local test environment running.

The WebExtension documentation on MDN is beyond amazing.

Appendix: summary of parsing differences

These are some of the differences in zip parsing logic that give rise to the vulnerability. The yauzl parser, with its strict EOCD comment length check, is the MVP: without it, exploitation of the vulnerability would be much less complicated. You would only need to put two EOCDs at the end of the file: one with a comment length greater than the number of bytes remaining (which Python will accept and Go will ignore), and another, earlier in the file, with a comment length less than or equal to the number of bytes remaining (Go will accept and Python will never reach). As it is, yauzl puts constraints on the exploit, namely that the files addons-linter sees has to be a subset of the files autograph sees, and so that subset cannot be fully malicious on its own.

EOCD search

yauzl: Search backwards for PK\x05\x06. Consider only the first match and assert that the comment length is exactly equal to the number of bytes after the EOCD.
Python zipfile: First, do a special-case check for an EOCD with a zero-length comment at the end of the file. Second, search backwards for PK\x05\x06 in the final 64×1024 bytes and take the first match, without any check on the comment length.
nsZipArchive: First, check for a Central Directory signature near the beginning of the file, at offset 4—this is the optimized jar layout from #559961/#589368 that doesn't have an EOCD. If that fails, then search backwards for PK\x05\x06 and take the first match, without any check on the comment length.
Go archive/zip: Search backwards for PK\x05\x06 in all but the final 65×1024 bytes. Take the first match whose comment length does not exceed the number of bytes after the EOCD.

Central Directory size

The EOCD contains fields specifying cd_num_entries, the number of entries in the Central Directory; and cd_size, the size in bytes of the Central Directory.

yauzl: Ignore cd_size and read exactly cd_num_entries entries.
Python zipfile: Ignore cd_num_entries and read exactly cd_size bytes.
nsZipArchive: Ignore both cd_num_entries and cd_size. Continue reading Central Directory Headers as long as the PK\x01\x02 signature is found. Expect a PK\x05\x06 EOCD signature at the end.
Go archive/zip: Ignore cd_size. Continue reading Central Directory Headers as long as the PK\x01\x02 signature is found. Once finished, assert that the number of headers found is equal to cd_num_entries modulo 65536.

Base file offset

yauzl: Take offsets relative to the start of the file.
Python zipfile: Assume that the Central Directory always immediately precedes the EOCD, and adjusts the implied start of the file to make the EOCD's specified offset to the Central Directory consistent. For example, if the file is 2000 bytes long, and the EOCD says that the Central Directory is located at offset 800 and has size 200, assume that there is garbage at the beginning of the file and the zip actually starts at absolute offset 2000−22−800−200 = 978. Look for the Central Directory at offset 1778 (i.e., immediately preceding the EOCD), not at offset 800. Similarly adjust the offsets to Local File Headers within the Central Directory.
nsZipArchive: Take offsets relative to the start of the file.
Go archive/zip: Take offsets relative to the start of the file.

Info-ZIP unzip seems to use the Python algorithm initially (assume adjacent Central Directory and EOCD), and warn about how many bytes it is ignoring. But if it doesn't find a Central Directory Header signature there, it falls back to absolute offsets.

Appendix: how the malicious add-on is constructed

The attached source code bundle has tools to show how the proof of concept xpi is parsed by yauzl, Python, and Go.

$ npm install yauzl
$ node ziplist-yauzl.js ambiguous_addon.xpi
446	manifest.json
58	background.js
227	bad.html
$ python3 ziplist-python ambiguous_addon.xpi
328	manifest.json
59	background.js
160	good.html
$ go run ziplist-go.go ambiguous_addon.xpi
446	manifest.json
58	background.js
227	bad.html
415	bad.js
99	badwords.js
131	coinhive.js
66	snoop.js
0	META-INF/manifest.mf
...65531 more copies of META-INF/manifest.mf...

The zip contains three separate Central Directories:

One for nsZipArchive, located at the beginning of the file, using the optimized jar layout that is peculiar to nsZipArchive.
One for Python, located just before the EOCD at the end of the file, where Python will always look for it no matter what offset is given in the EOCD.
One for yauzl and Go, located at the absolute offset given by the EOCD. This could be considered two overlapping Central Directories, because yauzl reads 3 entries from it and Go reads 65536+3 entries from it.

In order to make the number of Central Directory entries come out right modulo 65536, we have to pad the Go Central Directory with dummy Central Directory Headers. There are many padding Central Directory Headers, but to save space, they all point to a single Local File Header. We use the filename "META-INF/manifest.mf" for the padding files, because files with that name are stripped from the input before signing. If we used a different name, autograph would not strip the padding files and would switch to ZIP64 format in order to properly represent all 65536+ files, thereby producing a ZIP64 file that nsZipArchive cannot parse.

The nsZipArchive and Python Central Directories could potentially point to different sets of files, but we don't need to make use of that degree of flexibility.

The nsZipArchive optimized jar layout doesn't even look at the EOCD. The EOCD contains a single static value for the Central Directory offset, but it is interpreted differently by Python and yauzl/Go (see "Base file offset" above).

The "semi-malicious files" are not seen by human review and only have to avoid raising linter errors. The "fully benign files" are exposed to human review. The "fully malicious files" are only seen by the end user.

    4 bytes of padding (required for nsZipArchive layout)
   ╭nsZipArchive Central Directory starts here
   │ Central Directory Header "manifest.json" ─────────────────┐
   │ Central Directory Header "background.js" ─────────────────│┐
   │ Central Directory Header "good.html" ─────────────────────││┐
   ╰nsZipArchive Central Directory ends here                   │││
   "PK\x05\x06" (required for nsZipArchive layout)             │││
   ╭semi-malicious files start here                            │││
   │ Local File "manifest.json" <────────────────────┐         │││
   │ Local File "background.js" <────────────────────│┐        │││
   │ Local File "bad.html" <─────────────────────────││┐       │││
   ╰semi-malicious files end here                    │││       │││
   ╭fully malicious files start here                 │││       │││
   │ Local File "bad.js" <──────────────────────┐    │││       │││
   │ Local File "badwords.js" <─────────────────│┐   │││       │││
   │ Local File "coinhive.js" <─────────────────││┐  │││       │││
   │ Local File "snoop.js" <────────────────────│││┐ │││       │││
   ╰fully malicious files end here              ││││ │││       │││
   Local File "META-INF"/manifest.mf" <─────────││││─│││─────┐ │││
╔═>╭╭Go and yauzl Central Directory starts here ││││ │││     │ │││
║  ││ Central Directory Header "manifest.json" ──────┘││     │ │││
║  ││ Central Directory Header "background.js" ───────┘│     │ │││
║  ││ Central Directory Header "bad.html" ─────────────┘     │ │││
║  │╰yauzl Central Directory ends here          ││││         │ │││
║  │ Central Directory Header "bad.js" ─────────┘│││         │ │││
║  │ Central Directory Header "badwords.js" ─────┘││         │ │││
║  │ Central Directory Header "coinhive.js" ──────┘│         │ │││
║  │ Central Directory Header "snoop.js" ──────────┘         │ │││
║  │ 65532 × Central Directory Header "META-INF/manifest.mf" ┘ │││
║  ╰Go Central Directory ends here                             │││
║  ╭fully benign files start here                              │││
║  │ Local File "manifest.json" <──────────────┬───────────────┘││
║  │ Local File "background.js" <──────────────│┬───────────────┘│
║  │ Local File "good.html" <──────────────────││┬───────────────┘
║  ╰fully benign files end here                │││
╠═>╭Python Central Directory starts here       │││
║  │ Central Directory Header "manifest.json" ─┘││
║  │ Central Directory Header "background.js" ──┘│
║  │ Central Directory Header "good.html" ───────┘
║  ╰Python Central Directory ends here
╚══EOCD ambiguously pointing to two Central Directories

Appendix: local addons-server troubleshooting

The addons-server install docs didn't work for me directly. I get the impression, from conversation on IRC, that part of the problem is out-of-date prebuilt Docker containers, so YMMV. Here are some of the things I did to get it working.

After running docker-compose up -d, wait a few seconds and run it again. Often the nginx container crashes the first time and needs to be restarted.
Run docker-compose exec web bash and then chmod a+rwx /code/storage/files/temp inside the container, to prevent "permission denied" errors on upload.
Check logs/docker-celery.log for error messages. If it's hung up, you likely need to restart the worker container and run make initialize_docker again.
Run docker-compose exec worker bash and run python manage.py check. If it gives errors about missing modules, run pip install geoip2 django-extended-choices python-magic. Then exit the container and docker-compose restart worker.
- This is what to do if the initial Running all indexation tasks never finishes. logs/docker-celery.log will have an error about AppRegistryNotReady("Models aren't loaded yet."). You need to install the missing Python modules and restart worker. Then stop the hung make initialize_docker and run docker-compose exec web make populate_data.
Repeat the python manage.py check step with web in place of worker.
In src/olympia/lib/settings_base.py, change multidb.PinningReplicaRouter to multidb.PinningMasterSlaveRouter and add a new line SLAVE_DATABASES = REPLICA_DATABASES after the definition of REPLICA_DATABASES.

If you're having trouble with redirect loops, try rebuilding a fresh addons-frontend image.

addons-frontend$ docker build . -t addons/addons-frontend:latest
addons-server$ docker-compose stop addons-frontend
addons-server$ docker-compose up -d addons-frontend

For trailing-slash redirect loops, you may have to edit some of the routes. I had to add a /? to the ^(?P<type>fragment|file)/(?P<key>.*)$ route in src/olympia/files/urls.py.