David Fifield <david@bamsoftware.com>
The add-ons submission and reviewing pipeline makes use of four zip parsers, each of which implements details of the zip format differently:
The parsing differences make it possible to to construct an ambiguous add-on xpi that contains different files depending on what parser is in use. The effect is that you can submit an add-on to addons.mozilla.org that appears benign at review time, but becomes malicious after being signed and made available for download.
The attached file ambiguous_addon_source.zip contains ambiguous_addon.xpi, a proof of concept. Its xpi zip container is specially crafted to present different files to different parsers. It passes addon-linter with no warning or errors. The reviewer tools say the add-on contains only three harmless files. After being signed, the add-on contains additional, malicious files (in the demo they don't actually do anything bad). The "Step-by-step instructions" show how to try the proof of concept against a local installation of addons-server. The Appendix explains how it works, and the attached program make-ambiguous-addon shows how it was made.
The attack requires an attacker to upload a malicious add-on and get a user to install it. The advantage afforded by the vulnerability is that the attacker can hide the add-on's malicious nature from reviewers, making it more likely to be accepted and signed.
One avenue of attack would be to create a new add-on and convince people to install it. However it would probably be more effective to take advantage of an existing user base. Suppose an attacker compromises the Firefox account of the developer of a popular add-on. Then they could make a trivial update and submit an upgrade, along with arbitrary malicious files. The add-on's existing users would be infected once they upgraded.
The effect of the attack is anything harmful that an extension can do—code execution with extension-level privileges. This could include, for example, running a cryptominer, clicking on ads, or modifying proxy settings to monitor a user's browsing; but would not include, for example, directly modifying the filesystem. In order to use privileged JavaScript APIs, the extension will have to declare permissions in its manifest, but the manifest that is seen by reviewers see may differ from the manifest that gets signed. That is, a reviewer may see "no permissions," and the add-on's page at addons.mozilla.org may say "no permissions," but on installation the end user will still see a dialog declaring all the permissions the add-on actually has. (I'm not sure how this last point interacts with automatic upgrades.)
It's likely that the attack will be detected if the signed xpi is examined, because autograph re-packs the files into a new zip container and removes all the parsing tricks of the original. The proof of concept makes the malicious nature of the re-packed add-on apparent, but an attacker could of course be more or less artful in disguising it.
The proof of concept needs the autograph /sign/file endpoint to be enabled (as opposed to /sign/data). /sign/data ultimately uses the same Python zip parser as addons-server itself (via signing_clients.apps.JarExtractor
), so the files that get signed are the same as the files that get reviewed. Parsing ambiguity still exists, but offhand I couldn't think of a way to make it into as effective an attack. I understand from #9308 that /sign/file is meant to become the default endpoint.
call_signing
in src/olympia/lib/crypto/signing.py, add an or True
to the waffle.sample_is_active('activate-autograph-file-signing')
condition. This is to force use of the autograph /sign/file endpoint._queue
in src/olympia/reviewers/views.py, comment out the qs.filter
on files.is_webextension
. This is to make WebExtensions show up in the reviewer tools. The auto_approve
cron job wasn't working for me, and anyway we want to see what a reviewer would see if the add-on were not auto-approved.make initialize_docker
step, create a superuser that you can remember for later. I used username user
and email user@localhost
.wrapper
inside login_required
in src/olympia/amo/decorators.py, and add the lines:
from django.contrib.auth import login from django.db.models import Q from olympia.users.models import UserProfile login(request, UserProfile.objects.get(Q(email="user@localhost")))
qs.filter
note from step 1.manifest.json background.js good.html
manifest.json background.js bad.html bad.js badwords.js coinhive.js snoop.js META-INF/cose.manifest META-INF/cose.sig META-INF/manifest.mf META-INF/mozilla.sf META-INF/mozilla.rsa
saying bad words... mining cryptocoins... installing a proxy to monitor all your traffic...
I recommend applying a zip normalization step at the beginning of the pipeline. Before the submitted xpi is sent to addons-linter or anything else, extract and re-pack it using any zip library (doesn't matter which one). Don't use the original submitted file for anything after the normalization step. Normalization will enforce consistent interpretation of the zip container by all steps in the pipeline, modulo other parser bugs.
Consider not allowing the optimized jar layout (with a Central Directory at offset 4) when parsing add-on files in nsZipArchive.
You could try re-linting or re-reviewing add-ons (or a random sample) after they have been signed. Maybe you already do this, I don't know.
Thanks to @mat on #amo IRC for helping me get a local test environment running.
The WebExtension documentation on MDN is beyond amazing.
These are some of the differences in zip parsing logic that give rise to the vulnerability. The yauzl parser, with its strict EOCD comment length check, is the MVP: without it, exploitation of the vulnerability would be much less complicated. You would only need to put two EOCDs at the end of the file: one with a comment length greater than the number of bytes remaining (which Python will accept and Go will ignore), and another, earlier in the file, with a comment length less than or equal to the number of bytes remaining (Go will accept and Python will never reach). As it is, yauzl puts constraints on the exploit, namely that the files addons-linter sees has to be a subset of the files autograph sees, and so that subset cannot be fully malicious on its own.
Search backwards for PK\x05\x06
. Consider only the first match and assert that the comment length is exactly equal to the number of bytes after the EOCD.
First, do a special-case check for an EOCD with a zero-length comment at the end of the file. Second, search backwards for PK\x05\x06
in the final 64×1024 bytes and take the first match, without any check on the comment length.
First, check for a Central Directory signature near the beginning of the file, at offset 4—this is the optimized jar layout from #559961/#589368 that doesn't have an EOCD. If that fails, then search backwards for PK\x05\x06
and take the first match, without any check on the comment length.
Search backwards for PK\x05\x06
in all but the final 65×1024 bytes. Take the first match whose comment length does not exceed the number of bytes after the EOCD.
The EOCD contains fields specifying cd_num_entries
, the number of entries in the Central Directory; and cd_size
, the size in bytes of the Central Directory.
Ignore cd_size
and read exactly cd_num_entries
entries.
Ignore cd_num_entries
and read exactly cd_size
bytes.
Ignore both cd_num_entries
and cd_size
. Continue reading Central Directory Headers as long as the PK\x01\x02
signature is found. Expect a PK\x05\x06
EOCD signature at the end.
Ignore cd_size
. Continue reading Central Directory Headers as long as the PK\x01\x02
signature is found. Once finished, assert that the number of headers found is equal to cd_num_entries
modulo 65536.
Take offsets relative to the start of the file.
Assume that the Central Directory always immediately precedes the EOCD, and adjusts the implied start of the file to make the EOCD's specified offset to the Central Directory consistent. For example, if the file is 2000 bytes long, and the EOCD says that the Central Directory is located at offset 800 and has size 200, assume that there is garbage at the beginning of the file and the zip actually starts at absolute offset 2000−22−800−200 = 978. Look for the Central Directory at offset 1778 (i.e., immediately preceding the EOCD), not at offset 800. Similarly adjust the offsets to Local File Headers within the Central Directory.
Take offsets relative to the start of the file.
Take offsets relative to the start of the file.
Info-ZIP unzip seems to use the Python algorithm initially (assume adjacent Central Directory and EOCD), and warn about how many bytes it is ignoring. But if it doesn't find a Central Directory Header signature there, it falls back to absolute offsets.
The attached source code bundle has tools to show how the proof of concept xpi is parsed by yauzl, Python, and Go.
$ npm install yauzl $ node ziplist-yauzl.js ambiguous_addon.xpi 446 manifest.json 58 background.js 227 bad.html $ python3 ziplist-python ambiguous_addon.xpi 328 manifest.json 59 background.js 160 good.html $ go run ziplist-go.go ambiguous_addon.xpi 446 manifest.json 58 background.js 227 bad.html 415 bad.js 99 badwords.js 131 coinhive.js 66 snoop.js 0 META-INF/manifest.mf ...65531 more copies of META-INF/manifest.mf...
The zip contains three separate Central Directories:
In order to make the number of Central Directory entries come out right modulo 65536, we have to pad the Go Central Directory with dummy Central Directory Headers. There are many padding Central Directory Headers, but to save space, they all point to a single Local File Header. We use the filename "META-INF/manifest.mf" for the padding files, because files with that name are stripped from the input before signing. If we used a different name, autograph would not strip the padding files and would switch to ZIP64 format in order to properly represent all 65536+ files, thereby producing a ZIP64 file that nsZipArchive cannot parse.
The nsZipArchive and Python Central Directories could potentially point to different sets of files, but we don't need to make use of that degree of flexibility.
The nsZipArchive optimized jar layout doesn't even look at the EOCD. The EOCD contains a single static value for the Central Directory offset, but it is interpreted differently by Python and yauzl/Go (see "Base file offset" above).
The "semi-malicious files" are not seen by human review and only have to avoid raising linter errors. The "fully benign files" are exposed to human review. The "fully malicious files" are only seen by the end user.
4 bytes of padding (required for nsZipArchive layout) ╭nsZipArchive Central Directory starts here │ Central Directory Header "manifest.json" ─────────────────┐ │ Central Directory Header "background.js" ─────────────────│┐ │ Central Directory Header "good.html" ─────────────────────││┐ ╰nsZipArchive Central Directory ends here │││ "PK\x05\x06" (required for nsZipArchive layout) │││ ╭semi-malicious files start here │││ │ Local File "manifest.json" <────────────────────┐ │││ │ Local File "background.js" <────────────────────│┐ │││ │ Local File "bad.html" <─────────────────────────││┐ │││ ╰semi-malicious files end here │││ │││ ╭fully malicious files start here │││ │││ │ Local File "bad.js" <──────────────────────┐ │││ │││ │ Local File "badwords.js" <─────────────────│┐ │││ │││ │ Local File "coinhive.js" <─────────────────││┐ │││ │││ │ Local File "snoop.js" <────────────────────│││┐ │││ │││ ╰fully malicious files end here ││││ │││ │││ Local File "META-INF"/manifest.mf" <─────────││││─│││─────┐ │││ ╔═>╭╭Go and yauzl Central Directory starts here ││││ │││ │ │││ ║ ││ Central Directory Header "manifest.json" ──────┘││ │ │││ ║ ││ Central Directory Header "background.js" ───────┘│ │ │││ ║ ││ Central Directory Header "bad.html" ─────────────┘ │ │││ ║ │╰yauzl Central Directory ends here ││││ │ │││ ║ │ Central Directory Header "bad.js" ─────────┘│││ │ │││ ║ │ Central Directory Header "badwords.js" ─────┘││ │ │││ ║ │ Central Directory Header "coinhive.js" ──────┘│ │ │││ ║ │ Central Directory Header "snoop.js" ──────────┘ │ │││ ║ │ 65532 × Central Directory Header "META-INF/manifest.mf" ┘ │││ ║ ╰Go Central Directory ends here │││ ║ ╭fully benign files start here │││ ║ │ Local File "manifest.json" <──────────────┬───────────────┘││ ║ │ Local File "background.js" <──────────────│┬───────────────┘│ ║ │ Local File "good.html" <──────────────────││┬───────────────┘ ║ ╰fully benign files end here │││ ╠═>╭Python Central Directory starts here │││ ║ │ Central Directory Header "manifest.json" ─┘││ ║ │ Central Directory Header "background.js" ──┘│ ║ │ Central Directory Header "good.html" ───────┘ ║ ╰Python Central Directory ends here ╚══EOCD ambiguously pointing to two Central Directories
The addons-server install docs didn't work for me directly. I get the impression, from conversation on IRC, that part of the problem is out-of-date prebuilt Docker containers, so YMMV. Here are some of the things I did to get it working.
docker-compose up -d
, wait a few seconds and run it again. Often the nginx container crashes the first time and needs to be restarted.docker-compose exec web bash
and then chmod a+rwx /code/storage/files/temp
inside the container, to prevent "permission denied" errors on upload.make initialize_docker
again.docker-compose exec worker bash
and run python manage.py check
. If it gives errors about missing modules, run pip install geoip2 django-extended-choices python-magic
. Then exit the container and docker-compose restart worker
.
Running all indexation tasks
never finishes. logs/docker-celery.log will have an error about AppRegistryNotReady("Models aren't loaded yet.")
. You need to install the missing Python modules and restart worker
. Then stop the hung make initialize_docker
and run docker-compose exec web make populate_data
.python manage.py check
step with web
in place of worker
.multidb.PinningReplicaRouter
to multidb.PinningMasterSlaveRouter
and add a new line SLAVE_DATABASES = REPLICA_DATABASES
after the definition of REPLICA_DATABASES
.addons-frontend$ docker build . -t addons/addons-frontend:latest addons-server$ docker-compose stop addons-frontend addons-server$ docker-compose up -d addons-frontend
/?
to the ^(?P<type>fragment|file)/(?P<key>.*)$
route in src/olympia/files/urls.py.