ooni-sync

Fast downloader of OONI reports using the OONI API. It works by downloading an index of available files, and only downloading the files that are not already present locally. You can run it again and again to keep a local directory up to date with newly published reports.

How to get and build:

git clone https://www.bamsoftware.com/git/ooni-sync.git
cd ooni-sync
go build

Example usage:

ooni-sync -xz -directory reports test_name=tcp_connect
This command will create the directory reports if it doesn't exist, download all tcp_connect reports that are not already present in the directory, and compress the downloaded reports with xz.

Other examples:

ooni-sync -xz -directory reports.is probe_cc=IS since=2017-01-01
ooni-sync -xz -directory reports.as25 probe_asn=AS25
ooni-sync -xz -directory reports.tor-turkey test_name=vanilla_tor probe_cc=TR
ooni-sync -xz -directory reports.web_connectivity test_name=web_connectivity since=2017-01-01 until=2017-01-02

The query is composed of name=value pairs. Possible query parameters include:

test_name=name
probe_cc=cc
probe_asn=ASnum
since=yyyy-mm-dd
until=yyyy-mm-dd
The value of test_name can be tcp_connect, web_connectivity, ndt, etc. There's a list of test names at https://ooni.torproject.org/nettest/. The query parameters order_by, order, offset, and limit are used internally by the program and will be overridden if present. More documentation on the API is available from https://measurements.ooni.torproject.org/api/.

Because the program only uses filename comparison to know whether a report file has already been downloaded, it is careful not to store a file under its final filename until it has been completely downloaded. It is safe to interrupt the program or run two copies simultaneously.

To make your Python scripts capable of processing xz-compressed files, just call this function intead of open(filename):

def open_magic(filename):
    if filename.endswith(".xz"):
        p = subprocess.Popen(["xz", "-d", "-c", "--", filename], stdout=subprocess.PIPE)
        return p.stdout
    else:
        return open(filename)

ooni-sync is in the public domain.

Original mailing list announcement: [ooni-dev] Turbo OONI report downloader (2017-02-22)