My blog
[[!tag firefox mozilla testing machine-learning book-club]]
For our book club this time, Daniel, Mark and I were joined by Vince to discuss an article about using machine-learning to test Firefox more efficiently by Andrew Halberstadt and Marco Castelluccio. A short summary follows. I should note that part of the article dives into maths, which is beyond my understanding, but we tried to understand what Mozilla is trying to achieve, even if we skimped on understanding the details of how.
Mozilla has roughly 85 thousand tests for Firefox, which is built for about 90 configurations (combinations of target operating system, build flags, and so on), and they get around 300 pushes to version control per day. That's about 2.3 billion individual tests to run every day.
Running that many tests takes a lot of hardware resources, and so it has a real cost. Beyond the financial aspect, running all tests for every configuration for every push takes time and causes friction in the development process: other developers have to wait for your push to finish testing.
The goal of the work reported on in this blog post is about reducing the number of tests that are run without sacrificing quality, using machine-learning technologies. Previously, Mozilla has prioritised some configurations over others, and runs a smaller set of tests for some than others, or runs tests less frequently. They also integrate into a dedicated integration branch, and merge from that into the main branch manually, using dedicated people, "code sheriffs", who make informed decisions.
The machine-learning approach is based on the realisation that one can deduce patterns from historical data: if this module changes, then if any tests fail it's probably this set of tests. Thus, when a new change comes in, analysing what has changed can inform what tests to run, or in which order to run them. For CI, if the automated tests will fail, it's better that they fail fast. Not only does this mean developers get feedback sooner, and can start fixing things earlier, but also the other tests don't necessarily need to be run at all.
The book club group had long discussion about the article. Overall we found Mozilla's approach to testing the Firefox browser impressive. We certainly understand the motivation to reduce number of tests run without compromising on quality. It seems to be an issue that many large software projects have to face: it's one more thing to balance to meet various conflicting needs and requirements.
Large code bases are different from small ones. The sheer scale of things brings problems rarely seen in smaller code bases, and expose more problems in compilers, operating systems, and even hardware, than most small projects do. Large projects tend to also have more flaky tests (tests that sometimes fail for no obvious reason, but sometimes pass). Large projects may also have to make different kinds of careful compromises when tests fail, to maintain forward momentum.
We each had some amusing stories about how machine-learning can fail amusingly, but it seems Mozilla is doing things in ways that should avoid most of the inexplicable failures. We had some thoughts about how Mozilla might use their historical data and machine-learning more. For example, perhaps ML could identify flaky tests automatically: tests that fail unusually much, even when nothing seemingly related to the them has changed? Or perhaps identifying tests that become flaky?
Maybe ML could identify parts of code that could do with quality improvement: specific code modules that result in test failures unusually often? Or identify hidden, undeclared coupling between parts of the code: if this code module is changed, then tests for this other code module fail unusually often?
Overall, we liked the article. It reports on work in progress, and we look forward to learning more.
[[!tag fling network bandwidth]]
I was curious about how fast I can transfer data in various contexts. I did some informal benchmarks: these are not scientific, and your results might be different.
The data transferred is a one terabyte sparse file:
$ truncate --size 1T terabyte
First is just copying files on a disk: from the file to /dev/null.
software | sender | receiver | time (s) | speed (MB/s) |
---|---|---|---|---|
cat | exolobe1 | exolobe1 | 274 | 3826 |
dd | exolobe1 | exolobe1 | 2729 | 384 |
dd bs=1M | exolobe1 | exolobe1 | 304 | 3449 |
dd bs=16M | exolobe1 | exolobe1 | 314 | 3338 |
dd bs=1G | exolobe1 | exolobe1 | 328 | 3196 |
Conclusion: cat is quite good, and setting almost any size buffer on dd is a win over the default. (dd has other options that cat lacks, such as oflag=direct, which means its still interesting for me, despite a little slower than cat.)
Then network transfers. I use the fling software, which is quite efficient when security is not needed. It does not handle sparse files specially.
Source of the transfer is my laptop (exolobe1), target is the same machine over localhost, a VM under libvirt on the same laptop, a VM under qemu-system on the same laptop, or my server (exolobe2), or a VM on the server (holywood2).
$ fling -p terabyte receiver 8888 # sender
$ receiver: fling -r 8888 > /dev/null # receiver
For extra kicks, transfers between nested VMs inside a VM on the laptop, where the outer VM is running under libvirt or qemu-system. The nested VMs always run under libvirt.
software | sender | receiver | time (s) | speed (MB/s) |
---|---|---|---|---|
fling | exolobe1 | exolobe1 | 395 | 2655 |
fling | exolobe1 | libvirt guest | 613 | 1710 |
fling | exolobe1 | qemu-system-x86_64 | 4653 | 225 |
fling | exolobe1 | exolobe2 | too long | 111 |
fling | exolobe1 | holywood2 | too long | 78 |
fling | nested libvirt-guest | other nested guest | too long | 280 |
fling | nested qemu-system | other such | too long | 12 |
The "too long" results is because I got impatient. The speeds in those cases are from fling -p output.
Conclusion: localhost is fast. libvirt networking inside the same physical machine is fast. Every other case is slow.
I hope that's useful to someone.
[[!tag book-club git]]
We had our third book club meeting yesterday. Daniel posted his summary of our discussion.
[[!tag rant]]
I don't like the "infinite scroll" or "never-ending stream" type of communication application. I prefer to have a clear "inbox", where messages go, and from where I can remove them when I'm done with them. The reason I like inboxes is that they make it easier to keep track of things and harder to miss or forget about things. With the stream, I have to capture the message into my GTD system or I'll miss it. Obviously I can do that, and I do, but it's less convenient.
(Note that for the purposes of this rant, whether all your incoming messages go into one inbox or are automatically filtered into many is irrelevant.)
Another reason is that in the streaming model it's harder to look at only the messages that have arrived since I last looked, say overnight while I slept, or while I was busy doing other things. This makes it even easier to accidentally miss important things.
Examples of inboxes: email, SMS.
Examples of streams: Twitter, Mastodon, IRC, Matrix, Telegram, RSS feeds.
You may notice that all "modern" applications tend to be streams. In this, too, I feel like an old man shouting at clouds.
[[!tag email idea]]
I am tired of the existing Internet email system, both as a sender of email, as a recipient, and as an operator of an email server.
I've been thinking about ways to solve that, and have written an essay about it, without any plans to work on implementing a solution. I'm interested in inspiring more discussion. See HTML and PDF versions. It's a bit long, and does not yet address more than the spam problem.
TL;DR: I don't think the current email system can be improved enough to be worthwhile, but a new system could use digital signatures and digital stamps (tokens) to only allow recipients accept mail from specific senders.
[[!tag book-club email encryption]]
Three friends sat down and discussed a book, er, blog post that they'd read: Daniel, Mark, and myself. We live in different countries, so we did this over video conferencing. This is a summary of the discussion.
The article: https://latacora.singles/2020/02/19/stop-using-encrypted.html published in February this year. The title is "Stop Using Encrypted Email". It's not long.
To start with, we all agree that using encryption with the current Internet email system is far from ideal. The blog post correctly points out problems:
- email metadata (headers, routing) is public, even on encrypted messages
- it's easy to reply to an encrypted email in cleartext
- PGP is far from ideal
- PGP users tend to have long-lived encryption keys, and that if and when they are broken or leak, all messages' security is at danger
- personal email archives can leak an encrypted message long after it was sent
However, we think that blog post argues too strongly that encrypted email is pointless.
Most importantly, they claim that encrypted email should not be used by anyone, ever, for anything. We find this to be too strong, if understandable. They don't describe an actual threat model, though they give some examples, and seem to mostly concentrate on a threat where a very powerful adversary, with pervasive surveillance capabilities, is trying to catch individuals so they can punish them, and possibly kill them, possibly long after the communication happens. That is certainly a threat model where current encrypted email fails.
However, we claim there are situations where the encrypted email works well enough. For example, password reset emails that are encrypted to the PGP public key registered with the service. The value of the email disappears minutes after it's sent.
Or emails preparing a surprise party for someone's spouse. If the messages leak, it's a bummer, but it's not a big problem, especially after the party is over.
Thus we feel that rather than telling people to not use encrypted email at all, for anything, ever, a more sensible and useful approach is to discuss the risks and give people tools to decide for themselves. Accurate information is more valuable than overblown rhetoric, whether it's for or against email encryption.
We agree that the secure messaging systems they promote are good, but we don't agree that they're as good as the article implies. Signal, for example, routes all traffic through its own servers. A very powerful adversary with pervasive surveillance capabilities can deduce much from traffic patterns. This has already been used against Tor users (see for example 1 and 2).
We're also not entirely happy with messaging systems that require the use of phone numbers. Signal is one of these. Signal is also problematic when changing phones or phone numbers, as all trust relationships within it have to be re-established.
Messaging systems are also meant for use cases that aren't all the same as email's. For example, offline use, and long-form messages. We see messaging systems and email as complementary more than competing.
We also do not agree that improving email security is as hopeless as the blog post claims. Much could be done just by improving email client software. That said, we repeat that we agree that it's not going to be good enough against their implied threat model.
For example, email clients and servers could refuse to send or accept email except over unencrypted or unverified channels, or emails that are unencrypted. This wouldn't help, say, gmail users, but we would not expect people with the blog post's implied threat model to use gmail. Or email at all.
In summary, we do think the email system could be improved. We just don't think it and its encryption are as useless as the blog post claims, and we don't think the blog post is making things better.
[[!tag announcement contractor security programming]]
TL;DR: I wrote a little program to build and test software in a pair of nested virtual machines, to reduce the risk of bugs or malware in dependencies doing bad things. It's called the Contractor and it's just barely usable now. Feedback welcome.
Software development is a security risk.
Building software from source code and running it is a core activity of software development. Software developers do it on the machine they use for other things too. The process is roughly as follows:
- install any dependencies
- build the software
- run the software, perhaps as part of unit testing
When the software is run, even if only a small unit of it, it can do anything that the person running the build can do, in principle:
- delete files
- modify files
- log into remote hosts using SSH
- decrypt or sign files with PGP
- send email
- delete email
- commit to version control repositories
- do anything with the browser that a the person could do
- run things as sudo
- in general, cause mayhem and chaos
Normally, a software developer can assume that the code they wrote themselves doesn't do any of that. They can even assume that people they work with don't do any of that. In both cases, they may be wrong: mistakes happen. It's a well-guarded secret among programmers that they sometimes, even if rarely, make catastrophic mistakes.
Accidents aside, mayhem and chaos may be intentional. Your own project may not have malware, and you may have vetted all your dependencies, and you trust them. But your dependencies have dependencies, which have further dependencies, which have dependencies of their own. You'd need to vet the whole dependency tree. Even decades ago, in the 1990s, this could easily be hundreds of thousands of lines of code, and modern systems a much larger. Note that build tools are themselves dependencies, as is the whole operating system. Any code that is used in the build process is a dependency.
How certain are you that you can spot malicious code that's intentionally hidden and obfuscated?
Are you prepared to vet any changes to any transitive dependencies?
Does this really matter? Maybe it doesn't. If you can't ever do anything on your computer that would affect you or anyone else in a negative way, it probably doesn't matter. Most software developers are not in that position.
This risk affects every operating system and every programming language. The degree in which it exists varies, a lot. Some programming language ecosystems seem more vulnerable than others: the nodejs/npm one, for example, values tiny and highly focused packages, which leads to immense dependency trees. The more direct or indirect dependencies there are, the higher the chance that one of them turns out to be bad.
The risk also exists for more traditional languages, such as C. Few C programs have no dependencies. They all need a C compiler, which in turn requires an operating system, at least.
The risk is there for both free software systems, and non-free ones. As an example, the Debian system is entirely free software, but it's huge: the Debian 10 (buster) release has tens of thousands of software packages, maintained by thousands of people. While it's probable that none of those packages contains actual malware, it's not certain. Even if everyone who helps maintain is completely trustworthy, the amount of software in Debian is much too large for all code to be comprehensively reviewed. Also, no amount of review will catch all bugs.
This is true for all operating systems that are not mere toys.
The conclusion here is that to build software securely, we can't assume all code involved in the build to be secure. We need something more secure. The Contractor aims to be a possible solution.
See the README for instructions how to try it. See the subplot document for more about the architecture and so on for how it works.
The Contractor has only just reached a state where it can build and test some of my other projects. It's ugly, buggy, and awkward, but I expect to have much fun using and improving it in the future. Maybe you'd like to join the adventure?
[[!tag godwin wikipedia empowering]]
Mike Godwin in an essay on slate.com:
That’s the biggest thing I learned at the Wikimedia Foundation: When ordinary people are empowered to come together and work on a common, humanity-benefiting project like Wikipedia, unexpectedly great and positive things can happen. Wikipedia is not the anomaly my journalist friend thinks it is. Instead, it’s a promise of the good works that ordinary people freed by the internet can create. I no longer argue primarily that the explosion of freedom of expression and diverse voices, facilitated by the internet, is simply a burden we dutifully have to bear. Now, more than I ever did 30 years ago, I argue that it’s the solution.
I thought that was well said.
[[!tag email]]
I asked a couple of weeks ago what people like or hate about email. Here's a summary of the responses. I admit the summary may be tainted by my current thinking about re-inventing email.
Like
It's not real time. Sender and recipient do not net need to be participating in the communication at the same time. The sender can take their time to craft their message, the recipient can take their time to ponder on the message and how to respond.
It's established, ubiquitous.
It's de-centralized.
It's built on top of well-known data formats and protocols, and data can be stored locally under user control, and is highly portable. There are a variety of client software to choose from.
Separate discussions are kept separate.
Formatting, attachments, and lenght is flexible.
Mailing lists can be archived publically.
One can have many accounts, and people comprehend this.
Subject lines.
Email providers are neutral, commodity entities. Choosing one doesn't imply membership in a community.
Not like
Unreliable for communication, often due to bad anti-spam.
People sending one-line replies that don't add actual value or that miss the point entirely.
Encryption, security, privacy, rich media content, formatted messages, etc, are all built on top of older protocols, often resulting in unfortunate consequences.
Top quoting.
De-facto oligopoly.
Spam.
Abuse.
Configuring and administering email servers is complex.
Filters and organisation of email is often difficult. The tools provided are not always well suited for the task.
Threading is unreliable.
Email addresses are too tightly tied to your identity.
Searching is often inadequate.
[[!tag ]]
A friend expressed interest in how I keep my journal, so I set up a demo site. In short:
- markdown files (mostly) in git
- ikiwiki renders to HTML (locally or via CI to a website)
- ikiwiki's inline directive helps collect journal entries
- tags and topic pages help collect
- I have a "topic" for each person
jt helps make entries and maintain the journal (not necessary, but relieves some tedium)
source: http://git.liw.fi/demo-journal
- rendered: http://demo-journal.liw.fi/
- instructions: http://git.liw.fi/demo-journal/tree/README.md