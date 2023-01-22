This week, we discuss small mistakes and what it takes to take down a large system, fairness in multi-tenant systems, and the best GPUs to get for machine learning at home.

Articles

When was the last time the FAA suffered a catastrophic outage? You need to understand how the system actually works in order to make sense of how a large-scale failure can happen. A small mistake with a file transfer is a hopelessly incomplete explanation for how the FAA system actually failed.

Crane: Uber’s Next-Gen Infrastructure Stack Very informative write up from Uber.

Fairness in multi-tenant systems Interesting articles on how Amazon ensures multi-tenant environments.

