PHPnews.io

A new major version of Flysystem

Written by Frank de Jonge / Original link on Dec. 21, 2020

flysystem.png

For those who missed it, a new major version of Flysystem was released on the 24th of November. A new major version allows you break with the past for the sake of the future, which is exactly what I've done.

For the second version of Flysystem I went back to the drawing board. Many of the library's core design elements have been brushed up and improved. The API is reduced while keeping the same functionality. Error handling is now purely exception based, and directory listings are now backed by generators. Although there are many changes, this version of Flysystem is true to its roots. Let's dive in to find out more!

Exceptions for failures

In V1, the error cases were modelled as closely after PHP's own filesystem functions. These functions use false as an indicator for an unsuccessful operation. While this worked, it did cause some complexity in Flysystem and the code consuming it. At times it also caused a loss of information when dealing with errors. For unexpected errors, exceptions were still used. This design caused two different paths to deal with errors, which lead to core like this:

try {
  $success = $filesystem->write($path, $contents);

  if ($success === false) {
    // handle error
  }
} catch (Throwable $exception) {
  // hande an exception
}

While this is not a huge problem, when you depend on Flysystem this code is duplicated on every integration point. The library internals suffered from the same kind of duplication, this caused unwanted complexity.

For V2, all errors are represented as exceptions. Every filesystem operation has a corresponding exception class. Each exception class explains what the originating operation was and what went wrong. If an underlying SDK or client library throws an exception, Flysystem will make sure to wrap this in a Flysystem specific exception, keeping the stack-trace in tact. This allows you to handle exceptions in a uniform way, while retaining all the information you need to debug any issues.

try {
   $filesystem->write($path, $contents);
} catch (UnableToWriteFile $exception) {
 // handle the error
}

Deterministic filesystem operations

In V1, there were two ways to write a file. The write and writeStream functions allowed you to write new files. The update and updateStream functions were used to update files. These methods performed a file existence check to guard against overwriting files or trying to update a non-existing file. Although this seemed nice at the time, this was a big frustration for me as this resulted in a lot of conditionals in the consuming code. It also causes unnecessary checks, often resulting in expensive calls over the network.

if ($filesystem->has($file)) {
  $filesystem->update($file, $contents);
} else {
  $filesystem->write($file, $contents);
}

If AWS S3 was used in the example above, this block of code would always cause 3 HTTP requests.

For V2, all writes and deletes are deterministic. This means writing files will always result in a the file with the provided content being written. For deletes it means the delete operation is successful, even if the file didn't exist. The outcome is always "there is now no longer a file". This eliminates a lot of chances of race conditions caused by the library itself and makes consuming code simpler. This behaviour eliminated the need to expose an update function, so update and updateStream are now removed.

$filesystem->write($path, $contents);

// if you REALLY need to check if a file exists
$fileExists = $filesystem->fileExists($pathToFile);

In comparison with V1, only 1 HTTP call would be needed to write or update a file on S3.

Content listings with generators

In V1, directory listing responses were represented as an array of associative arrays with relatively well standardised properties/keys. This solution was very pragmatic but had its limitations. For one, consuming code needed to know which keys were available on the arrays and there was no IDE hinting. Additionally, the entire listing was fully fetched before returning it. For large directory listings, this could (and often did) result into processes running out of memory.

For V2, directory listings are backed by generators. This allows for a much more memory efficient way of listing directory contents. In addition, the Filesystem returns a DirectoryListing object, which adds some convenience methods such as filter and map.

Let's compare a case where we filter and map a directory listing. Here is the V1 usage:

$listing = $filesystem->listContents('path/to/dir');

$files = array_filter(
  $listing,
  fn ($i) => $i['type'] === 'file',
);

$paths = array_map(
  fn ($i) => $i['path']
  $files,
);

And this is what it looks like in V2:

$paths = $filesystem->listContents('path/to/dir')
  ->filter(fn (StorageAttributes $i) => $i->isFile())
  ->map(fn (StorageAttributes $i) => $i->path())
  ->toArray();

Go have a look at V2!

I believe Flysystem is ready for the upcoming years of Filesystem abstraction needs, and I hope you are excited to start using it. Checkout the documentation for more information.

frankdejonge

« Add Apple Watch authentication to sudo - Testing without mocking frameworks. »