Modify the Reader classes to support getting files from somewhere other than a local disk

The classes in `reader` take a path to a file on disk, read that file and then parse the contents. For example:

```java
public final class KeyValueReader {
  /**
   * Generic method to read key value pairs from the bagit files, like bagit.txt or bag-info.txt
   * 
   * @param file the file to read
   * @param splitRegex how to split the key from the value
   * @param charset the encoding of the file
   * 
   * @return a list of key value pairs
   */
  public static List<SimpleImmutableEntry<String, String>> readKeyValuesFromFile(final Path file, final String splitRegex, final Charset charset) throws IOException, InvalidBagMetadataException{
    final List<SimpleImmutableEntry<String, String>> keyValues = new ArrayList<>();
    
    try(final BufferedReader reader = Files.newBufferedReader(file, charset)){
       ...
    }

    return keyValues;
  }
}
```

For the Wellcome storage service (https://github.com/wellcometrust/storage-service), we aren’t keeping bags on the local disk, but in S3. If we want to read a file, we make a GetObject call to the S3 SDK, which returns an `InputStream`.

We could download the bag files to disk, and read them from there, but that seems a bit icky – would you be open to some pull requests that add allow parsing files even if they aren’t local files? Something like:

```java
public final class KeyValueReader {
  public static List<SimpleImmutableEntry<String, String>> readKeyValuesFromReader(
    final BufferedReader reader,
    final String splitRegex) throws IOException, InvalidBagMetadataException{
    final List<SimpleImmutableEntry<String, String>> keyValues = new ArrayList<>();

    ...    

    return keyValues;
  }

  public static List<SimpleImmutableEntry<String, String>> readKeyValuesFromFile(
    final Path file,
    final String splitRegex,
    final Charset charset) throws IOException, InvalidBagMetadataException{
    try(final BufferedReader reader = Files.newBufferedReader(file, charset)){
       return readKeyValuesFromReader(reader, splitRegex)
    }
  }
}
```

So the existing API is preserved, and calls into the new method that takes any BufferedReader – and now we can call that rather than round-tripping to the filesystem first.

Thoughts?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Modify the Reader classes to support getting files from somewhere other than a local disk #135

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Modify the Reader classes to support getting files from somewhere other than a local disk #135

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions