Skip to content

Which tool to load and share metadata? #13

@wehrad

Description

@wehrad

We initially explored datalad, but other options are very interesting too:

datalad

Very powerful because directly based on git-annex, but I still haven't fully understood how to use it properly/efficiently.
Datalad is a data management system, and only that (to my knowledge). Very efficient because concentrated on this one task, but somehow limits our application. Or calls for the use of other tools in combination. Which might just be ok.

intake

Simple set of tools but also powerful. Because simple, the community could easily contribute new catalog entries (through yaml files).

  • Allows for local file caching

  • Dask capabilities for big data

  • Cloud access support

  • Possibility for a simple GUI

  • Storing catalog metadata in files makes the structuring of our portal very easy to understand and efficient.

  • The use of the yaml format makes community contribution easier, even from non coders (json and more xml can be intimidating if not used to coding at all).

Intake is more than just a data management tool. Not only the data download step is streamlined but also the reading through the many drivers available (and easy to implement new ones).

pooch

Simple and similar to intake, instead data sources are not really considered as catalogs. Developed to download test data for libraries so we might see some limitations for our metadata portal.

This comparison will be further modified/refined.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions