Skip to content

Challenges posed by ZIM files with a huge number of entries #1054

@veloman-yunkan

Description

@veloman-yunkan

Recently there have been a couple of issues spawned by the attempts to convert (geographic) digital maps to ZIM format:

The problem boils down to the fact that the count of entries in such a ZIM file created from full-world data exceeds the count of entries in our thus far largest ZIM files by an order of magnitude (hundreds of millions vs tens of million in wikipedia_en_all_maxi). Such a leap pushes the current implementation of libzim if not the ZIM file format spec itself beyond limits implicitly assumed during design.

The huge number of entries leads to the following issues/problems/challenges:

  1. ZIM file size (inefficient usage of ZIM file space by the entry listing which has always been assumed to be much smaller than the space taken by the entry data).
  2. Memory consumption (both during ZIM file creation and/or consumption)
  3. Performance (both during ZIM file creation and/or consumption)

Metadata

Metadata

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions