Skip to content

Smarter bulk & unordered scans #43

Description

@ashvardanian

Currently ukv_scan is only working for fully consistent sorted exported of keys from collections.
With the bulk flag we allow prioritizing throughput over consistency, but a point can be made, that ML-like pipelines don’t need any dependency in operations whatsoever. Instead they may use scans to uniformly random-sample entries, which would in turn require a full scan of keys. If the user leaves start_key unset, we can perform the bulk sampling behind the curtains ourselves.
It will make the interface more ugly by making a function dual-use, but will keep the interface short. Worth considering.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions