Skip to content

bitemyapp/bloodhound

Repository files navigation

Bloodhound compatbuild Haskell nix ormolu Hackage

Bloodhound (dog)

Elasticsearch and OpenSearch client and query DSL for Haskell

Why?

Search doesn't have to be hard. Let the dog do it.

Endorsements

"Bloodhound makes Elasticsearch almost tolerable!" - Almost-gruntled user

"ES is a nightmare but Bloodhound at least makes it tolerable." - Same user, later opinion.

Version compatibility

Bloodhound supports and is continuously tested against the following Elasticsearch and OpenSearch releases (see the compat workflow and the CI runs):

Backend CI-tested version
Elasticsearch 7.17.25
Elasticsearch 8.19.16
Elasticsearch 9.4.2
OpenSearch 1.3.19
OpenSearch 2.19.5
OpenSearch 3.7.0

Backends

Each backend is exposed as its own type-safe module set, so only the endpoints that actually exist on that version are available:

  • Database.Bloodhound.ElasticSearch7.*
  • Database.Bloodhound.ElasticSearch8.*
  • Database.Bloodhound.ElasticSearch9.*
  • Database.Bloodhound.OpenSearch1.*
  • Database.Bloodhound.OpenSearch2.*
  • Database.Bloodhound.OpenSearch3.*

A shared Database.Bloodhound.Common.* layer carries the cross-backend surface, and Database.Bloodhound.Dynamic.* dispatches to the right backend at runtime (detected from the server). Use StaticBH backend m a for type-safe, backend-specific code.

Stability

Bloodhound is stable for production use. I will strive to avoid breaking API compatibility from here on forward, but dramatic features like a type-safe, fully integrated mapping API may require breaking things in the future.

Testing

The Bloodhound project uses Github workflows using Cabal to test for regressions and compatibility. A convenient development environment is provided by Nix and a Makefile, though the project can be built with only Cabal.

To run the tests:

  1. Get into the Nix environment by running nix develop (or nix-shell for a non-flake setup)
  2. Start Elasticsearch defined by docker-compose.yml: make compose
  3. Run the tests with Cabal: cabal test

The second step can be left out if ElasticSearch (or OpenSearch) is started manually.

Contributions

Any contribution is welcomed, for consistency reason ormolu is used.

Hackage page and Haddock documentation

http://hackage.haskell.org/package/bloodhound

Elasticsearch Tutorial

It's not using Bloodhound, but if you need an introduction to or overview of Elasticsearch and how to use it, you can use this screencast.

Examples

See the examples directory for example code.

Index a document

indexDocument testIndex defaultIndexDocumentSettings exampleTweet (DocId "1")
{-
IndexedDocument
  { idxDocIndex = "twitter"
  , idxDocType = "_doc"
  , idxDocId = "1"
  , idxDocVersion = 3
  , idxDocResult = "updated"
  , idxDocShards =
      ShardResult
        { shardTotal = 1
        , shardsSuccessful = 1
        , shardsSkipped = 0
        , shardsFailed = 0
        }
  , idxDocSeqNo = 2
  , idxDocPrimaryTerm = 1
  }
-}

Fetch documents

let query = TermQuery (Term "user" "bitemyapp") boost
let search = mkSearch (Just query) boost
searchByIndex @_ @Tweet testIndex search
{-
SearchResult
    { took = 1
    , timedOut = False
    , shards =
            ShardResult
                { shardTotal = 1
                , shardsSuccessful = 1
                , shardsSkipped = 0
                , shardsFailed = 0
                }
    , searchHits =
            SearchHits
                { hitsTotal = HitsTotal { value = 2 , relation = HTR_EQ }
                , maxScore = Just 0.18232156
                , hits =
                        [ Hit
                                { hitIndex = IndexName "twitter"
                                , hitDocId = DocId "1"
                                , hitScore = Just 0.18232156
                                , hitSource =
                                        Just
                                            Tweet
                                                { user = "bitemyapp"
                                                , postDate = 2009-06-18 00:00:10 UTC
                                                , message = "Use haskell!"
                                                , age = 10000
                                                , location = LatLon { lat = 40.12 , lon = -71.3 }
                                                }
                                , hitSort = Nothing
                                , hitFields = Nothing
                                , hitHighlight = Nothing
                                , hitInnerHits = Nothing
                                }
                        , Hit
                                { hitIndex = IndexName "twitter"
                                , hitDocId = DocId "2"
                                , hitScore = Just 0.18232156
                                , hitSource =
                                        Just
                                            Tweet
                                                { user = "bitemyapp"
                                                , postDate = 2009-06-18 00:00:10 UTC
                                                , message = "Use haskell!"
                                                , age = 10000
                                                , location = LatLon { lat = 40.12 , lon = -71.3 }
                                                }
                                , hitSort = Nothing
                                , hitFields = Nothing
                                , hitHighlight = Nothing
                                , hitInnerHits = Nothing
                                }
                        ]
                }
    , aggregations = Nothing
    , scrollId = Nothing
    , suggest = Nothing
    , pitId = Nothing
    }
-}

Contributors

Possible future functionality

Span Queries

Beginning here: https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-span-first-query.html

Node discovery and failover

Might require TCP support.

Support for TCP access to Elasticsearch

Pretend to be a transport client?

Bulk cluster-join merge

Might require making a lucene index on disk with the appropriate format.

GeoShapeQuery

https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-geo-shape-query.html

Script based sorting

https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-sort.html#_script_based_sorting

Runtime checking for cycles in data structures

check for n > 1 occurrences in DFS:

http://hackage.haskell.org/package/stable-maps-0.0.5/docs/System-Mem-StableName-Dynamic.html

http://hackage.haskell.org/package/stable-maps-0.0.5/docs/System-Mem-StableName-Dynamic-Map.html

Photo Origin

Photo from HA! Designs: https://www.flickr.com/photos/hadesigns/

About

Haskell Elasticsearch client and query DSL

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors