Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
    Marty Schoch
    @mschoch
    bleve appears to perform the entire search from scratch
    yes it does, the size/skip literally does just that, runs the entire search, and skips over results
    there is no easy way to cache this in some useful way to save work getting the second page
    alternatively we have a different method, it is most useful to allow for "deep pagination" but it may suit your use-case as well
    you can read more about that feature here: blevesearch/bleve#1182
    essentially, you run search for page 1 results as usual, next, if you want to access page 2, you don't use skip=10, you pass search_after and use the sort key from the last result of page 1
    this has different set of trade-offs, as you can not randomly jump to a page, just first/last/next/prev
    there are unit tests showing how it can be used
    Marty Schoch
    @mschoch
    today we do not support any changes to the index mapping once the index is created, however I know some users have found a way to add new fields, so it is possible, we just don't support it
    Amnon
    @amnonbc

    Thanks for the answers.

    I'll look at the deep pagination feature.

    Korede Oluwafemi
    @Koredeoluwafemi
    Hi everyone, is there a way bleve search results can be converted to structs
    Korede Oluwafemi
    @Koredeoluwafemi
    @mschoch
    Korede Oluwafemi
    @Koredeoluwafemi
    okay, thanks @mschoch, I was actually looking for a way to convert the indexed object back into a struct
    Korede Oluwafemi
    @Koredeoluwafemi
    @amnonbc
    ABHAY MANIYAR
    @abhaymaniyar

    @mschoch Hello Marty,
    Bleve is a fantastic search library. Kudos to you and all the contributors.

    I have a use-case that I want to use bleve for. I have a list of objects which I want to index but don't want the index data store files to be saved on server instance. Do we have any cloud support available for bleve? Or can we convert the data store files into a serialized form to save it on cloud or S3 and fetch them before use?

    Marty Schoch
    @mschoch
    @Koredeoluwafemi the index is a flat list of fields, if you want to convert this back to a struct, it is up to your application to do that
    @abhaymaniyar unfortunately in Bleve it's pretty hard-coded that the segment files are on disk locally.
    I have a newer library bluge (https://github.com/blugelabs/bluge) which is an experimental fork of bleve. It has support for a pluggable "directory" interface which removes this limitation. There has been interest expressed in the slack channel to add s3 support to bluge, but no work has started yet.
    ABHAY MANIYAR
    @abhaymaniyar
    @mschoch Can we serialize the datastore in bluge?
    Marty Schoch
    @mschoch
    @abhaymaniyar I don't know what "serialize the datastore" means
    Johann Tanzer
    @tulpenhaendler
    Hi all, I just found bleve and bluge a few days ago and started to experiment with it, so far looks very good, great work @mschoch!
    What I am a bit confused about right now is how "production ready" bluge is, or if should use bleve or bluge for a new project, generally speaking I am leaning towards bluge just because bleve seems a bit bloated with different indexes and query parsers and all, i would prefer the much slimmer bluge right now but not sure if thats a good idea
    Johann Tanzer
    @tulpenhaendler
    on a side note, I actually have a similar use case to @abhaymaniyar in terms of S3 storage and was super happy to see the Directory Interface in bluge, but i would just use bluge -> Directory Interface -> afero( mix of local disk and s3 )
    Marty Schoch
    @mschoch
    welcome @tulpenhaendler in my opinion, bluge is still only at developer preview release quality. it works for my use cases, but like bleve, it has unit tests, and some basic full-stack tests, but really lacks something more rigorous. in bleve, we have gotten by because Couchbase has invested in a considerable test suite for their product, which while not perfect, has functioned as a sort of proxy for that.
    i will probably be making some announcements about bluge in the near future, but one thing is clear, it will need help from the community to become production ready, it is not something i will be able to do myself.
    i appreciate you liking bluge being much leaner, that was one of my core goals when i started it
    regarding the directory interface, it should be possible to do something with s3, but i suspect as currently implemented, you'll need some sort of local caching layer, and i'm not sure that bluge's use of segments will make that easy (not impossible either though)
    Marty Schoch
    @mschoch
    to me a longer-term interesting idea would be to use s3 lamba, to push-down search of a segment into s3, and only return the relevant matches/meta-data, not even having to download all or part of the raw segment from s3
    Johann Tanzer
    @tulpenhaendler
    thanks for your answer, i am going to look into bluge more and i plan to write some benchmarks, will share that of course
    Johann Tanzer
    @tulpenhaendler
    for s3, i want to have lots of indexes shaded among multiple instances and they would just fetch the entire index once if the dont already have it local and upload snapshots periodically (so i dont actually need bluge to do anything s3 related), technically i guess s3 supports Range queries so you might even be able to implement something like ReadAt(offset,len) but imo it would be a strange use case where you need to use s3 like a filesystem....playing around on aws cost calculator - just the PUT requests for 5 writes/second are about the same price as a 500gb ebs drive per month...
    Marty Schoch
    @mschoch
    Ah ok, I guess we just have different use cases. The indexes I work with are hundreds of GB or larger, so even individual segments are quite large. The latency implied by downloading a segment you don't have yet would be unusable.
    ged
    @gedw99
    I am adding a GIOUI ( golang) gui to beer search, and was wondering if you want it in origin or i keep in my upstream repo.
    I am talking about this repo: https://github.com/blugelabs/beer-search
    here is a kitchen sink demo of GIOUI: https://gioui.org/files/wasm/kitchen/index.html
    Because beer search expects a FS, i will be using the golang FS wrapper that compiles to WASM and normal GO.
    Marty Schoch
    @mschoch
    @gedw99 in general I try to keep the examples focused on the bleve/bluge aspects of the application. No matter which JS library we choose, or even choosing to go without one, it still gets in the way when we build web-based examples. Using a Go UI library may help some because the code is in Go, but it still hurts because it isn't how most apps would actually be built (today). So, for now I encourage you to build this if it is of interest to you, but I cannot say whether or not it would be accepted upstream. When you have something working that we can look at/use, please share it here again.
    6 replies
    Scott Cotton
    @wsc0
    Hi, I am wondering if there is an easy way to selectively index : for example to provide a database of words not to index at index creation time in order to speed it up and focus it on specific application needs. I can code it if need be, but would be interested in some pointers...
    Marty Schoch
    @mschoch
    @wsc0 custom words not to index is already supported, it is called a stop word token filter
    Marty Schoch
    @mschoch
    Hey Everyone, at the end of August I will be stepping back from the Bleve project. I am not sure if any of the core developers will monitor conversations here, consider the Gophers slack if you need assistance: https://groups.google.com/g/bleve/c/jghqnwh_VjQ/m/766ZGYqSAQAJ
    Panagiotis Koursaris
    @panakour
    Hi Marty, thank you for letting us know. What about bluge ?
    Marty Schoch
    @mschoch
    @panakour I'd like to see Bluge continue, but it will require a group of interested people to make that happen. I hope to share more news about that soon.
    Johann Tanzer
    @tulpenhaendler
    I am interested.
    Panagiotis Koursaris
    @panakour
    I am also interested, but I don't know how the core of bluge is working and generally the algorithms you need to develop/maintenance an indexing library
    James Mills
    @prologic
    Hey all 👋 Wondering how I go about indexing social-media style #hashtag(s) and being do interesting things with them. Is this within bleve's scope?
    Is this what's called a Facet?
    James Mills
    @prologic
    Not sure how long I should hang around for a response 🤔
    1 reply
    ged
    @gedw99
    Did you see beer search Demo ? Cause there you can understand what a facet is
    For beer the facets are alcohol %