Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
    Andrea Di Cesare
    @ujibang
    @Babamon_gitlab regarding your question about the pagesize, it depends on the size of documents. MongoDB allows to create documents up to 16Mbytes. If you have large documents, a big pagesize can result in a significant network and bson to json conversion overhead. So you need to adjust the pagesize according to your use case.
    In the configuration file you'll find few options that allow you to tune the read performance
    ## Read Performance
    
    # default-pagesize is the number of documents returned when the pagesize query
    # parameter is not specified
    # see https://restheart.org/docs/mongodb-rest/read-docs#paging
    default-pagesize: 100
    
    # max-pagesize sets the maximum allowed value of the pagesize query parameter
    # generally, the greater the pagesize, the more json serializan overhead occurs
    # the rule of thumb is not exeeding 1000
    max-pagesize: 1000
    
    # cursor-batch-size sets the mongodb cursor batchSize
    # see https://docs.mongodb.com/manual/reference/method/cursor.batchSize/
    # cursor-batch-size should be smaller or equal to the max-pagesize
    # the rule of thumb is setting cursor-batch-size equal to max-pagesize
    # a small cursor-batch-size (e.g. 101, the default mongodb batchSize)
    # speeds up requests with small pagesize
    cursor-batch-size: 1000
    A Sadeghioon
    @Babamon_gitlab
    @ujibang Thank you, I would really like to support the project as I think its great is there a link that I can make a payment towards the project
    Andrea Di Cesare
    @ujibang
    wow @Babamon_gitlab thank you. we are actually enabling github sponsors. I'll let you know when it will be active for restheart!
    A Sadeghioon
    @Babamon_gitlab
    @ujibang is there any way to Pass allowDiskUse:true for a request? I have a very large collection document are fairly small but there are millions of them (they are sensor data) when I run a request with hint I get the put of memory error (for sorting)
    ERROR o.r.mongodb.handlers.ErrorHandler - Error handling the request com.mongodb.MongoQueryException: Query failed with error code 292 and error message 'Executor error during find command :: caused by :: Sort exceeded memory limit of 104857600 bytes, but did not opt in to external sorting. Aborting operation. Pass allowDiskUse:true to opt in.' on server
    A Sadeghioon
    @Babamon_gitlab
    I found some information on aggregates but I mean is it possible to pass it as query parameter? if yes how does it need to formatted?
    A Sadeghioon
    @Babamon_gitlab
    I als0 have a problem when I am reading a collection with a filter lest say the result is 1000 pages long as I request the pages one by one the query gets slower and slower is this normal
    A Sadeghioon
    @Babamon_gitlab
    Its almost same performance (good) for the first 200K documents and then its hits the wall regardless of page size
    Andrea Di Cesare
    @ujibang
    Hi @Babamon_gitlab, in this use case you need to use aggregations https://restheart.org/docs/mongodb-rest/aggregations/ which supports allowDiskUse
    the aggregation is defined in the collection metadata and can use parameters passed via ?avars={"var1":1, "var2": {"an": "object}}
    regarding the degrading performances with far pages, this is normal and it depends on how MongoDB works. skipping many results in a find (this is how paging is implemented) in a large collection is not very performant
    in this case you need to define an aggregations that paginates the results on some query conditions, for instance on a time based interval. This aggregation can also use ?page and /pagesize values as parameters, see https://restheart.org/docs/mongodb-rest/aggregations/#predefined-variables
    This approach is referred to as "range queries", see in MongoDB documentation https://docs.mongodb.com/manual/reference/method/cursor.skip/#using-range-queries
    Andrea Di Cesare
    @ujibang
    Range queries can use indexes to avoid scanning unwanted documents, typically yielding better performance as the offset grows compared to using skip() for pagination.
    Andrea Di Cesare
    @ujibang
    🔥 We just enabled github sponsors for RESTHeart https://github.com/sponsors/SoftInstigate, any help to improve our beloved piece of code would be much appreciated! @Babamon_gitlab
    A Sadeghioon
    @Babamon_gitlab
    Thank you @ujibang I am glad to be the first sponsor , i also noticed that the problem only happens when the query is requesting the last page the perfomance is significantly worse no matter how many pages in toral the performabce suddenly drops
    Andrea Di Cesare
    @ujibang
    Thanks @Babamon_gitlab for your sponsorship! Much appreciated
    Have you tried defining an aggregation with a range query? In case I can assist you on doing it
    Andrewzz
    @Andrewzz
    Hey team! Quick question. Is restheart compatible with swagger-ui? Or is there any way to create a swagger file from restheart natively?
    Andrea Di Cesare
    @ujibang
    Hi @Andrewzz, of course you can create a swagger file for the restheart api. I personally used it several times, it is a matter of defining a yml file as in https://editor.swagger.io
    Maybe I didn't get you question……please elaborate
    5 replies
    Hussam Qasem
    @hussam-qasem
    I uploaded many files into a file bucket with PUT. I would like to clone the bucket into a different MongoDB server. What's the easiest way to accomplish the task? An intelligent HTTPie Script? MongoDB dump & restore? Thank you!
    Andrea Di Cesare
    @ujibang
    I would say mongodump/restore
    The first Docker Community All Hands of 2022 is coming up this Thursday! I’ll be speaking about “Running RESTHeart with Docker”. Sign up here for JavaScript, Python, and Java tracks; workshops, Docker news and updates; and more: dockr.ly/3D9HTjr
    Hussam Qasem
    @hussam-qasem
    Thank you @ujibang for the prompt response. One more question. When I do a GET on a Bucket (e.g. http://localhost:8080/mybucket.files) with or without a filter is there a way to limit (or make unlimited) the number of returned documents? I noticed that it returns 100 only. How do I get all? Or limit to 10 only? I know RESTHeart supports pagination, but I can't figure out how to use it. Thank you!!
    Andrea Di Cesare
    @ujibang
    you use ?pagesize=n to ask for n documents. However n has a limit, default 1000 (in conf file you have max-pagesize: 1000)
    than you use ?page=x to ask for page number x
    so GET /mybucket.files?pagesize=100&page=3 will give you files from 300nd to 399th
    and yes ?filter={ <mongo query> } is the way to limit the result set.
    If you call GET /mybucket.files/_size?filter={ <mongo query> } you'll get the count of the files that mach the query
    Hussam Qasem
    @hussam-qasem
    Thank you Andrea. Much appreciated. Have a wonderful day!
    Hussam Qasem
    @hussam-qasem

    Greetings! I am testing retrieving a binary file from a bucket, but realized many of the files were empty, and RESTHeart returns a 500 http status code:

    % http --verify=no -a admin:secret -f GET https://localhost/storage/mybucket.files/myfile.jpg/binary
    
    HTTP/1.1 500 Internal Server Error
    Access-Control-Allow-Credentials: true
    Access-Control-Allow-Origin: *
    Access-Control-Expose-Headers: Location, ETag, X-Powered-By, Auth-Token, Auth-Token-Valid-Until, Auth-Token-Location
    Auth-Token: 3ixg98kbwzxso77wqpwt11y8z65a08icn27ssncbs2nlm085i0
    Auth-Token-Location: /tokens/admin
    Auth-Token-Valid-Until: 2022-04-04T18:28:26.530537652Z
    Connection: close
    Content-Disposition: inline; filename="file"
    Content-Length: 0
    Content-Transfer-Encoding: binary
    Content-Type: image/jpeg
    Date: Mon, 04 Apr 2022 18:13:26 GMT
    ETag: 6204a40e9bf8cb3fb5a0a642
    Server: Apache
    Set-Cookie: ROUTEID=.route1; path=/
    X-Powered-By: restheart.org

    Meanwhile, RESTHeart logs print:

    18:13:26.533 [XNIO-1 task-3] ERROR org.restheart.handlers.ErrorHandler - Error handling the request
     com.mongodb.MongoGridFSException: Unexpected Exception when reading GridFS and writing to the Stream
        at com.mongodb.client.gridfs.GridFSBucketImpl.downloadToStream(GridFSBucketImpl.java:578)
    Caused by: com.mongodb.MongoGridFSException: Could not find file chunk for file_id: BsonString{value='myfile.jpg'} at chunk index 0.
        at com.mongodb.client.gridfs.GridFSDownloadStreamImpl.getBufferFromChunk(GridFSDownloadStreamImpl.java:246)
    
    18:13:26.535 [XNIO-1 task-3] ERROR io.undertow.request - UT005071: Undertow request failed HttpServerExchange{ GET /mybucket.files/myfile.jpg/binary}
     com.mongodb.MongoGridFSException: Unexpected Exception when reading GridFS and writing to the Stream
        at com.mongodb.client.gridfs.GridFSBucketImpl.downloadToStream(GridFSBucketImpl.java:578)
    Caused by: com.mongodb.MongoGridFSException: Could not find file chunk for file_id: BsonString{value='myfile.jpg'} at chunk index 0.
        at com.mongodb.client.gridfs.GridFSDownloadStreamImpl.getBufferFromChunk(GridFSDownloadStreamImpl.java:246)
    
    18:13:26.537 [XNIO-1 task-3] INFO  org.restheart.handlers.RequestLogger - GET http://localhost/mybucket.files/myfile.jpg/binary from /127.0.0.1:34524 => status=500 elapsed=10ms contentLength=0 username=admin roles=[admin]

    Would you kindly help me decode the message and how to solve it?

    1) Retrieving myfile.jpg metadata (without /binary works fine)

    2) I did delete few documents using MongoDB Compass from mybucket.files collection and didn't delete the corresponding document in mybucket.chunks. I'm assuming MongoDB Compass does that automatically, or it doesn't really matter.

    Andrea Di Cesare
    @ujibang

    From https://www.mongodb.com/docs/manual/core/gridfs/

    GridFS uses two collections to store files. One collection stores the file chunks, and the other stores file metadata. The section GridFS Collections describes each collection in detail.

    You should access your files via the GridFS API

    To store and retrieve files using GridFS, use either of the following:

    A MongoDB driver. See the drivers documentation for information on using GridFS with your driver.
    The mongofiles command-line tool. See the mongofiles reference for documentation.

    As long as I understand you deleted data from one collection, so your bucket data is not cosistent.

    That's the reason why you get the error from RESTHeart

    The mongo driver finds the metadata (stored in mybucket.files) but not the chunks (stored in mybucket.chunks)

    To fix the state of the bucket, you should make sure that all the documents in mybucket.files have the corresponding documents in mybucket.chunks
    Hussam Qasem
    @hussam-qasem
    Thank you @ujibang. In my case, I didn't use the GridFS API. It is my mistake, I thought using MongoDB Compass was smart enough to detect that.
    Andrewzz
    @Andrewzz
    Hello team. Any word on the Spring4Shell vulnerabilities? Is restheart affected by any chance?
    Andrea Di Cesare
    @ujibang
    Hi @Andrewzz , RESTHeart does not use Sprint at all, it is also continuosly checked by Sonatype Lift, and we have 0 threats. See https://sbom.lift.sonatype.com/report/T1-0ff0976f7f21c391f20f-5fd315625ad1b2-1646908735-d19a2c6273764f4eb2775bee5c3499cc
    samharry
    @samharry
    Has anyone here connected DocumentDb va restheart?
    The post is quite old, but RESTHeart does work with DocumentDB. Of course some feature of MongoDB are not supported by DocumentDB (as transactions and change stream I think) but most of the API work
    Maurizio Turatti
    @mkjsix
    Maurizio Turatti
    @mkjsix

    The 6.3.0 release introduces a few bug fixes and some important security enhancements:

    ✅ Add new security interceptor bruteForceAttackGuard
    (defends from brute force attacks by returning "429 Too Many Requests" when failed auth attempts in the last 10 seconds from the same IP are more than 50%)
    ✅ Upgrade undertow to v2.2.16.Final
    ✅ Add WildcardInterceptor that allows intercepting requests to any service
    ✅MongoRealmAuthenticator can check the password field on user document updates and reject it when it is too weak
    ✅ Ensure that the defined auth mechanisms are executed in the correct order
    ✅ filterOperatorsBlacklist is now enabled by default with blacklist = [ "$where" ] (prevents code injections at the database level)
    ✅ Fix error message in case of var not bound in aggregation and MongoRequest.getAggregationVars() method name
    ✅ Fix CORS headers for request OPTIONS /bucket.files/_size
    ✅ Set default MongoDB connections minSize=0
    ✅ Allow specifying ReadConcern, WriteConcern and ReadPreference at the request level

    TommyK100
    @TommyK100
    Hello
    Andrea Di Cesare
    @ujibang
    Hello @TommyK100
    Agent Smith
    @DRN88
    Hi. I'm having difficulties using aggregations. Where exactly do I need to create my aggregations?
    I have a database with my normal documents: myProdDB.Orders. So an aggregate query would look like: myProdDB.Orders.aggregate([])
    Now, where do I create the restheart aggregations? Restheart documentation says: GET /coll/_meta What's coll, what's _meta? Where are these in relation to myProdDB.Orders ?
    https://restheart.org/docs/mongodb-rest/aggregations/
    And later on there is a PUT /coll HTTP/1.1 in the Examples. What's coll here? In which db is it?
    My mounts:
    mongo-mounts:
      - what: myProdDB/Orders
        where: /prod/orders
    Andrea Di Cesare
    @ujibang
    you collection is bound to the URI /prod/orders. So you need to add the aggregation to the collection properties and you do it with
    PATCH /prod/orders
    
    {
      "aggrs": [
        {
          "stages": [
            { "$match": { "name": { "$var": "n" } } },
            { "$group": { "_id": "$name", "avg_age": { "$avg": "$age" } } }
          ],
          "type": "pipeline",
          "uri": "example-pipeline"
        }]
    }