Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
    Dr. Di Prodi
    @robomotic
    maybe is only in some regions?
    I don't think I can add a screenshot but basically it says not found in 4 regions
    well mistery solved it seems to be available only in the north american region
    Dr. Di Prodi
    @robomotic
    that worked how do I allow other people in my dev team to access via http? I have added a rule to the security group but it doesn't seem to work (I can of course see it via the ssh tunneling)
    jraby
    @jraby
    @robomotic the AMI id is only available in useast, but the same AMI should be available in all regions (under another ami id)
    I just double checked and it is available in tokyo, seoul, oregon, n. cal
    jraby
    @jraby
    @robomotic , to access MLDB directly, you will have to update /etc/init/mldb.conf to change de port mapping from -p 127.0.0.1:80:80 to -p 80:80
    make sure you have proper security group rules in place if you do so.
    Dr. Di Prodi
    @robomotic
    interesting do you have ACL on the database level though? I don't see anything relevant in the docs
    so in other words I can only rely on network authentication to the host but can't filter at the application level?
    halidm
    @halidm
    Hi, we have an old server with intel xeon 54xx with older SSE4.1.. Is there any way to compile and run mldb on it?
    halidm
    @halidm
    Hi again, we ended up getting a server on digital ocean with 32GB RAM and 12 cores, but now whenever we try to even import a larger dataset (5GB) or run a bigger process MLDB simply exits and all processes die-- running mldb through docker
    Jeremy Barnes
    @jeremybarnes
    @halidm It is possible to run MLDB on an older server, but you will need to recompile it and modify the Makefile here https://github.com/mldbai/mldb/blob/master/ext/tensorflow.mk#L397 to a) add a sse41 mode, and b) set up the compile flags just below for that
    As for MLDB exiting, I'd say that you have system or docker limits set up in such a way that it runs out of memory.
    Siva Ramamurthy
    @SivaRamamurthy1_twitter
    @jeremybarnes @mailletf like mldb; tested the enterprise version ; can I get the pricing?
    Jeremy Barnes
    @jeremybarnes
    @SivaRamamurthy1_twitter please reach out on info@mldb.ai
    Fran├žois Maillet
    @mailletf
    @robomotic exactly. we decided to leave out ACL when building MLDB. we typically have another very thin service/firewall in front dealing with authentification when required
    datascienceit
    @datascienceit
    Is anyone had this:
    Aborting boot: directory mapped to /mldb_data owned by root
    Expected owner uid from MLDB_IDS is: 1003
    If the directory mapped to /mldb_data did not exist before launching MLDB,
    it is automatically created and owned by root. If this is what happened,
    please change the owner of the directory and relaunch.
    Jeremy Barnes
    @jeremybarnes
    @datascienceit in order to avoid problems with files being owned by root, mldb needs the mldb_data directory to be owned by the current user. You can normally fix it with sudo chown $UID.$UID mldb_datain the current directory
    datascienceit
    @datascienceit
    Thanks
    sniper0110
    @sniper0110
    Hello guys,
    So I have a question to ask. I trained a model using inception from tensorflow (to be able to classify food from images), at the end I have a graph file (.pb) and I want to use that graph to predict food from images but I want to make it in an API, how can I do that exactly? I saw the example from mldb where they used inception model as it is to predict stuff (they had a url of inception model and they pointed to the graph file in the inception folder) but I want to use my graph to predict food which is a re-trained model. How can I make that into an API that can be used by let's say websites to predict images? I am also thinking about the graph file and will that affect my API, I mean, the graph file is relatively big (~80Mb) so will this affect my API since this API needs to use the graph each time it wants to predict a food from image? I am sorry if this question is silly, I am not really a software engineer, I do computer vision stuff and I need to make this API as part of an assignment.
    Any help is greatly appreciated :)
    Thanks
    Jeremy Barnes
    @jeremybarnes
    It's very similar. The easiest way is to include the same structure as in the example here: https://docs.mldb.ai/doc/#/v1/plugins/tensorflow/doc/TensorflowGraph.md.html create a zip file with the model.pb and a text file with the label names, and just change the name of the file to match the path of the zip file and the other filenames to match those inside. The graph will only be loaded once, and then will stay in memory every time the API is called. If you follow that example, you can get a prediction out by calling the /v1/query route just like in that example. Note that making an API is something that you kind of need to be a developer for, as you need to know about REST and URL encoding and all that stuff.
    sniper0110
    @sniper0110
    Thank you so much @jeremybarnes , I will definitely try this out!
    sniper0110
    @sniper0110

    Hello again
    So I went with your suggestion @jeremybarnes , but I am stuck in one part where they created a filename (from a URL) and called the function. When I do that on a notebook in my machine (not live) I get an error saying that : Connection has no attributes called 'log' and 'sqlEscape'. I should mention that I started my code with :

    from pymldb import Connection
    mldb = Connection()

    What is the problem exactly?

    sniper0110
    @sniper0110
    So, all the code mentioned in that example is working except for that last part. The only thing I want to do is to call that function with a given URL and have the prediction values displayed as outputs.
    sniper0110
    @sniper0110

    So instead of using the last part of the code I decided to query the model directly using :

    mldb.query("SELECT imageEmbedding({url: '%s'}) as *" % filename)

    Where filename is a URL to a given image. I am having the following error :

    ResourceErrorTraceback (most recent call last)

    <ipython-input-25-b0780527bbb0> in <module>()
    ----> 1 mldb.query("SELECT imageEmbedding({url: '%s'}) as *" % filename)

    /usr/local/lib/python2.7/dist-packages/pymldb/init.pyc in query(self, sql, **kwargs)
    81 """
    82 if 'format' not in kwargs or kwargs['format'] == 'dataframe':
    ---> 83 resp = self.get('/v1/query', data={'q': sql, 'format': 'table'}).json()
    84 if len(resp) == 0:
    85 return pd.DataFrame()

    /usr/local/lib/python2.7/dist-packages/pymldb/init.pyc in inner(args, **kwargs)
    21 result = add_repr_html_to_response(fn(
    args, **kwargs))
    22 if result.status_code < 200 or result.status_code >= 400:
    ---> 23 raise ResourceError(result)
    24 return result
    25 return inner

    ResourceError: '400 Bad Request' response to 'GET http://localhost/v1/query'

    {
    "httpCode": 400,
    "error": "Cannot read column \"softmax\" with no FROM clause."
    }

    I'm guessing the problem is in my 'imageEmbedding' function that has the softmax layer as output. Is this due to the fact that I am not using the inception model as it is but rather a re-trained version of it?
    Jeremy Barnes
    @jeremybarnes
    Yes, that's likely it. The "variables" in the tensorflow.model query need to correspond with layer names, otherwise MLDB will look in an outer scope for the variable and not find it. You can either dump your graph with Tensorflow to understand the layer names, or use GET /v1/functions/<tfmodel function>/details to have a JSON dump, and look up the name of your layer there.
    sniper0110
    @sniper0110

    So as you suggested @jeremybarnes , I used mldb.get('/v1/functions/imageEmbedding/details') and I got a huge chunk of details but in the end there was :
    final_result = SoftmaxT=DT_FLOAT, _device=\"/cpu:0\";

    So what I did was to use 'final_result' as an output for my 'imageEmbedding' function and I got something different (which is kind of good!) , it was the following error :

    "httpCode": 400,
    "error": "Unable to run model: NodeDef mentions attr 'dct_method' not in Op<name=DecodeJpeg; signature=contents:string -> image:uint8; attr=channels:int,default=0; attr=ratio:int,default=1; attr=fancy_upscaling:bool,default=true; attr=try_recover_truncated:bool,default=false; attr=acceptable_fraction:float,default=1>; NodeDef: DecodeJpeg = DecodeJpegacceptable_fraction=1, channels=3, dct_method=\"\", fancy_upscaling=true, ratio=1, try_recover_truncated=false, _device=\"/job:localhost/replica:0/task:0/cpu:0\"\n\t [[Node: DecodeJpeg = DecodeJpegacceptable_fraction=1, channels=3, dct_method=\"\", fancy_upscaling=true, ratio=1, try_recover_truncated=false, _device=\"/job:localhost/replica:0/task:0/cpu:0\"]]"

    Any ideas on what this means exactly? It's very confusing

    sniper0110
    @sniper0110

    This is my code if anyone wants to take a look at it, maybe you'll have some ideas ;)

    retrained_model_food.ipynb

    Jeremy Barnes
    @jeremybarnes
    That error looks like it comes from a Tensorflow version mismatch between the model you trained and ther version that's in MLDB.
    sniper0110
    @sniper0110
    Oh I see, which version is MLDB using exactly ?
    Jeremy Barnes
    @jeremybarnes
    I think it's 0.10. We are moving to 1.1, but that work isn't finished yet.
    sniper0110
    @sniper0110
    What do you suggest @jeremybarnes ? I need to make an API before next Thursday and MLDB seemed the only relatively easy way to make it
    Jeremy Barnes
    @jeremybarnes
    You could either a) retrain your model using Tensorflow 0.10 (which should then be loadable by MLDB), or b) use a Python REST API framework to expose your model (which should be fine so long as performance isn't an issue)
    We won't have a release with Tensorflow 1.1 support before Thursday
    sniper0110
    @sniper0110
    Ok I see, I think I will first explore the second option, if I get lost I'll go back to option (a). Do you have any link to something similar to what I am trying to do, a tutorial-like kind of thing?
    sniper0110
    @sniper0110
    Hello again, so I was working on both options (in parallel) suggested by @jeremybarnes and it finally paid off. I retrained my model using tensorflow 0.1 and it's working perfectly :D. Thank you again for the help @jeremybarnes ;) I will be posting some theoretical questions soon to understand the working principles behind all of this.
    sniper0110
    @sniper0110

    So I gathered some questions regarding the working principles of MLDB. I hope you can help me answer them :)

    1) What is (DecodeJpeg/contents) node?

    2) The fetcher function downloads an image from a URL and turns it into a blob. What is a blob exactly?

    3) The procedure 'imageneLabels' reads the labels from a .txt file and puts them in a dataset. What kind of dataset is this?

    4) Is the function 'lookupLabels' only used to assign the probabilities of the predictions to the correct labels?

    5) In the main function 'imageEmbedding' there are two things that I don't quite understand:

    a) What does ('fetch({url})[content] AS "DecodeJpeg/contents"') mean exactly? I mean, we are turning an
    image into a blob using the fetch function but what does the second part (AS "DecodeJpeg/contents)
    do exaclty?

    b) In the output we are using a function called 'flatten', what is it doing?

    sniper0110
    @sniper0110
    Also, if someone wants to use this API to make predictions for types of food in a web application, a)how can he do that? b)Should I give him my retrained model or just the code?
    Jeremy Barnes
    @jeremybarnes
    1) The DecodeJpeg/contents node is the name of the variable that comes out of the JPEG decoder that's in the Tensorflow graph.
    2) The Fetcher function returns a blob which is the full contents (HTTP body) of the URL that is fetched
    3) The dataset type can be interrogated by asking to GET /v1/datasets/imagenetLabels to MLDB. The default dataset type is sparse.mutable which is a very general purpose dataset that can hold any data type, but is not particularly efficient at any operations.
    4) The lookupLabels function as you suggests associates labels with predictions based upon the index in the prediction vector.
    Jeremy Barnes
    @jeremybarnes
    5.a) When you call a Tensorflow graph, you provide inputs using the names in the Tensorflow graph. So in this case, we want to fetch the URL, take just the content (the binary blob with the JPEG encoded data), and feed it in to the DecodeJpeg/contents variable of the Tensorflow graph. In other words, we're taking something which looks like { content: <BLOB>, error: null } and turning it into { 'DecodeJpeg/contents': <BLOB> }. The AS operator is just like SQL: it renames columns from their default name into another name.
    5.b The output of the Tensorflow function is a 1x1008 matrix. By flattening it, we remove the first dimension and turn it into a simple 1008 element vector.
    If you want to look at how to turn an API into a webapp, you could look at the DeepTeach plugin or the handwriting recognition demo in the MLDB tutorials. You would normally need to give a person the retrained model so that they could load it up as they deploy the API.
    sniper0110
    @sniper0110
    Hello again,
    So I've been asked to deploy my API to a remote machine, it's a ubuntu machine that I have remote access to. How can I do that exactly? I need to deploy it so that someone else can test it.
    Since I am using docker to run MLDB, can I somehow deploy the container? Sorry if this sounds stupid because as I said before I m completely new to these things.
    sniper0110
    @sniper0110
    Any hints guys? I'm kinda stuck