Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
    pranaygp
    @pranaygp:beeperhq.com
    [m]
    Anyone have any good dashboard notebooks for inspecting runs in progress?
    I've just been using the one from the metaflow tutorial, but it's pretty barebones and I'm wondering if anyone came up with something nicer
    6 replies
    serj90
    @serj90
    Hi guys.
    I created a step function for my flow and specified CPU and RAM for steps using resources decorator, but after the run I noticed that it ignores values (to be more correct, it runs on default container) that are lower than metaflow defaults which are cpu : 1, memory : 4000. The situation is different when using batch decorator, in that case it allows smaller containers. Anyone facing this issue or having an idea why we can't specify smaller resources while running flow as a step function ?
    7 replies
    jamesbsilva
    @jamesbsilva

    Hey All, Quick question,

    Does anyone know if there are any PyCharm settings or PyCharm plugins for having your IDE to check if all the right dependencies have been loaded at the _"@step" level?

    1 reply
    mkjacks5
    @mkjacks5

    I am trying to run metaflow inside a docker container, but I have to run it as non-root user. When I try to import with

    "metaflow configure import metaflow_config/config.txt"

    I get "PermissionError: [Errno 13] Permission denied: '/.metaflowconfig'"

    I have tried changing the permissions with chown and chmod, currently set to

    drwxrwxrwx 2 1000 1000 6 Apr 5 19:01 .metaflowconfig

    But no luck. Can I run metaflow inside a docker container without being a root user?

    6 replies
    pranaygp
    @pranaygp:beeperhq.com
    [m]
    Can you inherit and extend a flow?
    I'l like to reuse the entire flow, and just override one of the steps
    4 replies
    when trying to do this, metaflow complains about steps being missing whenever I try to do "self.next" to a step that exists in the parent flow class
    grizzledmysticism
    @grizzledmysticism
    Has anyone been able to debug in Pycharm using the conda decorators, and not gotten a "No conda installation found. Install conda first" message? I followed the instructions in the documentation (using env var PATH with path to conda executable) but it still doesn't recognize it
    20 replies
    Corrie Bartelheimer
    @corriebar
    Hey everyone, one question regarding metaflow on step functions. Is it possible to somewhere see when a flow was last updated, i.e. redeployed?
    2 replies
    Sam Petulla
    @petulla1_gitlab
    Is it possible to remove any services from the cloudformation template? The services list is pretty extensive, wondering if any can be removed for cost-savings
    5 replies
    Sam Petulla
    @petulla1_gitlab

    Separately, anyone had this issue?

    I can run aws c3 cp (my file) (remote bucket) and it pulls from the $AWS_PROFILE variable correctly (with an alias set on aws command to do so). However, running METAFLOW_PROFILE=personal python 05-helloaws/helloaws.py --datastore=s3 run I'm getting a token expired error which, I think is because it is using the wrong profile. Any tips on how to debug without just switching profile names. I will need to use a named profile.

    7 replies
    Oleg Avdeev
    @oavdeev

    I've created an Netflix/metaflow#473 to stop supporting Python 2.x from the next Metaflow release. Mostly because it will allow type annotations from Python 3, and make codebase more contributor friendly.

    It seems like a pretty conservative move given that Python 2.7 has been EOL'ed more than a year ago. But I'm curious if anyone here is still using Metaflow with Python 2.7 and would be affected by this change?

    mkjacks5
    @mkjacks5

    I am getting internal server errors when I add 'METAFLOW_DEFAULT_METADATA': 'service' to my config file. My config file contains METAFLOW_SERVICE_URL, METAFLOW_SERVICE_INTERNAL_URL and METAFLOW_SERVICE_AUTH_KEY and I have verified they match what is in the cloudformation stack output.

    when I try to run a script locally with

    python inference-flow.py --environment=conda --datastore=s3 run

    I get the following

    Bootstrapping conda environment...(this could take a few minutes)
    
        Metaflow service error:
    
        Metadata request (/flows/inference-flow) failed (code 500): {"message": "Internal server error"}

    If I try to run step functions create I get the following:

    Running pylint...
    
        Pylint is happy!
    
    Deploying inference_flow to AWS Step Functions...
    
        Internal error
    
    Traceback (most recent call last):
    
      File "/usr/local/lib/python3.7/site-packages/metaflow/cli.py", line 930, in main
    
        start(auto_envvar_prefix='METAFLOW', obj=state)
    
      File "/usr/local/lib/python3.7/site-packages/click/core.py", line 829, in __call__
    
        return self.main(args, kwargs)
    
      File "/usr/local/lib/python3.7/site-packages/click/core.py", line 782, in main
    
        rv = self.invoke(ctx)
    
      File "/usr/local/lib/python3.7/site-packages/click/core.py", line 1259, in invoke
    
        return _process_result(sub_ctx.command.invoke(sub_ctx))
    
      File "/usr/local/lib/python3.7/site-packages/click/core.py", line 1259, in invoke
    
        return _process_result(sub_ctx.command.invoke(sub_ctx))
    
      File "/usr/local/lib/python3.7/site-packages/click/core.py", line 1066, in invoke
    
        return ctx.invoke(self.callback, ctx.params)
    
      File "/usr/local/lib/python3.7/site-packages/click/core.py", line 610, in invoke
    
        return callback(args, kwargs)
    
      File "/usr/local/lib/python3.7/site-packages/click/decorators.py", line 33, in new_func
    
        return f(get_current_context().obj, args, kwargs)
    
      File "/usr/local/lib/python3.7/site-packages/metaflow/plugins/aws/step_functions/step_functions_cli.py", line 88, in create
    
        check_metadata_service_version(obj)
    
      File "/usr/local/lib/python3.7/site-packages/metaflow/plugins/aws/step_functions/step_functions_cli.py", line 120, in check_metadata_service_version
    
        version = metadata.version()
    
      File "/usr/local/lib/python3.7/site-packages/metaflow/plugins/metadata/service.py", line 41, in version
    
        return self._version(self._monitor)
    
      File "/usr/local/lib/python3.7/site-packages/metaflow/plugins/metadata/service.py", line 288, in _version
    
        (path, resp.status_code, resp.text),
    
    NameError: name 'path' is not defined

    list(Metaflow()) gives the following

    Traceback (most recent call last):
    
      File "cleanup.py", line 14, in <module>
    
        print('list(Metaflow())',list(Metaflow()))
    
      File "/usr/local/lib/python3.7/site-packages/metaflow/client/core.py", line 245, in __iter__
    
        all_flows = self.metadata.get_object('root', 'flow')
    
      File "/usr/local/lib/python3.7/site-packages/metaflow/metadata/metadata.py", line 357, in get_object
    
        return cls._get_object_internal(obj_type, type_order, sub_type, sub_order, filters, *args)
    
      File "/usr/local/lib/python3.7/site-packages/metaflow/plugins/metadata/service.py", line 116, in _get_object_internal
    
        return MetadataProvider._apply_filter(cls._request(None, url), filters)
    
      File "/usr/local/lib/python3.7/site-packages/metaflow/plugins/metadata/service.py", line 247, in _request
    
        resp.text)
    
    metaflow.plugins.metadata.service.ServiceException: Metadata request (/flows) failed (code 500): {"message": "Internal server error"}
    
    script returned exit code 1

    Any ideas what might be happening?

    16 replies
    Mike Bentley Mills
    @mikejmills
    Hi I'm wondering if there is a way of seeing the code packages that where saved? Specifically can you look at them using the Metaflow Run Task modules.
    8 replies
    Edvin Močibob
    @emocibob
    Hi, is there a way to inspect a past run and see the parameters (metaflow.Parameter) it was started with?
    2 replies
    pranaygp
    @pranaygp:beeperhq.com
    [m]
    This has probably been asked before but, can you do conditional self.next ?
    3 replies
    if x:
      self.next(...)
    else
      self.next(...)
    sorry. that's a rhetorical question. I guess I meant. What's the recommended way to do something like that? cc @tuulos
    6 replies
    pranaygp
    @pranaygp:beeperhq.com
    [m]
    Ah, yeah #3 is my best case. Would be good to have native support for it
    1 reply
    The problem with 1 and 2 is that they take a lot of time when the steps are on batch
    Since we need to provision a GPU instance
    Just for noop
    Kamil Bobrowski
    @kbobrowski

    Hi, question about isolation of steps, I noticed that these two steps will be executed in the same conda environment:

    from metaflow import FlowSpec, step, conda
    
    class IsolationTest(FlowSpec):
    
        @conda(python="3.8.5")
        @step
        def start(self):
            import sys
            print(f"start executable: {sys.executable}")
            self.next(self.end)
    
        @conda(python="3.8.5")
        @step
        def end(self):
            import sys
            print(f"end executable: {sys.executable}")
    
    
    if __name__ == "__main__":
        IsolationTest()

    They will run in separated environments only if the python version is different. Is there a way to ensure that separate environments are created? (context: I need to install packages from pip, which results in heavy installing / rolling back packages every time flow is executed)

    5 replies
    Kelly Davis
    @kldavis4
    Question about step function IAM permissions. We have a step that is doing a next() with a foreach param, and when it runs as a step function, we get an error that the aws batch execution role doesn't have permission to call PutItem to the step functions dynamo table. This is in the step_functions_decoratory.py in task_finished(), so it makes sense to me that the batch execution role would need the dynamo permissions, but when I look at the cloudformation template in metaflow-tools, it doesn't seem to be granting those permissions. Can someone confirm that the batch execution role does need permissions to the step function dynamodb table?
    2 replies
    daavidstein
    @daavidstein

    Re (2) of issue #149:

    We are considering porting our data processing/ML pipelines to metaflow, but one thing that is holding us back is the lack of support for integration testing. For instance, we currently use kedro for our data pipelines. Kedro provides the ability to call and run a kedro pipeline from another python script (directly, without using subprocess.run) and additionally provides the ability to override the default datasets used in that pipeline at runtime. This is important, because some of our datasets are very large, so we naturally want to use subsetted versions of these datasets. In fact, some of the datasets we inject into the pipeline for the integration test are generated with Hypothesis which ensures that our pipelines are robust to unanticipated variations in the data.

    Furthermore, although it's not ideal for an integration test, there are some expensive functions, or functions that rely on a network connection, that we want to patch using unnitest.mock.patch . This doesn't seem to be possible when running a metaflow pipeline with subprocess.run .

    One solution could be to just to have a test flow inherit from the flow to be tested, and override the artifacts that way, ie:

    class TestPlaylistFlow(PlayListFlow):
        movie_data = test_data

    But as far as I understand it, the child flow will not inherit the step functions, which would force us to import them and manually specify them in the test flow.

    The only other solution I can think of at present is to define a boolean parameter test in the original flow and based on the value of that parameter assign different artifacts to the instance variables as necessary. Is there another option that can be implemented with the current version of metaflow==2.2.9?

    11 replies
    russellbrooks
    @russellbrooks

    hey guys, had a teammate run into what I believe is a bug in how Parameters are used in Step Functions deployments. If there is a Parameter that defaults to None like

    Parameter(name="test_param", type=int, default=None)

    it'll result in the following error when deploying with step-functions create.

    Flow failed:
        The value of parameter test_param is ambiguous. It does not have a default and it is not required.

    The same flow/parameter will run successfully locally or when submitted to batch without SFNs.

    1 reply
    pranaygp
    @pranaygp:beeperhq.com
    [m]
    is there a good way to "tag" runs?
    the auto increment number is fine, but I'd like to use a name ideally when kicking off jobs from the command line so it's easier to keep track of results from multiple runs and compare them easily
    2 replies
    vinod-rachala
    @vinod-rachala
    I have using metaflow on the ECR and while i am executing the code through the step function i am getting this error please let me if there any solutions. Error:Metaflow 2.2.8 executing preprocessflow unknown user:Metaflow could not determine your username based on environment variables ($USERNAME etc.)
    1 reply
    Ayotomiwa Salau
    @AyonzOnTop
    Hello, I noticed any time I join two dag steps in a class, it loses its self.<attributes>. I can no longer call the attributes in the dag steps after the join.
    3 replies
    Kamil Bobrowski
    @kbobrowski
    Hi, the way metaflow executes locally (creates unique conda environments only for a set of unique @conda(...) decorators) is making it quite difficult when relying on pip-installed packages - if the step which requires pip package is executed in parallel within foreach then pip will fail as multiple threads will be trying to install packages into the same environment. I'm thinking about possible solutions - maybe a switch to force creation of unique conda environments for each step / option to run each step in separate docker container / proper support for pip through @pip. What do you think? I'd be happy to contribute
    7 replies
    Ayotomiwa Salau
    @AyonzOnTop
    I tried running a logistic regression on one branch of my dag and randomforest regression on another branch. The randomforest completed its task quickly while the logistic regression kept on running on and on, almost 3 hours. Data is about 500k rows and 1024 cols. Not so much. Why is it taking long?
    1 reply
    Richard Decal
    @crypdick
    @savingoyal @tuulos Hopefully this clarifies (sorry, couldn't attach to our thread) metaflow_testing.txt
    Kelly Davis
    @kldavis4
    Is there any way to "checkpoint" mid-step and allow resuming a run from that checkpoint?
    The use case is we have two activities where the second depends on the first, but where it doesn't necessarily make sense to split them into separate steps and the first activity takes a while to complete. During development, say we are modifying the second activity, then every time we run another test of our changes, we have to execute that first long activity again. In order to speed things up during dev, we might split up into separate steps to allow resuming at the right place, but then we might combine them into a single step for higher efficiency in production
    3 replies
    David Vega Fontelos
    @repocho
    Hello !! I'm using a metaflow metadata service + a on premise object storage (S3) for the data store. But I saw that the Object Storage is using more than 2TB, storing very old flow.
    Is there a procedure to cleaning old flows from the data store ??
    Thanks !!
    3 replies
    David Patschke
    @dpatschke

    @tuulos @savingoyal I'm seeing some strange behavior when running Flows via AWS Batch that are not consistently reproducible. The error I'm getting says that:

    Task is starting.
    list index out of range
    Task failed.

    There is no hash present within hard brackets via the Metaflow logging so something is telling me this might be an undetected AWS Batch issue ( but I have no idea). It's like the Task never even got a chance to start and hard-failed from there.
    What really stinks about this error is that the Flow is run with --with retry but a retry is not even attempted. This is the 2nd time this error has presented itself to me within a week. Both times I have been able to re-run the Flow immediately afterwards, and it completes successfully. FWIW, this last time, it happened in the middle of a foreach fan-out and all other fanned out processes kept running. The Flow failed at the end step because, I'm guessing all known tasks didn't complete successfully and a check is made (good work on your end for having that).
    Are either of you aware of what may be causing this error and/or have any potential suggestions?
    Thanks!

    9 replies
    daavidstein
    @daavidstein
    We want to use a separate namespace for executing flows in a test environment. In particular we want runs executed in the test namespace to not be visible to user namespaces, or at least be clearly differentiated and easily filtered out. Currently if It seems that the current way that namespaces are implemented, if we run a flow with python my_flow.py run --namespace test --tag test, the last_run of the flow in the user namespace is updated with the run in the test namespace. Furthermore, It doesn't seem to be possible to run flows as a different user using the --namespace flag:
    python my_flow.py run --namespace user:test
    ...
    2021-04-29 18:18:04.147 [1619709482612481/end/5 (pid 67190)] Task finished successfully.
    
    namespace("user:daavid")
    Metaflow().flows[0].latest_run
    
    >> Run('PlayListFlow/1619709482612481')
    
    Metaflow().flows[0].latest_run.tags
    
    >> frozenset({'date:2021-04-29',
               'metaflow_version:2.2.10',
               'python_version:3.8.5',
               'runtime:dev',
               'user:daavid'})
    3 replies
    ailishbyrne
    @ailishbyrne
    hi all. we've implemented a custom step decorator to instrument our tasks and are super happy with the results. thank you! the one thing i have been unsuccessful in doing is accessing the input to foreach step in the task_pre_step. if i even try to do anything with the input on the flow parameter to that method, whether calling flow.input or using getattr(flow, 'input'), the input is not only unavailable there but also no longer available when the step is executing. we are using a contextual json logger, and i am hoping to add that context to the logger centrally, rather than requiring the context be added in the foreach steps themselves. here is an example log statement: {"text": "task succeeded in 0.2 seconds", "log_level": "INFO", "filename": "decorators.py", "lineno": "46", "method_name": "task_post_step", "flow_name": "TestFlow", "run_id": "1451", "step_name": "forme", "task_id": "11051", "task_input": "1"}
    19 replies
    Kyle Smith
    @smith-kyle

    Hello, I've recently installed metaflow on AWS using a manual deployment. I'm running the helloaws.py flow and it's stuck in the RUNNING state. Looking at the log steam I see the code is downloaded but again it's just stuck at Task is starting.

    Can someone please help me diagnose the problem?

    4 replies
    Ayotomiwa Salau
    @AyonzOnTop
    Hello guys, I just published a blog on building data science projects with Metaflow using MNIST dataset as a use case. Check it out, feedbacks and acknowledgements are welcomed.
    https://ayotomiwasalau.medium.com/starting-your-data-science-project-with-metaflow-the-mnist-use-case-44e3b3ad6ec3
    5 replies
    Mike Bentley Mills
    @mikejmills
    Hi I'm running metaflow but NOT using AWS batch (yet). I've found that the run.code & task.code are all None. Is there a way to tell Metaflow to always save the code?
    2 replies
    Christopher Wong
    @christopher-wong
    I’m probably missing something obviuos in the docs, but how do you delete a scheduled metaflow step function without manually deleting all the compoments from the console?
    3 replies
    ailishbyrne
    @ailishbyrne
    Screen Shot 2021-05-05 at 7.21.31 PM.png
    Ahmad Houri
    @ahmad_hori_twitter
    I have a flow which is scheduled to run everyday many times but each time is for different client, how can I use tags to tag each flow triggered using step function with specific tag?
    1 reply
    Kelly Davis
    @kldavis4
    HI all, we just published an article on our usage of Metaflow on CNN's Digital Intelligence team (w/ some highlight on the terraform PR we've been working on): https://medium.com/cnn-digital/accelerating-ml-within-cnn-983f6b7bd2eb
    5 replies
    baothienpp
    @baothienpp
    Hi everyone, is there anyway to assign namespace inside the flow code and not from cli ?
    17 replies
    Samuel Than
    @samuelthan
    ML Platform Architecture.png
    17 replies

    Hi all, taking some ideas from the community here. Would like you all to “roast” my potential ML platform Arch.

    The purpose is to provide a centralized place to handle the ML Training piplines. lThis is specific to using Metaflow

    1.Users executes their Metaflow “jobs”
    2.Metaflow runs the training pipeline jobs
    3.Outputs a model packaged as a docker image, stored in AWS ECR
    4.Gets duplicate/pulled into users of different teams/business accounts.

    Anyone had this approach before ? or i’m doing it wrong

    Kha
    @nlhkh
    Hi guys. I have some ML flows that have steps in both python and R (data fetching in Python, model training in R). This is due to previous development. I read that Metaflow supports both Python and R. I would like to know if I can mix a python step and an R step within a flow.
    2 replies