by

Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
    derrickk23
    @derrickk23
    Is there a way to specify dynamically memory requirements for the @batch decorator at runtime? My workflow has a fan out with many parallel jobs where each batch job uses between 10GB of memory to 1TB, and the amount of memory per job can be determined at runtime based its input, created dynamically in a previous step. Obviously, I would like to avoid having to allocate an EC2 instance with 1TB of memory for each (small) job.
    2 replies
    Wooyoung Moon
    @wmoon5

    Hi is there any way to run generic flows that read from a flow parameters json/yml file? So that I can do something like this:

    python my_flow.py run --parameters my_flow_parameters.yml

    5 replies
    tkr789
    @tkr789
    Hi, what are the requirements for running a docker image in @batch ? I need to set up an container that has a java process running in the background in addition to python, so I need both python and java installed and running. In addition I need some packages that are only pip and git installed (so no conda decorator). As far as I've been able to tell the java process can only run on one instance of ec2 (can't thread) so I need to parallelize on multiple instances rather than multiple cpu's, hence the container.
    4 replies
    derrickk23
    @derrickk23

    Hello, Is there a way to get the full s3 url path of a metaflow artifact, which was stored in a step?

    I looked at Metaflow's DataArtifact class but didn't see an obvious s3 path property.

    5 replies
    Joseph Bentivegna
    @jbentivegna15
    Hi all, has anyone experienced a situation where the flow runs to completion with no errors but when trying to view outputs from the flow, certain pandas dataframes are accessible while others are not and throw the error: S3 datastore operation _get_s3_object failed (An error occurred (400) when calling the HeadObject operation: Bad Request). Retrying 7 more times..
    40 replies
    Bahattin Çiniç
    @bahattincinic

    Hi all, I have a question regarding the inheritance of the Flow.
    We have an algorithm that has multiple versions (v1.0, v1.1 v1.2, etc.). Mostly these versions are similar to the primary version. So we only want to override some steps. When I checked if Metaflow supports it, I saw Netflix/metaflow#245 this ticket.

    it looks like we have 2 options to implement this;

    Do you think which one is the best choice in the Metaflow ecosystem? or different ideas about this?

    Thanks.

    3 replies
    Denis Maciel
    @denispmaciel_gitlab
    Hi, is there a way to add a custom step decorator without touching the source code?
    10 replies
    Malay Shah
    @malay95
    Hi, is there a way to tag the flow based on another parameter object in the start step?
    11 replies
    adKatta
    @adKatta
    hi I have deployed metaflow on AWS and am trying to run a job and I always get this error:
    Batch error: Task crashed due to OutOfMemoryError: Container killed due to memory usage .This could be a transient error. Use @retry to retry.
    How can i increase the memory availability in the cloudformation template
    adKatta
    @adKatta
    I have tried various combinations of @batch(memory=3072). I think it is the ECS capacity issue
    12 replies
    Malay Shah
    @malay95
    Hello all, I am provisioning an ECS for the batch instances. And I can monitor the metrics of the ECS using the memory utilization and memory reservation. When I run the metaflow step on batch I get the error OutOfMemory but when I look at the utilization its around 7% and the reservation is 86%. Where can I monitor the exact memory usage of the step or find out the exact issue. When I run the same step on local, the memory usage is around 2GB and I have kept 4GB in the batch decorator. Thanks in advance.
    9 replies
    adKatta
    @adKatta

    Internally we export our flows to Meson (Netflix's workflow orchestrator) and shortly we are going to release a similar integration with AWS Step Functions - Netflix/metaflow#2

    Hi @savingoyal when can we hope for this feature? I have seen the google docs for this and we are excited and keen to try this feature out. Also is there a nightly-build we can access?

    2 replies
    Peter Wilton
    @ammodramus

    Hi all, I am having some trouble with FlowSpec.merge_artifacts. I have a scatter-join sequence of steps that looks like this:

    @step
    def set_up(self):
        self.foreach_tuple = TUPLE_OF_VALUES_TO_PROCESS
        self.next(self.process, foreach='foreach_tuple')
    
    @step
    def process(self):
        self.value = self.input
        # (process self.value, not touching self.foreach_tuple)
        self.next(join_step)
    
    @step
    def join_step(self, inputs):
        self.merge_artifacts(inputs)

    Running my code results in a MergeArtifactsException due to input.foreach_tuple having a different value in each input in inputs. This despite the fact that I haven't touched foreach_tuple since assignment in set_up.

    The odd thing is that when I look at inputs in FlowSpec.merge_artifacts, the values of input._datastore['foreach_tuple'] are the same for each input in inputs, as expected. However, the SHAs (as accessed via input._datastore.items()) are all different.

    Any idea about what may be causing the SHAs too all differ (triggering the MergeArtifactsException) while the values are all the same?

    Thanks in advance.

    13 replies
    Martin Cheong
    @martincheong-myob

    Hi, we're running into an issue where the AWS Batch job will succeed but the flow won't proceed to subsequent steps. It looks like it's hanging waiting for the Batch job to complete even though it already has. Any ideas as to what might be going on here? We're just running the 00-helloworld flow with a @batch decorator for the hello step. The logs are as below:

    2020-07-21 11:36:24.118 Workflow starting (run-id 23):
    2020-07-21 11:36:24.182 [23/start/65 (pid 19754)] Task is starting.
    2020-07-21 11:36:25.281 [23/start/65 (pid 19754)] HelloFlow is starting.
    2020-07-21 11:36:25.609 [23/start/65 (pid 19754)] Task finished successfully.
    2020-07-21 11:36:25.847 [23/hello/66 (pid 19811)] Task is starting.
    2020-07-21 11:36:26.597 [23/hello/66 (pid 19811)] [5c92ef5f-6525-4b76-8e4e-96d75049d087] Task is starting (status SUBMITTED)...
    2020-07-21 11:36:29.747 [23/hello/66 (pid 19811)] [5c92ef5f-6525-4b76-8e4e-96d75049d087] Task is starting (status RUNNABLE)...
    2020-07-21 11:36:59.777 [23/hello/66 (pid 19811)] [5c92ef5f-6525-4b76-8e4e-96d75049d087] Task is starting (status RUNNABLE)...
    2020-07-21 11:37:29.831 [23/hello/66 (pid 19811)] [5c92ef5f-6525-4b76-8e4e-96d75049d087] Task is starting (status RUNNABLE)...
    2020-07-21 11:38:00.027 [23/hello/66 (pid 19811)] [5c92ef5f-6525-4b76-8e4e-96d75049d087] Task is starting (status RUNNABLE)...
    2020-07-21 11:38:04.489 [23/hello/66 (pid 19811)] [5c92ef5f-6525-4b76-8e4e-96d75049d087] Task is starting (status STARTING)...
    2020-07-21 11:38:30.738 [23/hello/66 (pid 19811)] [5c92ef5f-6525-4b76-8e4e-96d75049d087] Task is starting (status RUNNING)...
    2020-07-21 11:53:04.709 1 tasks are running: e.g. ....
    2020-07-21 11:53:04.709 0 tasks are waiting in the queue.
    2020-07-21 11:53:04.709 0 steps are pending: e.g. ....

    Any help would be appreciated. Thanks.

    23 replies
    Zhaozhufeng1
    @Zhaozhufeng1
    I have Metaflow configured with AWS, and can successfully run the job without batch on AWS. But when I try to run a job with batch, it return the error below:
    30 replies
    Screen Shot 2020-07-21 at 5.13.53 PM.png
    Savin
    @savingoyal
    Screenshot 2020-07-21 at 2.32.56 PM.png
    2 replies
    simon-lyons
    @simon-lyons
    Screenshot 2020-07-22 at 22.07.37.png

    Hi, I just had a quick question about debugging a metaflow run. It seems like there's some sort of buffering to stdout taking place. If I open a console in debug mode and execute print('foo\nbar'), the system will only print foo to the console. I'll have to print something else to see bar.

    Any idea how I might find a workaround for this issue? It gets triggered if you try to print the contends of a pandas DataFrame, which can be a real pain when you're debugging

    4 replies
    Juan Daza
    @dazajuandaniel
    Hi,
    Sorry for the early enter press! I'm trying to use KMS for authentication in AWS for writing to S3. Is this supported by Metaflow? I couldn't find any documentation about this.
    5 replies
    Taleb Zeghmi
    @talebzeghmi
    10 replies
    Jack Wells
    @jackwellsxyz
    Screen Shot 2020-07-24 at 16.10.58.png
    Hi, I think the answer is no, but does Metaflow support a feature store like Airbnb's Zipline (closed source)? I'm really racking my brain for a solution that helps turn our database of information on our customers into something usable for modeling.
    5 replies
    Martin Cheong
    @martincheong-myob
    Hi. I'm trying to dockerise the Metaflow CLI to provide a consistent experience and abstract away the configuration. I've noticed the caching of the conda dependencies is no longer happening and was wondering if I can specify the cache path. I saw METAFLOW_CLIENT_CACHE_PATH in the codebase but providing that as an env var doesn't seem to be doing anything. Any suggestions? Thanks.
    9 replies
    Ji Xu
    @xujiboy
    Hi may I know how I can access data artifact without using self.?
    13 replies
    Ville Tuulos
    @tuulos
    wohoo! Step Functions (aka Production Scheduler) integration is out finally! https://netflixtechblog.com/unbundling-data-science-workflows-with-metaflow-and-aws-step-functions-d454780c6280

    also we published a new Administrators Guide to Metaflow, which should be of interest for many people here https://admin-docs.metaflow.org/

    The admin guide was inspired by your questions and comments here, so a huge thanks to everyone! I hope you will find it useful. Please give feedback especially if you notice something missing or misrepresented.

    russellbrooks
    @russellbrooks
    :heart: :clap: thank you all for the outstanding work and can't wait to try it out – really been looking forward to this integration!
    13 replies
    Kolja
    @koljamaier
    Hi @tuulos thanks for the update, great to hear. Will internal Netflix teams also adapt SF now instead of meson?
    2 replies
    Kolja
    @koljamaier
    Reading through https://docs.metaflow.org/going-to-production-with-metaflow/scheduling-metaflow-flows it is not clear to my if it is possible to setup external triggers (e.g. another SFN task finished loading our ETL data, so the metaflow job/SF can be triggered). Is this possible or are only time based schedules supported?
    3 replies
    Hao Yuan
    @hyuan-integrate
    I have read this doc, https://docs.metaflow.org/metaflow/dependencies but it does not mention private python package repo. we have an internal pipy repo host the internal python packages. is there a way to pull packages from this private python repo?
    5 replies
    Ritesh Agrawal
    @ragrawal
    hi, I am starting to explore Metaflow for my use case . It's a standard machine learning problem. I would like to train a model on some data and use the trained model for making inference. However, I am not able to find a simple example that shows the two process (training and inference). Will appreciate any pointer to a simple example
    16 replies
    Ritesh Agrawal
    @ragrawal
    hi..I just got setup with AWS and running into challenges with some libraries as they are not available as part of Conda installation. Is there a way I can specify environment.yml file instead and included both pip and non-pip packages
    3 replies
    w76
    @w76

    Hi I have a question running a project using Metaflow on AWS. How does metaflow on batch handle module imports of other files that are required by the python script that is being run using --with batch?
    If for instance, in the tutorial example,
    python 02-statistics/stats.py --with batch run --max-workers 4,
    "stats.py" had a module import such as 'import moduleX', and the functions in moduleX are used by stats.py, how does batch handle the dependency on moduleX as this would typically be on my local filesystem and the --with batch only runs stats.py on batch

    In my project I have a library of modules and helper functions that would be called within the script that is run as a metaflow DAG. I'm unclear how I can make the other modules available when running on batch when they are not in the file that defines the metaflow DAG tasks

    2 replies
    bishax
    @bishax

    Hi, I'm struggling to work out how to use images not on dockerhub, e.g. I'd like to use images from AWS ECR.

    metaflow configure aws hints that if METAFLOW_BATCH_CONTAINER_REGISTRY is not set then "https://hub.docker.com/" is used; however if I explicitly set it to that value then I get AWS Batch error:,CannotPullContainerError: invalid reference format This could be a transient error. Use @retry to retry.

    edit: everything works normally if I don't configure METAFLOW_BATCH_CONTAINER_REGISTRY

    2 replies
    Ritesh Agrawal
    @ragrawal
    when using InputFile, how do you covert it to DataFrame
    16 replies
    Ritesh Agrawal
    @ragrawal
    once I have configured AWS, is there a way to run things again locally. I am trying to debug my pipeline and don't want to connect to AWS..is there some parameter that I can set that forces Metaflow to use local datastore
    2 replies
    VahidTehrani
    @VahidTehrani
    @benjaminbluhm and @tuulos Hey! did you guys figure out how to load parquet files from s3? Any example? Thanks
    Max Nikolaus Pagel
    @moodgorning

    Hi, has anyone gotten the conda dependency management to work with pycharm and be able to debug?
    I am running it with a miniconda interpreter and it seems to start fine, but never actually starts executing anything in the flow. However I don't really get any error either. If I run it from my command line it works just fine. Runing this on OSX

    runfile('/Users/maxpagel/Developer/infrastructure/TimescaleDB/Scripts/metaFlowPlayground.py', args=['--environment=conda', 'run'], wdir='/Users/maxpagel/Developer/infrastructure/TimescaleDB/Scripts')
    Metaflow 2.1.1 executing BranchFlow for user:maxpagel
    Validating your flow...
    The graph looks good!
    Running pylint...
    Process finished with exit code 1

    Max Nikolaus Pagel
    @moodgorning
    Hmm turns out it works if I use the system interpreter, but not if I use a conda environment as interpreter that should be fine then
    3 replies
    Javier
    @jagarcia29_gitlab
    is there a way to dynamically request the cpus and and memory passed into the @batch decorator e.g. (memory=mem_var, cpu=cpu_var). I don't see anything like that in the documentation but this would be super useful for our use case. Otherwise, it sounds we like we need to wrap the metaflow around yet another script using jinja templates or something
    4 replies
    Greg Hilston
    @GregHilston

    Hey guys, I'm on a fresh Metaflow installation and can successfully run locally. Attempting to run the tutorial 05-helloaws and met with

    An error occurred (ClientException) when calling the SubmitJob operation: JobQueue [arn] not found

    Any advice on how this JobQueue could have not been stood up when I used the Cloud Formation file provided?

    I can even see the CREATE_COMPLETE Status in the Resources tab in the CloudFormation>Stacks page on the AWS UI.

    6 replies
    Bahattin Çiniç
    @bahattincinic

    Hi all, I have a question regarding the Metaflow Parameter.

    I want to add a parameter like this;

    dry_run = Parameter('dry_run', default=False)

    When I add this parameter I need to send as --dry_run=False. But I want to use as --dry-run=False
    So when I changed parameter name as dry-run, I realized that this doesn't work. (dry_run = Parameter('dry-run', default=False))

    Do you think is this bug? or is dry-run reserved parameter?

    Thanks.

    2 replies
    John Ritsema
    @jritsema
    Hi all. I'm new to metaflow and I'm interested in possibly leveraging metaflow's python format for running on a separate runtime. What would be the best way to parse a metaflow program? Is there a particular api that i could use to get an AST that I could inspect? thanks!
    20 replies
    David Neuzerling
    @mdneuzerling
    Hi folks. Thanks for the useR workshop! I'm trying to convert the below pipeline into a Metaflow flow. I'm getting stuck at the joins. I've tried adding varous join = TRUE arguments, but Metaflow complains about an incorrect number of steps. I feel like I'm missing something conceptual about how the branching works here. Can anyone help me? I'll put my attempt in a reply because it's quite lengthy.
    8 replies
    Savin
    @savingoyal
    Screen Shot 2020-08-07 at 6.58.52 PM.png
    6 replies
    Kent Johnson
    @kent37
    Hi, just getting started with the R tutorial and 01-playlist does not work correctly. Assigning a data.frame to self$df loses the class attribute of the data.frame. This causes the subset operations in the pick_movie and bonus_movie steps to fail.
    7 replies