Where communities thrive

  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
    Van Nghia
    how about bazel clean --expunge and retry one more time :/ ?
    Keqiu Hu
    perhaps it is becauuse i didn't rerun ./configure.sh
    Keqiu Hu
    same error
    (p3) pi@pig:~/tf/io$ TFIO_DATAPATH=bazel-bin python3 -m pytest -s -v tests/test_pcap.py
    ================================================================== test session starts ===================================================================
    platform linux -- Python 3.8.5, pytest-6.2.2, py-1.10.0, pluggy-0.13.1 -- /home/pi/p3/bin/python3
    cachedir: .pytest_cache
    rootdir: /home/pi/tf/io
    collecting ... 2021-03-22 12:34:48.901457: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
    collected 1 item
    tests/test_pcap.py::test_pcap_input Testing PcapDataset
    ======================================================================== FAILURES ========================================================================
    ____________________________________________________________________ test_pcap_input _____________________________________________________________________
        def test_pcap_input():
            print("Testing PcapDataset")
            pcap_filename = os.path.join(
                os.path.dirname(os.path.abspath(__file__)), "test_pcap", "http.pcap"
            file_url = "file://" + pcap_filename
    >       dataset = tfio.IODataset.from_pcap(file_url, capacity=5).batch(1)
    _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
    tensorflow_io/core/python/ops/io_dataset.py:309: in from_pcap
        return pcap_dataset_ops.PcapIODataset(filename, internal=True, **kwargs)
    tensorflow_io/core/python/ops/pcap_dataset_ops.py:36: in __init__
        resource = core_ops.io_pcap_readable_init(
    tensorflow_io/core/python/ops/__init__.py:88: in __getattr__
        return getattr(self._load(), attrb)
    tensorflow_io/core/python/ops/__init__.py:84: in _load
        self._mod = _load_library(self._library)
    _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
    filename = '/home/pi/tf/io/tensorflow_io/core/python/ops/__init__.py', lib = 'op'
        def _load_library(filename, lib="op"):
            f = inspect.getfile(sys._getframe(1))  # pylint: disable=protected-access
            # Construct filename
            f = os.path.join(os.path.dirname(f), filename)
            filenames = [f]
            # Add datapath to load if en var is set, used for running tests where shared
            # libraries are built in a different path
            datapath = os.environ.get("TFIO_DATAPATH")
            if datapath is not None:
                # Build filename from:
                # `datapath` + `tensorflow_io` + `package_name` + `relpath_to_library`
                rootpath = os.path.dirname(sys.modules["tensorflow_io"].__file__)
                filename = sys.modules[__name__].__file__
                f = os.path.join(
                    os.path.relpath(os.path.dirname(filename), rootpath),
                    os.path.relpath(f, os.path.dirname(filename)),
            # Function to load the library, return True if file system library is loaded
            if lib == "op":
                load_fn = tf.load_op_library
            elif lib == "dependency":
                load_fn = lambda f: ctypes.CDLL(f, mode=ctypes.RTLD_GLOBAL)
            elif lib == "fs":
                load_fn = lambda f: tf.experimental.register_filesystem_plugin(f) is None
                load_fn = lambda f: tf.compat.v1.load_file_system_library(f) is None
            # Try to load all paths for file, fail if none succeed
            errs = []
            for f in filenames:
                    l = load_fn(f)
                    if l is not None:
                        return l
                except (tf.errors.NotFoundError, OSError) as e:
    >       raise NotImplementedError(
                "unable to open file: "
                + "{}, from paths: {}\ncaused by: {}".format(filename, filename
    let me do the --expunge nuclear version
    Jason Zaman
    is that your machine or a CI worker? and do you have caching enabled? (either remote or --disk_cache=?)
    Keqiu Hu
    this is from my local machine
    no caching
    ci has a similar but different error
    Yong Tang
    @oliverhu The missing symbol issue is likely caused by some API change on tf-nightly. I will take a look.
    Keqiu Hu
    thanks! @yongtang
    yeah, it still breaks with the same error after expunge
    Yong Tang
    I am able to reproduce the issue on my local environment. I tends to believe the issue is that tensorflow/core/platform/cloud/gcs_file_system.cc was not picked up when tf-nightly was packaged. Though may need to take a further look to validate it.
    Yong Tang
    @oliverhu @vnvo2409 Added a PR tensorflow/io#1336 for the build fix.
    Keqiu Hu
    gonna +1..
    Keqiu Hu
    @yongtang it works now. validated
    Keqiu Hu
    @yongtang thoughts on tensorflow/io#1334 ? i think it was a mistake to call it a columnar.py
    Yong Tang
    @oliverhu columnar is to categorize avro into the same category as parquet/feather/csv, as they essentially are column data. The intention is to limit the number of top level python module as the current number is growing too big.
    Keqiu Hu
    @yongtang avro/csv are not columnar
    we can group them into a row based. and i don't think there are many row based storage format nowadays
    Keqiu Hu
    i'm having challenges building tf/io again.. if i use tf 2.4.1, I got this ./tensorflow_io/core/plugins/gs/expiring_lru_cache.h:27:10: fatal error: tensorflow/c/env.h: No such file or directory; if i use tf-nightly (2.6.0), it complains tf doesn't have sysconfig property
    ok, it seems we need tf 2.5.0rc
    Vignesh Kothapalli
    @oliverhu yes, you would need tf 2.5.0rc0 for building tfio. In case you encounter build issues, try doing a bazel clean --expunge, followed by ./configure.sh to remove old symbols and try building again.
    Simon Weiß

    Hi, I am searching for a way to build a tf.data.dataset or tfio.IODataset from Apache Parquet files residing on S3. However, I cannot access the data from S3, e.g. tf.data.Dataset.list_files(s3uri + "/*", shuffle=True) gives the error InvalidArgumentError: Expected 'tf.Tensor(False, shape=(), dtype=bool)' to be true. Summarized data: b'No files matched pattern:....

    On Sagemaker Studio this worked out of the box but I assume they mount s3?
    Is there a good way to achieve this?

    Yong Tang
    @SimonCW Is the issue on Windows or Linux?
    1 reply
    Yong Tang
    @SimonCW Are you able to see files with API tf.io.gfile.listdir(s3uri), or it also does not list files?
    Simon Weiß
    No, I'm getting Could not find directory. Looking at the source, is this even supposed to work with S3 without mounting as a Filesystem?
    Yong Tang
    @SimonCW Yes s3 file system in tensorflow provides the support so that it is possible to access s3 files through s3://bucket/object without mounting. If it is not working, it could be related to configuration as s3 file system needs information about AWS region and permissions , either config file or environmental variable (see https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-files.html)
    Yong Tang
    @SimonCW can you check to see if the permissions and region are configured correctly?
    @SimonCW s3 file system in tensorflow use AWS C++ SDK so the same configuration for AWS CLI will work for s3 file system as well.
    Simon Weiß
    Mh, yes, everything is configured and I would it expect to ask me for MFA. Usually, I work with boto3 and assume_role to get a session that I pass to methods but it seems there is no way to pass this session.
    Victor Xie
    Hi, I was wondering whether someone can kindly give me some pointers to solve a strange problem I met with Tensorflow accessing s3 file from EC2. Basically, the test code is as simple as tf.io.read_file('s3://my_private_bucket/some/file'). The line can run successfully from my local machine, but somehow get stuck (i.e. never return) when I ran on a EC2. Notice the EC2 has been provisioned with appropriate AWS IAM role to access the s3 url (verified by aws cli on the box). I tried both TF v2.5 and v2.4. Both have the same problem. Notice TF 2.6 does not have built in S3 support, so I can't try with TF 2.6.
    Andrey Klochkov
    @xwk , I'd suggest doing something like faulthandler.register(signal.SIGUSR1) and then sending SIGUSR1 to the process to see the stacktrace.
    Victor Xie

    @diggerk thanks for the suggestion. I got the statcktrace as below, but it does not seem to be very informative.

    Current thread 0x00007fe3a832d740 (most recent call first):
    File "/home/victor.xie/.cache/pypoetry/virtualenvs/image-inference-pipeline-2_RwuX6D-py3.8/lib/python3.8/site-packages/tensorflow/python/eager/execute.py", line 59 in quick_execute
    File "/home/victor.xie/.cache/pypoetry/virtualenvs/image-inference-pipeline-2_RwuX6D-py3.8/lib/python3.8/site-packages/tensorflow/python/ops/gen_io_ops.py", line 596 in read_file_eager_fallback
    File "/home/victor.xie/.cache/pypoetry/virtualenvs/image-inference-pipeline-2_RwuX6D-py3.8/lib/python3.8/site-packages/tensorflow/python/ops/gen_io_ops.py", line 558 in read_file
    File "test/test_tf_s3_support.py", line 30 in <module>

    Victor Xie
    All right. I find a workaround by reading some comments in tensorflow/tensorflow#38054. Basically, I had to set these two env vars to make it working - AWS_REGION=<your_bucket_region> and S3_VERIFY_SSL=0.
    Kyle Prifogle
    I was attempting to use https://www.tensorflow.org/io/api_docs/python/tfio/experimental/serialization/decode_json to take a string tensor containing JSON and convert it to a series of feature tensors. However I noticed that it doesn't appear to support nested json. Can someone comment, is this something that shouldn't be attempted in tensorflow preprocessing?
    Kyle Prifogle
    Nevermind I can just call decode_json iteratively
    Andrey Klochkov
    @yongtang , how soon would a release compatible with TF 2.7.x be released? Thanks!
    Yong Tang
    @diggerk We are trying to release 0.22 as soon as possible. Currently we are trying to fix the issue in tensorflow/io#1546 . Once the issue is resolved we will release 0.22.0 (compatible with TF 2.7).
    Vaibhav Singh Thapli
    2021-11-12 19:19:30.562116: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library cudart64_110.dll
    Traceback (most recent call last):
    File "Tensorflow\models\research\object_detection\builders\model_builder_tf2_test.py", line 25, in <module>
    from object_detection.builders import model_builder
    File "C:\Users\vaibh\ANPR\anpr\lib\site-packages\object_detection-0.1-py3.8.egg\object_detection\builders\model_builder.py", line 37, in <module>
    from object_detection.meta_architectures import deepmac_meta_arch
    File "C:\Users\vaibh\ANPR\anpr\lib\site-packages\object_detection-0.1-py3.8.egg\object_detection\meta_architectures\deepmac_meta_arch.py", line 28, in <module>
    import tensorflow_io as tfio # pylint:disable=g-import-not-at-top
    File "C:\Users\vaibh\ANPR\anpr\lib\site-packages\tensorflow_io-0.22.0-py3.8-win-amd64.egg\tensorflow_io__init.py", line 17, in <module>
    from tensorflow_io.python.api import * # pylint: disable=wildcard-import
    File "C:\Users\vaibh\ANPR\anpr\lib\site-packages\tensorflow_io-0.22.0-py3.8-win-amd64.egg\tensorflow_io\python\api\
    init.py", line 19, in <module>
    from tensorflow_io.python.ops.io_dataset import IODataset
    File "C:\Users\vaibh\ANPR\anpr\lib\site-packages\tensorflow_io-0.22.0-py3.8-win-amd64.egg\tensorflow_io\python\ops\
    init.py", line 96, in <module>
    plugin_ops = _load_library("libtensorflow_io_plugins.so", "fs")
    File "C:\Users\vaibh\ANPR\anpr\lib\site-packages\tensorflow_io-0.22.0-py3.8-win-amd64.egg\tensorflow_io\python\ops\
    init.py", line 64, in _load_library
    l = load_fn(f)
    File "C:\Users\vaibh\ANPR\anpr\lib\site-packages\tensorflow_io-0.22.0-py3.8-win-amd64.egg\tensorflow_io\python\ops\
    init__.py", line 56, in <lambda>
    load_fn = lambda f: tf.experimental.register_filesystem_plugin(f) is None
    File "C:\Users\vaibh\ANPR\anpr\lib\site-packages\tensorflow\python\framework\load_library.py", line 218, in register_filesystem_plugin
    tensorflow.python.framework.errors_impl.AlreadyExistsError: File system for s3 already registered
    I got this error when i trained my SDD mobilenet model.
    Hi everyone,
    I have an Apple M1 chip. For a project I need an environment with Python 3.8.3, Tensorflow 2.4.1 and Tensorflow-io 0.17.1. Python and Tensorflow already work. I installed Python with Rosetta 2 and pyenv and compiled Tensorflow from source. So far I haven't found a suitable solution for tensorflow-io.
    'python3 setup.py -q bdist_wheel' didn't worked.
    Does anyone have an idea how I can install tensorflow-io?
    ranjeet gupta
    Hi all, does tensorflow 2.8.0 support fsspec based file systems? I am trying to write tensorboard logs using keras callback ``` tf.keras.callbacks.TensorBoard("myfsspec:path") ? I believe network IO operations get routed to tensorflow io module. I am not sure of tensorflow io is capable of supporting new fsspec based implementations. Any pointers will be much appreciated. thanks !
    Junfan Zhang
    Hi anyone could help me check this PR? tensorflow/io#1656
    Hi all, Arrow has released version 7.0 with a number of bug fixes and significant performance improvements. How about upgrading arrow to version 7.0?
    Austin Anderson
    @yongtang FYI, the GCP credentials job is broken due to an infrastructure problem; I'm working on fixing it now
    Yong Tang
    Thanks @angerson !
    Veeranjaneyulu Toka
    Hi All, i am able to install tf OD api, but when i try to run, am getting the below error. Anybody has faced this issue and any workarounds for the same. Thanks! File "C:\Users\Veeru\anaconda3\envs\tf41_py38\lib\site-packages\tensorflow\python\framework\load_library.py", line 178, in register_filesystem_plugin
    tensorflow.python.framework.errors_impl.AlreadyExistsError: File system for s3 already registered