Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
  • May 13 07:52
    kwalcock commented #251
  • May 12 17:49
    i10416 closed #240
  • May 09 16:25
    sfriedowitz closed #283
  • May 09 16:25
    sfriedowitz commented #283
  • May 09 16:25
    sfriedowitz commented #283
  • May 06 21:04
    sfriedowitz commented #283
  • May 06 20:43
    shadaj commented #283
  • May 06 15:28
    kwalcock commented #251
  • May 05 23:17
    sfriedowitz commented #283
  • May 05 23:14
    sfriedowitz commented #283
  • May 05 21:34
    shadaj commented #283
  • May 04 23:25
    sfriedowitz edited #283
  • May 04 23:24
    sfriedowitz edited #283
  • May 04 23:21
    sfriedowitz opened #283
  • May 01 00:13
    kiendang commented #280
  • Apr 30 23:59
    kiendang commented #280
  • Apr 30 23:23
    kiendang closed #280
  • Apr 30 23:23
    kiendang commented #280
  • Apr 30 22:08
    shadaj commented #280
  • Apr 30 22:08
    shadaj labeled #282
Joesan
@joesan
@kamilkloch You had a similiar issue in the past. Would be interested to know how you solved it!
Joesan
@joesan
Looks like I have to use "me.shadaj" %% "scalapy-numpy" % "0.1.0+6-14ca0424"
but then I hit two other errors that says:
[error] Make sure that type Writer is in your classpath and check for conflicting dependencies with -Ylog-classpath.
[error] A full rebuild may help if 'NumPy.class' was compiled against an incompatible version of me.shadaj.scalapy.py.
[error] np.asarray(preProcessedImagesWithLabels)
[error] /home/joesan/Projects/Private/ml-projects/object-classifier/src/main/scala/com/bigelectrons/animalclassifier/ImageLoader.scala:22:17: could not find implicit value for parameter writer: me.shadaj.scalapy.py.Writer[org.bytedeco.opencv.opencv_core.Mat]
[error] np.asarray(preProcessedImagesWithLabels)
[error] ^
shadaj
@shadaj:matrix.org
[m]
scalapy-numpy is unfortunately quite a bit out of date, but there is active work that will bring back static typing for NumPy and TensorFlow soon! in the meantime, you'll have to use the dynamically-typed APIs or define your own facades
Joesan
@joesan
@shadaj:matrix.org Thanks for the reply. Could you point me to some examples?
bogorman
@bogorman_twitter

Hi. I know the facadeGen is alpha but i just tried to run it and i get this error. Any suggestion on how to get it running? I just want to generate some facades for some modules and play with it. I am only trying to generate it for the "builtins" module at the moment while testing. I can see in your scalacon-live folder it looks like the facadeGen did work for at that point in time. Thanks.

[info] running (fork) me.shadaj.scalapy.facadegen.Main
[error] Exception in thread "main" me.shadaj.scalapy.py.PythonException: <class 'TypeError'> list object expected; got SequenceProxy
[error] at me.shadaj.scalapy.interpreter.CPythonInterpreter$.$anonfun$throwErrorIfOccured$2(CPythonInterpreter.scala:328)
[error] at me.shadaj.scalapy.interpreter.Platform$.Zone(Platform.scala:10)
[error] at me.shadaj.scalapy.interpreter.CPythonInterpreter$.$anonfun$throwErrorIfOccured$1(CPythonInterpreter.scala:314)
[error] at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.scala:18)

bogorman
@bogorman_twitter
Changed the use of toPythonProxy to toPythonCopy and it seems to run.
Lorenzo Gabriele
@lolgab
Hi @shadaj:matrix.org,
I was playing with Scalapy in Scala Native and I whatever I do I leak memory ( unless I use py.local).
I wanted to ask you why the automatic free of memory can't be done in Scala Native.
Is it because Scala Native doesn't support finalize? Because it is deprecated in Java 9, maybe people continue to use it either way?
If it is because you need to call malloc there is a "better" ( hackish ) way to deal with that and have GC memory managed..
You can allocate an Array[Byte]of a certain dimension. And then with ByteArray.at(0) you get the pointer to the start of the arrays' data section. When the original Array is GC collected, you have successfully freed the memory. So you don't have to use malloc.
import scala.scalanative.runtime.ByteArray

val arr = ByteArray.alloc(size)
arr.at(0) // is your pointer
return arr // you don't want to lose the reference to the `ByteArray`object ahead of time
Scalapy could be the thing that would allow us in Scala Native to have a proper ecosystem of utility libraries to get from, while still use pure Scala for the main programs. It would be great if it worked out of the box with memory leaks so you can call Python libraries and forget about it.
I'm particularly interested in the cloud native ecosystem which is really thorough in Python!
Lorenzo Gabriele
@lolgab
*without memory leaks
Eric K Richardson
@ekrich
@lolgab Could you explain a bit more you take on ScalaPy and how this fits into Scala Native and Cloud Native Foundation tools?
Lorenzo Gabriele
@lolgab
@ekrich This answer from Odersky + the 2 replies explain very well why Scala Native would play nicely with Python: https://contributors.scala-lang.org/t/scala-native-next-steps/4216/75
About Cloud Native tools, nothing special about Python there.. It is just that Cloud Native is the area I'm personally interested in and official client libraries for famous clouds like AWS are only for few languages: C++, Go, Java, Javascript, .NET, Node.js, PHP, Python and Ruby.
Scala Native can use C++ libraries if you write C++ glue code with extern "C"functions, but it is a bit of work and if the Python library is fast enough and it's not what makes you slow, I don't see the reason to go that route.
Other dynamic languages like PHP, Ruby etc. can be probably integrated with a Scalapy like library, but I think Python ecosystem is bigger.. And we already have scalapy!
Eric K Richardson
@ekrich
@lolgab I see thanks, that make total sense.
shadaj
@shadaj:matrix.org
[m]
@lolgab: the reason for needing py.local is finalizers; since we expose Python objects through nice Scala wrappers, we are dependent on the garbage collector telling us when the Scala wrappers are no longer needed and therefore we can decrement the reference count for the Python value
@joesan: take a look at https://scalapy.dev/docs/static-types for some examples

@bogorman_twitter: this is a known limitation, since the proxy collections aren't exactly Python lists (but instead a sequence type), whereas copy collections are directly created as native lists

if the API you're using requires a list, copies are unfortunately the only way to go

Lorenzo Gabriele
@lolgab
@shadaj:matrix.org Any plan to migrate away from finalizers? They are flagged for removal in Java 17 and will never be supported in Scala Native.
Something like AutoCloseable would make it, but at a developer experience cost.
shadaj
@shadaj:matrix.org
[m]
@lolgab: I think the solution would be to switch to something like phantom references, with a separate thread responsible for clearing out freed Python values. But one way or another we need some support from Scala Native to have the GC notify ScalaPy of such changes
Dhruva Bharadwaj
@dhruva-clari
@shadaj:matrix.org I have python 3.7 installed on my machine. I'm not using SBT but gradle for the build. Our code has Scala version 2.11. I'm getting the following stack trace despite passing in the jvm argument of jna.library.path with  python's lib directory value
java.lang.IllegalArgumentException: Can't determine class with native methods from the current context (class me.shadaj.scalapy.interpreter.CPythonAPIInterface
KaTeX parse error: Can't use function '$' in math mode at position 8: anonfun$̲1)
    at com.s…: anonfun$1)
    at com.sun.jna.Native.findDirectMappedClass(Native.java:1473)
    at com.sun.jna.Native.register(Native.java:1443)
    at me.shadaj.scalapy.interpreter.CPythonAPIInterface
anonfun$1.apply(CPythonAPI.scala:20)
at me.shadaj.scalapy.interpreter.CPythonAPIInterface$$anonfun$1.apply(CPythonAPI.scala:19)
at scala.collection.immutable.Stream.map(Stream.scala:418)
at me.shadaj.scalapy.interpreter.CPythonAPIInterface.<init>(CPythonAPI.scala:19)
at me.shadaj.scalapy.interpreter.CPythonAPI$.<init>(CPythonAPI.scala:111)
at me.shadaj.scalapy.interpreter.CPythonAPI$.<clinit>(CPythonAPI.scala)
at me.shadaj.scalapy.interpreter.CPythonInterpreter$.<init>(CPythonInterpreter.scala:9)
at me.shadaj.scalapy.interpreter.CPythonInterpreter$.<clinit>(CPythonInterpreter.scala)
at me.shadaj.scalapy.py.package$.<init>(package.scala:15)
at me.shadaj.scalapy.py.package$.<clinit>(package.scala)
shadaj
@shadaj:matrix.org
[m]
@dhruva-clari: ah, unfortunately ScalaPy only supports Scala 2.12 and up, so you'll likely need to upgrade to use ScalaPy
it's interesting that KaTeX shows up somehow, I wonder why (or if) JNA is using that under the hood
Richard Lin
@ducky64

Does ScalaPy support Windows + Python 3.10? The manual sbt config seems to invoke python3-config which doesn't appear to be a thing on Windows, and python-native-libs seems to request sys.abiflags which gives an error.

Also, since it looks like configs are set during compile time instead of runtime, is it possible to distribute programs using ScalaPy as a JAR (including cross-platform support) or must users compile from source?

shadaj
@shadaj:matrix.org
[m]

@ducky64: I haven't tested ScalaPy myself with Windows, but in theory everything should just work (as in there is nothing hardcoded for *nix). You might want to try using python-native-libs as described in https://scalapy.dev/docs/, or otherwise you may need to hardcode the dependency.

You can set the system property scalapy.python.library to point to a specific Python dependency at runtime (as long as you do this before calling any ScalaPy APIs). We don't have built-in support for automatic configuration, but it may be interesting to use python-native-libs at runtime to power cross-platform discovery.

Python 3.10 isn't officially supported yet, and you'll need to set the scalapy.python.library property or the SCALAPY_PYTHON_LIBRARY environment variable manually to try using it. But I'll look into testing with that in CI and making support official!

Darsh Selarka
@darshselarka1497

Hey @shadaj:matrix.org , firstly, I really appreciate the work that you have put in for building Scalapy! After using it extensively in day-to-day tasks, it has proven to be a great asset in reducing manual conversion efforts from Pandas code to Scala.

I have been trying to run a custom PySpark script from a Scala based notebook on Databricks but facing an issue when I try to pass a spark session to a function in a custom python package. ScalaPy throws a type mismatch error. Have attached the error below for your reference.

Importing required libraries using ScalaPy

val pd = py.module("pandas")

val s3fs = py.module("s3fs")

val py_spark_sql = py.module("pyspark.sql")

val pyspark_package = py.module("pyspark_tier2_test.pyspark_tier2_test")

This is the error I get when I pass the spark session to the function in my custom package:

val result_df = pyspark_package.py_driver_func(py_spark_sql.SparkSession)

command-3273291744808514:1: error: type mismatch;
 found   : org.apache.spark.SparkSession
 required: me.shadaj.scalapy.py.Any

Pandas works perfectly with ScalaPy but I have an requirement to make pyspark scripts run with Scalapy in order to make things more scalable and distributed!
Can you please suggest a fix or head me in the right direction? Any help will be much appreciated!

shadaj
@shadaj:matrix.org
[m]
@darshselarka1497: hmm, this is odd; py_spark_sql.SparkSession should automatically be py.Any since it's just a member of another Python module. I wonder if the Databricks notebook environment is doing something funky. Could you try printing out the type of py_spark_sql.SparkSession (py_spark_sql.SparkSession.getClass)?
Darsh Selarka
@darshselarka1497
@shadaj:matrix.org This is the output for the class type
class me.shadaj.scalapy.py.AnyDynamics$$anon$15$$anon$16
shadaj
@shadaj:matrix.org
[m]
@darshselarka1497: hmm, that's weird, in theory things should compile then; maybe you can try storing py_spark_sql.SparkSession in a variable before using it?
Pascal Méheut
@pascal_meheut_gitlab
Hi. I'm running Scalapy with Python 3.9 in Anaconda on a Mac. It works fine. I just had to configure the jna.library.path manually because python3-config returns something wrong. But some modules cannot be imported. numpy and pandas work fine but when I try to import feather or xgboost, I got a message
"Exception in thread "main" me.shadaj.scalapy.py.PythonException: <class 'ModuleNotFoundError'> No module named 'xgboost'"
shadaj
@shadaj:matrix.org
[m]
@pascal_meheut_gitlab: hmm, these packages are installed in your Anaconda environment as well? those should work out of the box; if you're using a virtualenv you need to follow the instructions at https://scalapy.dev/docs/#virtualenv though
Pascal Méheut
@pascal_meheut_gitlab
Yes, this package are installed: they are my bread and butter. I'm not using VirtualEnv at all, just Anaconda.
shadaj
@shadaj:matrix.org
[m]
that's surprising, is the installation of xgboost right next to numpy and the modules that do work?
Pascal Méheut
@pascal_meheut_gitlab
Yes. Everything is in $HOME/opt/anaconda3/envs/lbo/lib/python3.9/site-packages
lbo being my environment name. I'll test on another Mac and on Linux & Windows tomorrow.
Pascal Méheut
@pascal_meheut_gitlab
Ok, this was a problem with my installation. I removed Anaconda, reinstalled it, recreated the environment and now it works. Thanks.
Pascal Méheut
@pascal_meheut_gitlab
Another question: anybody wrote a facade or explained how to use a Pandas dataframe?
mn98
@mn98
Hi all, I'm trying to get started with ScalaPy and experiencing issues similar to others with respect to libraries not being found.
I've tried a few of the solutions proposed but have been miserably unsuccessful in getting it to work. What's slightly different about my setup is that I've used pyenv to install python 3.9.9 and anaconda 3-2011.11. Has anyone had experience with this approach and be able to share any pointers? Many thanks in advance!
mn98
@mn98

To make this slightly easier, I've removed pyenv from the equation and pushed this skeleton example to github.
At this point, upon executing runMain hello in the sbt shell the error begins with:

java.lang.UnsatisfiedLinkError: Unable to load library 'python3':
dlopen(libpython3.dylib, 0x0009): tried: '/Applications/IntelliJ IDEA CE.app/Contents/jbr/Contents/Home/bin/../lib/jli/libpython3.dylib' (no such file) ...

And it's correct, that file doesn't exist, it's actually /Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9.dylib, but how do I get this to be picked up?

1 reply
I get the same link error when running either in the sbt shell or within IntelliJ IDEA, so I don't think it's an IDE issue.
mn98
@mn98
I'd be curious to know if what I've pushed to github works out of the box for people.
Kien Dang
@kiendang
add fork := true to your build.sbt and things should work fine
1 reply
mn98
@mn98
@kiendang thank you very much!!
mn98
@mn98
I updated my minimal example on github, which may be useful for others trying to get off the ground.
mn98
@mn98

Next, I've switched to my local install of anaconda by changing the path to python in build.sbt:

lazy val python = Python("/opt/anaconda3/bin/python3.9")

and the existing example works fine.
However, when I then try to experiment with numpy at runtime a particular library can't be loaded:

[info] INTEL MKL ERROR: dlopen(/opt/anaconda3/lib/libmkl_intel_thread.1.dylib, 0x0009): Library not loaded: @rpath/libiomp5.dylib

I notice that /opt/anaconda3/lib/libiomp5.dylib does exist, although /opt/anaconda3/lib/libmkl_intel_thread.1.dylib does not.
Has anyone experienced a similar problem?

mn98
@mn98
Correction, both libraries are present under /opt/anaconda3/lib yet they are not loaded at runtime.
mn98
@mn98
I tried a few of these suggestions in the anaconda docs but unfortunately they haven't resolved my issue.
The full error message reads:
[info] INTEL MKL ERROR: dlopen(/opt/anaconda3/lib/libmkl_intel_thread.1.dylib, 0x0009): Library not loaded: @rpath/libiomp5.dylib
[info]   Referenced from: /opt/anaconda3/lib/libmkl_intel_thread.1.dylib
[info]   Reason: tried: '/Applications/IntelliJ IDEA CE.app/Contents/jbr/Contents/Home/bin/../lib/jli/libiomp5.dylib' (no such file), '/usr/lib/libiomp5.dylib' (no such file).
[info] Intel MKL FATAL ERROR: Cannot load libmkl_intel_thread.1.dylib.
mn98
@mn98

On the executable /opt/anaconda3/bin/python3.9 it would appear (from using otool) that LC_RPATH is correct:

Load command 14
          cmd LC_RPATH
      cmdsize 272
         path /opt/anaconda3/lib (offset 12)

and in /opt/anaconda3/lib/libmkl_intel_thread.1.dylib itself, I see:

Load command 10
          cmd LC_LOAD_DYLIB
      cmdsize 48
         name @rpath/libiomp5.dylib (offset 24)

I'm in a world of macos/rpath pain now and well out of my depth, but none of the above looks incorrect to me.
Would anyone care to venture why it doesn't pick up @rpath/libiomp5.dylib from /opt/anaconda3/lib?

mn98
@mn98
I put the minimal anaconda/numpy example on this branch. Again, I'd be curious to know if that just works out of the box for folks with a local anaconda install on MacOS.
8 replies
alicebrb
@alicebrb
Hello, I'm trying to use Scalapy to integrate Python Arima library with the Scala code from the rest of my project. It is working fine, but when I'm trying to integrate with Jenkins pipeline, I'm getting an error with Sonar analysis.
I have an class in my project that contains the Scalapy code, all others classes have Scala code.
I tried to ignore this specific file but no success, and the goal would be to analyse all Scala and ScalaPy code in SonarQube