Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
  • Nov 22 20:11
    marcelotrevisani commented #264
  • Nov 22 20:10

    marcelotrevisani on main

    generic message (#264) * gener… (compare)

  • Nov 22 20:10
    marcelotrevisani closed #264
  • Nov 22 17:58
    codecov-commenter commented #264
  • Nov 22 17:46
    codecov-commenter commented #264
  • Nov 22 17:46
    ocefpaf synchronize #264
  • Nov 21 15:23
    marcelotrevisani commented #264
  • Nov 17 06:48
    marcelotrevisani commented #263
  • Nov 16 19:49
    codecov-commenter commented #264
  • Nov 16 19:45
    codecov-commenter commented #264
  • Nov 16 19:30
    ocefpaf opened #264
  • Nov 16 19:23
    ocefpaf labeled #263
  • Nov 16 19:23
    ocefpaf opened #263
  • Nov 13 17:10
    zzhengnan commented #229
  • Nov 11 09:19
    jakirkham commented #262
  • Nov 11 08:33
    marcelotrevisani commented #262
  • Nov 10 20:44
    jakirkham edited #259
  • Nov 10 20:41
    jakirkham commented #260
  • Nov 10 20:40
    jakirkham commented #262
  • Nov 10 20:40
    jakirkham commented #262
Bastian Zimmermann
@BastianZim
Ahh you're right.
Hmm, that's a problem if a feedstock has multiple outputs.
Ok, I just linted a recipe for requests that I generated with Grayskull and the linter didn't complain about duplicates.
So that's not an option.
Bastian Zimmermann
@BastianZim
@marcelotrevisani Do you have access to the conda-smithy logs?
Because I would then suggest that we try to create a feedstock again for which we know there is a duplicate that isn't being detected and then we see if we can find something in the logs.

@BastianZim, so either it fails to run, or it runs but fails silently, and we don't currently know which?

For here

Ben Mares
@bmares_gitlab

Cool, thanks! I am learning a lot...

Ok, so the mapping is not meant to be complete, it's meant to be an approximation for disambiguation. And in such a mapping, particular choices must be made. That's quite a mess, since imports themselves are so ambiguous. For example, import snappy can refer to either the PyPi package python-snappy which is a compression library by Google, or it can refer to the PyPi package snappy for hyperbolic geometry of 3-manifolds. I happen to be a mathematician interested in manifolds, so I use the latter, but not the former. Probably you want to maximize likelihood of correctness by choosing the most popular PyPi package corresponding to a given import...

In the case of detecting duplicates I think it would be really helpful to maintain an up-to-date package.name -> source.url mapping.

Jaime Rodríguez-Guerra
@jaimergp
all feedstocks in conda forge: https://github.com/regro/cf-graph-countyfair/blob/master/all_feedstocks.json
(also not useful due to multiple outputs)
Bastian Zimmermann
@BastianZim
@bmares_gitlab Yes, that all makes sense. I guess in the end conda-forge core would have to write a CFEP standardising package names and where they are stored (with their mappings).
Jaime Rodríguez-Guerra
@jaimergp
@bmares_gitlab source.url can change though... I've seen moving from PyPI to GH because maintainer stopped releasing
what about the about.home, about.doc_url or about.dev_url
I'd say there's a higher chance of finding matches there
but that's no longer a 1-to-1 comparison
there are several fields in the metadata that can be matching in that case
Ben Mares
@bmares_gitlab

@jaimergp yes, that is a very good point. We'd probably want a mapping including all of those fields.

And then if we get really carried away, for each referenced PyPi package, we could generate another mapping which pulls in all that metadata too.

Jaime Rodríguez-Guerra
@jaimergp
btw, @ForgottenProgramme , we are getting out of scope here, but for your first effort I'd say that hyphen-canonicalization for the package name is enough for now!
Bastian Zimmermann
@BastianZim
One other note: I wouldn't necessarily put too much emphasis on the import name. Most often the reqs.txt file uses the pypi names for specifying dependencies so that's what should be used for disambiguation.
Ben Mares
@bmares_gitlab

we are getting out of scope here

@jaimergp Haha, ya, sorry about that. Much better to have an initial solution which works most of the time than constructing a perfect solution.

Jaime Rodríguez-Guerra
@jaimergp
Agreed. The import name will be useful when there's support for non-pypi sources
but for PyPI stuff, PyPI name it is
Ben Mares
@bmares_gitlab
And PyPI means _ → -?
Jaime Rodríguez-Guerra
@jaimergp
Yes.
Hyphens all the way
At least that will reduce the surprise factor long term
Checking for existing duplicates can be tackled after that
And something to discuss with CF, in my opinion
Because right now their linting is only partially functional in this regard
Marcelo Duarte Trevisani
@marcelotrevisani
@BastianZim which logs?
Mahe Iram Khan
@ForgottenProgramme
Grayskull now names the recipes same as what the user types in. We want to change that so that the recipes generated have the same name as the PyPI name
Right?
Marcelo Duarte Trevisani
@marcelotrevisani
sorry, I might take a while to answer because I still need to conciliate with my day job :grin:
correct
Bastian Zimmermann
@BastianZim

And something to discuss with CF, in my opinion

Yes, agreed.

@BastianZim which logs?

The conda-smithy webservices log.

It would just be nice to know why the linter sometimes doesn't pick up duplicates.

Right?

Generally agreed but what about duplicates? Where the package already exists in bioconda?

Marcelo Duarte Trevisani
@marcelotrevisani
you will need to ask Matthew (beckermr) for that, maybe @ocefpaf knows how to get it
Bastian Zimmermann
@BastianZim
Will do, thanks!
Filipe
@ocefpaf
In theory we do check for dups but that is broken lately. Not sure why.
Note that the user always has the choice to rename. The reviewer may accept it or not. Like the PyPI pkg build is named python-build to avoid taking a really generic name.
Tgat also happens if a python pkg has the same name of a c-lib it wraps.
TL;DR we need a "sane" default but that isn't necessarily the final name the pkg will take.
The _ vs - is another, and more complex, problem.
I commented on the PR about it.
Ben Mares
@bmares_gitlab
@ocefpaf which PR contains your comments about _ vs -? (Sorry, I'm a bit lost with all the links)
Marcelo Duarte Trevisani
@marcelotrevisani
@bmares_gitlab it is this one
conda-incubator/grayskull#230
Filipe
@ocefpaf
To summarize it here. For PyPI the _ and - are interchangeable so most pkg authors don't really care. Conda is not a Python only package manager so we do care! That means, when we can have a name from the metadata, we should use it. When the name is confusing, we should "guess" based on a sane standard, which we do not have already. Last but not least, sometimes we require both names to satisfy all use cases and we can achieve that with multiple outputs but that is not an unanimous solution yet. Some people do not like it.