Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
  • Sep 05 2019 14:43
    @typelevel-bot banned @jdegoes
  • Jan 31 2019 21:17
    codecov-io commented #484
  • Jan 31 2019 21:08
    scala-steward opened #484
  • Jan 31 2019 18:19
    andywhite37 commented #189
  • Jan 31 2019 02:41
    kamilongus starred typelevel/cats-effect
  • Jan 30 2019 00:01
    codecov-io commented #483
  • Jan 29 2019 23:51
    deniszjukow opened #483
  • Jan 29 2019 23:37
  • Jan 29 2019 23:22
  • Jan 29 2019 20:26
    Rui-L starred typelevel/cats-effect
  • Jan 29 2019 18:01
    jdegoes commented #480
  • Jan 29 2019 17:04
    thomaav starred typelevel/cats-effect
  • Jan 28 2019 17:43
    asachdeva starred typelevel/cats-effect
  • Jan 28 2019 07:12
    alexandru commented #480
  • Jan 28 2019 05:45
    codecov-io commented #482
  • Jan 28 2019 05:35
    daron666 opened #482
  • Jan 27 2019 13:56
    codecov-io commented #481
  • Jan 27 2019 13:46
    lrodero opened #481
  • Jan 27 2019 05:47
    codecov-io commented #460
  • Jan 27 2019 05:37
    codecov-io commented #460
Ben Stewart
@bastewart
I thought there must be a reason for it
Thanks both!
felher
@felher
np :)
Rafi Baker
@BakerRafi_twitter
Hi, I want to wrap a Ref[F, Boolean] in a Resource and use it to avoid running a process by multiple consumers simultaneously. Basically, the acquire sets the Ref to false and release resets it to true. Is this a valid use case?
Arnau Abella
@monadplus
I think so. You can also use Semaphore(1) and .withPermit
Rafi Baker
@BakerRafi_twitter
Thanks! let me try that.
Gabriel Volpe
@gvolpe

Tracing is directly tied to runtime: you want a tracing id for each fiber on the system. You can hack it into Kleisli or one of the reader-like MTL type classes, but that's just a hack.

@jdegoes I'm curious, if using Kleisli or a reader-like MTL typeclass is a "hack" how does this gets done in pure functional languages like Haskell? I've only seen implementations based on ReaderT and MonadReader but my knowledge is limited :)

John A. De Goes
@jdegoes
@gvolpe It's a hack there too. :) With tracing, what you really want to get for free is the execution graph — main thread spawns thread A, B, and C, thread C spawns thread D, E, and F, and so forth; for at least some history; and then ideally across processes and even networks. This lets you debug, profile, and diagnose large, complex systems. Even TaskLocal / FiberLocal is not quite right, but it's at least directionally correct.
Reference equality is a huge hack in Haskell, too (and some problems benefit greatly from it, e.g. serialization, detecting cycles in graphs, etc.). Basically anything connected to how something is implemented at the system level is a hack in Haskell.
Gabriel Volpe
@gvolpe
Thanks John, that's an interesting point of view. I guess there's another use case that at least we care more about at work and it is the execution graph of the business components (say external web service call, access to filedisk, access to db, etc). We get the threads / fibers information if we decide to attach it to the trace call but we are only interested in knowing what parts of business logic have executed so far at a given point. When processing payments I can't explain well enough how useful this is, specially in failure scenarios.
John A. De Goes
@jdegoes
@gvolpe I think that's common (to want to attach metadata), for example on a web server wanting all logging to be tagged by the request id of the initiating request. But you still want to track the thread / fiber graph because you need to know how that 1 request is being translated into parallel / sequential tasks that themselves perform additional IO.
How do you tell now (without thread information) which parts are executing in parallel, and which sequentially, and which are primary versus secondary tasks, etc?
Fabio Labella
@SystemFw
heh, I find those to be different concerns: if I'm tracing for e.g profiling, yeah something baked in the runtime would be preferable since I don't want to "dirty" my code with that. If I'm doing tracing on the semantic level, I'd rather it be built compositionally (since as I see it as full blown feature of my code) than build it by "abusing" the runtime
This comes from having seen both approaches on a large scale, with ThreadLocal
ThreadLocal for semantic info was a disaster
Gabriel Volpe
@gvolpe
@jdegoes We know exactly how things run because parallelism and concurrency are explicit with cats-effect. We don't use tracing for that, we use it primarily to know when a piece of business logic has begun / ended / failed / etc.
And of course we want to identify each piece of business logic with a particular http request, hence the trace-id in use.
John A. De Goes
@jdegoes
@SystemFw Why not both: I want business component/task-level metadata (e.g. this part here is DB access, etc), but I don't want to have to work to take that metadata and see it in a graph that clearly depicts parallel and sequential operations; and which business components were initiated from which other ones. For big picture information—for debugging, reducing latency, identifying bottlenecks, etc.—nothing beats the execution graph (appropriately labeled with metadata)
@SystemFw ZIO FiberLocal has no runtime support; what has runtime support is every fiber has a unique id, which is used to implement FiberLocal. I think you might appreciate that approach more. Once you have id it becomes natural to talk about parents, which (if implemented in a sane fashion), gives you raw graph data you can use to build really nice tracing. The section (semantic) metadata can just go in a Ref but the other stuff requires runtime support.
John A. De Goes
@jdegoes
@gvolpe I think that's a luxury of a relatively small project. When there are 3 million lines of code and hundreds of services / business components, then the knowing "exactly" how things run will become intractable due to time commitments. But product will tell you latency of 2s is unacceptable and you need to get that down to 500ms and you can either study tens of thousands of lines of ever-changing code for days or you can look at the execution graph. I agree though that it's not necessary for sane-sized projects.
Fabio Labella
@SystemFw
right, I think my underlying thought is that is tempting to add more and more features to the runtime (in general, not just for this case), but I'd like to keep things compositional if at all possible, I find it works better in the long run, sometimes in subtle ways (e.g. when upgrading)
John A. De Goes
@jdegoes
@SystemFw I agree. But rather than adding features to the runtime, exposing its guts in a sane principled way such that you can modularly snap in features like the full graph tracing (not just fiber identity) could make a lot more sense.
Fabio Labella
@SystemFw
yeah, I agree with that
John A. De Goes
@jdegoes
@SystemFw Do you have an ideas on what it could look like (aside from “hook” based approach)?
Fabio Labella
@SystemFw
no, all my cats-effect thoughts are being spent on how a sane model for interruption could look like (I do have some abstract ideas about that API, but the implementation strikes me as hard and I haven't tried anything yet)
John A. De Goes
@jdegoes
What are your ideas on that?
Btw we generalized Managed (Resource) in ZIO. It’s now possible to express interruptible acquisition for concurrent data structures, as well as bracket-style for foreign resources, in the same abstraction. It’s less code too.
Fabio Labella
@SystemFw

so, it could be all wrong, but basically I think the current model (interruptible flatMaps, interrupt on async boundaries) is ok except when you are trying to write concurrent code, because then you'd like to know where interruption could happen.
So my idea is a (possibly unimplementable :P) model based on 3 primitives, guaranteeCase, `mask, and poll.
guaranteeCase is the same as now.
mask is the same as uninterruptible, except when poll appears.
poll makes an F wrapped in an interruptible block interruptible again.

So, the typical case where you want uninterruptible flatMaps locally, but interruption on an async thing, for example when using semaphore, could be:

mask {
    poll { sem.acquire }.flatMap(yourThing)
}
if poll encapsulates the flatMap as well, then you can interrupt that as well
you can then guaranteeCase in the places it makes sense too
the main issues I'm trying to think about is that this gives you (for now) the ability to "catch " interruption, which is unadvisable, as well as requiring some way of "rethrowing" it. All in all, it seems like a closer/best fit for an async exception model, which I don't know if it's desirable
the other problem is implementation complexity: even though you can understand interruption as an async exception conceptually, in practice interruption on async boundaries and interruption between flatMaps is implemented rather differently
and this interface would require you to mask/unmask either transparently
any links about where to look for the new zio.Resource? (apart from its definition)
the nice thing about this model though is it's compositionality
Fabio Labella
@SystemFw
if in a bigger section you poll an action containing that snippet above, its semantics are preserved, and it only gets interrupted where the author deemed it safe to. Otoh, most code keeps the property of not needing any specific "I want it to be interrupted here" that you get with a cancelBoundary sort of model (which is also weird because it only applies to canceling flatMaps, but not async actions)
@jdegoes
I'm also not entirely sure if an uninterruptible is needed, which would override any mask/poll thing where you truly want to say that this can never be interrupted in any case
John A. De Goes
@jdegoes
At the gym now but this is really interesting...let me digest and get back.
Fabio Labella
@SystemFw
:+1:
John A. De Goes
@jdegoes
@SystemFw I kind of like it. Flip flopping between interruptible and uninterruptible sections. The meaning of uninterruptible (mask) changes to "I don't want what I know about to be interruptible." But things you don't know about (e.g. acquire) may independently decide they are ok with interruption. I think users could be trained to deal with that interpretation. The main drawback I see is that, if you say something's not interruptible (mask), just so you don't have to deal with the possibility of interruption, well, you may still have to add a guaranteeCase to deal with the possibility, because whether or not something is truly interruptible depends on details you'll never know, they're buried deeper. "Inner most interruptible wins".
In implementing bracket and many other things, you need uninterruptible. If you've handled errors and you make something uninterruptible, you have the strong guarantees necessary to implement higher-level semantics like bracket or Resource. Now you have to think carefully because masking something doesn't guarantee it won't be interrupted unless whatever actions you run do not use poll. Which you may not know (you definitely won't know in the case of combinators).
I think it doesn't so much eliminate the tension as transform it into a different one (which is maybe more manageable, I'm not sure yet).
Fabio Labella
@SystemFw

@jdegoes

I've reread
https://haskell-lang.org/tutorial/exception-safety
http://www.well-typed.com/blog/97/

in detail. These are my takeaways:

  • There is a fundamental tradeoff between safety and deadlock prevention
  • Allowing some actions to be interruptible even in a masked scenario is essential for deadlock prevention
  • Overriding the above sentence is necessary for absolute safety

Overall, what I proposed above is basically how the Haskell model works, if you dive deep enough, but condensed to 4 primitives (which are higher level than Haskell's, none of the following are primitives there).
The primitives are: mask, poll, uninterruptible, guaranteeCase.
bracket can be defined as:

mask {
acquire.flatMap { r =>
  poll { use(r) }.guaranteeCase { _ => release.uninterruptible }
 }
}

Consensus seems to be that it's not necessary to make acquire uninterruptible, it can take care of itself. In any case the model allows both.
There is debate on release, but overall it's preferred to make it uninterruptible to err on the side of caution.
Also note that if poll is defined as taking interruption and raising something akin to InterruptedException, guarantee is not needed, it becomes handleErrorWith.

Now, it's true that you need to know which actions are "pollable" (basically things like Semaphore.acquire), but that's an unavoidable limitation due to the tension mentioned above:
to avoid deadlock you need to trust that the implementation of things you rely on is interruptible (e.g using bracket with semaphore), but to guarantee safety in all cases you need to never trust the implementations and make them all uninterruptible. (e.g. using bracket when opening a file).
Therefore, a sane model needs to allow both, or it will be broken in the other case.

nashid
@nashid

I am getting this error:

value foreach is not a member of cats.effect.IO[String]

Code:

val content = for {
content <- fetchUrl(url)
} content

def fetchUrl(url: String): IO[String] = {
httpClient.expectString
}

Gabriel Volpe
@gvolpe
You're missing the yield in your for :)
@nashid
nashid
@nashid
ops
John A. De Goes
@jdegoes

Acquire in general cannot be interruptible. openFile <* IO.unit cannot leak resources just because of the unit at the end. Similarly for release, otherwise it will leak resources. Concurrent data structures are a special exception because you can divide the acquisition into a non-interruptible section, followed by an async interruptible section.

The difficulty I have with this new model (which I otherwise quite like) is that if someone gives me an io that I have to run, and my acquire looks like openFile <* io, then the safety of my code depends not on what I write ((openFile <* io).bracket(...)(...)), but on what io I was passed. In other words, local reasoning about resource safety breaks down. It's now necessary to do whole-program analysis to figure out if your code is resource safe.

The current model can be reasoned about locally. That is, you can know just looking locally whether or not you can leak resources. Whole program analysis is not necessary. Of course, you don't know if the program will deadlock but that's beyond the realm of most static type systems anyway and there are lots of ways to deadlock (MVar, Queue, etc.).

Now maybe bracket would be implemented with uninterruptible and not mask so you wouldn't have this problem. But then you're back in "deadlock" land with semaphore.acquire.bracket(_ => semaphore.release)(...).