These are chat archives for uwhpsc-2016/uwhpsc-2016

2nd
Jun 2016
Chris Kang
@shinwookang
Jun 02 2016 00:51
Hi @mvelegar . I had some office hours time with Chris today, but it seems like I need some more help. I will send you Private Chat, when you're available for the 6PM office hours. Thank you.
mvelegar
@mvelegar
Jun 02 2016 00:52

Starting Office Hours

bmva
@bmva
Jun 02 2016 00:58
Hi @mvelegar...there was some discussion above regarding using MPI_Wait...is this only necessary if you are using the ISend AND the IRecieve?
or I suppose just I_Receive?
burnhamdr
@burnhamdr
Jun 02 2016 01:00
@mvelegar, I posted an issue in my private repository, if you could take a look at the plots I posted there I would appreciate any insight you could provide.
mvelegar
@mvelegar
Jun 02 2016 01:02
@bmva you are next in line, am helping one student right now. @burnhamdr you are third in line. will be getting back to you shortly
natwall27
@natwall27
Jun 02 2016 01:07
@mvelegar , can I be in the line up, please? I posted my question on the issues page, and I will try to post the plot of the solution in a private chat.
mvelegar
@mvelegar
Jun 02 2016 01:07
@bmva MPI_Wait is needed in asynchronous version of MPI implementation
so when you would use MPI_Isend
jdstead
@jdstead
Jun 02 2016 01:08
@mvelegar , hi i also have a couple questions once you get through with these other ones. let me know when you get time
bmva
@bmva
Jun 02 2016 01:09
okay, that's what I thought, so if I use Isend and Receive, there's no reason to use a wait, correct?
mvelegar
@mvelegar
Jun 02 2016 01:09
@bmva it is "necessary" as long as you are using Isend
bmva
@bmva
Jun 02 2016 01:10
so if I am using Isend and Receive without a wait, but consistently getting the matching answer, then it might be a fluke?
mvelegar
@mvelegar
Jun 02 2016 01:11
some machines are pretty lenient when it comes to the behavior of MPI_Send and MPI_Recv. On my machine for example, I can get away without MPI_Wait
bmva
@bmva
Jun 02 2016 01:12
so maybe I could elaborate a bit without saying too much about my exact code
mvelegar
@mvelegar
Jun 02 2016 01:12
But it's good practice to make sure you use Wait
sure @bmva, it might be helpful to everyone
bmva
@bmva
Jun 02 2016 01:13
but if I have an MPI_Isend followed directly by MPI_Recv, doesn't it inherently have a wait with the MPI_Recv
mvelegar
@mvelegar
Jun 02 2016 01:13
this has already been discussed btw, make sure you have scrolled up to see Sean's OH yesterday and Chris's from earlier todat
bmva
@bmva
Jun 02 2016 01:14
yep, I saw Chris's and perhaps just didn't fully understand it...I will scroll up to read Sean's as well though
Hugh Krogh-Freeman
@hughkf
Jun 02 2016 01:14
@mvelegar Are grades posted for homework #3? I don't see mine
mvelegar
@mvelegar
Jun 02 2016 01:16
@hughkf it's still muted, @quantheory graded this one and I am not sure if he is done yet. He will publish them as soon as he is done
@bmva there is a little confusion in your understanding
Hugh Krogh-Freeman
@hughkf
Jun 02 2016 01:16
Thanks!
mvelegar
@mvelegar
Jun 02 2016 01:17
so when there is interleaving communication between multiple MPI processes, when
the send is asynchronous: we will send data as soon as possible so no other task has to wait for the data. so we can:
(1) Put in the MPI_Isend calls
(2) go about computing the euler step for data we already have.
(3) Put in requests MPI_Recv for data we don't have, compute these euler steps after the receive requests
bmva
@bmva
Jun 02 2016 01:19
Ah, so that would be hiding the latency...I think I've coded mine poorly such that there is no latency hiding, just straight send receive, then calculate
in which case I have all the data I need at the start of my calculations
mvelegar
@mvelegar
Jun 02 2016 01:20
(4) put in a MPI_Wait (as much in the end as possible, say just before updating ghost cells
that is what I am getting at: There is no latency in the way I have outlined the steps above
In other words, you can group the send and receive requests seperately
bmva
@bmva
Jun 02 2016 01:21
gotcha...yea I unfortunately totally forgot about hiding the latency...so I think mine is working properly and without any flukes...but it's just inefficient in terms of speed...is that what you've gathered?
mvelegar
@mvelegar
Jun 02 2016 01:21
there is no "matching" happening in real time
@bmva i am sure your code is working fine without the wait.
but like I said, it is recommended to put in waits so there is no leaking/inconsistencies etc
bmva
@bmva
Jun 02 2016 01:23
awesome, thanks for the help! I'll work on hiding the latency and check the run time improvements
mvelegar
@mvelegar
Jun 02 2016 01:24
ok @burnhamdr you are next
burnhamdr
@burnhamdr
Jun 02 2016 01:24
Great, thanks!
mvelegar
@mvelegar
Jun 02 2016 01:27
@burnhamdr (1) you need to update your ghost cells every time you update the ukt values
lines 117 and 118 should be inside the time loop
natwall27
@natwall27
Jun 02 2016 01:28
@mvelegar , please take me off the list. I figured it out.
burnhamdr
@burnhamdr
Jun 02 2016 01:28
Ah ok, so reinitializing those variables each time is better than initializing outside the loop and then updating them at the end of each iteration?
mvelegar
@mvelegar
Jun 02 2016 01:29
Yay @natwall27 ! Would you like to share with all of us your issue and solution if you can stick around?
(2) Well that is the other issue, keep 2 right_ghost cells and 2 left ones
since we are using asynchronous communication, you logic will be modified as:
for {time loop}
actually let me take a step back
@shinwookang this applies to you as well
Chris Kang
@shinwookang
Jun 02 2016 01:32
okay!
mvelegar
@mvelegar
Jun 02 2016 01:32
please modify this line: MPI_Request req;
to MPI_Request req_left, req_right;
then for clarity, lets declare the ghost_cells as:
double left_ghost[2];
and same for right_ghost
jdstead
@jdstead
Jun 02 2016 01:34
so you're making it an array with length 2? that's brilliant
mvelegar
@mvelegar
Jun 02 2016 01:35
one of these values (say left_ghost[0] and right_ghost[0]are the ones we'll update for each time loop.
in time loop
jdstead
@jdstead
Jun 02 2016 01:35
then you just specify when to send one and when to use the other to write into the next iteration?
mvelegar
@mvelegar
Jun 02 2016 01:36
// set and communicate boundary data left_ghost[0] = ukt[0]; right_ghost[0] = ukt[Nx-1];
now we send it out using MPI_Isend as soon as possible so thatmno other task has to wait for the data
jdstead
@jdstead
Jun 02 2016 01:37
that is the piece that i have been missing.
mvelegar
@mvelegar
Jun 02 2016 01:38
MPI_Isend(&left_ghost[0], 1, MPI_DOUBLE, left_proc, 0, comm, &req_left); for example
and same for right_ghost[0]
everyone with me so far?
burnhamdr
@burnhamdr
Jun 02 2016 01:38
yes!
Chris Kang
@shinwookang
Jun 02 2016 01:38
Are you MPI_Isending successively?
so, MPI_Isend(left ghost) followed by MPI_Isend(right_ghost)?
mvelegar
@mvelegar
Jun 02 2016 01:40
@shinwookang or the other way round, it doesn't matter, we are just doing the data send first as soon as we update it so that no other process is waiting
then next step would be to either (1) put in your Recv calls
Chris Kang
@shinwookang
Jun 02 2016 01:40
okay
mvelegar
@mvelegar
Jun 02 2016 01:41
so at left boundary for example MPI_Recv(&left_ghost[1], 1, MPI_DOUBLE, left_proc, 0, comm, MPI_STATUS_IGNORE);
and similar for right
burnhamdr
@burnhamdr
Jun 02 2016 01:41
or do forward Euler for the points we have local data for?
mvelegar
@mvelegar
Jun 02 2016 01:42
@burnhamdr YES! so note here that I will be consistently sending the left/right [0] values and receiving data in the left/right[1] values
burnhamdr
@burnhamdr
Jun 02 2016 01:42
ohhh ok so you send to the other entry in the array and then at the end you would update the first entry?
oh ok!
mvelegar
@mvelegar
Jun 02 2016 01:43
@burnhamdr Right again! note that we need the Recv commands before computing our boundary values at left and right
now, at the very end of the time loop
we put in MPI_Wait to make sure there was no data leak etc
like so: MPI_Wait(&req_left, MPI_STATUS_IGNORE);
MPI_Wait(&req_right, MPI_STATUS_IGNORE);
Chris Kang
@shinwookang
Jun 02 2016 01:44
should there be two different MPI_Waits for the two different ones?
oh okay
mvelegar
@mvelegar
Jun 02 2016 01:45
yes, which is why i asked you to modify the req at the top to req_left and req_right
burnhamdr
@burnhamdr
Jun 02 2016 01:46
wow ok, that fixed my issue. Thanks so much!
mvelegar
@mvelegar
Jun 02 2016 01:46
no problem!
is any one waiting in line?
burnhamdr
@burnhamdr
Jun 02 2016 01:46
I had originally messed around with 4 ghost cells but didn't quite make the right jump. It makes sense to me now!
mvelegar
@mvelegar
Jun 02 2016 01:46
@shinwookang did that help?
@burnhamdr great!
Chris Kang
@shinwookang
Jun 02 2016 01:47
I'm making my fixes, but I will let you know!
Thank you so much for all the help, in advance
jdstead
@jdstead
Jun 02 2016 01:56
@mvelegar , so do we need to also change our request declaration to match our ghost cell if we are using Isend? i think i saw something like that above....
mvelegar
@mvelegar
Jun 02 2016 01:56
@jdstead I recommended that, you will find the changes if you scroll above
Good luck to everyone!

Office hours end

jdstead
@jdstead
Jun 02 2016 01:58
ok i thought i saw that but it didn't make sense to me at the time. i think it does now
jdstead
@jdstead
Jun 02 2016 18:46
@cswiercz , hi Chris, I have re-written my program based on our converstation yesterday, and my plots are generating very well, but my test is still failing (I got my norm delta down to about 10^-3) and i am still getting the BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES error. I think that I am missing something in the initial and boundary conditions for each process. When you get a chance, can you look at my code and see if there is something obvious i am missing?
gadamico
@gadamico
Jun 02 2016 19:03
Hi, Chris. Was hoping you could take a look at my code. Somehow I'm getting a spatially constant solution of u = 1 (for the parallel version).
Chris Swierczewski
@cswiercz
Jun 02 2016 19:04

(Running 5-10 minutes late. Be here soon.)

gadamico
@gadamico
Jun 02 2016 19:04
ok
gadamico
@gadamico
Jun 02 2016 19:09
@jdstead: I've been getting that "bad termination" error, too.
Yi Zheng
@qpskcn1
Jun 02 2016 19:10
Same here
Chris Swierczewski
@cswiercz
Jun 02 2016 19:10
Okay. I might be jumping in and out, here, but I'm ready to answer questions starting with @jdstead
gadamico
@gadamico
Jun 02 2016 19:10
cool
Chris Swierczewski
@cswiercz
Jun 02 2016 19:10
In general, I'm quite sure that your "BAD TERMINATION" errors stem from minunderstandign the role of uk in the function heat_parallel().
See the comments I made in the previous office hours above.
In short, you should only need to invoke MPI_Isend, MPI_Recv, MPI_Wait and / or MPI_Barrier.
That being said, I'll take a look at @jdstead 's code first.
@jdstead There are a number of issues, here, but you're getting close. First, you have some strange conditional statements starting at Line 126
Remember that the current process needs to pass its values for uk[0] and uk[Nx-1] to the left and right processes, respectively.
jdstead
@jdstead
Jun 02 2016 19:15
right but for first iteration, they have to have an initial condition
Chris Swierczewski
@cswiercz
Jun 02 2016 19:15
Furthermore, in Lines 147 and 150 you look like you want to use the boundary data but you haven't received it from any adjacent processes, yet.
The initial condition was already passed to the processes in the Python script.
jdstead
@jdstead
Jun 02 2016 19:15
but stored where?
Chris Swierczewski
@cswiercz
Jun 02 2016 19:15
In other words, Process 0 already has uk[i] == 1 for all i.
You can check for yourself!
(Via inspection of the values of uk for the rank == 0 process.)
Regardless, in Lines 147 and 150 you look like you want to use the boundary data but you haven't received it from any adjacent processes, yet.
jdstead
@jdstead
Jun 02 2016 19:17
everywhere else not in the domain Nx should be ukt[i] = 0 right?
Chris Swierczewski
@cswiercz
Jun 02 2016 19:17
You want to receive those values from the left and right processes so you can actually compute uktp1[0] and uktp1[Nx-1], repsectively.
No!
Process 0's uk "chunk" is all populated with zeros.
Sorry, ones
For all other processes, the value of uk is initially all zeros.
jdstead
@jdstead
Jun 02 2016 19:18
right that is what i am saying
Chris Swierczewski
@cswiercz
Jun 02 2016 19:18
Sorry, I misunderstood.
I see what you were saying, now.
jdstead
@jdstead
Jun 02 2016 19:19
so when rank 0 gets the conditions of the boundary next to it, they are 0
Chris Swierczewski
@cswiercz
Jun 02 2016 19:19
True, but what if I decided to give you a different problem where the initial condition was something else? Then the conditional in your code at Line 126 would make no sense.
I claim that each process needs to do the same thing when it comes to "ghost cells" before the time iteration begins.
jdstead
@jdstead
Jun 02 2016 19:20
ok. i suppose. but in that case, a new set of initial conditions should be provided too right?
but those valuse may not have been calculated yet, since they depend on the ones next to them. it makes a circular reference
Chris Swierczewski
@cswiercz
Jun 02 2016 19:23
When the funciton heat_parallel is called by the Python script an initial condition is passed. (Just like in heat_serial.) The initial condition given in this particular problem is a Heaviside-like function.
The initial condition for heat_parallel is that process 0 receives ones and other processes receive szeros.
However, it's bad programming practice to assume that this is all the case. That's why I'm getting up in your grill about the codein Lines 126-135: it's problem-specific. It's not extendible to other input.
Even with these lines aside you have problems at Lines 147 and 150. The data that you need here come from adjacent processes. However, your communication statements currently lie below these lines.
Make sense?
jdstead
@jdstead
Jun 02 2016 19:26
i have to revert back to see which ones you're talking about. i have been tweeking and retweeking to get the send/receive calls correct
Chris Swierczewski
@cswiercz
Jun 02 2016 19:26
Also, you have duplicate lines 163 and 164. I think you mean to wait for left and right.
I'm talking about the code that's currently on GitHub.
jdstead
@jdstead
Jun 02 2016 19:27
oh yeah that is true. both say right
Chris Swierczewski
@cswiercz
Jun 02 2016 19:27
Like I said, in order to compute uktp1[0] you need data from the left process, first.
I'll let you think about this a bit while I address other questions, if that's okay.
jdstead
@jdstead
Jun 02 2016 19:27
yeah i felt like that code was getting close
Chris Swierczewski
@cswiercz
Jun 02 2016 19:27
It is!
jdstead
@jdstead
Jun 02 2016 19:27
ok
thanks
Chris Swierczewski
@cswiercz
Jun 02 2016 19:27
You're getting there. The main issue is that some things are in the wrong order.
Okay, @gadamico . You're next. Let me take a look at your code.
gadamico
@gadamico
Jun 02 2016 19:28
Thanks.
I think I'm missing an MPIWait somewhere (given the conversation above).
Chris Swierczewski
@cswiercz
Jun 02 2016 19:29
I can already see the problem. You never "update" ukt with the "old values" from uktp1.
gadamico
@gadamico
Jun 02 2016 19:29
Oh, right....
Oops.
Chris Swierczewski
@cswiercz
Jun 02 2016 19:29
That is, with each iteration of the time loop you want to take the values from uktp1 and "copy them" to ukt. I do the equivalent of that in Lines 66-68.
(In the serial version.)
gadamico
@gadamico
Jun 02 2016 19:29
Right. That makes sense.
Chris Swierczewski
@cswiercz
Jun 02 2016 19:29
That's why your solution never changes.
gadamico
@gadamico
Jun 02 2016 19:30
That also makes sense.
Geez.
Yi Zheng
@qpskcn1
Jun 02 2016 19:30
Can you look at my code plz.
gadamico
@gadamico
Jun 02 2016 19:30
Thanks, Chris. Not sure how I managed to leave that out.
jasheetz
@jasheetz
Jun 02 2016 19:34
Can I jump in after @qpskcn1 ?
Chris Swierczewski
@cswiercz
Jun 02 2016 19:41
Sorry for the brief outage.
I'm back.
@qpskcn1 -- you're next
What is your question?
Yi Zheng
@qpskcn1
Jun 02 2016 19:43
I am quite confused about updating the ghost cell
Chris Swierczewski
@cswiercz
Jun 02 2016 19:44
Anything in particular you can tell me? What have you tried
?
Yi Zheng
@qpskcn1
Jun 02 2016 19:45
Also, I am getting the bad termination as well
Chris Swierczewski
@cswiercz
Jun 02 2016 19:46
Still, what in particular are you confused about with your use of ghost cells? I can see your code on GitHub but I want to make sure I'm answering the specific questions you have.
Yi Zheng
@qpskcn1
Jun 02 2016 19:46
I tried to update the ghost cell at the beginning of each time iteration
Chris Swierczewski
@cswiercz
Jun 02 2016 19:47
At Line 130 you indeed prep some data to be transfered to adjacent processes, yes.
Yi Zheng
@qpskcn1
Jun 02 2016 19:48
Do we still need to update the ghost cell after the MPI WAIT?
haiboqi
@haiboqi
Jun 02 2016 19:48
@cswiercz could you have a peek of my code after your done with qpskcn1,Chris? im having the BAD TERMINATION too?
Chris Swierczewski
@cswiercz
Jun 02 2016 19:48
So from what I see it does appear that you're correctly passing information between the processes.
You want the current process to send its value of ukt[0] to the left processes, for example, because the left process needs this data to compute it's version of uktp1[Nx-1].
To rephrase: the left process needs the current process's uk[0] to compute its uktp1[Nx-1].
(A similar situation occurs on the right boundary.)
So...my question to you is: are you using the appropriate data when computing uktp1[Nx-1] and uktp1[0]?
(Hint: you're not.)
Yi Zheng
@qpskcn1
Jun 02 2016 19:51
That's what I am confusing about as well
Chris Swierczewski
@cswiercz
Jun 02 2016 19:52
Finally, your MPI_Wait statments should occur after the code you have written at the end of the loop iteration. This is because you need wait for the data to be sent before you can use it like you do.
Not to mention that Line 152-153 are unnecessary because you already obtain the data that you want to send at the top of the loop.
What part of this is confusing in particular?
Yi Zheng
@qpskcn1
Jun 02 2016 19:54
OK! Thank you! I will try to fix the computation after receive the data.
NVM, I think I got it.
Chris Swierczewski
@cswiercz
Jun 02 2016 19:54
@haiboqi What is your question?
haiboqi
@haiboqi
Jun 02 2016 19:55
im have the BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= PID 20763 RUNNING AT compute4-us
= EXIT CODE: 139
jasheetz
@jasheetz
Jun 02 2016 19:55
I am having those issues right now too. Doesn't seem to run anything. Or is this a deadlock condition
haiboqi
@haiboqi
Jun 02 2016 19:55
after i tried to fix the problem of nx isn't dividable by size
Chris Swierczewski
@cswiercz
Jun 02 2016 19:56
@jasheetz @haiboqi Here's something I came up wtih when I took your errors and typed them into google: https://wiki.mpich.org/mpich/index.php/Frequently_Asked_Questions#Q:_Why_did_my_application_exited_with_a_BAD_TERMINATION_error.3F
Just FYI. The internet is your friend!
Matt
@mostberg1
Jun 02 2016 19:56
Non-sequitir:
Chris Swierczewski
@cswiercz
Jun 02 2016 19:57
Now, let me take a look at your code
jasheetz
@jasheetz
Jun 02 2016 19:57
Just commenting that that was happening now.
Matt
@mostberg1
Jun 02 2016 19:57
When you get time, could you give some details about the timing/delivery of the final for remote students? Thank you
jasheetz
@jasheetz
Jun 02 2016 19:57
I tried to incorporate some changes from sean, but still have trouble.
Chris Swierczewski
@cswiercz
Jun 02 2016 19:58

@haiboqi I have seen a lot of the following going on, which I talk about in my previous office hours:

size_t chunk_size = Nx / size;
if ((rank + 1) == size)
  chunk_size = Nx - chunk_size * (size - 1);

I claim that this code is unnecessary. Let me explain:

The Python script has already taken a "large" data array u representing the initial condition and broke it up into a bunch of chunks, uk. When heat_parallel is called by each process the array uk already represents this chunk. So Process 0 will have its own version of uk, Process 1 will have its own version of uk, etc. So the code that you need to write in heat_parallel is entirely from the perspective of a single process. One consequence is that the data has already been chunked up and the only data you need to communicate to the "adjacent" processes are the boundary ukt data. Furthermore, each "chunk" is of length Nx.
You're receiving this "Bad Termination" error because of wrongful memory access within your MPI program as a result of getting your indexing mixed up by interpreting uk as the "entire" data array, not correctly as the chunk that has been assigned to Process 0.
Make sense?
@mostberg1 I will send an email out later today. not much has changed since the first email I sent out plus the things I mentioned here: uwhpsc-2016/syllabus#37
haiboqi
@haiboqi
Jun 02 2016 20:02
what about Nt? are they the same as Nx?
Chris Swierczewski
@cswiercz
Jun 02 2016 20:02
Nt is just the number of time steps you take.
i.e. how many times to forward Euler iteration takes place.
haiboqi
@haiboqi
Jun 02 2016 20:03
So Nt is exactly the same for each process?
Matt
@mostberg1
Jun 02 2016 20:03
Thank you
Chris Swierczewski
@cswiercz
Jun 02 2016 20:03
Definitely. Take a closer look at the problem statment and the contents of test_homework4.py. You'll see that I pass the same exact data to all processes except for the intial condition data which is already broken up into chunks.
haiboqi
@haiboqi
Jun 02 2016 20:05
ok, i kinda know what's wrong with my code now, thanks a lot!
Chris Swierczewski
@cswiercz
Jun 02 2016 20:05
Sure thing.
@jasheetz Let me see what you have now. Is it on gitHub?
jasheetz
@jasheetz
Jun 02 2016 20:05
yes, please!
I've tried many iterations.
Chris Swierczewski
@cswiercz
Jun 02 2016 20:06
Okay. First of all you have a little erroneous bit in Line 182. You definitely do not want to overwrite the previous / current iteration with the new data.
Right? Your rec_left/right data has just arrived from adjacent processes. You don't want to muck with the ukt data. You only want to update the uktp1 data approriately.
Instead...
...you want to use the rec_left/right data directly in your expressions for computing the boundary values in Lines 185/186 However! Keep in mind which is which!
Sometimes bad variable notation / names can get you really confused.
jasheetz
@jasheetz
Jun 02 2016 20:08
so these should be after the two boundary calculations?
Chris Swierczewski
@cswiercz
Jun 02 2016 20:09
What do you mean when you say "these"
jasheetz
@jasheetz
Jun 02 2016 20:09
I tried to keep it coming from the left side of the chunk. send_left/rec_left and coming from the right side send_right/rec_right.
by these, I guess I meant like lines 188-89 resetting the data.
Chris Swierczewski
@cswiercz
Jun 02 2016 20:10
Okay. So use these data direcly in the computations for uktp1[0] and uktp1[Nx-1], respectively.
And do so without changing the values in ukt.
jasheetz
@jasheetz
Jun 02 2016 20:10
I tried that too, but wiped them out to go back to square one.
Chris Swierczewski
@cswiercz
Jun 02 2016 20:11
Because the values of ukt are needed for later.
No! Don't clear any data. You need to keep track of previous and current states with each iteration.
jasheetz
@jasheetz
Jun 02 2016 20:11
I meant went back to an earlier version of my heat.c
Chris Swierczewski
@cswiercz
Jun 02 2016 20:12
In a perhaps circuitous way I'm try to explain that you just need to delete ukt[0] = gh_recleft;.
(And the corresponding rhs one.)
jasheetz
@jasheetz
Jun 02 2016 20:12
Understand.
I feel there is something else major going on though. With the send and receives though.
Chris Swierczewski
@cswiercz
Jun 02 2016 20:13
True.
Another thing that I noticed is that you're using the same variables req1 and req2 for all four non-blocking calls.
jasheetz
@jasheetz
Jun 02 2016 20:14
I thought they were supposed to be connected?
Chris Swierczewski
@cswiercz
Jun 02 2016 20:14
There are a couple of problems with this. However, it is best remedied by realizing that only the sends need to be non-blocking.
The corresponding receives can be blocking.
That is, I claim that you should use MPI_Recv instead of MPI_Irecv. If you draw the communication diagram (either on paper or in your head) you can see how this is sufficient to avoid deadlock.
That should be a big enough hint for now. I have to stop office hours here.

Office Hours - End.

jasheetz
@jasheetz
Jun 02 2016 20:16
appreciate the help. THanks!
Chris Swierczewski
@cswiercz
Jun 02 2016 20:16
no worries.
Yi Zheng
@qpskcn1
Jun 02 2016 21:10
Problem solved!! Thank you Chris!