These are chat archives for uwhpsc-2016/uwhpsc-2016
MPI_Request req_left, req_right;
right_ghostare the ones we'll update for each time loop.
// set and communicate boundary data left_ghost = ukt; right_ghost = ukt[Nx-1];
MPI_Isend(&left_ghost, 1, MPI_DOUBLE, left_proc, 0, comm, &req_left);for example
MPI_Recv(&left_ghost, 1, MPI_DOUBLE, left_proc, 0, comm, MPI_STATUS_IGNORE);
reqat the top to
ukin the function
MPI_Waitand / or
uk[Nx-1]to the left and right processes, respectively.
uk[i] == 1for all
rank == 0process.)
uk"chunk" is all populated with zeros.
ukis initially all zeros.
heat_parallelis called by the Python script an initial condition is passed. (Just like in
heat_serial.) The initial condition given in this particular problem is a Heaviside-like function.
heat_parallelis that process 0 receives ones and other processes receive szeros.
uktp1you need data from the left process, first.
uktwith the "old values" from
uktp1and "copy them" to
ukt. I do the equivalent of that in Lines 66-68.
uktto the left processes, for example, because the left process needs this data to compute it's version of
ukto compute its
MPI_Waitstatments should occur after the code you have written at the end of the loop iteration. This is because you need wait for the data to be sent before you can use it like you do.
@haiboqi I have seen a lot of the following going on, which I talk about in my previous office hours:
size_t chunk_size = Nx / size; if ((rank + 1) == size) chunk_size = Nx - chunk_size * (size - 1);
I claim that this code is unnecessary. Let me explain:
urepresenting the initial condition and broke it up into a bunch of chunks,
heat_parallelis called by each process the array
ukalready represents this chunk. So Process 0 will have its own version of
uk, Process 1 will have its own version of
uk, etc. So the code that you need to write in
heat_parallelis entirely from the perspective of a single process. One consequence is that the data has already been chunked up and the only data you need to communicate to the "adjacent" processes are the boundary
uktdata. Furthermore, each "chunk" is of length
ukas the "entire" data array, not correctly as the chunk that has been assigned to Process 0.
test_homework4.py. You'll see that I pass the same exact data to all processes except for the intial condition data which is already broken up into chunks.
rec_left/rightdata has just arrived from adjacent processes. You don't want to muck with the
uktdata. You only want to update the
ukt = gh_recleft;.
req2for all four non-blocking calls.
MPI_Irecv. If you draw the communication diagram (either on paper or in your head) you can see how this is sufficient to avoid deadlock.