Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
  • Dec 15 2018 14:41
    safford93 starred usnistgov/hiperc
  • Dec 12 2018 18:08

    tkphd on master

    Doxygen moved (compare)

  • Oct 26 2018 23:06
  • Aug 23 2018 14:49
    wme7 forked
    wme7/hiperc
  • Aug 23 2018 14:49
    wme7 starred usnistgov/hiperc
  • Jul 30 2018 10:57
    avinashprabhu starred usnistgov/hiperc
  • Jul 27 2018 17:40
    tkphd opened #138
  • Jul 27 2018 17:39
    tkphd opened #137
  • Jul 27 2018 17:39
    tkphd labeled #137
  • Jul 27 2018 17:39
    tkphd labeled #137
  • Jul 11 2018 02:30
    ablekh starred usnistgov/hiperc
  • Jul 10 2018 18:36

    tkphd on master

    merged find command (compare)

  • Jul 10 2018 18:29

    tkphd on master

    link checks with Travis fix broken links Merge branch 'linting' (compare)

  • Jul 02 2018 16:21
    richardotis starred usnistgov/hiperc
  • Jun 28 2018 17:12
    ritajitk starred usnistgov/hiperc
  • Apr 26 2018 17:37
    tkphd opened #136
  • Mar 30 2018 08:42
    hellabyte starred usnistgov/hiperc
  • Feb 20 2018 21:50

    tkphd on manufactured-solutions

    fix kappa in script (compare)

  • Feb 19 2018 08:56

    tkphd on manufactured-solutions

    prettier truncation label starting temporal study (compare)

  • Feb 19 2018 07:56

    tkphd on manufactured-solutions

    tracking down SymPy source gene… clean-compiling Benchmark 7 for… improved output and 2 more (compare)

Trevor Keller
@tkphd
Thank you, Jim! The idea's been on the back burner for a while, but some recent activity in CTCMS moved it up front. Learning a lot as I go.
Trevor Keller
@tkphd
NERSC has almost 10,000 Xeon Phi nodes. Trying to figure out how to get time on one...
Trevor Keller
@tkphd
Whew! Got readthedocs to use the GitHub README as an index file. That took a lot longer than expected, but the results are worth it. The PDF build there looks nice, too.
Daniel Wheeler
@wd15
That's really nice regarding Readthedocs and the C API documentation.
A. M. Jokisaari
@amjokisaari
ooh, that Readthedocs page is nice.
drjamesawarren
@drjamesawarren
Pretty!
Ian Bell
@ianhbell
Readthedocs is somewhat painful to get up and running, but worth it. I've moved to that for some projects from my former life (e.g., http://achp.readthedocs.io/en/latest/ )
Ian Bell
@ianhbell
Not sure if I missed it, but is there any way to checksum the results so I can be sure that the methods are yielding the same numbers, as well as being faster/slower?
Trevor Keller
@tkphd
There's a ticket open for quantitative comparisons (#21). Checksums might work, but differences in the generated headers would mess it up. We've been discussing 2-point statistics for the CHiMaD phase-field benchmarks, which might also be good. I don't have a solution yet.
A. M. Jokisaari
@amjokisaari
Should the results be identical down to the very last decimal place in your double-precision data (i.e., numerical noise issues or things of that ilk)? If not, would a checksum actually be a useful tool? ...This is a naive question.
drjamesawarren
@drjamesawarren
Naive questions are the hard ones (usually)
Trevor Keller
@tkphd
Not down to the last decimal place, but the PNG images are binned into unsigned chars, so the data should be identical. However, since I'm using different compilers with different math libraries, and some hardware instructions are not available in every case, a statistical approach to verifying the results would be preferable. I have some lit review to do before implementing the comparison tools.
A. M. Jokisaari
@amjokisaari
ah right, I forgot that you want to do comparisons on the PNGs, not the raw data.
Trevor Keller
@tkphd
I mean, comparing the PNGs would be easier. A quick check shows that the SHA-256 checksums of PNGs produced every ten-thousandth timestep by the CPU-based codes, compiled with the same compiler on the same system, are indeed identical through t=100,000. But I still suspect this comparison is platform-dependent. I'll keep examining the possibility though. I will also note that the goal is to compare the raw data, not the PNG output, at some point.
Trevor Keller
@tkphd
Attention: I have renamed the repository from phasefield-accelerator-benchmarks to hiperc. This chatroom is in the process of migrating to a new location. I will invite each of you to join after the migration completes. Thank you for your input so far, and I'll see you on the other side.
Andrew Reid
@reid-a
In the OOF regression tests, we use a floating-point comparison with a tolerance, which might be a superior solution -- exact comparisons of binned data can be problematic in the case where one version is just inside a bin, and its counterpart is just outside. How to pick the tolerance, and picking relative or absolute or first one and the
other is a whole different can of worms.
Daniel Wheeler
@wd15
Maybe use Skimage to read the png images as Numpy arrays and then use numpy.allclose with an atol and rtol parameter.
Daniel Wheeler
@wd15
@reid-a, just increase rtol and atol until all the tests pass!
A. M. Jokisaari
@amjokisaari
:+1:
Andrew Reid
@reid-a
In the case of actual regression testing, that's not necessarily crazy, since we are looking for changes in behavior, not absolute correctness.
Adjust tolerances until the Linux test data passes on the Mac and vice versa.
A. M. Jokisaari
@amjokisaari
by the way, what's the reason for the name change?
"High Performance Computing Strategies for Boundary Value Problems"
Trevor Keller
@tkphd
"hiperc" is easier to type and remember than "phasefield-accelerator-benchmarks", and HPCS4BVP (sorry, on a phone) captures the goals and scope of the project. Phase field is my preferred application, but anybody solving diffusion-like equations will find this useful.
A. M. Jokisaari
@amjokisaari
yeah, gotcha. what is the pronunciation? "hyper-see" ? "hi-perk" ?
drjamesawarren
@drjamesawarren
Hype-rock
Trevor Keller
@tkphd
Hyper-see is my preference, yeah.
@reid-a, thanks for the regression suggestion. Is the technique documented in the OOF manual?
Andrew Reid
@reid-a
@tkphd Not really, the manual mostly is about how to use it. The critical function is "fp_file_compare" in the utils directory. You can drill down to it on the github repo.
Dan Lewis
@lucentdan
Thanks for the invite. Looking into the phase field accelerator at this time.
Dan Lewis
@lucentdan
Think this will be useful for the new generation of HPC planned at RPI 2018+
Trevor Keller
@tkphd
No problem, @lucentdan. Welcome!
A. M. Jokisaari
@amjokisaari

ok. The diffusion code runs on KNL!

runlog.csv results:

iter sim_time wrss conv_time step_time IO_time soln_time run_time
0 0 0 0 0.188137 0.057628 0 0.246949
10000 10000 0.000286 4.493929 1.365069 0.12165 0.005621 6.65671
20000 20000 0.000574 8.895637 2.39418 0.187032 0.006831 12.781632
30000 30000 0.000863 13.398053 3.401486 0.255395 0.008045 19.002456
40000 40000 0.001152 17.789476 4.41478 0.327418 0.009311 25.117928
50000 50000 0.001442 22.126154 5.438066 0.402769 0.012329 31.182672
60000 60000 0.001732 26.484548 6.458279 0.478117 0.013839 37.286988
70000 70000 0.002023 30.873932 7.447651 0.555889 0.015224 43.361118
80000 80000 0.002313 35.321395 8.449465 0.635728 0.016628 49.513117
90000 90000 0.002604 39.700359 9.443941 0.720135 0.018102 55.588724
100000 100000 0.002895 44.014251 10.427562 0.803771 0.019487 61.58906

diffusion.0100000.png
that's the final result.

I'm watching the output of top while running this, and I'm getting

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
22566 jokisaar 20 0 17.009g 35144 2232 R 25600 0.0 154:24.30 diffusion

Andrew Reid
@reid-a
Where's the +1 button on this thing?
+1!
Trevor Keller
@tkphd
PID PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
22566 20 0 17.009g 35144 2232 R 25600 0.0 154:24.30 diffusion
@reid-a type :+1:
Andrew Reid
@reid-a
:+1:
A. M. Jokisaari
@amjokisaari
@tkphd , ooh formatting. I shall endeavor to do that the next time
Trevor Keller
@tkphd
:+1:
A. M. Jokisaari
@amjokisaari
dumb me, but how do I interpret that %CPU? That's indicating threaded running, right? (mpi would give multiple lines in top)
Trevor Keller
@tkphd
Thank you so much for the KNL time and data! Looks like my code is 25.6/60=42% efficient, so we'll iterate :smile:
A. M. Jokisaari
@amjokisaari
haha. you are welcome! looking forward to further testing and really seeing how the phase field benchmarks work too!
Trevor Keller
@tkphd
Yes, CPU of 100% is one core, 25.6e3 is 25 cores, 60e3 would be 100% load.
A. M. Jokisaari
@amjokisaari
so these KNL nodes have 64 cores.