Automated thermodynamic database development within the CALPHAD method using pycalphad
bocklund on master
DOC: Fix README - malformed lin… (compare)
bocklund on docs-conda-forge
bocklund on master
DOC/TST: Update docs, drop pyca… (compare)
bocklund on master
MAINT/TST: Add Python 3.9 to se… (compare)
bocklund on docs-conda-forge
Fix readme syntax link (compare)
bocklund on docs-conda-forge
Remove :ref: role in favor of a… (compare)
bocklund on docs-conda-forge
Clean up RST comment Update readme to closely match … Remove reference to pycalphad A… and 8 more (compare)
bocklund on master
DOC: Enable testing docs, disab… (compare)
bocklund on docs-test-revamp
Add readthedocs config file (compare)
TRACE:root:Probability for initial parameters
/home/semikk/anaconda3/lib/python3.7/site-packages/pycalphad/core/lower_convex_hull.py:136: RuntimeWarning: invalid value encountered in double_scalars
result_array_GM_values[it.multi_index] = new_energy / molesum
TRACE:root:Likelihood - 6.71s - Thermochemical: -1004.646. ZPF: -3424.380. Activity: -14400.540. Total: -18829.566.
TRACE:root:Proposal - lnprior: 0.0000, lnlike: -18829.5658, lnprob: -18829.5658
Total
is close to your -15,500 or -16,000?
-lnprob
are only the accepted parameters that are in the Markov chains
T
in the model, that's actually a really big change. A 40x change (-5 to -200) in this parameter would give a decrease in the energy of 40 kJ at 1000 K. Usually a good rule of thumb is that the b
coefficient should be roughly 1/3000th of the a
coefficient. So for a = -14000
, b ~= -5
.
I'd double check your activity data, plotting with the pycalphad link I sent the other day, if possible.
I noticed in the files you sent me before that there are some unusual values, e.g. Si-acr-75cha_1.json
has
"X_SI": [
0.650,
0.600,
0.500,
0.399,
0.303,
0.200,
0.099,
0.790,
0.700
]
and
"values": [[[
0.597,
0.513,
0.356,
0.196,
0.104,
0.078,
0.033,
0.004,
0.080
]]],
For a reference of pure Si, the compositions should follow the trend in the mole fraction of Si. The last two compositions are for X(SI)=0.790
and X(SI)=0.700
, which don't match up well with the activities
Si-acr-75cha_2.json
and Si-acr-75cha_3.json
(maybe the activites are just out of order in this one?)
For the direct parameter generation, it's unlikely that it will be supported because the ionic liquid requires doing the energy minimization step to determine the site ratios of the cation and anion sublattices which allow the liquid to charge balance.
On the other hand, because the ionic liquid model is supported by pycalphad, you can use ESPEI to try to optimize the parameter values via MCMC if you supply a parameterized model with reasonable starting values
pycalphad doesn't support the ionic solid model yet (we had a closed PR that attempted to address it here: pycalphad/pycalphad#222). We have some solver changes upcoming that might make it worth addressing again.
I think that it would be possible to support parameter generation for ionic soilds, but that's not a high priority on our current roadmap for ESPEI and it would be pending support for ionic solids from pycalphad.
When I try to fit using MCMC I get some error. Here is how I started (datasets are stored in a folder called datasets):
where espei-in.yml looks like this:
system:
phase_models: mgsnX_input.json
datasets: datasets
generate_parameters:
excess_model: linear
ref_state: SGTE91
output:
output_db: mgsn.tdb
where mcmc.yml looks like this
system:
phase_models: mgsnX_input.json
datasets: datasets
mcmc:
iterations: 1000
input_db: mgsn.tdb
output:
output_db: mgsn_mcmc.tdb
I get the following error:
/home/davidkleiven/.local/lib/python3.7/site-packages/pycalphad/codegen/callables.py:97: UserWarning: State variables in build_callables
are not {N, P, T}, but {T, P}. This can lead to incorrectly calculated values if the state variables used to call the generated functions do not match the state variables used to create them. State variables can be added with the additional_statevars
argument.
"additional_statevars
argument.".format(state_variables))
Traceback (most recent call last):
File "/home/davidkleiven/.local/bin/espei", line 11, in <module>
load_entry_point('espei', 'console_scripts', 'espei')()
File "/home/davidkleiven/Documents/ESPEI/espei/espei_script.py", line 311, in main
run_espei(input_settings)
File "/home/davidkleiven/Documents/ESPEI/espei/espei_script.py", line 260, in run_espei
approximate_equilibrium=approximate_equilibrium,
File "/home/davidkleiven/Documents/ESPEI/espei/optimizers/opt_base.py", line 36, in fit
node = self._fit(symbols, datasets, args, kwargs)
File "/home/davidkleiven/Documents/ESPEI/espei/optimizers/opt_mcmc.py", line 238, in _fit
self.predict(initial_guess, ctx)
File "/home/davidkleiven/Documents/ESPEI/espei/optimizers/opt_mcmc.py", line 305, in predict
non_eq_thermochemical_prob = calculate_non_equilibrium_thermochemical_probability(parameters=np.array(params), *non_equilibrium_thermochemical_kwargs)
File "/home/davidkleiven/Documents/ESPEI/espei/error_functions/non_equilibrium_thermochemical_error.py", line 271, in calculate_non_equilibrium_thermochemical_probability
points=data['calculate_dict']['points'])[output]
File "/home/davidkleiven/Documents/ESPEI/espei/shadowfunctions.py", line 55, in calculate
largest_energy=float(1e10), fake_points=fp)
File "/home/davidkleiven/.local/lib/python3.7/site-packages/pycalphad/core/calculate.py", line 190, in _compute_phase_values
param_symbols, parameter_array = extract_parameters(parameters)
File "/home/davidkleiven/.local/lib/python3.7/site-packages/pycalphad/core/utils.py", line 361, in extract_parameters
parameter_array_lengths = set(np.atleast_1d(val).size for val in parameters.values())
AttributeError: 'NoneType' object has no attribute 'values
has anyone seen this before or have can point me in the right direction? Thanks in advance for your help.
@zhanghaihui_gitlab ESPEI computes the log-likelihood for every type of data by finding value of the log probability density function of a normal distribution centered at zero with a standard deviation of σ, i.e. norm(loc=0, scale=σ).logpdf(error)
where error
is the difference between the expected and calculated value for a particular set of parameters.
Since we have multiple types of data, a default value of the σ_initial
exists for all supported data types to handle the different scales of error and to make the likelihoods comparable even though the errors are not. The data_weights
modifies the σ used in the normal distribution by making σ = σ_initial/data_weight[data_type]
for each data_type
like ZPF
or HM
. By default, all data_weight
values are 1.0, i.e. σ = σ_initial
.
For example: consider that we have two experiments: one for enthalpy of formation data and one for entropy of formation data. If we calculate the error
as the difference between the expected enthalpy of formation and the calculated enthalpy of formation, we might find the difference is 10 J/mol. If we do the same for the entropy of formation, we might find that the error
is 5 J/K-mol.
Now we need to determine the likelihood for these two errors, one with a difference between the expected/calculated value of 10 and another of 5. Using our existing knowledge about the relative magnitudes of formation enthalpy vs. entropy, we know that an error of 10 J/mol in the enthalpy of formation is very close to the solution (i.e. high likelihood), while an error of 5 J/K-mol in the enthalpy of formation is probably not that close to the solution (i.e. low likelihood). The σ_initial
are set so that the likelihoods of these two cases are comparable, but you might not agree with my choice of σ_initial
or you might want to value a particular type of data more heavily than I do. data_weights
lets you do that.
data_weights
. I can update the docs with an edited version of this to help answer this question in the future