I'm 1) measuring my current state and 2) sharing my status (to create a sense of accountability). My plan is to 3) revisit this metric and evaluate the rate of change.
I could potentially go back in my commit history and construct this measurement, but I didn't previously have the infrastructure in place to make the measurement easy.
I'll list the issues below. While you're welcome to contribute by tackling them, your help with the Latex parsing and formal proofs of inference rules are contributions that I don't have as much skill with.
Tasks associated with step validation
Currently the only way to edit the implementation of an inference rule is to edit the https://github.com/allofphysicsgraph/proofofconcept/blob/gh-pages/v7_pickle_web_interface/flask/validate_steps_sympy.py . I'm not sure if there's a clearer method of verifying the transformation to the input ASTs.
Also, applying the AST transformation to inputs would save the user the work of specifying the output Latex/SymPy representations and improve the correctness. I haven't explored this yet.
- replace string variables in the ASTs with their PDG symbol IDs in https://github.com/allofphysicsgraph/proofofconcept/blob/gh-pages/v7_pickle_web_interface/flask/data.json
- verify that SymPy representations are consistent with the Latex string. Printing the SymPy as Latex and comparing with Latex will require converting the PDG symbol IDs back to Latex strings
I have a bunch of recorded talks and slides from the CICM 2020 meeting. I realized I could share them by running
python -m SimpleHTTPServer 8000
and coordinating with you regarding specific windows in which to download the content. The CICM materials are a total of 10GB.
-->Let me know if and when you're interested in downloading the materials. Then I'll make sure the content is temporarily accessible via a web server (running on my DigitalOcean droplet).
In the process of fixing expressions and steps in the PDG database, I encountered a novel challenge.
I've assumed every step in Latex corresponds to a step that can be checked in SymPy.
https://physicsderivationgraph.blogspot.com/2020/09/evaluating-definite-integrals-for.html
I'm not sure what to do when that isn't the case. In the post I describe a intermediary step commonly included in definite integrals but not supported in SymPy
Idea: apply mathematical operations on the IDs in the PDG.
if I have an ID for some value
x
andy
and+
Then the id ofx + y
should be some distinct relational mapping applied to the IDs.I think this can be done with Gödel numbers.
The main benefit is to improve confidence levels by reducing the number of string transformations. My guess is that doing operations on integer IDs will yield a few number of errors in the long term.
To summarize, that would mean each expression has an ID, and each symbol in an expression has an ID, and permutations of symbols have IDs?
/change of topic/
I encountered a few issues with SymPy today. I'll explain what I found and what I plan to do about it:
I have an expression (1158485859) that, in Latex, is
\frac{-\hbar^2}{2m} \nabla^2 = {\cal H}
The \nabla^2
is represented in SymPy as Laplacian()
, though an argument is required for the operator.
My solution: leave the SymPy representation as Pow(Symbol('nabla'), Integer(2))
Also on the topic of vectors, I encountered the Latex expression
\vec{p} \cdot \vec{F}(\vec{r}, t) = a
In order to convert that to SymPy, I'd need to specify a basis for \vec{p}
(e.g., from sympy.vector import CoordSys3D
). I know how to do that; for example,
>>> N = CoordSys3D('N')
>>> p = Symbol('p_x')*N.i + Symbol('p_y')*N.j + Symbol('p_z')*N.k
>>> F = Symbol('F_x')*N.i + Symbol('F_y')*N.j + Symbol('F_z')*N.k
>>> p.dot(F)
F_x*p_x + F_y*p_y + F_z*p_z
However, that doesn't seem to extend to functions:
>>> p.dot(Function(F)(Symbol('r'), Symbol('t')))
Traceback (most recent call last):
...
TypeError: expecting string or Symbol for name
My solution for now: leave the SymPy representation as incorrect, using "multiplication" instead of "dot"
Regarding the time invested, 2 hours to finally get around to realizing the relevant details.
The good news is that I'm in the process of identifying systemic issues that are design considerations, so once I decide what to do about them they are "fixed" across the project.
For example, nabla^2
shows up 22 times in the current database. The observation about Laplacian applies to all of them at once and is a find-and-replace text-based fix.
On a broader note, I'm reconsidering having a data structure that assumes a 1-to-1 mapping of the Latex-to-SymPy.
I'd like to talk with folks who know SymPy, Latex, and sufficient math in Physics about these issues :)