I'll write up some more detailed instructions someplace else (github maybe, or our docs) but in general restoring sequencing files deleted from a sample involves the table sample_sequencingobject
which stores a single row which which sample is associated with which sequencing object:
select * from sample_sequencingobject;
+----+---------------------+-----------+---------------------+
| id | created_date | sample_id | sequencingobject_id |
+----+---------------------+-----------+---------------------+
| 1 | 2022-01-11 12:38:16 | 1 | 1 |
+----+---------------------+-----------+---------------------+
The table sample_sequencingobject_AUD
stores all modifications made to the sample_sequencingobject
table:
select * from sample_sequencingobject_AUD;
+----+---------------------+-----------+---------------------+-----+---------+
| id | created_date | sample_id | sequencingobject_id | REV | REVTYPE |
+----+---------------------+-----------+---------------------+-----+---------+
| 1 | 2022-01-11 12:38:16 | 1 | 1 | 6 | 0 |
| 1 | 2022-01-11 12:38:16 | 1 | 1 | 47 | 2 |
+----+---------------------+-----------+---------------------+-----+---------+
Here, REV
is a unique identifier for every operation performed (e.g., create, update, delete). REVTYPE
defines the specific operation performed (a REVTYPE
of 2
means a deletion).
So if you look at:
select * from sample_sequencingobject_AUD where REVTYPE = 2;
+----+---------------------+-----------+---------------------+-----+---------+
| id | created_date | sample_id | sequencingobject_id | REV | REVTYPE |
+----+---------------------+-----------+---------------------+-----+---------+
| 1 | 2022-01-11 12:38:16 | 1 | 1 | 47 | 2 |
+----+---------------------+-----------+---------------------+-----+---------+
You can see all delete operations on the sample_sequencingobject
table. You can also see that the deleted information is still saved in this table (e.g., the specific sample_id=1
and sequencingobject_id=1
defining which sequencing object used to be linked with which sample).
To restore the link between the sample_id
and sequencingobject_id
you can re-insert this data into the sample_sequencingobject
table like so:
start transaction;
insert into sample_sequencingobject (id,created_date,sample_id,sequencingobject_id) select id,created_date,sample_id,sequencingobject_id from sample_sequencingobject_AUD where REV in (47) and REVTYPE = 2;
commit;
This will re-insert the entry into the sample_sequencingobject
table linking the sample and sequence data back up in IRIDA:
select * from sample_sequencingobject;
+----+---------------------+-----------+---------------------+
| id | created_date | sample_id | sequencingobject_id |
+----+---------------------+-----------+---------------------+
| 1 | 2022-01-11 12:38:16 | 1 | 1 |
+----+---------------------+-----------+---------------------+
(see above screenshot)
The actual code that gets called when you remove sequence data from a sample is:
Specifically, here you can see that what it does is removes the link between a sample and a sequencing object (that is a row from the sample_sequencingobject
table). So restoring the entry in this table restores the sample/sequence data link.
REVTYPE
values in case you wanted to see what they all represent (from https://docs.jboss.org/hibernate/orm/current/userguide/html_single/Hibernate_User_Guide.html#envers)
@apetkau I am getting this Error
AnalysisSubmission [id=11, name=SISTR__1-11-2022_Test_1, submitter=admin, workflowId=b21ea62c-7916-4ca6-96ba-90c20177b70f, analysisState=PREPARING, analysisCleanedState=NOT_CLEANED] changing to state ERROR
ca.corefacility.bioinformatics.irida.exceptions.galaxy.WorkflowUploadException: Could not upload workflow from /workflows/1.1.1/irida_workflow_structure.ga
at ca.corefacility.bioinformatics.irida.pipeline.upload.galaxy.GalaxyWorkflowService.uploadGalaxyWorkflow(GalaxyWorkflowService.java:64) ~[classes/:?]
No New history is being created
Is it something to do with file sharing or?
Hi, I am having this issue whenever trying to run an analysis
ca.corefacility.bioinformatics.irida.exceptions.galaxy.WorkflowUploadException: Could not upload workflow from /workflows/0.1.5/irida_workflow_structure.ga
Galaxy logs throw the uwsgi issue, could they be related.[uwsgi-http key: localhost:9090 client_addr: 127.0.0.1 client_port: 50636] hr_instance_read(): Connection reset by peer [plugins/http/http.c line 647]
peer
in this case could be IRIDA (if that message is from Galaxy) so could be produced during the error in uploading the workflow. But I'm not sure.
It's likely that some genomes are either very low coverage or too different from the rest and all SNV/SNP data is removed from the phylip alignment file. You can check the mapping quality file (https://snvphyl.readthedocs.io/en/latest/user/output/#mapping-quality) for information on which genomes have low coverage. You can check the core positions file to find the percent of the reference genome considered for data analysis.
If you find any genomes with too low of coverage or too distantly related you can remove them from analysis and re-run SNVPhyl