Removes in memory set from dead compaction detector#6283
Removes in memory set from dead compaction detector#6283keith-turner wants to merge 8 commits intoapache:mainfrom
Conversation
Removed an in memory set of tables ids in the dead compaction detectors that contained table ids that may have compaction tmp files that needed cleanup. This set would be hard to maintain in multiple managers. Also the set could lose track of tables if the process died. Replaced the in memory set with a set in the metadata table. This set is directly populated by the split and merge fate operations, so there is no chance of losing track of things when a process dies. Also this set is more narrow and allows looking for tmp files to cleanup in single tablets dirs rather than scanning an entire tables dir. Also made a change to the order in which tmp files are deleted for failed compactions. They used to be deleted after the metadata for the compaction was cleaned up, this could lead to losing track of the cleanup if the process died after deleting the metadata but before deleting the tmp file. Now the tmp files are deleted before the metadata entry, so should no longer lose track in process death. This change is needed by apache#6217
| final TabletMetadata tm = ctx.getAmple().readTablet(extent, ColumnType.DIR); | ||
| if (tm != null) { | ||
| final Collection<Volume> vols = ctx.getVolumeManager().getVolumes(); | ||
| for (Volume vol : vols) { |
There was a problem hiding this comment.
This code was moved to FindCompactionTmpFiles so it could be called here and in the dead compaction detector.
ddanielr
left a comment
There was a problem hiding this comment.
Overall I like the functionality move into the metadata table.
I am curious how long those row entries become with the full UUID and tablet dir.
FindCompactionTmpFiles seems like it could use some refactoring but that's not technically impacting the work this PR is trying to accomplish.
server/base/src/main/java/org/apache/accumulo/server/metadata/RemovedCompactionStoreImpl.java
Outdated
Show resolved
Hide resolved
...src/main/java/org/apache/accumulo/manager/compaction/coordinator/DeadCompactionDetector.java
Outdated
Show resolved
Hide resolved
server/manager/src/main/java/org/apache/accumulo/manager/tableOps/merge/MergeTablets.java
Outdated
Show resolved
Hide resolved
server/manager/src/main/java/org/apache/accumulo/manager/tableOps/merge/MergeTablets.java
Show resolved
Hide resolved
| } | ||
|
|
||
| // Finds any tmp files matching the given compaction ids in table dir and deletes them. | ||
| public static void deleteTmpFiles(ServerContext ctx, TableId tableId, String dirName, |
There was a problem hiding this comment.
Some of this code seems like it has overlap with findTempFiles
There was a problem hiding this comment.
Some code I copied mostly as is from CompactionCoordinator to this class for deleting a file. Did not look at the existing code when I copied it in, will take a look at that.
There was a problem hiding this comment.
Made a change in 4993169 to consolidate the delete code. Still needed a separate function to find tmp files.
…RemovedCompactionStoreImpl.java Co-authored-by: Daniel Roberts <ddanielr@gmail.com>
…ction/coordinator/DeadCompactionDetector.java Co-authored-by: Daniel Roberts <ddanielr@gmail.com>
| interface RemovedCompactionStore { | ||
| Stream<RemovedCompaction> list(); | ||
|
|
||
| void add(Collection<RemovedCompaction> removedCompactions); |
There was a problem hiding this comment.
It took me a few minutes to understand what a RemovedCompaction represented. My understanding of these changes are that when a tablet is merged or split, the ECID entries in the tablet metadata for the tablets involved are removed. However, the compaction is likely running on a Compactor. Is that right?
I was having trouble understanding the context given the name. I wonder if a different name might better reflect the situation. The compaction itself is not removed, it's OBE. It's been orphaned from it's parent tablet or it's like a dangling ref in a database. Would OrphanedCompaction be a better name?
There was a problem hiding this comment.
OrphanedCompaction is a much better name, will change to that. Used removed in the name because it was removed from the metadata table.
There was a problem hiding this comment.
Is that right?
Yes that is all correct.
Removed an in memory set of tables ids in the dead compaction detectors that contained table ids that may have compaction tmp files that needed cleanup. This set would be hard to maintain in multiple managers. Also the set could lose track of tables if the process died.
Replaced the in memory set with a set in the metadata table. This set is directly populated by the split and merge fate operations, so there is no chance of losing track of things when a process dies. Also this set is more narrow and allows looking for tmp files to cleanup in single tablets dirs rather than scanning an entire tables dir.
Also made a change to the order in which tmp files are deleted for failed compactions. They used to be deleted after the metadata for the compaction was cleaned up, this could lead to losing track of the cleanup if the process died after deleting the metadata but before deleting the tmp file. Now the tmp files are deleted before the metadata entry, so should no longer lose track in process death.
This change is needed by #6217