Clean databases folder on startup#675
Conversation
5eb12ef to
4aa11e4
Compare
|
Ping! @github/docs-content-dsp This is not entirely a user facing feature and perhaps mentioning in the changelog is good enough. But some users may want to be aware of this. From the changelog:
|
|
|
4aa11e4 to
4a3c2b4
Compare
Thanks for letting us know 👍🏽 What does "cleaning up" the database actually mean in this case? As long as it doesn't break the database or do anything unexpected, we're probably fine with just a changelog entry 😊 |
|
Cleaning up means deleting unused databases. In this case, the user has first imported a database as a zip file (or from a url or LGTM) and it was placed inside of the extension's storage area (a place that is not user facing). Then the user has removed this database from the extension. Most of the time, this means that the database is deleted from the file system as well, but sometimes (typically because windows has not released the file system lock), the database cannot immediately be removed from disk. Thus the database is orphaned. It exists, but is no longer used anywhere. This change will ensure that these orphaned databases are removed eventually. Note that this change does not affect databases added as a filesystem folder. We assume these databases are user controlled even after being removed from the extension. |
Great, thanks for clarifying! That all sounds sensible and pretty harmless. I don't think we need to document it. |
| @@ -0,0 +1,22 @@ | |||
| import { fail } from 'assert'; | |||
There was a problem hiding this comment.
Typo in filename: pures -> pure
| const dbRegeEx = /^db-(javascript|go|cpp|java|python|csharp|ruby)$/; | ||
| function isLikelyDbFolder(dbPath: string) { | ||
| return path.basename(dbPath).match(dbRegeEx); | ||
| } | ||
|
|
||
| async function isDatabaseDirectory(dir: string) { | ||
| return (await fs.readdir(dir)).some(isLikelyDbFolder); | ||
| } |
There was a problem hiding this comment.
Not very fond of having to maintain a list of language slugs.
Elsewhere, I expect we are checking for the existence of codeql-database.yml (or .dbinfo as a fallback) to determine whether a directory is a CodeQL database. Can we continue to use that here instead of relying on the dataset folder name?
There was a problem hiding this comment.
I agree about the hard-coded languages, but I'm not quite sure what to do about them. Maybe there is some kind of way we can discover this by introspecting which standard libraries are installed, but that seems complex for now.
I'll be more precise and check for one of those two files.
| } | ||
| }) | ||
| ); | ||
| showAndLogErrorMessage(`Failed to delete orphaned databases:\n ${failures.join(' \n')}'. Must delete manually.`); |
There was a problem hiding this comment.
Nit: why leading space before the newline? And since this is user-facing, perhaps give them an action, e.g. 'To delete unused databases, please remove them manually from the workspace storage folder.'
There was a problem hiding this comment.
Leading space is for indentation, to indent paths we could not remove. Though, the space should come after the newline.
Also, the failures should contain the full path to the database. Maybe I will use only the basename and include the storage folder elsewhere.
| .filter(dirent => dirent.isDirectory()) | ||
| // get the full path | ||
| .map(dirent => path.join(this.storagePath, dirent.name)) | ||
| // filter databases still in workspace |
There was a problem hiding this comment.
| // filter databases still in workspace | |
| // filter out databases still in workspace |
| // filter databases still in workspace | ||
| .filter(dbDir => { | ||
| const dbUri = Uri.file(dbDir); | ||
| return this.databaseManager.databaseItems.every(item => item.databaseUri.fsPath !== dbUri.fsPath); |
There was a problem hiding this comment.
I don't think we have to do this search for every identified directory in storage.
I suggest creating a Set up front containing all the fsPath values from this.databaseManager.databaseItems, and then this filter just becomes a set lookup.
My assumption here is that when this function is awaited, there is no way for the user to add databases to the workspace after the set is constructed but before cleanup completes (otherwise we have a race condition).
| }; | ||
|
|
||
| handleRemoveOrphanedDatabases = async (): Promise<void> => { | ||
| logger.log('Removing orphaned databases.'); |
There was a problem hiding this comment.
| logger.log('Removing orphaned databases.'); | |
| logger.log('Removing orphaned databases from workspace storage.'); |
| } | ||
| }) | ||
| ); | ||
| showAndLogErrorMessage(`Failed to delete orphaned databases:\n ${failures.join(' \n')}'. Must delete manually.`); |
There was a problem hiding this comment.
This should only be called when failures is non-empty.
| }, | ||
| { | ||
| "command": "codeQLDatabases.removeOrphanedDatabases", | ||
| "title": "Remove databases no longer imported into VS Code" |
There was a problem hiding this comment.
Clean up unused databases?
There was a problem hiding this comment.
This is not really a user facing command. But, sure.
6f410d1 to
300fb21
Compare
Cleans orphan databases on startup. This commit also bumps the fs-extra dependency to get readdir with dirent objects. Adds the `asyncFilter` to filter arrays asynchronously.
300fb21 to
4b11e5d
Compare
Cleans orphan databases on startup. This commit also bumps the fs-extra
dependency to get readdir with dirent objects.
Adds the
asyncFilterto filter arrays asynchronously.Implemented as a command so a saavy user could assign a keyboard shortcut to it, but it is not accessible through the command palette.
Fixes #674.
Checklist
@github/docs-content-dsphas been cc'd in all issues for UI or other user-facing changes made by this pull request.