Opened 11 years ago
Last modified 13 months ago
#639 assigned enhancement
omega : should reindex file's write when needed
Reported by: | egarette | Owned by: | Olly Betts |
---|---|---|---|
Priority: | normal | Milestone: | 2.0.0 |
Component: | Omega | Version: | git master |
Severity: | normal | Keywords: | |
Cc: | Blocked By: | ||
Blocking: | Operating System: | All |
Description
If we change write access for a specified file, omega don't reindex file, so user that lost right access could see it in query.
In ticket #632 i suggest to used ctime instead of mtime. But this proposal is not suitable.
Here is the Olly's reply:
The change from mtime to ctime will mean that the "last modified" time reported in the Omega UI will now in general not actually be the last time the contents of the file were changed.
I'm also slightly concerned that the mtime -> ctime change will result in reindexing files in many more cases - e.g. if I tar up a file tree and Xapian database and untar it on another machine (as a non-privileged user), the mtimes are preserved but the ctimes change. So this change would mean that omindex would have to reindex every document in this case (and without root access, I don't think one can avoid that).
I think we probably need to store the ctime separately (so lastmod still works as before) and make whether ctime or mtime is used for reindexing an option, or else find a better way to know when ACLs have changed - perhaps only checking the ACL for changes if the ctime has changed but the mtime hasn't.
Change History (7)
comment:1 by , 11 years ago
Component: | Other → Omega |
---|---|
Milestone: | → 1.3.3 |
comment:2 by , 10 years ago
Status: | new → assigned |
---|
[2853cdace3ab8ba4d23a1dd568f207b1bbbbb4b5] adds a --track-ctime
option which stores ctime and uses it instead of mtime to decide if we reindex. But mtime is still used in the UI.
I realised we can actually easily check for the case when the file contents are the same and only the inode metadata has changes (newer ctime but same mtime), and in this case just update the terms and values for that metadata in the existing document - that's a lot less work, especially if a slow filter is involved (e.g. if we are doing OCR to get document text). I've not implemented this optimisation yet though.
comment:3 by , 10 years ago
Blocking: | 632 added |
---|
comment:4 by , 10 years ago
Milestone: | 1.3.3 → 1.3.4 |
---|
I don't want to delay 1.3.3 any longer, so bumping the rest of this to 1.3.4.
comment:6 by , 9 years ago
Blocking: | 632 removed |
---|---|
Milestone: | 1.3.5 → 1.4.x |
Version: | → git master |
The thing left to do here is the optimisation for the case of "ctime changed, mtime unchanged". I don't think it makes much sense to block 1.4.0 by that - it's not a correctness issue. This also no longer blocks 632.
comment:7 by , 13 months ago
Milestone: | 1.4.x → 2.0.0 |
---|
It'd be good to finish off this work, but it's an optimisation rather than a correctness thing, and could be added in a stable release so postponing in the interests of actually getting a new stable release series started.
Marking to consider for 1.3.3.