Opened 16 years ago
Closed 15 years ago
#358 closed defect (fixed)
Omega: omindex eating up all available physical memory
Reported by: | Eric Voisard | Owned by: | Olly Betts |
---|---|---|---|
Priority: | normal | Milestone: | 1.0.17 |
Component: | Omega | Version: | 1.0.12 |
Severity: | normal | Keywords: | |
Cc: | Blocked By: | ||
Blocking: | Operating System: | Linux |
Description
I'm having a recurring problem with Omega's indexing with omindex.
When I run omindex, sometimes it is like if it was missing to recognize the extension of some .doc and .pdf files and it skips them with an "Unknown extension ... - skipping" message. In the same run, omindex is otherwise perfectly able to index other files with same extensions.
If I manually run antiword on a .doc file that failed previously, it works. If I narrow down the directory structure to make the recursion and indexing lighter, and then I run omindex, it works with files that failed previously. It never seems to fail with html and plain text files (built-in formats)
Each time a failure occurs and a file is skipped, a kernel error like the following one is recorded in /var/log/messages:
Apr 21 14:10:12 zen kernel: sh[4153]: segfault at ffffffffffffffff rip 00002ac7e7c4581f rsp 00007fffc3452de0 error 4
As reported by some other users who had same problem, it can be due to my system running low on memory and omindex not being able to run the external converter.
I ran omindex while checking the system's memory usage. The system (SLES10) has 1GB of RAM. Approx two thirds of that was used by other processes. Over the free third, omindex gradually took up all until 10MB remained free. At this point, memory usage stabilized and failures began to occur. Remember that it doesn't fail on every subsequent .doc or .pdf, but only on some of them.
--- Before running omindex:
top - 13:44:50 up 187 days, 21:46, 6 users, load average: 2.02, 2.03, 2.08 Tasks: 149 total, 2 running, 147 sleeping, 0 stopped, 0 zombie Cpu(s): 50.0%us, 0.0%sy, 0.0%ni, 50.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 1027164k total, 654256k used, 372908k free, 7760k buffers Swap: 4200988k total, 258284k used, 3942704k free, 404760k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 3268 root 34 19 261m 4656 2440 S 100 0.5 270524:02 zmd 1 root 16 0 796 72 40 S 0 0.0 0:00.96 init 2 root RT 0 0 0 0 S 0 0.0 0:00.18 migration/0 3 root 34 19 0 0 0 S 0 0.0 0:00.00 ksoftirqd/0 4 root RT 0 0 0 0 S 0 0.0 0:00.14 migration/1 5 root 34 19 0 0 0 S 0 0.0 0:00.00 ksoftirqd/1
--- Beginning (omindex using 9860kB or less than 1% of RAM):
top - 13:45:35 up 187 days, 21:47, 6 users, load average: 2.16, 2.05, 2.09 Tasks: 152 total, 1 running, 151 sleeping, 0 stopped, 0 zombie Cpu(s): 63.7%us, 6.5%sy, 0.0%ni, 29.4%id, 0.0%wa, 0.0%hi, 0.5%si, 0.0%st Mem: 1027164k total, 678548k used, 348616k free, 8224k buffers Swap: 4200988k total, 258284k used, 3942704k free, 421772k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 3268 root 34 19 261m 4656 2440 S 93 0.5 270524:47 zmd 829 root 17 0 16864 6520 1788 D 20 0.6 0:00.64 omindex 28316 root 10 -5 0 0 0 S 1 0.0 1:12.45 cifsd 31328 root 16 0 5656 1260 876 R 1 0.1 0:14.33 top 1 root 16 0 796 72 40 S 0 0.0 0:00.96 init [evoisard@zen]/home/evoisard > date ; ps aux | grep omindex Tue Apr 21 13:45:40 CEST 2009 root 829 8.7 0.9 20052 9860 pts/5 D+ 13:45 0:01 \ /usr/local/bin/omindex --db /srv/xapian/test --follow --url /docs/test/ /srv/xapian/targets/test
--- During runtime, still working fine (omindex using 103200 kB or 10% of RAM):
top - 13:56:14 up 187 days, 21:58, 6 users, load average: 3.06, 2.98, 2.58 Tasks: 153 total, 1 running, 152 sleeping, 0 stopped, 0 zombie Cpu(s): 60.7%us, 3.0%sy, 0.0%ni, 35.8%id, 0.0%wa, 0.0%hi, 0.5%si, 0.0%st Mem: 1027164k total, 824360k used, 202804k free, 10340k buffers Swap: 4200988k total, 258284k used, 3942704k free, 464760k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 3268 root 34 19 261m 4656 2440 S 99 0.5 270535:07 zmd 829 root 17 0 110m 100m 1820 S 25 10.0 0:55.32 omindex 31328 root 16 0 5656 1260 876 R 1 0.1 0:16.99 top 1 root 16 0 796 72 40 S 0 0.0 0:00.96 init 2 root RT 0 0 0 0 S 0 0.0 0:00.18 migration/0 3 root 34 19 0 0 0 S 0 0.0 0:00.00 ksoftirqd/0 [evoisard@zen]/home/evoisard > date ; ps aux | grep omindex Tue Apr 21 13:56:17 CEST 2009 root 829 8.5 10.0 113396 103200 pts/5 D+ 13:45 0:55 \ /usr/local/bin/omindex --db /srv/xapian/test --follow --url /docs/test/ /srv/xapian/targets/test
--- Close to the end, docs skipping and segfaults occuring (omindex using 369340kB or 36% of RAM):
top - 14:10:23 up 187 days, 22:12, 6 users, load average: 3.19, 3.22, 2.96 Tasks: 152 total, 2 running, 150 sleeping, 0 stopped, 0 zombie Cpu(s): 94.6%us, 1.5%sy, 0.0%ni, 0.0%id, 3.0%wa, 0.0%hi, 1.0%si, 0.0%st Mem: 1027164k total, 1017024k used, 10140k free, 996k buffers Swap: 4200988k total, 258284k used, 3942704k free, 401204k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 3268 root 34 19 261m 4656 2440 S 100 0.5 270549:04 zmd 829 root 18 0 370m 360m 1920 D 93 36.0 5:05.18 omindex 154 root 15 0 0 0 0 S 1 0.0 0:08.67 kswapd0 31328 root 16 0 5656 1260 876 R 1 0.1 0:20.52 top 1 root 16 0 796 72 40 S 0 0.0 0:00.96 init 2 root RT 0 0 0 0 S 0 0.0 0:00.18 migration/0 [evoisard@zen]/home/evoisard > date ; ps aux | grep omindex Tue Apr 21 14:10:28 CEST 2009 root 829 20.5 35.9 379324 369340 pts/5 D+ 13:45 5:08 \ /usr/local/bin/omindex --db /srv/xapian/test --follow --url /docs/test/ /srv/xapian/targets/test
When omindex terminates, all reserved resources are freed.
So, it looks like omindex somehow is not releasing all the runtime memory it is using. Sure I could add more memory to the system, but would then omindex not eat up all the extra memory too, or would it not have same problem again if the directories to index increase in size..
I don't know if this memory is required for handling the database itself or if it's used for the runtime and the filtering/indexing jobs.
I don't know if this behavior should be considered a bug or not, or if the process could be optimized. I let Xapian masters decide...
Anyway, many thanks for the wonderful work! Eric
Change History (5)
comment:1 by , 16 years ago
Version: | → 1.0.12 |
---|
comment:2 by , 15 years ago
Assuming you're using GCC 3.4 or newer, could you try:
GLIBCXX_FORCE_NEW=1 export GLIBCXX_FORCE_NEW
And then run omindex.
This tells the C++ STL allocator not to horde memory it has previously allocated, which might be at least part of the issue here.
comment:3 by , 15 years ago
Milestone: | → 1.0.17 |
---|---|
Status: | new → assigned |
I think this Debian bug explains the issue here:
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=548987
That's fixed in trunk r13572.
The other factor is I believe due to the C++ STL hording released memory, as I suggested in comment:2. In the absence of any feedback on that, I plan to backport the _SC_PHYS_PAGES change in r13572 for 1.0.17 and then close this ticket. If you (or anyone) is still seeing issues after that, please supply the requested information and reopen this ticket.
comment:4 by , 15 years ago
I added 1GB (=>2GB) to this system, now it runs fine. As it's now in production it'll not be easy to remove the memory and redo the tests. Hopefully, next week I'll have time for this...
Thanks, Eric
comment:5 by , 15 years ago
Resolution: | → fixed |
---|---|
Status: | assigned → closed |
Backported for 1.0.17 in r13600.
I think it makes sense to close this ticket now. If the memory usage issue isn't STL hoarding, then please open a new ticket for it.
I forgot to mention the version of Xapian/Omega I'm using. Eric