#91 closed defect (released)
javascript can block html indexing
| Reported by: | maarten | Owned by: | Olly Betts |
|---|---|---|---|
| Priority: | normal | Milestone: | |
| Component: | Omega | Version: | 0.9.6 |
| Severity: | normal | Keywords: | |
| Cc: | Blocked By: | ||
| Blocking: | Operating System: | All |
Description
When using "omindex" to index a directory filled with html-files, some javascript will stop the body from being indexed. For example on the following simple page:
<html> <head> <script language="Javascript"> function test(i) {
if(1<i) row=2;
} </script> </head> <body> The javascript bug </body> </html>
This page can't be found afterwords. The problem lies in "<". The program thinks its opening a tag and there for ignores all of the following text. Since the "tag" is never closed for the rest of the document.
Attachments (1)
Change History (6)
comment:1 by , 19 years ago
| op_sys: | Linux → All |
|---|---|
| rep_platform: | PC → All |
| Status: | new → assigned |
comment:2 by , 19 years ago
| Resolution: | → fixed |
|---|---|
| Status: | assigned → closed |
Fixed in SVN (rev 7176).
I'll attach the patch so you can verify it, and use it if you wish.
by , 19 years ago
| Attachment: | omega-htmlparser-ignore-javascript-lessthan.patch added |
|---|
Patch to fix this bug
comment:4 by , 19 years ago
| Resolution: | fixed → verified |
|---|
comment:5 by , 19 years ago
| Operating System: | → All |
|---|---|
| Resolution: | verified → released |
Note:
See TracTickets
for help on using tickets.

Hmm, indeed - we're going to need to treat <script> specially I think.