Opened 13 years ago

Closed 10 years ago

#520 closed defect (fixed)

Wrap C++ iterators as PHP iterators

Reported by: Daniel Ménard Owned by: Olly Betts
Priority: normal Milestone: 1.3.2
Component: Xapian-bindings (PHP) Version: 1.2.3
Severity: minor Keywords:
Cc: Blocked By:
Blocking: Operating System: All

Description

Hi,

I'm not sure if this is a bug or just a precision to add to the documentation, but it is not possible to call XapianTermIterator::get_term() (or get_termfreq) if a previous call to skip_to() or next() has positioned the internal pointer after the last term of the database.

For example, this php script generates a GPF on my machine (windows, php bindings from flax.co.uk, php 5.3.3, Xapian 1.2.3):

<?php
$db = Xapian::inmemory_open();
$it = $db->allterms_begin();
$it->skip_to('term');
$it->get_term(); // or $it->get_term_freq();
?>

The same apply with a not-empty database: skip_to(last term) + next() + get_term() = fault.

Of course, I'm the culprit: I should test for "eof" before trying to dereference my iterator.

This is what I do, now, and it works well:

if (!$it->equals($db->allterms_end()) $it->get_term();

But I was surprised about Xapian generating a GPF and not an Exception, so I thought it was worth reporting.

Cheers,

Daniel

PS: I tested with Xapian 1.2.3, but the behavior is the same with version 1.0.16.

Change History (5)

comment:1 by Olly Betts, 13 years ago

Milestone: 1.2.x
Operating System: Microsoft WindowsAll

This behaviour is sort of expected. The semantics of C++ iterators are that trying to dereference or advance them past the end position is undefined behaviour, and PHP inherits this since it just puts a thin PHP class wrapper around the C++ iterator classes.

The best fix for this would be to wrap the C++ iterator classes as PHP5 iterators, which would do away with the ugly equals() method and allow looping over an iterator to be written as something like:

foreach ($db->allterms() as $it) {
    echo $it->get_term(), ": ", $it->get_term_freq(), "\n";
}

At the time we sorted out the PHP5 wrappers, there were several approaches from different people which emerged at about the same time. Paul Dixon's included PHP5 iterator wrappers, and is described here:

http://blog.dixo.net/2006/04/04/xapian-php5-wrapper/

But the iterator wrappers part didn't get merged in back then.

I've just spent a while looking at adding these in, but annoyingly it's not trivial to fit this in with how the SWIG generated wrappers work.

So I think for 1.2.4 I'll just document this issue, and this ticket can be for implementing PHP5 iterators (trunk r15223, 1.0 branch r15224).

Marking as 1.2.x - we should be able to do this such that the existing iterator wrappers continue to work, but we can mark these as deprecated and remove them at some point.

comment:2 by Olly Betts, 13 years ago

Owner: set to Olly Betts
Status: newassigned
Summary: TermIterator::get_term() can fault (core dump with php bindings on windows)Wrap C++ iterators as PHP iterators

comment:3 by Olly Betts, 11 years ago

Milestone: 1.2.x1.3.x

This isn't 1.2.x material now.

comment:4 by Olly Betts, 10 years ago

Milestone: 1.3.x1.3.2

I have a nearly complete patch for this, so marking for 1.3.2.

comment:5 by Olly Betts, 10 years ago

Resolution: fixed
Status: assignedclosed

Fixed in trunk r18254.

Note: See TracTickets for help on using tickets.