Get all documents from your KinoSearch1 index
·1 min
I wanted a script to dump the entire contents of my KinoSearch1 index into a flat file. Turns out this was pretty easy by using the KinoSearch1::Index::IndexReader
.
The following Gist shows how I was able to do this.
#!/usr/bin/perl
use strict;
use warnings;
use KinoSearch1::Index::IndexReader;
my $r = KinoSearch1::Index::IndexReader->new( invindex => '/path/to/index' );
# get one or more readers
my @readers = ref $r->{sub_readers} eq 'ARRAY' ? @{ $r->{sub_readers} } : $r;
for my $reader (@readers) {
print "Segment "
. $reader->get_seg_name . " has "
. $reader->max_doc
. " docs\n";
# the index is numbered from 0 to max_doc, so just loop
# through them and get the documents
for(my $i = 0; $i < $reader->max_doc; $i ++){
# A KinoSearch1::Document::Doc object
my $doc = $reader->fetch_doc($i);
# this will depend on how your index has been set up
my $title = $doc->get_value('title');
print "$title\n";
}
}
print "Total documents: " . $r->max_doc . " in " . @readers . " segments\n";
Pretty self-explanatory. Let me know if you have any questions.