Skip to main content

Get all documents from your KinoSearch1 index

·1 min

I wanted a script to dump the entire contents of my KinoSearch1 index into a flat file. Turns out this was pretty easy by using the KinoSearch1::Index::IndexReader.

The following Gist shows how I was able to do this.

#!/usr/bin/perl
use strict;
use warnings;

use KinoSearch1::Index::IndexReader;

my $r = KinoSearch1::Index::IndexReader->new( invindex => '/path/to/index' );

# get one or more readers
my @readers = ref $r->{sub_readers} eq 'ARRAY' ? @{ $r->{sub_readers} } : $r;

for my $reader (@readers) {
    print "Segment "
      . $reader->get_seg_name . " has "
      . $reader->max_doc
      . " docs\n"; 
    
    # the index is numbered from 0 to max_doc, so just loop
    # through them and get the documents
    for(my $i = 0; $i < $reader->max_doc; $i ++){
        # A KinoSearch1::Document::Doc object
        my $doc = $reader->fetch_doc($i);
        
        # this will depend on how your index has been set up
        my $title = $doc->get_value('title');
        print "$title\n";
    }

}
print "Total documents: " . $r->max_doc . " in " . @readers . " segments\n";

Pretty self-explanatory. Let me know if you have any questions.