Friday, April 1, 2016

pingest.pl Gets a New Option

In case you're using pingest.pl from MVLC's Evergreen utilities repository, then you might be interested to know that it got a new option this week:

--pipe
         Read record IDs to reingest from standard input.
         This option conflicts with --start-id and/or --end-id.


This new option allows you to run a custom query to feed record ids to pingest.pl. For instance, assuming you have a query that returns bibliographic record ids in a text file called query.sql, you could use a command line like the following to ingest the records corresponding to the ids returned from the query:

    psql -q -t -f query.sql | pingest.pl --pipe

In the absence of the --pipe option, pingest.pl continues to use its internal query to determine what records to ingest.

In case you are new here and don't know what all this record ingestion is about, this is Evergreen-speak for generating the indexes used for search, browse, facets, and record attributes. pingest.pl generates these indexes in parallel by splitting the records up into batches and working on more than one batch at a time. Parallel processing is usually faster than starting with one record and going straight through to the end.

No comments:

Post a Comment