Tuesday, May 14, 2013

Strike That

Well, I've just been informed that I'm a liar, or misspoke if you prefer.

I didn't know when I made the previous post that we are actually still using 1,000 as our MaxRequestsPerChild setting. I guess I misunderstood what I was being told last week, and I certainly did not bother to check our configuration before making the previous blog post.

I apologize for any confusion this may have caused. However, the general observation that you may have to experiment with your Apache settings before you find the right combination for your server and its usage patterns still stands. You can't always expect the defaults to work right out of the box.

More Apache Fun

Another update to share with you what MVLC is doing with our Apache configuration for Evergreen.

We found last week with the MaxMemFree setting at 16 and MaxRequestsPerChild at 1,000 that we were getting more texts from our monitoring software about the load being high on our Evergreen server. We thought this might have to do with more frequent turnover among the Apache child processes, so we adjusted MaxRequestsPerChild back up to 10,000. However, during overnight monitoring, we discovered that this made the situation worse, or at least put us back where we were before trying all of these changes.

In the end, we have set MaxRequestsPerChild to 5,000 while leaving MaxMemFree at 16. We've been running this configuration for several days, including over the weekend, and things seem to have really settled down on our server. You may have to experiment with the settings to find something that works for you if you think you are having this issue.

Tuesday, May 7, 2013

Update on the Apache Situation

Thomas just told me that he changed another Apache configuration variable that seems to have helped things. He set the MaxMemFree directive to 16 in our mpm_prefork configuration section. This setting also limits the memory that an individual Apache process can consume before releasing the memory back to the operating system.

I thought we'd share this in case anyone else is bumping their heads against memory issues with Apache.

Saturday, May 4, 2013

What has been going on.

TL;DR: We've had trouble with the memory consumption of Apache processes on our Evergreen server since we did our latest update on April 14, 2013. Along the way to figuring this out we've had a few minor detours and fixed another bug. Our breakthrough came when one of us realized that the longer Apache processes run the more memory they were using. We have made changes to our Apache configuration as a mitigation strategy. Basically, we have lowered our MaxRequestsPerChild from 10,000 down to 1,000. This appears to have helped, but only time will tell.

Read on for the gory details....

Thursday, February 14, 2013

Another Backstage Authority Update

Looks like we haven't had much to say here at MVLC since October, but that's because we've been too busy doing lots of things to get around to making blog entries.

I just wanted to take a minute to let everyone know that the software for managing authority updates with Backstage Library Works got a little code update today. A command line option was added that allows you to just download the new files from Backstage's server. This is good if you want to download the files before the weekend and wait until next week to process them, or if you just don't feel like loading the authority and bibliographic updates right away.

As always, the code is here:

http://git.mvlcstaff.org/?p=jason/backstage.git;a=summary

Wednesday, October 3, 2012

Backstage Authority Update

This post is to announce some improvements to the software we use to import records from Backstage Library Works. The new features include the ability to run authority_control_fields.pl on updated bibs and a "rerun" option to allow you to run the software again in the event of a failure.

If you have found this software useful, then you might want to checkout the latest changes with git and see what the improvements are.

Monday, September 17, 2012

Authority Control: After Action Report

The run of authority_control_fields.pl that I start at 7:00 pm on Saturday ran through Sunday and finished at 4:35 am this (Monday) morning. At 33 hours and 35 minutes, it took a little longer than I had hoped, but it finished well within the bounds of what I needed.

For those of you following along at home, there were some clean up issues this morning.

The output contained 1,699 lines about what appeared to be bib records that were missing subfield codes in various tags, mostly 400, 410 and 670. These lines were typically surrounded by messages about wide characters in warn.

I checked all of the reported bib records and the one thing that they all had in common was that they did not contain the datafield that was supposedly missing subfield entries. I mentioned this on IRC and Galen Charlton suggested that it could be bad authorities.

So, I modified my copy of authority_control_fields.pl to add print("$rec_id : $auth_id\n"); on or about line 461. This way it would print all of the bibliographic records and matching authority record ids. I then wrote a script to take the list of bibs and run this authority_control_fields.pl and capture the output to a file. This script ran each of the bad records individually using the --record parameter of authority_control_fields.pl. This run mysteriously produced no error output and all of the bibs now appear to be linked to authorities.

I then sorted the output of authority ids and uniquified the list. After checking the authorities by dumping their MARCXML to a file and going over it, none of them looked bad.

Galen called this a "heisenbug" since the behavior seems to change as you observe it. However, I think the strange output maybe due to some difference in the environment when I run jobs via at. I normally use the UTF-8 character set, and this may not be sent in the environment when at runs a job.

The upshot of the above is, if you get errors when running your batched authority_control_fields.pl jobs, then run it again on the errored records. This may just fix those.