<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	>

<channel>
	<title>Project 2061 Techlog</title>
	<atom:link href="http://techlog.p2061.org/feed/" rel="self" type="application/rss+xml" />
	<link>http://techlog.p2061.org</link>
	<description>Blogging from Project 2061 technology group</description>
	<pubDate>Thu, 14 Aug 2008 21:38:19 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.6.1</generator>
	<language>en</language>
			<item>
		<title>Playing nice with CPU usage</title>
		<link>http://techlog.p2061.org/2008/08/14/playing-nice-with-cpu-usage/</link>
		<comments>http://techlog.p2061.org/2008/08/14/playing-nice-with-cpu-usage/#comments</comments>
		<pubDate>Thu, 14 Aug 2008 21:32:18 +0000</pubDate>
		<dc:creator>BrianS</dc:creator>
		
		<category><![CDATA[PHP]]></category>

		<category><![CDATA[System Administration]]></category>

		<guid isPermaLink="false">http://techlog.p2061.org/?p=86</guid>
		<description><![CDATA[I needed to do a mass resampling of around 280,000 images. There are a number of ways to do this, but I settled on doing it via PHP because the images are stored on our web server, the total size of the images is large (~10GB), and I didn&#8217;t want to kill my machine trying [...]]]></description>
			<content:encoded><![CDATA[<p>I needed to do a mass resampling of around 280,000 images. There are a number of ways to do this, but I settled on doing it via PHP because the images are stored on our web server, the total size of the images is large (~10GB), and I didn&#8217;t want to kill my machine trying to get it done.</p>
<p>PHP is ideal for a task such as this: parsing directories and subdirectories for images is easy; resampling using the built-in library (GD) is a breeze; specifying the destination as a subdirectory is simple. The one minus was processor usage. Performing image manipulation eats up the CPU in a big way.</p>
<p>Luckily linux systems have a built-in utility for addressing a situation like this: <code>nice</code>. <code>nice</code> will &#8220;run a program with modified scheduling priority.&#8221; I&#8217;m running the image manipulation script using the following command:</p>
<blockquote><p><code>nice --adjustment=19 php script.php</code></p></blockquote>
<p>If nothing else is going on the script will use whatever resources are available. When anything with a higher priority executes, that program will take precedence over the script with regard to system resources. The script should thus not affect the responsiveness of the web server. This is the reason I was searching for this kind of functionality.</p>
]]></content:encoded>
			<wfw:commentRss>http://techlog.p2061.org/2008/08/14/playing-nice-with-cpu-usage/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Items Utility: Data Conversion Ready</title>
		<link>http://techlog.p2061.org/2008/07/25/items-utility-data-conversion-ready/</link>
		<comments>http://techlog.p2061.org/2008/07/25/items-utility-data-conversion-ready/#comments</comments>
		<pubDate>Fri, 25 Jul 2008 14:54:04 +0000</pubDate>
		<dc:creator>BrianS</dc:creator>
		
		<category><![CDATA[Adobe Illustrator]]></category>

		<category><![CDATA[MySQL]]></category>

		<category><![CDATA[Web Development]]></category>

		<guid isPermaLink="false">http://techlog.p2061.org/?p=70</guid>
		<description><![CDATA[After creating a new structure for the current year&#8217;s piloting data there has been a bit of disconnect in development between the two data sets. There&#8217;s just not enough time to ensure that everything works across both data sets. There were basically two choices to moving forward: 1) construct a view of the new data [...]]]></description>
			<content:encoded><![CDATA[<p>After creating a new structure for the current year&#8217;s piloting data there has been a bit of disconnect in development between the two data sets. There&#8217;s just not enough time to ensure that everything works across both data sets. There were basically two choices to moving forward: 1) construct a view of the new data structure that mimics the old data structure; 2) port the old data structure to the new one and continue redevelopment of the scripts. I chose the latter, mainly because there are some improvements I&#8217;d like to make to the interface in the process.</p>
<p>I created a series of <a href="http://techlog.p2061.org/wp-content/uploads/2008/08/convert-from-packet_item_records-to-packet_data.sql">SQL statements</a> to run in MySQL that will convert the data from the packet_item_records and miscon_packet_refs tables to the packet_students, packet_data, and miscon_packetdata_refs tables. After the conversion I updated any pages that referenced the old data structure. So far in testing the data seems to have converted perfectly.</p>
<p>Minus one issue. Multiple selections for the answer choice questions (A, B, C, D) from 2006 were recorded as a generic <em>Multiple</em> rather than <em>Y+NS</em>, <em>N+NS</em>, etc. This value is not represented in the updated data format or on the data entry forms or summary tables. Rather then spend too much time addressing this issue I&#8217;m going to leave these values empty for now. I don&#8217;t expect this to be a problem since the researchers are focused on data for the current pilot and field tests. Also, I&#8217;m keeping the old version of the piloting data and scripts that interact with it online in case it&#8217;s needed.</p>
]]></content:encoded>
			<wfw:commentRss>http://techlog.p2061.org/2008/07/25/items-utility-data-conversion-ready/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Creating Graphs in PHP</title>
		<link>http://techlog.p2061.org/2008/06/19/creating-graphs-in-php/</link>
		<comments>http://techlog.p2061.org/2008/06/19/creating-graphs-in-php/#comments</comments>
		<pubDate>Thu, 19 Jun 2008 20:13:03 +0000</pubDate>
		<dc:creator>BrianS</dc:creator>
		
		<category><![CDATA[PHP]]></category>

		<category><![CDATA[Web Development]]></category>

		<guid isPermaLink="false">http://techlog.p2061.org/?p=68</guid>
		<description><![CDATA[I was tasked with creating a demonstration of a dynamically generated graph for one of the grant proposals being worked on at present. This isn&#8217;t something we&#8217;ve really done before in PHP, but I had a feeling it would not be a unique requirement. A quick search revealed that a PEAR package exists for just [...]]]></description>
			<content:encoded><![CDATA[<p>I was tasked with creating a demonstration of a dynamically generated graph for one of the grant proposals being worked on at present. This isn&#8217;t something we&#8217;ve really done before in PHP, but I had a feeling it would not be a unique requirement. A quick search revealed that a PEAR package exists for just this purpose: <a href="http://pear.veggerby.dk/">Image_Graph</a>.</p>
<p>Installation was not too difficult, though I did have to ensure that a few dependencies were installed before PEAR would load the package. <a href="http://pear.veggerby.dk/documentation/">Documentation</a> for the module is not that great, but there are a number of <a href="http://pear.veggerby.dk/samples/">samples</a> and a fairly active <a href="http://pear.veggerby.dk/forum/">discussion forum</a>.</p>
<p>I did not find the graphing object very intuitive. It seems to me that a lot of the properties and functions are abstracted from the objects they affect. I suspect this was a design decision to allow for greater flexibility, but it may also be due to an older coding style. The current branch appears to have been developed in earnest starting in 2005, and it&#8217;s been a few years since a new release. Even so, my initial exposure leads me to believe the package may be stable enough for even a production-level project so long as sufficient QA has been performed.</p>
<p>It took me only a couple of days to hack together the demonstration. Though the requirements of the script were few, I decided to push the limits of my object-oriented PHP and create the graphing component as a class. Overall it was a good first-try at producing something of this nature. You can find the demonstrator at <a href="http://flora.p2061.org/weather_data/">http://flora.p2061.org/weather_data/</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://techlog.p2061.org/2008/06/19/creating-graphs-in-php/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Items Utility: Piloting Data Entry Updates</title>
		<link>http://techlog.p2061.org/2008/06/18/items-utility-piloting-data-entry-updates/</link>
		<comments>http://techlog.p2061.org/2008/06/18/items-utility-piloting-data-entry-updates/#comments</comments>
		<pubDate>Wed, 18 Jun 2008 22:05:41 +0000</pubDate>
		<dc:creator>BrianS</dc:creator>
		
		<category><![CDATA[PHP]]></category>

		<category><![CDATA[Web Development]]></category>

		<guid isPermaLink="false">http://techlog.p2061.org/?p=67</guid>
		<description><![CDATA[Piloting packets for this year are starting to come in and the data from these packets needs to be entered into the items utility. Because the questions in the packet have changed from previous years, and due to the rigid structure of the piloting data entry form and data tables, I decided to rewrite the [...]]]></description>
			<content:encoded><![CDATA[<p>Piloting packets for this year are starting to come in and the data from these packets needs to be entered into the items utility. Because the questions in the packet have changed from previous years, and due to the rigid structure of the piloting data entry form and data tables, I decided to rewrite the data entry form. The new format allows more flexible on the front-end without requiring modification of the back-end. In addition I modified the way PHP interacts with the packet data from MySQL by utilizing a structured array to hold the data.</p>
<p><span id="more-67"></span></p>
<p><strong>Updating the interface<br />
</strong></p>
<p>First up is the form organization. Previously the forms were designed, I believe, with the researchers in mind. The most important data entry sections were placed at the top of the page with everything else following. Once we started using temps to enter the data, however, this format proved to be a hindrance to efficiency. So to improve efficiency I regrouped the form into logical sections: student responses, student demographics, and researcher notes. Plus, to allow for easier scanning I organized the student responses section so that it resembles the paper form.</p>
<p>In addition to organizing the form more logically, I wanted to make portions of the form collapsible. Since a lot of the data entry is performed by temps there is no reason for them to have to wade through the research notes, which are at the top of the page by request of the researchers. Brian W. has commented on the utility of Dreamweaver&#8217;s Spry widgets so I thought I&#8217;d give them a try. I must admit that they work quite nicely and are easy to implement, and I found just what I needed in the Collapsible Panel widget. The benefit of using the panels is that I could position the researcher notes at the top of the page without significantly hindering the ability of the temps to access the student responses by having this panel closed by default.</p>
<p>Which brings me to a minor change I made to the functionality of the panels: having the appropriate panels open when the data entry form is loaded (e.g. student responses always open; researcher notes closed unless otherwise requested by the user; demographics only open if no data has been entered). I used a combination of PHP and JavaScript to set the initial state. Unfortunately this seems to caused some problems for Dreamweaver. When you open the packet data entry form in Dreamweaver the program appears to freeze. It appears Dreamweaver is attempting to process the spry widgets and choking on the PHP-derived JavaScript variables. Eventually Dreamweaver informs you that &#8220;A script in file &#8230; has been running for a long time. Do you want to continue?&#8221; Answering no to this will return Dreamweaver to normal operation. I could get around this problem by setting the state of the panels after they have been initialized, but this produces a disorienting collapse of panels after the page has loaded.</p>
<p>The final interface modification was a small script that makes it possible to deselect radio buttons. The main purpose of developing this feature was to streamline the form which, in previous revisions, had an extra option to clear the selected response.</p>
<p><strong>Improving the foundation<br />
</strong></p>
<p>As I mentioned, while the form is still hard-coded (each question coded by hand) I did lay out the foundation for a more dynamic approach. First I redeveloped the table that stores the data related to a packet. I reused the packets and packet_items tables, but created a new table called packet_data that stores only the question number, answer, and any comments. The benefit here is that forms with varying numbers of questions can be stored in this one table. I also moved the student meta data into a separate table, a design decision more in line with database normalization practice.</p>
<p>Another modification I made was to use a structured array to define the questions in the packet. While this has not been fully developed as of yet, I have used it to significantly simplify the script that saves the data. Rather than specify each question and corresponding HTML form element and writing the query to insert that information in the database I designed the PHP to loop through the array and generate the SQL code on the fly.</p>
<p>Finally, I used another structured array to hold data related to the packet, packet items, and students while generating the packet data entry page. I do not utilize any methods to automate the form generation at present, though there is little to prevent this from being developed. One reason I decided to code the form by hand is to allow for a level of customization that is difficult to accommodate when automating the visual side of things. Even so, utilizing structured arrays allows for more consistency between PHP pages.</p>
<p><strong>Moving forward</strong></p>
<p>Data from years prior has not yet been ported into the new structure. I do not expect this task to require too much work, but I am not sure when I&#8217;ll be able to perform the task. At the very least this will be part of the redevelopment taking place around subversion if not sooner.</p>
<p>To fully accommodate the data in the utility I&#8217;ll need to also update the summary tables, the school report generator, and packet data export. These updates will need to be completed fairly soon in order to support the item review that will begin taking place fairly soon.</p>
]]></content:encoded>
			<wfw:commentRss>http://techlog.p2061.org/2008/06/18/items-utility-piloting-data-entry-updates/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Subversion Repository for Items</title>
		<link>http://techlog.p2061.org/2008/06/17/subversion-repository-for-items/</link>
		<comments>http://techlog.p2061.org/2008/06/17/subversion-repository-for-items/#comments</comments>
		<pubDate>Tue, 17 Jun 2008 21:30:15 +0000</pubDate>
		<dc:creator>BrianS</dc:creator>
		
		<category><![CDATA[Web Development]]></category>

		<guid isPermaLink="false">http://techlog.p2061.org/?p=69</guid>
		<description><![CDATA[I&#8217;m currently cleaning up the items utility so that it can be brought into subversion. I&#8217;ve already deleted unused files and done some reorganization of the site files. Next up I&#8217;m going to work on some interface modifications that I hope will simplify the it as well as create some common navigational elements across  pages. [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;m currently cleaning up the items utility so that it can be brought into subversion. I&#8217;ve already deleted unused files and done some reorganization of the site files. Next up I&#8217;m going to work on some interface modifications that I hope will simplify the it as well as create some common navigational elements across  pages. The final step will involve unifying the code in a control structure similar to other recent projects.</p>
<p>Because this is more of a long-term project it is being performed along side any current development of the items utility. As a result any modifications to the items utility should be brought to my attention so that they can be ported to the SVN working copy if necessary.</p>
]]></content:encoded>
			<wfw:commentRss>http://techlog.p2061.org/2008/06/17/subversion-repository-for-items/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Using Subversion for Application Development and Deployment</title>
		<link>http://techlog.p2061.org/2008/05/21/using-svn-for-application-deployment/</link>
		<comments>http://techlog.p2061.org/2008/05/21/using-svn-for-application-deployment/#comments</comments>
		<pubDate>Wed, 21 May 2008 20:25:00 +0000</pubDate>
		<dc:creator>BrianS</dc:creator>
		
		<category><![CDATA[PHP]]></category>

		<category><![CDATA[System Administration]]></category>

		<guid isPermaLink="false">http://techlog.p2061.org/?p=65</guid>
		<description><![CDATA[I recently wanted to update our install of WordPress to the latest version. WordPress is a fairly easy install, and we could learn a thing or two about application set-up by examining their code. But I recently switched to using subversion to deploy and maintain our install. In just the little bit I&#8217;ve used subversion [...]]]></description>
			<content:encoded><![CDATA[<p>I recently wanted to update our install of WordPress to the latest version. WordPress is a fairly easy install, and we could learn a thing or two about application set-up by examining their code. But I recently switched to using subversion to deploy and maintain our install. In just the little bit I&#8217;ve used subversion so far, I believe development and deployment of our internal applications would be simplified by employing it for all our projects. Here&#8217;s a quick outline of the process, with examples based on my WordPress deployment.</p>
<p><span id="more-65"></span></p>
<p><strong>Workflow</strong></p>
<p>Since we&#8217;re not using an IDE that has integrated subversion capability, and because I wanted to keep the process as simple as possible, I decided to try a combined approach to development. A working copy of project files are checked out from subversion to the development server. Dreamweaver uses the working copy for its remote. Code development will continue as usual in Dreamweaver. When a development milestone is reached the changes are committed to subversion. When the project is ready for deployment, we use tagging to mark a stable copy in the repository and check it out to the production server.</p>
<p>The main problem I&#8217;ve found with this setup right now is that there is no way to hide the subversion structure from Dreamweaver. Subversion creates a directory to house control files for each directory in the working copy. The subversion control directories can be cloaked, allowing Dreamweaver to function as if these directories don&#8217;t exist, but this is not as feasible for a project with lots of directories.</p>
<p>I found an extension that can automatically cloak any subversion directories, but it only works for the local copy. So if you&#8217;re willing to go through the process of downloading the entire site you can use the extension to cloak all subversion directories automatically. Read about the extension on <a href="http://www.adobe.com/cfusion/exchange/index.cfm?event=extensionDetail&amp;extid=1018603">Adobe Exchange</a>; I&#8217;ve <a href="http://techlog.p2061.org/wp-content/uploads/2008/05/cloakscm3.mxp">saved a copy</a> of the extension on the techlog.</p>
<p>I have set up subversion to ignore the typical Dreamweaver development files (lock files, notes, backups). This allows the benefits of using the Dreamwever environment to its fullest without mucking up the subversion repository with unnecessary data.</p>
<p>We should also implement configuration options in a single file that is not part of the project files. We would still want to provide a sample configuration file, but the actual configuration file should be created during initial deployment by copying the sample and modifying the parameters. Using a sample file instead of actual configuration in the repository will allow us to upgrade the production server without having to worry about overwriting local settings. Overwriting the configuration is actually not much of a concern so long as the configuration file is not updated in the repository. But implementing this kind of setup does give us a little more security in that regard.</p>
<p><strong>Subversion setup</strong></p>
<p>First step in the process is to create a subversion repository. We&#8217;ll create this on the production server because we can see the production server from the development server, but not vice versa. To set up the subversion repository in /inet/svn, run the following command on the production server:</p>
<blockquote><p><code>svnadmin create /inet/svn</code></p></blockquote>
<p>Next we&#8217;ll populate the repository with a project. Since we&#8217;re just getting started with subversion we&#8217;ll utilize a well-known repository layout: project, project/branches, project/tags, project/trunk. Create the directory structure on the development server and place the current development files in project/trunk. There are a number of ways to access a subversion server, but for now we&#8217;ll use file-based access over ssh. If I create the project directory in /tmp, I would run the following command to import the project:</p>
<blockquote><p><code>svn import /tmp/benchmarks svn+ssh://production.server/inet/svn</code></p></blockquote>
<p><strong>Working with project files</strong></p>
<p>Now that we have the data in the repository we need to set up a working copy for Dreamweaver. First change to the directory where you want to store the working copy. Then run the following command to check out a working copy:</p>
<blockquote><p><code>svn checkout svn+ssh://production.server/inet/svn/benchmarks/trunk .</code></p></blockquote>
<p>Set up this directory as your remote in Dreamweaver and edit as usual.</p>
<p>At some point in the future you&#8217;ll want to record the changes in the subversion repository. This can be at any time in the development process, but at the very least it&#8217;ll need to be done when a project has reached a stable state and is ready for deployment. Run the following command from the working copy directory:</p>
<blockquote><p><code>svn commit</code></p></blockquote>
<p>Once the project has reached a stable state we need to have a way of maintaining that stable copy in the face of future modifications. This is where the branches/tags/trunk structure really comes in handy. The in-development files are always in /trunk. To create a stable release we merely need to create a copy in the /tags directory (where ver.no is the version number, such as 1.0):</p>
<blockquote><p><code>svn copy svn+ssh://production.server/inet/svn/benchmarks/trunk svn+ssh://production.server/inet/svn/benchmarks/tags/ver.no</code></p></blockquote>
<p><strong>Deploy the project</strong></p>
<p>Whether we&#8217;re talking about a home-brewed application or one from an external organization, the steps for deployment are the same. Check out a working copy or export a clean copy to the appropriate directory on the production server. I prefer to check out a working copy because it enables greater flexibility in deployment. If we need to modify a file in the production environment, a working copy enables non-destructive upgrades in the future. Only the files that have been modified in the repository will be updated during a version upgrade. Any conflicting updates will be noted by the subversion client so that they can be addressed. (Of course, this is only really a benefit if significant modifications have not been made to the project between versions.)</p>
<p>Updating a working copy to a new version is as simple as the following command (executed from the working copy directory):</p>
<blockquote><p><code>svn switch repository_address .</code></p></blockquote>
<p><strong>Belt and suspenders</strong></p>
<p>To avoid loss of our subversion repository we should consider creating a copy on our development server. This would only be used to restore the subversion repository in case of loss or corruption. This can be done using the <code>svnsync</code> command and should be done automatically (via <code>cron</code>). This may not be a necessity as much as a convenience, however, since the repository should be included in the nightly backup.</p>
<p><strong>General Reading</strong></p>
<p>What I attempted to provide here was the core commands needed to get started. There&#8217;s a lot more to working with subversion than what I covered and I would recommend .</p>
<ul>
<li><a href="http://svnbook.red-bean.com/">Version Control with Subversion</a> (all you need to know)<a href="http://svnbook.red-bean.com/"><br />
</a></li>
<li><a href="http://wiki.peopleaggregator.org/Tracking_PeopleAggregator_releases_with_Subversion_vendor_branches">Tracking PeopleAggregator releases with Subversion vendor branches</a></li>
<li><a href="http://lookfirst.com/2007/11/subversion-vendor-branches-howto.html">Subversion Vendor Branches Howto</a></li>
<li><a href="http://burtonini.com/blog/computers/svn-vendor-2005-05-04-13-55">Vendor Branches in Subversion</a></li>
</ul>
<p>I found the vendor branch concept a bit confusing; so a few extra links for that.</p>
]]></content:encoded>
			<wfw:commentRss>http://techlog.p2061.org/2008/05/21/using-svn-for-application-deployment/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Expanding Search Terms for More Inclusive Results</title>
		<link>http://techlog.p2061.org/2008/05/14/expanding-search-terms-for-more-inclusive-results/</link>
		<comments>http://techlog.p2061.org/2008/05/14/expanding-search-terms-for-more-inclusive-results/#comments</comments>
		<pubDate>Wed, 14 May 2008 16:54:11 +0000</pubDate>
		<dc:creator>BrianS</dc:creator>
		
		<category><![CDATA[MySQL]]></category>

		<category><![CDATA[PHP]]></category>

		<category><![CDATA[Web Development]]></category>

		<guid isPermaLink="false">http://techlog.p2061.org/2008/05/14/expanding-search-terms-for-more-inclusive-results/</guid>
		<description><![CDATA[While working on the Benchmarks search I wanted to try and provide a feature I find useful on Google and other search engines: word form expansion (lemmatisation). A little research showed to me that this would require more work than we really should be spending on search functionality. Especially considering that the built-in MySQL full [...]]]></description>
			<content:encoded><![CDATA[<p>While working on the Benchmarks search I wanted to try and provide a feature I find useful on Google and other search engines: word form expansion (<a href="http://en.wikipedia.org/wiki/Lemmatisation">lemmatisation</a>). A little research showed to me that this would require more work than we really should be spending on search functionality. Especially considering that the built-in MySQL full text search capability is sufficient for our needs. So I decided to focus on a feature that would still provide value but require little time: <a href="http://en.wikipedia.org/wiki/Word_stem">word stem</a> expansion.</p>
<p><span id="more-64"></span></p>
<p><strong>Purpose and Need</strong></p>
<p>As I mentioned, we&#8217;re using the <a href="http://dev.mysql.com/doc/refman/5.0/en/fulltext-search.html">full text search</a> capability of MySQL as the core of our search functionality. By default, MySQL performs a natural language full text search. This mode has a few restrictions that can affect the results that are returned, but most notable for our purposes is that a word provided by the user must match exactly a word in the record. What this means is that records with a variation of the word (such as a plural when a singular was provided) will not be returned. For example, if you search for &#8220;atom,&#8221; MySQL will not return records that have &#8220;atoms&#8221; or &#8220;atomic&#8221; but not &#8220;atom.&#8221;</p>
<p>I&#8217;ve found the easiest way to address this limitation is by performing the search using the keywords as typed, but then adding on results that have matches to the stem of the keywords. By using these two full text search methods in conjunction we can get the best results for our user.</p>
<p><strong>Finding the Stem</strong></p>
<p>The hardest part of this whole process was determining a way to find the stem for a word. Initially I developed a simple function that stripped plurality from a word. Is was simplistic, but provided a good sample of how effective stemming could be. A little more research led me to a PHP extension that uses an established algorithm to determine a word stem. The extension is available from the PECL PHP library and is called &#8220;<a href="http://pecl.php.net/package/stem">stem</a>.&#8221; Installation is simple. On *nix you first install the extension:</p>
<blockquote><p><code>sudo pecl install stem</code></p></blockquote>
<p>Next, update the php.ini file to enable the extension by adding the following line:</p>
<blockquote><p><code>extension=stem.so</code></p></blockquote>
<p>Finally, restart the web server for the new setting to take affect:</p>
<blockquote><p><code>sudo /etc/init.d/apachectl graceful</code></p></blockquote>
<p>On Windows you first download the dll from <a href="http://pecl4win.php.net/ext.php/php_stem.dll">pecl4win.php.net</a> (PECL installation actually involves compiling the extension binary which is not quite as straightforward on Windows). Place the dll in your php/ext directory. Next, update the php.ini file to enable the extension by adding the following line:</p>
<blockquote><p><code>extension=stem.dll</code></p></blockquote>
<p>Finally, restart IIS for the settings to take affect.</p>
<p><strong>How to Use</strong></p>
<p>Performing a search with stem expansion using MySQL&#8217;s full text search functionality is actually quite easy. First we need to break the search terms up so that we can stem them. To do this we need to isolate each word in the search phrase. We can do this by splitting the search phrase into an array. It&#8217;s impossible to know exactly what the user will type, so we can be greedy and use something like:</p>
<blockquote><p><code>$araKeys = preg_split('/\W/', $strSearch);</code></p></blockquote>
<p>Then we need to modify each keyword so that we have its stem:</p>
<blockquote><p><code>$araKeys = array_map('stem', $araKeys );</code></p></blockquote>
<p>Recombine the keywords into a new boolean search phrase:</p>
<blockquote><p><code>$strSearchBool = join('* ', $keysArray) . '*'</code></p></blockquote>
<p>Finally, we combine the original natural language search with a boolean mode search:</p>
<blockquote><p><code>MATCH(fulltext_index_columns) AGAINST ('$strSearch') OR MATCH(fulltext_index_columns) AGAINST ('$strSearchBool' IN BOOLEAN MODE)</code></p></blockquote>
<p>Order the results by the sum of the relevance values for the full text searches and you&#8217;re done:</p>
<blockquote><p><code>ORDER BY (MATCH(fulltext_index_columns) AGAINST ('$strSearch')) + (MATCH(fulltext_index_columns) AGAINST ('$strSearchBool' IN BOOLEAN MODE)) DESC<br />
</code></p></blockquote>
<p>The two combined provides a good sense of how well a result matches. Records that match the entered search terms exactly will bubble to the top of the list since they match on both the natural language and boolean searches. Records that match only on the stem-based boolean search will come next, and the more words they match the higher in this secondary list they will be. These stem-based matches would not be returned by the natural language search since the exact search terms were not present in the records.</p>
<p><strong>Future Development</strong></p>
<p>MySQL has stated a desire to enhance full text searching with stem and proximity information. The implementation of these features will negate the need for this to be done via the hack described above. At that point this extra code should be removed, though I doubt it will cause problems with the search results if left in place.</p>
]]></content:encoded>
			<wfw:commentRss>http://techlog.p2061.org/2008/05/14/expanding-search-terms-for-more-inclusive-results/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Items Utility: School Reports Update</title>
		<link>http://techlog.p2061.org/2008/05/13/items-utility-school-reports-update/</link>
		<comments>http://techlog.p2061.org/2008/05/13/items-utility-school-reports-update/#comments</comments>
		<pubDate>Tue, 13 May 2008 21:09:51 +0000</pubDate>
		<dc:creator>BrianS</dc:creator>
		
		<category><![CDATA[PHP]]></category>

		<category><![CDATA[Web Development]]></category>

		<category><![CDATA[assessment]]></category>

		<category><![CDATA[items utility]]></category>

		<guid isPermaLink="false">http://techlog.p2061.org/2008/05/13/items-utility-school-reports-update/</guid>
		<description><![CDATA[I made a few updates to the school reports at the request of the researchers.

Added a save feature so that packet/item selections for a report can be retrieved at a later date.
Added the ability to set thresholds that must be met before the script will include statistics in the report.
Updated tables to reflect changes in [...]]]></description>
			<content:encoded><![CDATA[<p>I made a few updates to the school reports at the request of the researchers.</p>
<ul>
<li>Added a save feature so that packet/item selections for a report can be retrieved at a later date.</li>
<li>Added the ability to set thresholds that must be met before the script will include statistics in the report.</li>
<li>Updated tables to reflect changes in displayed information and formatting.</li>
</ul>
<p>In regards to the above changes, the save feature is of particular note for it inscrutibility. While the technique used to save/retrieve the report setting is adequate, the code to enable this functionality is not  well-implemented. Unfortunately, time constraints required a fast, rather than best, implementation.</p>
<p>As modifications are requested this script is getting to be a little harder to work with. The code was created over the course of a week or two back in January. I was able to make quick work of it by using some concepts I initially worked out for summary table generation (and based on discussions with Brian W). While the data is stored in a psuedo object-oriented format, the script itself is fairly linear in design. If more modifications are requested the script may need some rewrites to enable a bit more flexibility.</p>
]]></content:encoded>
			<wfw:commentRss>http://techlog.p2061.org/2008/05/13/items-utility-school-reports-update/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Optimizing MySQL Server Runtime Parameters</title>
		<link>http://techlog.p2061.org/2008/05/07/optimizing-mysql-server-runtime-parameters/</link>
		<comments>http://techlog.p2061.org/2008/05/07/optimizing-mysql-server-runtime-parameters/#comments</comments>
		<pubDate>Wed, 07 May 2008 18:55:25 +0000</pubDate>
		<dc:creator>BrianS</dc:creator>
		
		<category><![CDATA[MySQL]]></category>

		<category><![CDATA[System Administration]]></category>

		<guid isPermaLink="false">http://techlog.p2061.org/2008/05/07/optimizing-mysql-server-runtime-parameters/</guid>
		<description><![CDATA[Since we&#8217;ll be exposing MySQL to significantly more traffic (due mainly to the transition to a database-driven version of Benchmarks Online [dbBOL]) I decided to spend some time optimizing the server&#8217;s settings. There are a number of settings that can be tweaked to improve performance. I based my decisions on the information available from the [...]]]></description>
			<content:encoded><![CDATA[<p>Since we&#8217;ll be exposing MySQL to significantly more traffic (due mainly to the transition to a database-driven version of <cite>Benchmarks Online</cite> [dbBOL]) I decided to spend some time optimizing the server&#8217;s settings. There are a number of settings that can be tweaked to improve performance. I based my decisions on the information available from the references cited and the performance statistics reported by MySQL (SQL <code>SHOW VARIABLES</code> or use PHPMyAdmin). MySQL has been running for 131 days as of the writing of this post (<a href="http://techlog.p2061.org/nonwp-files/server_status_080317.htm">see cached copy</a> of the runtime stats), so I expect the data will be a fairly good indication of the performance of MySQL under its current usage. Unfortunately, I expect the usage pattern to change significantly once dbBOL is released. As a result some of the settings used will be based on expected usage patterns. At specific intervals after dbBOL is released we should examine the performance of MySQL based on the runtime stats to determine if additional tweaking needs to be performed. I recommend the following schedule: 1 week, 1 month, 3 months, then every 6 months.</p>
<p><span id="more-51"></span></p>
<p><strong>MyISAM Recovery</strong></p>
<p>MyISAM is a nice format for speed and has support for functionality not available in other MySQL storage engines (such as full text search). Unfortunately MyISAM is not nearly as robust as InnoDB. Since the data files are dealt with directly sans transactions a system crash can cause table corruption and loss of data (particularly is an <code>INSERT</code>/<code>UPDATE</code> operation were in progress). To ensure that the tables have not been corrupted at any time we can set <code>myisam-recover=BACKUP,FORCE</code>. This will tell MySQL to check a MyISAM table when it is opened, repair it if necessary, and make a backup of the table.</p>
<p>There are some drawbacks with this setting. First, if a row is corrupted the data from that row could be lost. That&#8217;s why we use the <code>BACKUP</code> option. Also, there can be a performance hit due to recovery operations, particularly if a large number of tables have to be repaired simultaneously. Not to mention that the recovery check is done every time a table is opened.</p>
<p>Another method of checking the MyISAM tables we should consider is a cron job that checks the tables outside of MySQL. This would give us the benefit of automated repair (or at least notification) while mitigating possible performance bottlenecks.</p>
<p>References:</p>
<ul>
<li><a href="http://dev.mysql.com/doc/refman/5.0/en/server-options.html#option_mysqld_myisam-recover"> MySQL Reference Manual: 5.1.2. Command Options</a></li>
<li><a href="http://dev.mysql.com/doc/refman/5.0/en/server-system-variables.html#option_mysqld_tmp_table_size">MySQL Reference Manual: 5.1.3. System Variables</a></li>
<li><a href="http://www.mysqlperformanceblog.com/2006/06/17/using-myisam-in-production/">MySQL Performance Blog: Using MyISAM in production</a></li>
<li><a href="http://www.mysqlperformanceblog.com/2006/07/30/mysql-crash-recovery/">MySQL Performance Blog: MySQL Crash Recovery</a></li>
</ul>
<p><strong>Thread Cache</strong></p>
<p>MySQL assigns a thread to each connection made by a client. Thread creation/destruction can be a relatively expensive process, but MySQL gets around this by implementing thread caching. Thread caching allows MySQL to reuse a thread once a connection is finished with it. A thread is not destroyed unless the number of active threads exceeds the thread cache limit. The thread cache does not have to be large enough to handle all simultaneous connections, particularly since maintaining a thread uses up system resources. The cache should be large enough, however, such that the number of threads created is small. Check the Threads_created server status variable.</p>
<p>We currently do not have thread caching enabled. Our threads created is at 60,000, which is extremely high (~460 per day). Unless you set up persistent connections to MySQL using the <code>pconnect()</code> function PHP will open/close a connection each time a web page is loaded. I&#8217;m going to go with a value of 20. Assuming maximum concurrent connection could reach 25 this provides plenty of cached connections for average usage. If the number of threads created does not budge past the cache limit we should consider lowering the value somewhat to free up resources.</p>
<p>For an hint of how important thread caching is, see <a href="http://jeremy.zawodny.com/blog/archives/000173.html">MySQL, Linux, and Thread Caching</a> and <a href="http://www.epigroove.com/posts/63/optimize_mysql_the_thread_cache">Optimize MySQL: The Thread Cache</a>.</p>
<p>References:</p>
<ul>
<li><a href="http://dev.mysql.com/doc/refman/5.0/en/connection-threads.html">MySQL Reference Manual: 7.5.7. How MySQL Uses Threads for Client Connections</a></li>
<li><a href="http://www.mysqlperformanceblog.com/2006/09/29/what-to-tune-in-mysql-server-after-installation/">MySQL Performance Blog: What to tune in MySQL Server after installation</a></li>
</ul>
<p><strong>Table Cache</strong></p>
<p>Opening a table can be a slow process as data descriptors are created and file headers are modified. To mitigate this MySQL uses a file-based table cache that maintains a table in an open state for future connections. A unique entry in the table cache is required for each concurrent access to a table (multiple users or multiple accesses by a single user in a query). Every time a table is opened an unused table cache entry for that table is sought. If none is found a new one is created. Once the table cache reaches the limit specified by the <code>table_cache</code> variable MySQL has to close old cached connections before opening new ones, adding even more time to table access. MySQL recommends sizing the table cache so that it can handle the largest number of concurrent connections multiplied by the largest number of tables accessed by a single query. This is at the high end. You can start lower and watch <code>opened_tables</code> to see if the table cache is constantly swapping out tables. The faster <code>opened_tables</code> rises the more urgently the table cache needs to be increased.</p>
<p>One caveat to consider when setting the table cache is the per-process file pointer limit. Each cache entry is associated with MySQL. If the number of files held open by MySQL exceeds the limit allowed by the operating system no further files can be opened. MySQL does not fail gracefully in this situation and may, according to the documentation, &#8220;refuse connections, fail to perform queries, and be very unreliable.&#8221; You can find the file usage limit by issuing the following command <code>cat /proc/sys/fs/file-max</code>. It&#8217;s very unlikely we&#8217;ll have a problem; the current value indicated by this command is 50569. But the results of going over this limit appear to be fairly severe for MySQL, so it&#8217;s a good idea to check.</p>
<p>The current table cache of is set to 160 and is full, but the value of opened tables is rising slowly. Still, our maximum concurrent connections has already hit ten, so I believe we could easily see the table cache get overwhelmed once the database is exposed to a larger traffic base. If we assume maximum concurrent connections of 25 and a crazy join of 10 tables then we&#8217;re looking at a table cache of around 250. I&#8217;ll start with this number and watch the opened tables stat to see if the cache needs to go higher or can be lowered.</p>
<p>References:</p>
<ul>
<li><a href="http://dev.mysql.com/doc/refman/5.0/en/table-cache.html">MySQL Reference Manual: 7.4.8. How MySQL Opens and Closes Tables</a></li>
<li><a href="http://www.mysqlperformanceblog.com/2006/09/29/what-to-tune-in-mysql-server-after-installation/">MySQL Performance Blog: What to tune in MySQL Server after installation</a></li>
<li><a href="http://www.databasejournal.com/features/mysql/article.php/3367871">Database Journal: Optimizing the mysqld variables</a></li>
</ul>
<p><strong>Temporary Tables</strong></p>
<p>Temporary tables may be used by MySQL when performing queries. By default these tables are created in memory. However, there are two situations in which a temporary table will be written to disk, resulting in a bit of a performance hit: when a temporary table grows beyond the maximum allowed; and when a condition exists that prevents the use of a temporary table. The former situation is one determined by the <code>tmp_table_size</code>/<code>max_heap_table_size</code> parameters. The latter is determined by table and query structure. <code>tmp_table_size</code> is specific to temporary tables while <code>max_heap_table_size</code> applies to all memory tables, so make sure that <code>max_heap_table_size</code> is at least as large as <code>tmp_table_size</code>.</p>
<p>The allowable size of temporary tables should be large enough to avoid writing to disk where possible, but small enough that memory is not eaten up. There is no provision to limit the number of temporary tables stored in memory. If there are many simultaneous connections and each connection is working with a large temporary table memory could be filled rather quickly.</p>
<p>You can determine whether or not your temporary tables are being created in memory by looking at the number of temporary tables that had to be written to disk (<code>Created_tmp_disk_tables</code>). Ours is hovering around 50% of the total number of temporary tables (<code>Created_tmp_tables</code>), but this isn&#8217;t enough information to make a decision about the optimal setting for <code>tmp_table_size</code>. What we don&#8217;t know is the reason a table is written to disk. That&#8217;s something that can only be determined using the <code>EXPLAIN</code> statement.</p>
<p>Since the number of disk-based temporary tables is relatively high I&#8217;m going to increase the maximum size allowed for memory-based tables and see if that improves things.</p>
<p>References:</p>
<ul>
<li><a href="http://dev.mysql.com/doc/refman/5.0/en/server-system-variables.html#option_mysqld_tmp_table_size">MySQL Reference Manual: 5.1.3. System Variables</a></li>
<li><a href="http://dev.mysql.com/doc/refman/5.0/en/server-status-variables.html#option_mysqld_Created_tmp_disk_tables">MySQL Reference Manual: 5.1.5. Status Variables</a></li>
<li><a href="http://dev.mysql.com/doc/refman/5.0/en/internal-temporary-tables.html">MySQL Reference Manual: 7.5.9. How MySQL Uses Internal Temporary Tables</a></li>
<li><a href="http://www.databasejournal.com/features/mysql/article.php/3367871">Database Journal: Optimizing the mysqld variables</a></li>
</ul>
<p><strong>Query Cache</strong></p>
<p>For SELECT queries, the speed of the response can be affected by factors such as query structure and which columns are indexed. MySQL is able to provide fast results for often-run queries by storing the results in the query cache. The first time a query is run the query and its result set are stored in the query cache. Subsequent runs of the exact same query will pull the results from the cache (so long as the cached entry is available). There are some qualifications for the query cache to be used successful. First, a query must match in a binary manner in order for MySQL to use a cached query. This means a query must match character for character in a case-sensitive way with the previous run. Second, the query results must not exceed a defined size (1MB by default) or it will not be cached.</p>
<p>To enable the query cache set <code>query-cache-type = 1</code> and give <code>query_cache_size</code> a value. The best size of the query cache is a guessing game. I&#8217;m starting with a value of 32M. Though we&#8217;re likely to see little performance improvement on the web-based applications due to constant table updates (which invalidates any cached results for that table) we should see decent performance improvement for more static tables such as those used to serve dbBOL. As a result, judging the efficiency of the cache based on the usage data provided by MySQL will be somewhat difficult. Still, I would expect to see a relatively high value for <code>Qcache_hits</code> when compared to <code>Qcache_inserts</code> and a low value for <code>Qcache_lowmem_prunes</code>. It&#8217;s also important to keep an eye on <code>Qcache_free_blocks</code> in order to ensure the cache memory is not fragmented. This number should remain low and can be improved temporarily by issuing a <code>FLUSH QUERY CACHE</code> statement, which will &#8220;defragment&#8221; the query cache.</p>
<p>One thing to keep in mind is that any cached query will be dropped if the table it references is updated. Tables that are updated often should include the <code>SQL_NO_CACHE</code> attribute in any SELECT queries to prevent caching. This will help prevent the extra overhead of storing and dropping queries from the cache when those caches can rarely be reused.</p>
<p>References:</p>
<ul>
<li><a href="http://dev.mysql.com/doc/refman/5.0/en/query-cache.html">MySQL Reference Manual: 7.5.4. The MySQL Query Cache</a></li>
<li><a href="http://dev.mysql.com/tech-resources/articles/mysql-query-cache.html">MySQL: A Practical Look at the MySQL Query Cache</a></li>
<li><a href="http://www.databasejournal.com/features/mysql/article.php/10897_3110171_1">Database Journal: MySQL&#8217;s Query Cache</a></li>
<li><a href="http://jayant7k.blogspot.com/2007/07/mysql-query-cache.html">Whatever&#8230;.: mysql query cache</a></li>
</ul>
<p><strong>Key Buffer</strong></p>
<p>MySQL says this is one of the most important performance tuning variables and recommends allocating as much memory as possible. However, since the key buffer is stored in RAM a setting should be used that (in consideration with other settings) won&#8217;t cause the server to page memory. MySQL also recommends that the <code>key_reads</code>/<code>key_read_requests</code> be less than 0.01.</p>
<p>The MySQL Performance Blog recommends up to 40% of your system memory, taking into account the size of the MyISAM table indexes and available memory.</p>
<p>Currently the setting for this parameter is 16M and our <code>key_reads</code>/<code>key_read_requests</code> are very low (0.0002). So right now we seem to be doing great in regard to this value, but since MySQL recommends a high value I&#8217;m going to increase this to 24M or about 4.7% of system memory. Though this value is below the roughly 58M that the table indexes add up to, the maximum portion of the key buffer that has been used at any one time so far is only about 70% (based on a block size of 1M and <code>Key_blocks_used</code> showing 11K, meaning roughly 11M of 16M in the current setup).</p>
<p>Due to how MySQL uses the indexes, we should be O.K. with a key buffer size smaller than the sum of the indexes. As I understand it, MySQL divides indexes into blocks which allows MySQL to only access the part of an index it needs. Only used portions of a particular index need to be stored in the key buffer. Since some of the tables in the database are rarely (if ever) used their indexes won&#8217;t add to the overall key buffer usage.</p>
<p><em>Special note:</em> some tables have rather large indexes. These tables should be reviewed to determine if any optimizations to the table structure can be made.</p>
<p>References:</p>
<ul>
<li><a href="http://dev.mysql.com/doc/refman/5.0/en/server-system-variables.html#option_mysqld_key_buffer_size">MySQL Reference Manual: 5.1.3. System Variables</a></li>
<li><a href="http://dev.mysql.com/doc/refman/5.0/en/myisam-key-cache.html">MySQL Reference Manual: 7.4.6. The MyISAM Key Cache</a></li>
<li><a href="http://dev.mysql.com/doc/refman/5.0/en/server-parameters.html">MySQL Reference Manual: 7.5.2. Tuning Server Parameters</a></li>
<li><a href="http://www.mysqlperformanceblog.com/2006/09/29/what-to-tune-in-mysql-server-after-installation/">MySQL Performance Blog: What to tune in MySQL Server after installation</a></li>
<li><a href="http://www.databasejournal.com/features/mysql/article.php/3367871">Database Journal: Optimizing the mysqld variables</a></li>
</ul>
<p><strong>Sort Buffer</strong></p>
<p>The sort buffer is used whenever ORDER BY or GROUP BY is used in a query. If the amount of memory required to perform an operation of this type exceeds the value of <code>sort_buffer_size</code> MySQL has to sort the current working set, write the data out to disc, and start another working set. So not only do you have the problem of sorting parts of the result set separately, but then those groupings have to be combined on disc and sorted. Conceivably, then, a larger value would be beneficial. You can see how often a sort buffer has to be written out to disc and combined with additional buffered data by looking at the <code>Sort_merge_passes</code> status variable.</p>
<p>The sort buffer is a per-connection setting, meaning that each connection can allocate the amount specified by this value. As a result, care should be taken when setting this value in order to avoid eating up too much memory. Also, the MySQL Performance Blog has benchmarks showing a larger sort buffer actually hurting performance. With all this in mind it may be wise to use a relatively low value in the server settings and specify a larger value when necessary for a specific connection. Also, it may be advisable to perform some testing in various scenarios to see if there is an optimal minimum to cover most situations.</p>
<p>Still, MySQL recommends using a larger value to help improve sorting. Our current value was 512K. Our merge passes appear to be rising steadily (though not severely). So for now I&#8217;m doubling the value.</p>
<p>References:</p>
<ul>
<li><a href="http://dev.mysql.com/doc/refman/5.0/en/server-system-variables.html#option_mysqld_sort_buffer_size">MySQL Reference Manual: 5.1.3. System Variables</a></li>
<li><a href="http://dev.mysql.com/doc/refman/5.0/en/server-status-variables.html#option_mysqld_Sort_merge_passes">MySQL Reference Manual: 5.1.5. Status Variables</a></li>
<li><a href="http://dev.mysql.com/doc/refman/5.0/en/order-by-optimization.html">MySQL Reference Manual: 7.2.11. ORDER BY Optimization</a></li>
<li><a href="http://www.mysqlperformanceblog.com/2007/08/18/how-fast-can-you-sort-data-with-mysql/">MySQL Performance Blog: How fast can you sort data with MySQL?</a></li>
<li><a href="http://www.mysqlperformanceblog.com/2006/06/06/are-larger-buffers-always-better/">MySQL Performance Blog: Are larger buffers always better?</a></li>
<li><a href="http://www.databasejournal.com/features/mysql/article.php/3367871">Database Journal: Optimizing the mysqld variables</a></li>
</ul>
<p><strong>Read Buffers</strong></p>
<p>MySQL uses read buffers when accessing table data. There are two settings to pay attention to here: <code>read_buffer_size</code> and <code>read_rnd_buffer_size</code>. MySQL allocates <code>read_buffer_size</code> when a sequential scan is performed. A sequential scan is when every row in a table is read and typically would be done when an index can&#8217;t be used to satisfy a query. MySQL allocates <code>read_rnd_buffer_size</code> when tables rows are read based on a key sort.</p>
<p>As with the sort buffer, the read buffers are a per-connection setting. So setting this value with consideration of our available memory is important. Plus, the MySQL Performance Blog has found some performance issues with larger values for these settings, similar to the issues with the sort buffer. Once again these are variables that may be best increased as needed. And again, testing should be done to find an optimal minimum.</p>
<p>Since these settings can show something of a benefit we want to increase the value if possible. Until further testing can be done on our system, however, I&#8217;ll keep these values below the theorized threshold of decreasing returns, 256K (at least until further testing and optimization can be done with our own system).</p>
<p>References:</p>
<ul>
<li><a href="http://dev.mysql.com/doc/refman/5.0/en/server-system-variables.html#option_mysqld_read_buffer_size">MySQL Reference Manual: 5.1.3. System Variables</a></li>
<li><a href="http://dev.mysql.com/doc/refman/5.0/en/server-parameters.html">MySQL Reference Manual: 7.5.2. Tuning Server Parameters</a></li>
<li><a href="http://www.mysqlperformanceblog.com/2007/09/17/mysql-what-read_buffer_size-value-is-optimal/">MySQL Performance Blog: MySQL: what read_buffer_size value is optimal?</a></li>
<li><a href="http://www.mysqlperformanceblog.com/2006/06/06/are-larger-buffers-always-better/">MySQL Performance Blog: Are larger buffers always better?</a></li>
<li><a href="http://www.mysqlperformanceblog.com/2007/07/24/what-exactly-is-read_rnd_buffer_size/">MySQL Performance Blog: What exactly is read_rnd_buffer_size?</a></li>
<li><a href="http://www.mysqlperformanceblog.com/2007/09/12/read-buffers-mmap-malloc-and-mysql-performance/">MySQL Performance Blog: Read Buffers, mmap, malloc and MySQL Performance</a></li>
<li><a href="http://mysql-ha.com/2007/09/06/read-buffer-performance-hit/">MySQL-HA: Read Buffer performance hit</a></li>
<li><a href="http://www.databasejournal.com/features/mysql/article.php/3367871">Database Journal: Optimizing the mysqld variables</a></li>
</ul>
<p><strong>InnoDB</strong></p>
<p>The MySQL Performance Blog recommends a number of settings for enhancement of InnoDB performance. After much consideration, however, I believe it may be best if we forgo usage of the InnoDB storage engine where appropriate. I have a few reasons for this opinion</p>
<ol>
<li>InnoDB tables are more resource intensive due to their use of transactions.</li>
<li>Fulltext indexing, a feature of MySQL that we have come to depend on for a number of applications, is not currently available for InnoDB tables.</li>
<li>Our current applications do not generally require the added safety of a transactional database.</li>
</ol>
<p>If we do decide in the future to implement InnoDB for a future application (such as an online ordering system) we should revisit the optimization settings.</p>
<p>To ensure that tables are MyISAM by default I have set <code>default-storage-engine = MyISAM</code>. This can be overrode by specifying the table engine when creating a new table or by altering a table.</p>
<p><strong>Fin</strong></p>
<p>Once settings changes have been made to /etc/my.cnf restart the server and check the server variables (<code>SHOW GLOBAL STATUS</code> or use PHPMyAdmin) to ensure all settings have been implemented correctly.</p>
<p>Also, see my <a href="http://techlog.p2061.org/2007/07/13/optimizing-mysql/">earlier post on optimization</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://techlog.p2061.org/2008/05/07/optimizing-mysql-server-runtime-parameters/feed/</wfw:commentRss>
		</item>
		<item>
		<title>BOL Tricks</title>
		<link>http://techlog.p2061.org/2008/04/02/bol-tricks/</link>
		<comments>http://techlog.p2061.org/2008/04/02/bol-tricks/#comments</comments>
		<pubDate>Wed, 02 Apr 2008 21:56:43 +0000</pubDate>
		<dc:creator>BrianS</dc:creator>
		
		<category><![CDATA[Web Development]]></category>

		<category><![CDATA[benchmarks]]></category>

		<category><![CDATA[BOL]]></category>

		<category><![CDATA[easter eggs]]></category>

		<category><![CDATA[javascript]]></category>

		<guid isPermaLink="false">http://techlog.p2061.org/2008/04/02/bol-tricks/</guid>
		<description><![CDATA[In order to enable easier checking of Benchmarks Online against the print  (aka 1993) version, I implemented a switch that allows you to specify the 1993 version as the default tab displayed. To enable to switch you have to visit the utility with the querystring bmtabver=1993 appended. So the following URL would enable the [...]]]></description>
			<content:encoded><![CDATA[<p>In order to enable easier checking of <em>Benchmarks</em> Online against the print  (aka 1993) version, I implemented a switch that allows you to specify the 1993 version as the default tab displayed. To enable to switch you have to visit the utility with the querystring bmtabver=1993 appended. So the following URL would enable the switch:</p>
<p><a href="http://flora.p2061.org/benchmarks/index.php?bmtabver=1993">http://floradev.p2061.org/benchmarks/index.php?bmtabver=1993</a></p>
<p>To disable the switch change the value from 1993 to 2007 (or delete the &#8220;bmtabver&#8221; cookie from your browser).</p>
<p>This switch is not publicly noted.</p>
]]></content:encoded>
			<wfw:commentRss>http://techlog.p2061.org/2008/04/02/bol-tricks/feed/</wfw:commentRss>
		</item>
	</channel>
</rss>
