07.25.08

Items Utility: Data Conversion Ready

Posted in Adobe Illustrator, MySQL, Web Development at 9:54 am by BrianS

After creating a new structure for the current year’s piloting data there has been a bit of disconnect in development between the two data sets. There’s just not enough time to ensure that everything works across both data sets. There were basically two choices to moving forward: 1) construct a view of the new data structure that mimics the old data structure; 2) port the old data structure to the new one and continue redevelopment of the scripts. I chose the latter, mainly because there are some improvements I’d like to make to the interface in the process.

I created a series of SQL statements to run in MySQL that will convert the data from the packet_item_records and miscon_packet_refs tables to the packet_students, packet_data, and miscon_packetdata_refs tables. After the conversion I updated any pages that referenced the old data structure. So far in testing the data seems to have converted perfectly.

Minus one issue. Multiple selections for the answer choice questions (A, B, C, D) from 2006 were recorded as a generic Multiple rather than Y+NS, N+NS, etc. This value is not represented in the updated data format or on the data entry forms or summary tables. Rather then spend too much time addressing this issue I’m going to leave these values empty for now. I don’t expect this to be a problem since the researchers are focused on data for the current pilot and field tests. Also, I’m keeping the old version of the piloting data and scripts that interact with it online in case it’s needed.

05.14.08

Expanding Search Terms for More Inclusive Results

Posted in MySQL, PHP, Web Development at 11:54 am by BrianS

While working on the Benchmarks search I wanted to try and provide a feature I find useful on Google and other search engines: word form expansion (lemmatisation). A little research showed to me that this would require more work than we really should be spending on search functionality. Especially considering that the built-in MySQL full text search capability is sufficient for our needs. So I decided to focus on a feature that would still provide value but require little time: word stem expansion.

Read the rest of this entry »

05.07.08

Optimizing MySQL Server Runtime Parameters

Posted in MySQL, System Administration at 1:55 pm by BrianS

Since we’ll be exposing MySQL to significantly more traffic (due mainly to the transition to a database-driven version of Benchmarks Online [dbBOL]) I decided to spend some time optimizing the server’s settings. There are a number of settings that can be tweaked to improve performance. I based my decisions on the information available from the references cited and the performance statistics reported by MySQL (SQL SHOW VARIABLES or use PHPMyAdmin). MySQL has been running for 131 days as of the writing of this post (see cached copy of the runtime stats), so I expect the data will be a fairly good indication of the performance of MySQL under its current usage. Unfortunately, I expect the usage pattern to change significantly once dbBOL is released. As a result some of the settings used will be based on expected usage patterns. At specific intervals after dbBOL is released we should examine the performance of MySQL based on the runtime stats to determine if additional tweaking needs to be performed. I recommend the following schedule: 1 week, 1 month, 3 months, then every 6 months.

Read the rest of this entry »

02.11.08

Misconceptions misnumbering

Posted in MySQL, PHP at 1:57 pm by bwalker

Jill is working on a new topic (

Processes that shape the earth/Plate Tectonics Version II (PT)) which shares a number of misconceptions with another topic (Earth Science: Processes that shape the Earth / Plate Tectonics (EP)).

Problem: She changed the topic of those misconceptions believing it would be available to both topics. Once she realized this was not the case, she changed the topic back, but the numbering was off.

Solution: She provided me a list of which misconceptions were out of sync. Because adding a misconception involves updating multiple tables I actually had to add the misconceptions through the Item interface and then change the numbering on the backend.

Conclusion: The conclusion is that the conclusion is pretty obvious. This functionality needs to be revised. Changing the topic of a misconception has considerable consequences throughout the Items utility, so I think it may be necessary to add an additional step to this process. For example, have the current topic displayed along with a check box and a disabled/hidden topic menu. Then use the checkbox to enable/disable the menu and possibly some kind of textual warning (”Changing this is likely to break something.”).

Another note is that the ‘code’ column denoting the numbering sequence is not a unique column so we could have the numbering be editable on the interface. Although this could have some duplication complications.

I also found it strange that when submitting a new misconception, there is no notification and you are not redirected back to the miscon list. I found out the hard way that once you submit a misconception you are essentially in edit mode (not Add New Miscon anymore) and any changes made are being made to the record you just added. It wasn’t obvious to me the difference between Add and Edit mode.

Updates to field test registrant_list_demo

Posted in MySQL, PHP at 1:33 pm by bwalker

As I was meddling with registrant_list_demo.php I found that the demographics were not showing up for any of my test cases.  I was under the mistaken impression this data was obtained from an external server, but it is actually taken from a data dump from the nces stored on our server in a database called ‘demographics’ (go figure).  The demographics data appears to be sorted by zip code.  I am surmising that this data dump is requested from nces after the registration process is complete and an updated list of zip codes has been generated. 

I’m sure Mr. Sweeney can provide details of this request procedure upon his return.  The demographics script is working fine, but the demographics data is most likely out if date.  Cari indicated the demographics data will not be needed for a few weeks.

Updates to field test registrant_list

Posted in MySQL, PHP at 1:23 pm by bwalker

When I first opened up the register dir I found register.php, review.php and review.v1.php.  Based on script and notes in the script the registration and review registration processes were consolidatedby Mr. Sweeney in register.php, but perhaps at one point they were separate.  review.php and review.v1.php turned out to be the registrant listing scripts. 

This names were confusing so I renamed them to registrant_list.php and registrant_list_demo.php (demographics).

I added a filter (by year) to limit output results.  However it is not technically filtering by year.  There was already a date limit written into the existing query (2007-04-27, I think) which I used as the start range marker for 2007.  The start range for 2008 is 2008-01-01.  The messy part is that everything prior to 2007-04-27 is marked as 2006 in the filter when in fact the records only go back to 2007-04-17.  I expect to be working on this script again in the near future (see “Updates to field test registrant_list_demo”) at which point I will probably change the filter to just ‘Prior’.

11.29.07

Character confusion: Odd characters on a web page

Posted in MySQL, Web Development at 5:17 pm by BrianS

I was recently doing some clean-up on the content of the database-driven version of Benchmarks. One of the issues I was attempting to address was the presence of odd characters in the content. This is an attempt to explain how to discover what these characters are and how to find them.

Read the rest of this entry »

07.13.07

Optimizing MySQL

Posted in MySQL, System Administration at 3:19 pm by BrianS

Slow on the draw
Some of our users were complaining of slow performance on one of our web-based utilities. There are, unfortunately, a number of factors that can affect system performance. I haven’t spent much time learning the best practices for profiling a system to discover the source of bottlenecks, but my hunch was that the problem was in MySQL.

Read the rest of this entry »