Converting static pages to dynamic pages using PHP and MySQL, Part 2

The challenge:
Take Adam Walker’s Carrajina dictionary and stick it in a database so that Adam can add an entry whenever he wants without re-writing an html page and possibly messing up his html. One stricture is that Adam does not want to learn too much new stuff – no SQL, no PHP, just HTML. And if he can enter SAMPA and have it turn into IPA, that’s a plus!

Step 1 (data prep) is here

Step 2 involves importing data into a database. I  will skip blogging Step 2 for now. Putting data into a database is more complicated than taking it back out again. But I will blog it eventually. Promise.

Step 3: Getting the data back out of the database to generate the dictionary listings.

So, the first thing we need to do is connect to the database. I mentioned in a previous entry that PHP’s mysqli extension is the preferred extension to use with a MySQL database. Here’s how that works:

$host = "my_host"; //"localhost" or "http://mysql.host.com"...
$user = "my_username"; //an authorized user of MySQL
$password = "my_password"; //my_username's password
$database = "my_database"; //the database we want to use.

$mysqli = new mysqli($host, $user, $password, $database);

if ($mysqli->connect_error) {
    die('Connect Error ('. $mysqli->connect_errno .') '
            . $mysqli->connect_error);
}

After connecting to the database, the next step is writing a select query to find the appropriate data. Querying the database with mysqli uses the following steps, initialize the statement, prepare the (parameterized) statement, bind the variables to the parameterized statement, and execute the query. (More on this in a future post.) One of the nice things about using PHP (or any programming language, really) to make dynamic pages is that you do not have to write a complicated SQL query to get everything from the database at once. Instead, you can write a sequence of queries and use PHP to do things with the results.

Adam currently has his dictionary set up so that each letter is on its own page. To duplicate this, I’ve used the following queries:

public function getList($mysqli, $letter) {
   $like = $letter . '%';  //% is the SQL wildcard character

   $query = "SELECT english_entryID FROM english_entry
   WHERE head_entryID IS NULL  /* so we don't get subentries */
   AND english_term LIKE ? ORDER BY english_term ASC";

   $stmt = $mysqli->stmt_init();
   if ($stmt = $mysqli->prepare($query)) {
      $stmt->bind_param("s", $like);
      $stmt->execute();
      $stmt->bind_result($entryID);
      while ($stmt->fetch()) {
         $listIDs[] = $entryID;
      }
      $stmt->close();        
   }
   return $listIDs;
}

What this does, when called, is execute an SQL statement that looks in the table english_entry and finds all the english_entryIDs in the table that match the following conditions: head_entryID does not exist (and therefore this is not a subentry) and english_term starts with the passed $letter. The ‘?’ in the query is replaced by the variable $like, which is $letter followed by the ‘%’ sign to match anything. So, if $letter = ‘a’, then the SQL statement becomes:

SELECT english_entryID FROM english_entry WHERE head_entryID IS NULL
AND english_term LIKE 'a%' ORDER BY english_term ASC;

This SQL statement returns a list of english_entryIDs

english_entryID
1
2
3

which is then put into a PHP array array( [0] => 1, [1] =>2, [2] =>3, ...); for ease of use.

Now that we have the array of english_entryIDs that start with ‘a’ and are not subentries, we can use more code to get each actual entry. For this project, I have put all the code that gets an English language entry into a class called EnglishListing. The following code is then used on the dictionary pages to generate a listing.

//make sure $letter is lower case
$letter = strtolower($letter);

$eng = new EnglishListing();

$list = $eng->getList($mysqli, $letter);
foreach ($list as $entryID) {
    echo $eng->getEntry($mysqli, $entryID);
}

So after calling getList(), for each english_entryID returned by the getList function, we call getEntry:

public function getEntry($mysqli, $entryID) {
    //an entry has a term, a pronunciation,
    //one or more defs (num, pos, deftext),
    //and 0 or more subentries (just like an entry).
    $termrow = $this->getTerm($mysqli, $entryID);
    $defrow = $this->getDefs($mysqli, $entryID);
    $subrow = $this->getSubentry($mysqli, $entryID);

    return "<dt>$termrow</dt><dd>$defrow $subrow</dd>\n";
}//end function

OK. That code breaks up the process of getting an entry into 3 steps: getTerm(), getDefs(), and getSubentry().

getTerm() executes an SQL statement that returns english_term and english_pronunciation. It then wraps each of these in span tags and returns $termrow:

private function getTerm($mysqli, $entryID) {
   $termrow = "";
   $query = "SELECT english_term, english_pronunciation
              FROM english_entry WHERE english_entryID =?";

   $stmt = $mysqli->stmt_init();
   if ($stmt = $mysqli->prepare($query)) {
      $stmt->bind_param("i", $entryID);
      $stmt->execute();
      $stmt->bind_result($term, $pron);
      while ($stmt->fetch()) {
         $termrow .= "<span class='term'>$term</span>
                      <span class='pron'>$pron</span>";
      }
      $stmt->close();   
   }
   return $termrow;
}//end function

getDefs() executes an SQL statement that returns definition_number, part_of_speech, definition for each definition matched by the given english_entryID. Where only 1 definition exists, $defrow contains only the part_of_speech and definition values. Otherwise, it contains all the values. ‘.=’ in the PHP code concatenates the value to the existing variable. An entry with two definitions, then would have $defrow equal to “$definition_number. $part_of_speech $def. $definition_number. $part_of_speech $def. “.

private function getDefs($mysqli, $entryID) {
   $defrow = "";

   $query = "SELECT definition_number, part_of_speech, definition
      FROM english_definition
      WHERE english_entryID = ?
      ORDER BY definition_number ASC";

   if ($stmt = $mysqli->prepare($query)) {
      $stmt->bind_param("i", $entryID);
      $stmt->execute();
      $stmt->bind_result($defno, $pos, $def);
      $stmt->store_result(); //so I can use $stmt->num_rows
      while ($stmt->fetch()) {
         $defno = "<span class='defno'>$defno</span>";
         $pos = "<span class='pos'>$pos</span> ";
         $def = "<span class='deftext'>$def</span> ";
         if (($stmt->num_rows) == 1) {
            $defrow .= "$pos $def ";
         } else {
            $defrow .= "$defno. $pos $def ";
         }
      }//end while
      $stmt->close();        
   }//end if 

   return $defrow;
}//end function

getSubentry() starts by calling another function checkSubentry() which checks to see if the entry has any subentries. If there is a subentry, then checkSubentry() returns the english_entryIDs for the subentries. Otherwise it returns an empty array. getSubentry() then checks the length (number of values) of that array (must be greater than or equal to 1), and for each value in the array it calls getTerm() and getDefs().

private function getSubentry($mysqli, $entryID) {
   $subrow = "";

   $headentryIDs = $this->checkSubentry($mysqli, $entryID);

   if (count($headentryIDs) >= 1) {
      foreach ($headentryIDs as $entryID) {
         $subrow .= "<br />&#x2022; "
            . $this->getTerm($mysqli, $entryID);
         $subrow .= $this->getDefs($mysqli, $entryID);
      }
   } 

   return $subrow;
}//end function

private function checkSubentry($mysqli, $entryID) {
   $headentryIDs=array();

   $query = "SELECT english_entryID FROM english_entry
             WHERE head_entryID =?"; 

   $stmt = $mysqli->stmt_init();
   if ($stmt = $mysqli->prepare($query)) {
      $stmt->bind_param("i", $entryID);
      $stmt->execute();
      $stmt->bind_result($eentryID);
      while ($stmt->fetch()) {
         $headentryIDs[] = $eentryID;
       }
      $stmt->close();
   }
   return $headentryIDs;
}

That is the code for the EnglishListing class. The Carrajina side looks similar. There were some challenges in that ‘ch’ and ‘dj’ could not be listed under ‘c’ and ‘d’. I’ll talk about that later.

<?php

class CarrajinaListing {

public function getList($mysqli, $letter) {
   //extra letters ch, dj, ñ, rr
   $like = $letter . '%';  //% is the SQL wildcard character
   $like2 = ucfirst($letter) . '%'; 

   //BINARY means that MySQL doesn't ignore accented characters,
   //n-tilde, and the like. it also doesn't ignore capitalization
   //without the BINARY keyword, accents and capitalization are
   //ignored.
   $query = "SELECT carrajina_entryID FROM carrajina_entry ";

   switch ($letter) {
      case 'c':
         $query .= "WHERE (carrajina_term LIKE BINARY ?
                    OR carrajina_term LIKE BINARY ?) AND
                    (carrajina_term NOT LIKE 'ch%') ";
      break;
      case 'd':
         $query .= "WHERE (carrajina_term LIKE BINARY ?
                    OR carrajina_term LIKE BINARY ?) AND
                    (carrajina_term NOT LIKE 'dj%' ) ";
      break;
      default:
         $query .= "WHERE carrajina_term LIKE BINARY ?
                    OR carrajina_term LIKE BINARY ? ";
      break;
   }

   $query .= " ORDER BY carrajina_term ASC";
   $stmt = $mysqli->stmt_init();
   if ($stmt = $mysqli->prepare($query)) {
      $stmt->bind_param("s", $like);
      $stmt->execute();
      $stmt->bind_result($entryID);
      while ($stmt->fetch()) {
         $listIDs[] = $entryID;
      }
      $stmt->close();        
   }
   return $listIDs;
}//end function

private function getEntry($mysqli, $entryID) {
   //an entry has a term, a pronunciation,
   //one or more defs (num, pos, deftext),
   //one or more idioms (idiom, idiomdef),
   //an etymology, and sometimes a note
   $termrow = $this->getTerm($mysqli, $entryID);
   $defrow = $this->getDefs($mysqli, $entryID);
   $idioms = $this->getIdioms($mysqli, $entryID);
   $etymology = $this->getEtymology($mysqli, $entryID);
   $note = $this->getNote($mysqli, $entryID);

   return "<dt>$termrow</dt>
           <dd>$defrow $etymology $idioms $note</dd>\n";
}//end function

private function getTerm($mysqli, $entryID) {
   $query = "SELECT carrajina_term, carrajina_pronunciation
             FROM carrajina_entry WHERE carrajina_entryID = ?";
   $stmt = $mysqli->stmt_init();
   if ($stmt = $mysqli->prepare($query)) {
      $stmt->bind_param("i", $entryID);
      $stmt->execute();
      $stmt->bind_result($term, $pron);
      while ($stmt->fetch()) {
           $termrow .= "<span class='term'>$term</span>
                        <span class='pron'>$pron</span>";
      }
      $stmt->close();        
   }//end if 

   return $termrow;
}//end function

private function getDefs($mysqli, $entryID) {
   $defrow = "";

   $query = "SELECT definition_number, part_of_speech, definition
             FROM carrajina_definition WHERE carrajina_entryID = ?
             ORDER BY definition_number ASC";
   $stmt = $mysqli->stmt_init();
   if ($stmt = $mysqli->prepare($query)) {
      $stmt->bind_param("i", $entryID);
      $stmt->execute();
      $stmt->bind_result($defno, $pos, $def);
      $stmt->store_result(); //so I can use $stmt->num_rows
      while ($stmt->fetch()) {
         $defno = "<span class='defno'>$defno</span>";
         $pos = "<span class='pos'>$pos</span> ";
         $def = "<span class='deftext'>$def</span> ";
         if (($stmt->num_rows) == 1) {
            $defrow .= "$pos $def ";
         } else {
            $defrow .= "$defno. $pos $def ";
         }
      }//end while
      $stmt->close();        
   }//end if 

   return $defrow;
}//end function

private function getIdioms($mysqli, $entryID) {
   $idioms = "";

   $query = "SELECT carrajina_idiom, carrajina_idiom_definition
             FROM carrajina_idiom WHERE carrajina_entryID = ?
             ORDER BY carrajina_idiom ASC";
   $stmt = $mysqli->stmt_init();
   if ($stmt = $mysqli->prepare($query)) {
      $stmt->bind_param("i", $entryID);
      $stmt->execute();
      $stmt->bind_result($idiom, $idiomdef);
      while ($stmt->fetch()) {
         $idioms .= "<span class='idiom term'>&#x2022; $idiom</span>
                     <span class='idiomdef'>$idiomdef</span> ";
      }//end while
      $stmt->close();        
   }//end if 

   return $idioms;
}//end function

private function getEtymology($mysqli, $entryID) {
   $etymology = "";

   $query = "SELECT etymology FROM carrajina_entry
             WHERE carrajina_entryID =?";

   $stmt = $mysqli->stmt_init();
   if ($stmt = $mysqli->prepare($query)) {
      $stmt->bind_param("i", $entryID);
      $stmt->execute();
      $stmt->bind_result($etym);
      while ($stmt->fetch()) {
         $etymology .= "<span class='etymology'>$etym</span>";
      }//end while
      $stmt->close();        
   }//end if 

   return $etymology;
}//end function

private function getNote($mysqli, $entryID) {
   $note = "";

   $query = "SELECT note FROM carrajina_note
             WHERE carrajina_entryID =?";

   $stmt = $mysqli->stmt_init();
   if ($stmt = $mysqli->prepare($query)) {
      $stmt->bind_param("i", $entryID);
      $stmt->execute();
      $stmt->bind_result($cnote);
      while ($stmt->fetch()) {
         $note .= "<span class='note'>$cnote</span> ";
      }//end while
      $stmt->close();        
   }//end if
   return $note;
}//end function

}//end class
?>

I know there are things in here that I didn’t explain. Consider this your excuse to take me to task and ask questions in the comments.

Edited 9-20-2011 for aesthetic reasons.

Using mysqli and parameterized statements

mysqli is the PHP extension that is recommended for use with current versions (4.1.3 and later) of MySQL. The way to connect to a MySQL database using mysqli is:

$host = "my_host"; //"localhost" or "http://mysql.host.com"
$user = "my_username"; //an authorized user of the MySQL database
$password = "my_password"; //my_username's password
$database = "my_database"; //the database we want to use.

$mysqli = new mysqli($host, $user, $password, $database);

if ($mysqli->connect_error) {
    die('Connect Error (' . $mysqli->connect_errno . ') '
            . $mysqli->connect_error);
}

Keep note of that $mysqli variable. It gets used a lot.

Querying the database can be as simple as:

$mysqli->query("some valid SQL statement");

This will return TRUE if the SQL statement was successful. So, if you are not expecting any data to be returned from the server, this bit of code will work fine:

if ($mysqli->query("some valid SQL statement") === TRUE) {
    //do stuff, or not, like print a success statement
}
$mysqli->close();

If you are using a SELECT statement to get data from the server, then you can do so:

if ($result = $mysqli->query("some valid SQL statement")) {
    //do stuff with the data
    while ($row = $result->fetch_row()) {
        //do stuff with the data in the $row array
    }
    $result->close();
}
$mysqli->close();

In addition to fetch_row(), there is also fetch_array() and fetch_assoc(). The full list of $result functions is here.

But the true wonderfulness of mysqli comes from the parameterized queries.

$stmt = $mysqli->stmt_init();
if ($stmt->prepare("some valid SQL query")) {
    $stmt->bind_params("s...", $vars);
    $stmt->execute();
    $stmt->bind_result($vars);
    while ($stmt->fetch()) {
        //do stuff with the data
    }
    $stmt->close();
}
$mysqli->close();

Yeah, right, you say. It looks complicated, but really, it isn’t. And some parts are optional.

I mentioned earlier that parameterized queries were one line of defense against sql injection attacks because they force the database to interpret anything within the bound variable as data and not as SQL instructions. So, here is how that works.

First, we initialize the statement with $stmt = $mysqli->stmt_init();. This step is optional, but it is good practice. By initializing, the next function called on $stmt has to be $stmt->prepare().

$stmt->prepare() takes a query (either directly or as a variable) and prepares it for the server. If the query requires data input, such as in a WHERE clause or in an INSERT statement, then one uses ‘?’ markers as placeholders for the data. Like so:

SELECT FirstName, LastName FROM Friends WHERE FavoriteColor = ?;

In PHP:

$stmt = $mysqli->stmt_init();
if ($stmt->prepare("SELECT FirstName, LastName
    FROM Friends WHERE FavoriteColor = ?")) {
    ...

So far, so good. Make sure not to include the normal final semi-colon in the SQL statement.

The next step is to bind the parameters, or tell the database what value the ? has. This is done using $stmt->bind_param(); This function takes a minimum of two arguments. The first is a string with information on the type of data. The second is the variable that holds the data. The information string can consist of the following characters only: b, d, i, and s. These stand for blob, double, integer, and string. Since FavoriteColor is a word, we use ‘s’ to identify the input as a string. The number and sequence of characters in the string must correspond to the number and sequence of ?s in the query. The number and sequence of the data- holding variables must correspond to the same number and sequence.

$color = "purple";
$stmt = $mysqli->stmt_init();
if ($stmt->prepare("SELECT FirstName, LastName
    FROM Friends WHERE FavoriteColor = ?")) {
       $stmt->bind_param("s", $color);
       ...

OK so far? Want another example? How about:

INSERT INTO Friends (FirstName, LastName, FavoriteColor)
     VALUES (?, ?, ?);
//the data
$firstname = "David";
$lastname = "Peterson";
$color = "orange";

$stmt = $mysqli->stmt_init();
if ($stmt->prepare("INSERT INTO Friends (FirstName, LastName,
    FavoriteColor) VALUES (?, ?, ?)")) {
       $stmt->bind_param("sss", $firstname, $lastname, $color);
       ...

See, three ?s, three characters in the first argument, and three variables with our data.

Then we execute the statement, using $stmt->execute(), which sends our query to the database.

$color = "purple";
$stmt = $mysqli->stmt_init();
if ($stmt->prepare("SELECT FirstName, LastName FROM Friends
    WHERE FavoriteColor = ?")) {
       $stmt->bind_param("s", $color);
       $stmt->execute();
       ...

Since this SELECT statement is returning two values (FirstName and LastName), we can also bind the results. This part is optional, though still a good idea.

$color = "purple";
$stmt = $mysqli->stmt_init();
if ($stmt->prepare("SELECT FirstName, LastName FROM Friends
    WHERE FavoriteColor = ?")) {
       $stmt->bind_param("s", $color);
       $stmt->execute();
       $stmt->bind_result($firstname, $lastname);
       ...

After binding the results, we can use $stmt->fetch() to get the data from the result set. All that’s left to do is close the statement with $stmt->close(), freeing up resources, and close the connection with $mysqli->close().

$color = "purple";
$stmt = $mysqli->stmt_init();
if ($stmt->prepare("SELECT FirstName, LastName FROM Friends
    WHERE FavoriteColor = ?")) {
       $stmt->bind_param("s", $color);
       $stmt->execute();
       $stmt->bind_result($firstname, $lastname);
       while ($stmt->fetch()) {
           //do stuff with the data
           echo "$firstname $lastname";
       }
       $stmt->close();
}
$mysqli->close();

The other example:

//the data
$firstname = "David";
$lastname = "Peterson";
$color = "orange";

$stmt = $mysqli->stmt_init();
if ($stmt->prepare("INSERT INTO Friends (FirstName, LastName,
    FavoriteColor) VALUES (?, ?, ?)")) {
       $stmt->bind_param("sss", $firstname, $lastname, $color);
       $stmt->execute();
       $stmt->close();
}
$mysqli->close();

INSERT does not return any results for binding. If you want information regarding the success of the query, you can use $mysqli->affected_rows to get back a count of affected rows. Even more useful is $mysqli->insert_id which returns the auto-incremented ID for the new row (if there is one, and if you call it right away before inserting or updating anything else.)

If you chose to skip binding the results, then you cannot use $stmt->fetch(). Instead, you must use  $stmt->get_result() and then $result->fetch_row() or an equivalent. Like so:

$color = "purple";
$stmt = $mysqli->stmt_init();
if ($stmt->prepare("SELECT FirstName, LastName FROM Friends
    WHERE FavoriteColor = ?")) {
       $stmt->bind_param("s", $color);
       $stmt->execute();
       $result = $stmt->get_result();
       while ($row = $result->fetch_row()) {
           //do stuff with the data in the $row array
       }
       $result->close();
}
$mysqli->close();

And that’s how you use mysqli and parameterized statements.

Oct 19, 2011: Updated to fix typos in the method name.
Apr 19, 2014: Updated to add missing commas in the code.

Do you speak SQL? (Part 1)

SQL stands for Structured Query Language and it is used to talk to relational databases. While there is a standard for SQL, many databases also have their own dialects, so to speak. This is a quick intro to basic SQL, nothing fancy. In the examples, I will use the following sample data:

Friends table:

FirstName LastName FavoriteColor
James Smith blue
Mary Johnson green
David Williams red
Linda Jones purple
Daniel Brown green
Jenny Davis blue
Nancy Smith orange

SELECT

This is the most common statement used, as it gets data from the database. The basic syntax of a SELECT statement is:

SELECT FirstName, LastName, FavoriteColor FROM Friends;

Multiple fields are generally separated by commas, as are multiple tables in more advanced SELECT statements. The SELECT statement may also have optional WHERE and ORDER BY clauses and other refinements.

One of those refinements is the asterisk (*) used in place of the fields:

SELECT * FROM Friends;

* stands for all fields in the table. This certainly saves time, but can cause issues if the fields returned are not the fields expected, say because someone has changed the structure of the table. I generally never use * in my applications for that reason.

Another refinement is the DISTINCT keyword:

SELECT DISTINCT FavoriteColor FROM Friends;

This will return:

FavoriteColor
blue
green
red
purple
orange

Without the DISTINCT keyword, we would get:

FavoriteColor
blue
green
red
purple
green
blue
orange

WHERE limits the data returned to that which matches the expression in the WHERE clause.

SELECT FirstName, LastName FROM Friends WHERE FavoriteColor = 'purple';

FirstName LastName
Linda Jones

‘purple’ needs to be quoted because it is a string rather than a number. Numbers are not quoted.

Multiple conditions in a WHERE clause can be joined by the AND keyword. There is also an OR keyword.

SELECT FavoriteColor FROM Friends WHERE FirstName = 'Jenny' AND LastName = 'Davis';

FavoriteColor
blue

LIKE looks for data that matches a certain condition. LIKE has two wildcard characters: ‘_’ which matches any single character in a string, and ‘%’ which matches any number of characters in a string.

SELECT FirstName, LastName FROM Friends WHERE FirstName LIKE 'J%';

FirstName LastName
James Smith
Jenny Davis

ORDER BY sorts the returned data. ASC specifies ascending order and DESC specifies descending order.

SELECT FirstName, LastName FROM Friends ORDER BY LastName ASC;

FirstName LastName
Jenny Davis
Daniel Brown
Mary Johnson
Linda Jones
James Smith
Nancy Smith
David Williams

INSERT, UPDATE, DELETE

These are used to add, change, and delete rows in tables.

An INSERT statement adds a row to a table.  The syntax for an INSERT statement is:

INSERT INTO Friends (FirstName, LastName, FavoriteColor) VALUES ('David', 'Peterson', 'orange');

An UPDATE statement changes an existing row in a table.

UPDATE Friends SET LastName = 'Anderson' WHERE FirstName = 'Nancy';

A DELETE statement deletes a row from a table.

DELETE FROM Friends WHERE FirstName = 'David' AND LastName = 'Williams';

After executing these INSERT, UPDATE, and DELETE statements, our data looks like so:

Friends table:

FirstName LastName FavoriteColor
James Smith blue
Mary Johnson green
Linda Jones purple
Daniel Brown green
Jenny Davis blue
Nancy Anderson orange
David Peterson orange

A short (very short) XML primer

So, XML. It’s actually very easy. It’s a lot like HTML, except you get to make up your own tags. The rules are:

  1. Any text ought to be surrounded by tags: <my_tag>text</my_tag>
  2. Do not nest tags. <my_tag><bold_text>text</my_tag></bold_text> is not allowed. Use <my_tag><bold_text>text</bold_text></my_tag> instead.
  3. There needs to be a tag that starts and ends the document. This is generally called the root node.

That’s it, mostly.* Follow those rules and you will have well-formed XML. Valid XML, however, requires a set of tag definitions to validate against. Those rules are written in a dtd file. Like so:

<!ELEMENT my_tag (tags and stuff my_tag can contain) >

A dtd needs an ELEMENT declaration for every tag in your XML file. Note that in my previous post, I showed two dtds for two XML files. A line like so:

<!ELEMENT english_term (#PCDATA)>

means that the tag <english_term>...</english_term> contains text but no other tags. A line like so:

<!ELEMENT english_entry (english_term, pronunciation, definitions+, subentry*)>

lists the tags that <english_entry>...</english_entry> contains. Note that some of these tags have characters after them: +, *, ?. Those have a special meaning:

  • + means that 1 or more tags of this type can exist within the defined tag
  • * means that zero or more tags of this type can exist within the defined tag
  • ? means that zero or 1 tag of this type can exist within the defined tag
  • no mark means that 1 tag of this type must exist within the defined tag

There. That is basic (very basic) XML. Easy!

*Then there are attributes and namespaces and other optional complications.

Converting static pages to dynamic pages using PHP and MySQL, Part 1

The challenge:
Take Adam Walker’s Carrajina dictionary (http://carrajina.conlang.org/dicthome.html) and stick it in a database so that Adam can add an entry whenever he wants without re-writing an html page and possibly messing up his html. One stricture is that Adam does not want to learn too much new stuff – no SQL, no PHP, just HTML. And if he can enter SAMPA and have it turn into IPA, that’s a plus!

Step 1: get the html source code for the current dictionary pages. Examine them carefully and figure out the structure. Outcome:

For the English to Carthaginian side:

Each entry consists of an English word, followed by a pronunciation. Then come definitions: a number if there’s more than one, a part of speech, and the definition text – a set of glosses in Carrajina sometimes with explanation. In addition, some entries have subentries, which have an English word or phrase, the definition number if there’s more than one, and the definition text.

I turned this into XML to make it easier to analyze. Turning it into a spreadsheet might also have worked. The dtd for this part of the dictionary is:

<!ELEMENT english-carthaginian (english_entry+)>
<!ELEMENT english_entry (english_term, pronunciation, definitions+, subentry*)>
<!ELEMENT english_term (#PCDATA)>
<!ELEMENT pronunciation (#PCDATA)>
<!ELEMENT definitions (definition_entry+)>
<!ELEMENT definition_entry (definition_number, part_of_speech, definition) >
<!ELEMENT definition_number (#PCDATA) >
<!ELEMENT part_of_speech (#PCDATA) >
<!ELEMENT definition (#PCDATA) >
<!ELEMENT subentry (english_term, pronunciation?, definition+)>

Creating the database model, I decided to put the definitions in their own table. Also, since the subentry is more or less identical to a head entry, I decided to put them in the same table. The diagram:

data model for english-carthaginian tables

For the Carthaginian to English side:

<!ELEMENT carthaginian-english (entry+)>
<!ELEMENT carrajina_entry (carrajina_term, pronunciation, definitions+, etymology?, note?, idioms*)>
<!ELEMENT carrajina_term (#PCDATA)>
<!ELEMENT pronunciation (#PCDATA)>
<!ELEMENT definitions (definition_entry+)>
<!ELEMENT definition_entry (definition_number, part_of_speech, definition) >
<!ELEMENT definition_number (#PCDATA) >
<!ELEMENT part_of_speech (#PCDATA) >
<!ELEMENT definition (#PCDATA) >
<!ELEMENT etymology (#PCDATA)>
<!ELEMENT note (#PCDATA) >
<!ELEMENT idioms (idiom, idiom_definition)>
<!ELEMENT idiom  (#PCDATA)>
<!ELEMENT idiom_definition (#PCDATA) >

Again, I put the definitions in their own table. Despite the fact that English and Carthaginian definitions both have the same structure, I will keep them in separate tables. Since an entry can have multiple idioms associated with it, those will go in their own table, too. Since etymologies are common, I will keep those in the same table with the terms. Notes, however, are not very common, so I will put them in their own table. This is a space- and memory-saving move. It is perfectly valid to put notes in the same table as terms as well.

data model for carthaginian-english tables

The full model:

full data model

Note: ignore the VARCHAR(45). I haven’t decided on the length of these fields yet.

Questions? Comments? Mistakes?