MySQL tutorial

Tags: tutorials, webdevelopment.
By lucb1e on 2012-01-27 19:48:21 +0100

This is a beta version!

There are a lot of MySQL tutorials out there, but like with many tutorials, it stops at some point. It teaches you the basics and then either stops, gives only a sort of reference manual for more, or doesn't explain advanced concepts well. If you want to know more, you'll have to Google for it on your own, but there is almost never one tutorial from the beginning to the very end. Also, most tutorials are very practical. If [this], do [that], but never because of [reason].

In this MySQL tutorial I will attempt to explain it from the beginning up to where I know. And I'll attempt to explain the concept, not only the working of the commands.

Contents
1. Tables: How to efficiently store data in tables.
2. Notation and terminology: How some things are denoted, and some terminology used.
3. Relational database: What relations are, what their use is, how they work and what they do.
4. Data types: Different types of fields (numbers, text, timestamps...).
5. SQL queries: Basics of (My)SQL.
6. PHP: Using MySQL in PHP,
7. Erroring queries: What errors it may turn up look like.
8. More advanced queries: Selecting data from multiple tables at once.
End
Note to self about what more to write about.
Remarks


1. Tables

Tables are the core of SQL, not only MySQL. All data you save is stored in a twodimensional array you could say, like an Excel spreadsheet. This may seem very limiting to you at first, it did to me, but you need to get used to it. The process of saving data efficiently in tables is called normalisation. This is a term which is thrown often by wannabes and they make it sound hard, but it basically comes down to 'just do things properly'. I'll give some examples.

Table named "users", fields: "name", "email", "displayname".
nameemaildisplayname
Lucspam@lucb1e.comlucb1e
Zaphod Beeblebroxsomeone@example.comPresident Zaphod


This looks straigh-forward. And it is, this is the most basic example I can think of. Let's complicate it a bit!

Say you want users to be able to enter multiple e-mail addresses. You could solve it in two ways:
nameemaildisplayname
Lucspam@lucb1e.com-spam2@lucb1e.comlucb1e
Zaphod Beeblebroxsomeone@example.com-anotherMe@mail.tldPresident Zaphod


See the hyphen (-)? That way you can separate e-mail addresses, easy. But what if someone had a hyphen in his email address? "person-two@example.org". Well that'd be a problem, but you could "escape" it: "person\-two@example.org". This is not ideal though.

Another way would be to do this:
nameemailemail2displayname
Lucspam@lucb1e.comspam2@lucb1e.comlucb1e
Zaphod Beeblebroxsomeone@example.comanotherMe@mail.tldPresident Zaphod


That looks better. But, you can only use 2 email addresses now. And if a user only uses one, you will have some space taken on your disk which isn't actually used since the field is empty.

The correct way to do this is by using two tables:
Table 1: "users": "name", "displayname"
Table 2: "emails": "name", "address"

namedisplayname
Luclucb1e
Zaphod BeeblebroxPresident Zaphod

nameemail
Lucspam@lucb1e.com
Lucluc@example.com
Zaphod Beeblebroxpresident@example.com


This way might seem a bit weird to you, but this is the best way to do it. Two tables. There is only one problem with this though, let me illustrate.

Table 1: "users": "first_name", "last_name", "displayname"
Table 2: "emails": "first_name", "last_name", "email"
first_namelast_namedisplayname
ZaphodBeeblebroxPresident Beeblebrox

first_namelast_nameemail
ZaphodBeeblebroxpresident@example.org
ZaphodBeeblebroxzaphod@example.com


Something messed up here, don't you think? Indeed, the data is redundant. Redundancy is the worst thing you can have in a database.

Since there is no field which is unique, we need to include both the first and last name in the emails table. Only this way we know for sure which person we mean.

But there are people with the same first and last name... And whatever way, this is all messed up. What we need is an identifier. The ID field.

Table 1: "users": "id", "first_name", "last_name", "displayname"
Table 2: "emails": "userid", "email"
idfirst_namelast_namedisplayname
1ZaphodBeeblebroxPresident Beeblebrox

useridemail
1president@example.org
1zaphod@example.com


That is the way! This is the final correct way to do this.
Like this, normalisation looks pretty easy, and it is once you get used to it. But there are actually certain steps to take to make sure you don't forget anything in the process. I'll probably explain those later, but I hope you will do it automatically. All SQL tutorials I've ever followed started off with "Make a database, open phpmyadmin, make a table with this and this in it, enter this in php, and it does something". This more theoratical approach, which shows you how it is done ideally, might give you a better insight from the start. I didn't actually find normalisation hard at all, so hopefully you'll do it naturally :)


2. Notation and terminology

A short chapter on how these things are actually denoted. Let's take the last table as an example:

users: id, first_name, last_name, displayname
emails: userid, email

See the underline under the id field? This usually says "I'm the one unique field in this table, I identify this record". It's called the primary key. There are more special fields like indexed or unique, but they aren't really important at the moment. You might ask What do you need an id field for when there is another field which is unique for each record?, and that is a very good question. The answer is that an id is efficient (it usually takes 4 bytes in the database while a name can take as many as 50, but I'll get to this in part 4), it is easy to use, and everyone uses it. That last is hardly an argument, but it still is an advantage.

Sometimes you also need to underline the foreign keys, but that's usually just at school. I'll explain in part 3 what this type of key is.

Now some terminology
- Column: synonym for Field.
- Row: One record in the database.
- Record: One row in the database :D
- Row or record: Allright then. Remember the table examples in part 1? A horizontal row there is... well, a row. The vertical one (from top to bottom) is a column.

One more thing on notation
Sometimes a field is written down like this:
databasename.tablename.columnname
yourDatabase.users.first_name

More often only the databasename and tablename, or tablename and column name. Anyhow, they are separated by periods and always in this order (from database to table to column).


3. Relational database

MySQL is a relational database engine. This means that you can set relations between fields. For example the id field in the users table and the userid field in the emails table match, this is a relation.

Why should you add relations in a database? There is no use really if you are a robot. Humans however make mistakes, and if you enter impossible data it will tell you that you can't do that.
Another purpose for adding relations is readability. Again, if you are a robot there is no point in this, but humans simply can't compute. We find things much clearer visually.

For example, this is is a database for an unfinished project a couple weeks ago.
acl: object, item, permission
items: id, name
objecthierarchie: parent, child
objects: id, name, type(user|group)
permissions: id, name, type(allow|deny)

Any idea how these things relate? Probably not, I wouldn't have a clue either. A visual representation with relations is much clearer:


You still might not know what does what exactly, but a practiced eye gets the idea of the database much quicker now. It should also be clear to you how the fields relate, even if not what everything is for.

For the query though, the code you send to the database engine (MySQL) to get the data you want, it makes no difference whether you have relations in your database or not.

The foreign key I was talking about is the field which is linked to a primary key. Like users.id (primary key) was linked to emails.userd (foreign key).


4. Data types

Datatypes are , well, types of data. For example it would be smart to set a name to be text and the year of birth to be a number.

MySQL knows many datatypes, many of which you won't use for at least a year (if you are new to this). The most important are the following:
Int: A number without decimals. Can be roughly be between -2 billion and 2 billion. Takes 4 bytes or 32 bits.

Bigint: Twice the size of an integer, 64 bits. Can store numbers between –9223372036bln and 9223372036bln, ought to be enough for most purposes lol.

Float: A number with decimals, good for storing monetary values. 32 bits.

Double: Same as a float, but with 'double precision'. 64 bits. Usually not needed.

Varchar: Short texts up to 255 characters. A length is required, e.g. 50 characters, to set how long the field can be.

Text: Texts up to about 60KB.

Blob: Binary Large OBject, for storing things like images. You usually don't want to do this anyway, it's generally better to store it in files on the disk, but it is used sometimes so it's good to know what it is and how it works.

Enum: Set of choises. Best illustrated with an example: Table "support_requests", field "status", type "enum('open', 'in_process', 'closed')". The status field can be either of those 3, but never any other value. This is efficient, prevents you from making mistakse (typos), and makes the database more readable too.

I deliberately leave the date and time (and datetime) datatypes. They store dates and times in a readable format: "YYYY-MM-DD HH:MM:SS". This looks good, but only to you. If you want to add 5 days to this the computer is like "what?!"
Better is using a unix timestamp. Incase you never heard of it, it's the number of seconds since 1-1-1970 00:00. The current timestamp at the moment of writing is 1327670861 (1,327,670,861). This fits inside an integer (int) and is precise to one second, precise enough for almost everything. Calculating with this is easy, adding 5 days goes like this:
post_time = post_time + (3600 * 24 * 5).
3600 is the number of seconds in an hour, then times the number of hours in a day, times the number of days you want. PHP has unix timestamps closely integrated:
post_time = strtotime("5 days", post_time)
Will add 5 days to the current value of post_time.
Or displaying a time:
print date("Y-m-d H:i:s", post_time)
Will print (display) something like this: 2012-01-27 14:31:23

You can easily format a unix timestamp in PHP, other languages are often less integrated but still handle it easily. To be honest, I never searched a lot for howto use the date field in MySQL (so to display the date in MM-DD instead of YYYY-MM-DD, for example), but I think you'll have to get it trough some parser first. Also I don't think you can compare dates like this:
if (date_field1 < date_field2) then ...
With a unix timestamp, both values are numeric, you can compare it like any other number.

Example
users: id int, first_name varchar(50), last_name varchar(50), displayname varchar(50)
emails: userid int, email varchar(100)
messages: userid int, message text, post_time int

The number between parenthesis is the length of the varchar. Names can be max 50 characters, email addresses 100. This should usually be enough.


5. SQL queries

Now we're getting to the practical stuff. First of all, we need a way to communicate with the database. This can be done in many ways, I think the easiest is to sign up for free hosting. Another way is installing everything locally, but this is probably slightly harder.
For example, go to bplaced.net and sign up for an account. I've never had spam from them, so you can use your real e-mail address. You probably need to confirm that address too. (Why bplaced? They are one of the biggest, fastest and most reliable free hosts who offer a number of databases, ftp accounts, and webspace with PHP installed).
Once logged in, go to "MySQL datenbanken" and under "Datenbank hinzufügen" near the bottom you can add a database. By the way, I can't really read German either, but the website is intuitive enough to use without that I think. Also if you submit a support ticket in English, they will reply in English. In fact, they replied to me on a sunday evening at 8pm, try find a paid host who does that without extra cost.
Then go to phpmyadmin.yoursubdomain.bplaced.net and log in with yoursubdomain_yourdatabase and the password you set. Find a way to add a table (probably on the left, I'm not sure). If you can't find it, go to the databases tab on top, select the database, and then fill in the table name "users" and 3 fields.

As example, we are going to enter the database I used before:
users: id int, first_name varchar(50), last_name varchar(50)
emails: userid int, email varchar(100)
messages: userid int, message text, post_time int

The id has two special settings: auto_increment should be on and the index should be primary. Your screen will look something like this.
Leave the other fields and click Save. For now the users table is enough. Click the table users, and go to the tab SQL. Going into the table is useful because it now shows a list of columns on the right.

On the left, you can see it automatically generated a query for you. This query it made will give you the entire database, but since it is empty at the moment, there is no sense in running this query.

There are a few types of queries you need to know about now:
- select: Gets you data
- insert: Adds data
- update: Edits or updates data
- delete: Deletes data

In a query there are rougly three things: SQL, names and data. Data can't be changed, that's simply what you want to have in the database. Names, like fields or tablenames, are usually written in lowercase or with only the first letter uppercase. I'm for all-lowercase myself. SQL, the actual commands, are written in UPPERCASE. You can see this in the query it automatically generated. SQL itself is not actually case-sensitive, you can write "update this" as well as "UPDATE this". It is just a good thing to do, and many people actually regard it stupid not to do this.

Also, names should be between backticks (`name`, note that this is no apostrophe! That's this: 'data'). You can usually make them by the key to the left of your 1 and above your tab key. If you press it and nothing happens, press it again, there will probably appear two backticks now. It works the same as apostrophes.

Enough theory, let's get some data into this table! The syntax for an insert query is:
INSERT INTO tablename (field1, field2, ...) VALUES(data for field 1, data for field 2, ...)

Since the id field is automatically incrementing, we don't need to specify the id. Only the first- and lastname are going to be inserted. Run this query:
INSERT INTO `users` (`first_name`, `last_name`) VALUES('John', 'Smith')

You can use newlines as you want. Putting VALUES on a new line is no problem if you like to. Click the button "Go". It will tell you 1 row was added. Huray!
Note that in this example, the text in uppercase ("INSERT INTO", "VALUES") were SQL commands. The `users`, `first_name` and `last_name` were the fields (with backticks as I said). And lastly 'John' and 'Smith' was data. The parenthesis can be seen as a part of SQL, though you can hardly write them in uppercase.

Now we want to see that row. The most basic way is:
SELECT fields FROM tablename
SELECT `id`, `first_name` FROM `users`
Clicking Go will display the id and firstname. To go back to the SQL input, click the SQL tab again.
SELECT * FROM `users`
Clicking Go will show all fields, the * does that.

John decided to change genders. Let's update the database.
UPDATE table SET field=updated_value, otherfield=updated_value
UPDATE `users` SET `first_name` = 'Jane'

What this does is updating all records in the database. In this case it is just one, but if there were more records, everyone would get the first name "Jane". To prevent this, we can use a WHERE clause.

UPDATE `users` SET `first_name` = 'Jane' WHERE `id` = 1

Only records with the ID 1 (which is only 1 record since the ID is unique) will be updated now. Don't ever forget to use the WHERE clause here! It will lead to big trouble.

Lastly, we are going to delete some data. Here too, don't forget a WHERE! Any time you use a delete query, be sure to check it at least two times. By not doing so you can end up deleting all data in the table.
DELETE FROM table WHERE conditions
DELETE FROM `users` WHERE `id` = 1

Our table is now empty again.

This were the very basics. Let's add the other tables:
emails: userid int, email varchar(100)
messages: userid int, message text, post_time int
You can find the Create Table button on the left of the screen if I'm not mistaken. You should know how to do the rest.

Run the following query:
INSERT INTO users (first_name, last_name) VALUES('Jane', 'Doe'), ('Zaphod', 'Beeblebrox'), ('Luc', 'b1e');
INSERT INTO emails VALUES(2, 'jane.doe@gmail.com'), (3, 'Zaphod@beeblebrox.net'), (4, 'spam@lucb1e.com');
INSERT INTO messages VALUES(2, 'Hello, World!', 1327678227), (2, 'I\'m Jane', 1327678277), (4, 'Status update: Writing SQL tutorial', 1327678300);


There are a few new things here.
First of all, notice how we can insert multiple values at once.

Second, you might or might not have noticed before, there is a clear distinction between numbers and text. Text has to be between apostrophes or quotes ('' or "") and numbers can be written down the way they are. Also negative numbers (with a hyphen in front of them (-102)) and decimals (1.02) are no problem. Make sure you use a period (.) instead of a comma (,) to denote decimals though!

Third, not all fields have backticks. Programmers sometimes like to joke they're programmers because they are lazy and they like to make other things work for them. Backticks are not needed everywhere, so we often leave them.
If you have a field named "update" in your database which is a unix timestamp of when the last update was, then you must use backticks. "UPDATE users SET status='something', update=123" is not going to work. There must be backticks around update (the one in lowercase), but the other fields are not needed. It is good practice to not name fields the same as commands anyway, but you can always try to leave backticks. When something doesn't work, simply add them.

Fourth, you may have noticed the \' in the second message. This is to prevent the ' from ending the data accidentally.

Fifth, there are multiple commands separated by a semi-colon (;). This usually doesn't work, only PHPMyAdmin ran the queries separately.

Lastly, I wanted to note is that you can actually use spaces in column names, but then you must always use backticks around the field:
UPDATE `some table` SET `first name` = 'Goofy' WHERE id = 5

I personally find it annoying and concider spaces bad practice. Underscores (_) are better for readability, using casing (firstName) is quicker to type. Since this is a tutorial I'll use underscores (more readable), but I often use casing myself. Choose either way yourself, but be consistent. You can switch from one way to another between projects, but it's better to keep one style per project. Same goes for indentation style in PHP, but that's another subject.


6. PHP

Now we've got the basics, it's time to use it.
If you are familiar with FTP, log in via FTP on your bplaced account. If you are not, the easiest way is by going to net2ftp.com and enter the following details:
Server: yourSubdomain.bplaced.net
Username: yourSubdomain
Password: your password
Login. Other settings are fine by default.

Create a new file named sqltest.php or something, anything ending in .php is good. For the text in the file, enter this:
mysql_connect("localhost", "yourSubdomain_databasename", "database password") or die("Could not connect to database. Errormessage: " . mysql_error());
mysql_select_db("yourSubdomain_databasename") or die("Could not open database. Errormessage: " . mysql_error());
$result = mysql_query("SELECT * FROM users") or die("Error: " . mysql_error());
echo "End of script.";


Save the file and load http://yourSubdomain.bplaced.net/sqltest.php
You should get a page saying "End of script.". If there are any errors, it should give a clear enough errormessage what went wrong. If you get a page with an advertisement on it, look at the bottom. It probably says the page is not found, check if you typed it correctly and if you named it correctly. This advertisement is from bplaced to be able to provide you free hosting.

Having verified it works, we can expand our script to actually display something.
Add to the file (below everything else):
while ($row = mysql_fetch_array($result)) {
    echo "UserID: " . $row["id"] . "<br>";
    echo "First name: " . $row["first_name"] . "<br>";
    echo "<br>";
}


Save the script, and load the page on yoursubdomain.bplaced.net again (F5 if you left it open).

It should be clear enough what this code does. Going into detail about this would be more of a PHP tutorial, which I could do but I don't really feel like doing that at the moment lol.
Here is a script that will display tables a bit like phpmyadmin would. Replace the second part you just added (from "while (...) {" until the "}") with this:
if(mysql_num_rows($result)==0){die("No results.");}
$row=mysql_fetch_array($result);echo "<table><tr>";
$s=false;
foreach ($row as $column=>$value){if ($s==true){echo "<th>" . $column . "</th>";}$s=!$s;}echo "</tr>";
mysql_data_seek($result,0); while ($row = mysql_fetch_row($result)){echo "<tr>";foreach ($row as $column) {echo "<td>" . $column . "</td>";}}


This code is extremely ugly, but my admin panel still isn't optimized for writing decent code. It's not intended for reading anyway, the mysql_query() is the part we will be focusing on.

Save the file and verify that it's working by reloading the page again. You'll get a table with columns (including column names) and the resulting data.


7. Erroring queries

Just so you recognise what happens if something goes wrong (hopefully you haven't already seen this, I tried the above steps myself and I didn't get errors). Let's enter an erroring query in there:
mysql_query("GET ME COOKIES")
Important: Only edit the part between quotes (""), leave the "$result = " before and "or die" thing behind it. Those parts gather the results and display any errors!

If you run it now (save the file, (re)load the page), it will give you a syntax error:
"Error: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'GET ME COOKIES' at line 1"
It doesn't recognize the GET part and displays what comes behind that. Note how it says "at line 1", you can actually spread the query across multiple lines. You'll see that later.

Another erroring query:
"SELECT the_capital_of_France FROM users"
Run it, and you'll see it doesn't know the capital of France. Bplaced's servers never had geography in school I guess. It only knows the columns you defined in the users table of course, which are id, first_name and last_name.


8. More advanced queries

We want to display a list of messages posted by users. This can be done in two ways, a good way and a bad way. The bad way is done by all beginners though, don't feel bad if you use this. I've done it often enough ;)

The bad way:
$messages = mysql_query("SELECT userid, message FROM messages");
while ($msg = mysql_fetch_array($messages)) {
    $user = mysql_query("SELECT * FROM users WHERE id = " . $msg["userid"]);
    //This does: mysql_query("SELECT * FROM users WHERE id = X"),
    //where X is `messages`.`userid`
    $user = mysql_fetch_array($user);
    echo $user["first_name"] . " " . $user["last_name"] . " posted:<br />";
    echo $msg["message"] . "<br /><br />";
}


Note: if you want to test this code (you don't really have to), put it under the 'mysql_select_db' statement and cut/paste the rest to somewhere else. You'll need it later.

Why is this way bad?
There is a mysql_query inside the while loop here. If it is in any way possible to get the query outside a loop, please do. This is what Blake O'Hare had to say about this method:
"If you do this, your database administrator will hunt you down in the middle of the night and bludgeon you to death with a sack full of blown-out CPU fans from the MySQL server that he had to replace numerous times because of your poorly written code."

The pretty good way:
$messages = mysql_query("SELECT messages.message, users.first_name, users.last_name FROM users, messages WHERE messages.userid = users.id");
while ($message = mysql_fetch_array($messages)) {
    echo $message["first_name"] . " " . $message["last_name"] . " posted:
";
    echo $message["message"] . "<br /><br />";
}


Notice how everything you want is nicely collected by MySQL now, not PHP anymore. The $messages result is the only result set you need.
Why is this the 'pretty good way', not just the good way? JOINs are the way to use two tables, not putting two tables in a FROM. Don't ask me why, nobody ever gave me a reason, but everyone universally agrees. Also it is better to get used to JOINs, and when you need to edit the query you can just do this without rewriting it entirely.

The way you are supposed to do it:
Same code as above, only replace the query with this:
"SELECT messages.message, users.first_name, users.last_name FROM messages INNER JOIN users ON messages.userid = users.id"
It will give you the same result again.

How most webdevelopers would do it:
"SELECT m.message, u.first_name, u.last_name FROM messages m INNER JOIN users u ON m.userid = u.id"
The letter behind the tables we use (m and u) are abbrevations we can use. We could also do:
"SELECT msgs.message, u.first_name, u.last_name FROM messages msgs INNER JOIN users u on msgs.userid = u.id"
This makes no difference at all.


The End, for now.

Note to self, todo:
- More complex joins
- SELECT something AS anotherthing
- LIMIT
- ORDER BY
- Small PHP tutorial to get the basics to use this (or a link to that)
- AVG()
- Arithmetic
- Logic (OR, AND)
- LIKE
- Passwords
- Write about normalization
- Excersises
- SQL Injections
- LAST_INSERT_ID()
- UNION
- MySQL vs SQL vs ..?

(Meh that list is longer than I thought it would be)


Remarks.

While writing, I noticed that SELECT, UPDATE, DELETE and INSERT all are the same length. How odd I never noticed!

Thinking of a random e-mail address I found beeblebrox.net, a funny one-page website having the hacker logo as favicon (I often use it as profile image). Nice ^^

This is the longest blogpost I've made so far.
lucb1e.com
Another post tagged 'webdevelopment': CSRF: It's not trivial

Look for more posts tagged tutorials or webdevelopment.

Previous post - Next post