Anatomy of an HTML page – part 1

With more and more editors being expected to correct content on the web, it’s becoming increasing useful for us to know at least the basics of HyperText Markup Language (HTML). In this post, John Espirian provides a simple introduction to the language of the web.

Anatomy of an HTML page – part 1

Anatomy of an HTML page – part 1

Whenever you view a website, your web browser converts HTML code into rendered text, images and other media. If you ever want to fix or amend the contents of a page, you’ll often need to change that HTML code. The very thought of this strikes fear into many hearts, but that needn’t be the case.

By understanding the basic anatomy of a page, we can soon get to grips with our content.

Elements: the building blocks of HTML

HTML is made up of a series of hierarchical elements. Most elements have an opening tag and a closing tag, with some content sandwiched in between. If you’re used to marking up chapters, headings and subheadings, this concept should be easy to grasp.

If this sounds alien to you, just think of HTML as a tree with the elements as its branches.

Here’s an example of the paragraph (<p>) element:

<p>The content goes between the opening and closing tags. This paragraph can be as long as I want it to be, and there's no problem with the content spilling over multiple lines. This whole block will be treated as a single paragraph.</p>

A small number of elements use a single tag. These are sometimes referred to as self-closing elements. Here’s an example of a self-closing element – the line break (<br>):

Line breaks insert new lines into<br>
the text. It's usually better to<br>
use paragraph elements instead.<br>

Note for nerds

In XHTML, self-closing elements need to be written with a forward slash (e.g. <br />), but we won’t worry about that in this series.

Creating an HTML document

With a basic understanding of the sorts of elements that exist, we can start to build our own HTML page. First, let’s create the file that will contain our code:

  1. Create a new text file in a plain-text editor such as Notepad (Windows) or TextEdit (Mac).
  2. Save the empty file as sample.html.

Text editor recommendations

If you want a better plain-text editor than the one pre-installed on your machine, you won’t go wrong by trying one of these:

Both free, both excellent.

OK, we now have somewhere to place our HTML code. When we’re done editing the sample file, we can save the file, close it, and then double-click to re-open it in a web browser. (By default, .html files open automatically in a web browser.)

The simplest web documents start with this opening tag:

<html>

We can help web browsers to correctly display our content by declaring the HTML standard we’d like to use on our page. To use the latest HTML5 standard, we should start our page as follows:

<!DOCTYPE html>
<html>

Because <html> is a standard element, its opening tag needs a complementary closing tag. Therefore, at the end of the document, we’ll use this code:

</html>

We don’t need to close the <!DOCTYPE html> tag at the end of the document, because it’s not really an element.

Building the hierarchy

Just as a real tree’s branches can have their own branches, elements in HTML can be nested within other elements. Think of the <html> element as the single root from which all those branches ultimately grow.

Our root <html> element must contain the following elements, in this order:

  1. <head> – includes information about the page
  2. <body> – includes information displayed on the page

Another useful analogy – one I’ll refer to in a future post – is to think of the <html> element as the ‘parent’ and the <head> and <body> elements as the ‘children’.

Let’s put this together and see what we’ve got:

<!DOCTYPE html>
<html>
<head></head>
<body></body>
</html>

The above code defines a basic, empty HTML page. Everything else we add to the page will now be nested within the <head> or <body> elements. These elements will therefore become the parents of their own child elements.

Page titles

We can add a title to the page by including a <title> element within the <head> element. Here’s how that looks:

<head>
<title>Page title goes here</title>
</head>

Did you notice that I added a carriage return after the opening <head> tag? That was done to aid readability. In HTML, you can add spaces, tabs and carriage returns pretty much anywhere and the code will still work.

You won’t see the page title appear within the main part of your web browser screen, but it’s often displayed at the very top of the browser window, as shown in this example from Google Chrome:

Page title

Page title in Google Chrome

Paragraphs and formatting

Most text on a page should be placed inside <p> elements within the <body> element. We can set text in bold or italics by using the <strong> and <em> elements respectively. Here’s an example:

<body>
<p>This text sits within a paragraph element.</p>
<p>Other elements are used to write text in <strong>bold</strong> or <em>italics</em>.</p>
</body>

The result

If we put all of this code together, we produce the following:

<!DOCTYPE html>
<html>
<head>
<title>Page title goes here</title>
</head>
<body>
<p>This text sits within a paragraph element.</p>
<p>Other elements are used to write text in <strong>bold</strong> or <em>italics</em>.</p>
</body>
</html>

Still struggling to understand how all these elements fit together? Here’s a view of the code that emphasises the tree structure I mentioned earlier:

HTML tree

Another way of looking at our HTML document

To see how the code looks when viewed in a web browser, take a look at this side-by-side screenshot of my own code editor and web browser:

HTML code (left) and browser output (right)

HTML code (left) and browser output (right)

And if you wish to view the real HTML page in question, here it is: sample.html.

End of part 1

This post was enough to give you a small taster of what HTML is about. In part 2, I’ll show you some of the other commonly used elements. After that, we’ll move on to some more interesting topics.

Series list

  1. Part 1 – building blocks of HTML
  2. Part 2 – headings, images and lists
  3. Part 3 – comments, tables and special characters
  4. Part 4 – quoting and citing
  5. Part 5 – the <head> element

Facebooktwittergoogle_pluslinkedinmail