Web page hosting and cheap domain registration & search engine ranking services
  

 

Overview of MIME Messages

[previous] [next] [table of contents] [index]

This section starts by explaining the overall purposes of MIME. Next it shows a simple MIME message and introduces the basic parts of a MIME message.

Purposes of MIME

With MIME, you can:

  • Make "tree-structured" message bodies that have many levels of parts and subparts. This is like the idea of a UNIX filesystem, an office file cabinet, and so on. This lets you send many different things (messages, graphics, sounds, and so on) in the same message.

    Like non-MIME messages, though, many MIME messages have just one part.

  • Handle multiple character sets, even within the same mail message.
  • Transport many different types of data in the same message.
  • Pass data reliably through standard mail transports. The sender's MIME agent converts text and data into 7-bit us-ascii with line length limits. The recipient's MIME-capable mail program converts the message contents into their original formats.

Let's look at a MIME message as it would arrive in your MH inbox. The message in the next Example was encoded by the sender; this is how it is actually stored on disk by the recipient's MTA (mail transfer agent), waiting for someone to read it. This is also how the message will look to a person who does not have a MIME-capable MUA (mail user agent):

Example: Encoded MIME message

From: Jerry Peek <jerry@ora.com>
To: carlos@entelfam.cl
Subject: Un d'ia dif'icil
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

Carlos, estoy en la casa de mi amigo.  Pero, =A1qu=E9 d=EDa dif=EDcil!
Tom=E9 un taxi entre al aeropuerto y el hotel.  "=A1Tenga cuidado!",
me dijo el chofer.  "=A1Esta parte de la ciudad es peligrosa!"  Fue
evidente para m=ED.  Yo o=EDa que la ciudad tiene partes malas, y esa
parte pareci=F3 as=ED.  Los edificios ten=EDan barrotes sobre sus
ventanas, hubo gente sospechosa en la calle, y todas las puertas estaban
cerrada con llave.  "=A1Vaya con di=F3s!"  =C9l se fue.

Before we dig into that message, here's one thing you should be aware of. RFC 1521, the MIME specification that MH supports, only tells how to transfer non-ASCII text and graphics in the message body. Although RFC 1521 adds fields like MIME-Version: to a message header -- it doesn't tell how to put non-ASCII text in a message header. So, in my message Subject:, I had to write the word día as d'ia. (In the body, that word would have been encoded d=EDa.) RFC 1522 -- which MH doesn't support (at least, not yet) -- is a standard for non-ASCII text in the header.

MIME Header Fields

The header field MIME-Version: 1.0 in the previous Example tells that the message is in MIME format. It's a signal to mail programs that the message meets the requirements of RFC 1521. (Unfortunately, a few mail programs add that header field to messages that aren't in MIME format.)

Here's an overview of the other header fields that a MIME message may have. I'll toss in some MIME philosophy along the way. We can't cover everything in the MIME spec; it's close to 100 pages long! Other sections of this chapter, and sections of later Chapters, have more MIME information. The Section More About MIME explains how to find all the gory details.

  • Content-type: tells you what kind of data is in the message. The content can be text, image (still pictures), video, and audio. Another content type is message, which means that the content is structured in the standard RFC 822 format; this can be used for forwarding messages. The application content type is designed to be sent to an external program -- for example, text for a PostScript printer or viewer.

    Finally, a message can be multipart, with several separate body parts. It's possible for the parts to be of different types (the Section on Multipart Messages explains the MIME syntax for multipart messages). For example, a message could start with a text part to describe what's in the message, then an audio part with a message from a photographer, followed by five of the photographer's pictures (in five image parts). It's also possible for each of the multipart body parts to be another complete multipart message -- with its own parts. That is, MIME messages can be recursive.

    A Content-type: must have a subtype. The type and subtype names have a slash (/) between them. For instance, image/gif is a picture in the GIF format; the type is image and the subtype is gif. The MIME Reference Guide lists many of the common content types and subtypes.

    Finally, a Content-type: can have optional parameters at the end, starting with a semicolon (;). For example, the charset= parameter in

    Content-type: text/plain; charset=iso-8859-1
    

    says that the message body uses the ISO-8859-1 character set. (The default charset is us-ascii.)

    The default Content-type: is text/plain.

  • The Content-transfer-encoding: field tells how the message data was encoded for transfer. Encoding gets the message safely from the person who sent it, across mail transfer agents and gateways, into your mailbox. The Section MIME Encoding lists the values that can go in the Content-transfer-encoding: field.
  • The Content-ID: and Content-Description: header fields help to describe what's in the message body. These two fields are most useful for the parts of a multipart message.

    The Content-ID: is a unique string, similar to the RFC 822 field Message-ID:, that no other message in the world is likely to have. A typical Content-ID: value is <1283.780402430.1@ora.com>. Each part of a multipart message has its own Content-ID:. Its main use is identifying external parts and cached body parts.

    The Content-Description: field describes the content in words, like Report on Zeta Meeting. The RFC 822 Subject: field describes the whole message. You can add a Content-Description: in the message header, but it's more useful for describing a part of a multipart message.

The list of content subtypes changes a lot. Quite a few subtypes are for non-UNIX computers. This book doesn't cover every content subtype; examples use some of the most common subtypes from RFC 1521. Your MH setup probably doesn't need to support all subtypes:

  • When you send a message, unless you know that the recipient needs a particular kind of content, it's a good idea to use the simplest (most-widely-available) content type and subtype that you can. For instance, to send a non-text data file, you should probably use application/octet-stream instead of application/mac-binhex40 because the BinHex format is generally used for Macintosh files.
  • If you get a message with a content that your MH hasn't been set up to handle, you may be able to make sense of the message by ignoring the MIME information. Just tell MH not to run its mhn MIME-decoding program by setting the NOMHNPROC environment variable. Look at the message on your screen. If that isn't enough, and you don't think you'll get many messages of this type,

    In my opinion, it doesn't make sense to constantly reconfigure MH for every new content subtype unless you have a lot of spare time. :-)

MIME Encoding

MIME messages are designed to be readable by all existing RFC 822-compatible mail programs. (Although, of course, MUAs that don't understand MIME won't be able to interpret the MIME-specific parts.) The messages may be sent through all kinds of networks and gateways. So, MIME encodes messages that have non-ASCII parts. The Content-transfer-encoding: field tells the recipient the way a message was encoded -- and how to decode it. (Encoding usually isn't required for plain ASCII text.)

For instance, the previous Example message, from me to a friend in Chile, is in Spanish (more or less :-)). Spanish uses characters like ¡ and ñ that aren't part of the ASCII character set. My MIME-capable MUA could encode the message's non-ASCII characters to pass safely through an ASCII-only mail transfer system. For instance, the character ¡ was encoded as the three-character sequence =A1. When the message gets to Chile, my friend's MIME-capable mail reader will translate the message's encoded characters to the correct Spanish characters.

One important feature of MIME's encodings is that they are designed to leave as much of the message in plain ASCII text as possible. In general, MIME only translates the characters that some email transfer systems would munge. So, if the recipient doesn't have a MIME-capable MUA, the encoded text in the message will probably still make some sense. (It's possible to encode text messages so that people can't read them without decoding. But, unless you want to hide words or have another reason, one of the less-severe encodings will probably do the job. In general, MH chooses human-readable encoding for text messages.)

Of course, when MIME encodes a binary file (like a digitized picture) that people can't read in the first place, the encoded data won't be any easier for a person to read. MIME encoding is designed to get the data safely through almost every known mail transfer system and gateway. One of the major wins in MIME is that it was designed to work everywhere, including "broken" and "brain-damaged" systems. Instead of trying to impose a new standard on mail transfer systems, MIME works with existing systems -- and adapts to their eccentricities.

Although you don't need to understand how encoding works to use MIME, you should have a general idea of the types of encoding. So, if you'd like to skip the technical details in the following section, please do skim it and learn the types of encoding. There are five encodings:

  • 7bit is the default. It means that the message contents are plain ASCII text. Lines must be "short" (1000 characters or less) and end with CR-LF (carriage return plus linefeed).
  • quoted-printable is used for text that is mostly 7-bit but which has a small percentage of 8-bit characters. (There's quoted-printable text in the next Example.) For instance, characters with the eighth bit on in the ISO-8859-n sets should be encoded as quoted-printable. Each 8-bit character is encoded into three 7-bit characters: = (an equal sign) and the hexadecimal value of the character. So the ISO-8859-1 character ñ, which has the hex value F1 (that's 11110001 binary), would be encoded as =F1.

    To keep the message readable on non-MIME readers, characters that don't have the eighth bit set generally shouldn't be encoded. The = character itself must be encoded, though; it's encoded as =3D. Also, space and tab characters at the ends of lines must be encoded (as =20 and =09, respectively); this keeps broken gateways from eating them. If a line ends with = followed by CR-LF, those characters are ignored; this lets you continue ("wrap") a long line.

    Lines must be no more than 76 characters long, not counting the final CR-LF. Longer lines will be broken when the message is encoded and joined again by decoding.

    Quoted-printable text was designed to be (mostly) readable by people with non-MIME mail programs.
  • base64 is used for data and other text that was never meant to be read by humans -- or must be preserved verbatim. Every 3 octets (24 bits) are encoded into a 4-character sequence. The 64-character set was chosen carefully. It comes from ASCII characters that aren't munged by known gateways or transfer systems.
  • 8bit is data made of 8-bit characters with "short" lines that end in CR-LF. This isn't too useful yet because 8bit data can't be shipped reliably over standard SMTP mail transport. The new ESMTP standard (RFC 1651) and 8bitMIMEtransport extension (RFC 1652) handle 8-bit MIME messages.
  • binary is like 8bit, but without CR-LF line boundaries.

One of MIME's main goals is to make different email programs work with each other. To make interoperability more likely, the MIME designers tried to avoid having lots of different content-types. They tried even harder to avoid lots of different encodings. MIME content types and subtypes, as well as encodings, are registered with the IANA (Internet Assigned Numbers Authority). "Experimental" unofficial values start with X-, like X-pbm. A few experimental content-types and subtypes are in wide use. But in general, so that as many people as possible can read your message, try to avoid inventing new content types and subtypes. If you get the urge to create, the comp.mail.mime newsgroup and the info-mime mailing list are great places to work it out.

 

 

 

 

Disclaimers: This document is provided as free service for the benefits of Active-Venture.com's web hosting customers only.

 
Quotes: If we are always arriving and departing, it is also true that we are eternally anchored. One's destination is never a place but rather a new way of looking at things.