Internet Message Format

Internet Message Format

Line Length Limit

  • MUST be no more than 998 ( + CR LF = 1000 )
  • Recommended limit = 78 ( + CR LF = 80 )

Internet Message (E-mail)

  • Message
    • Header
    • <CRLF><CRLF>
    • Body
  • Header field
    • (Header) Field Name
    • : (colon)
    • Field Body

Unstructured and Structured

  • Header Field Bodies
  • Unstructured ... Plain ASCII text (with folding, MIME etc)
    • (e.g. Subject: Test)
  • Structured ... fixed format description + ; (semicolon) separated parameter=value pairs. (also with folding, RFC2231 Internationalization)
    • (e.g. Content-Type: text/plain; charset=iso-2022-jp)

Long Header

  • Folding White Space (FWS)
  • Examples:
       Subject: This is a test
    
       Subject: This
        is a test
    

Comments

  • Characters enclosed in parentheses
  • Example:
       From: foo@example.com (just for example)
    

Syntax

Date and Time

  • Date: Fri, 11 Dec 2009 10:55:59 -0800

    date-time = [ day-of-week "," ] date time [CFWS]

Address

  • address = mailbox / group
  • mailbox = name-addr / addr-spec
  • name-addr = [display-name] angle-addr
  • angle-addr = [CFWS] "<" addr-spec ">" [CFWS] / obs-angle-addr
  • addr-spec = local-part "@" domain

Address Examples

  • John Doe <jdoe@machine.example>
  • "Joe Q. Public" <john.q.public@example.com>
  • "Giant; \"Big\" Box" <sysservices@example.net>
  • jdoe@example.com (John Doe)

Message ID

  • Message-ID: <4AF3CD0C.7030109@is.kochi-u.ac.jp>
  • In-Reply-To: <4AF2736B.7080307@is.kochi-u.ac.jp>
  • References: <4AF2736B.7080307@is.kochi-u.ac.jp>

Trace Fields

  • Return-Path:
  • Received: from mail.example.com (post.example.com [192.168.0.1]) by mail.example.jp for <foo@example.jp>; Sat, 12 Dec 2009 08:04:10 +0900

MIME

  • Multipurpose Internet Mail Extensions
  • MIME-Version: 1.0

Content Type

  • Content-Type: type/format
  • text/plain, text/html
  • image/jpeg, image/gif, image/png ...
  • application/pdf, application/vnd.ms-excel, application/msword

Content Transfer Encoding

  • Content-Transfer-Encoding:
  • "7bit" / "8bit" / "binary" / "quoted-printable" / "base64"

Quoted Printable

  • "=" 2(Hexadecimal)
  • = -> =3D
  • MUST be less or equal than 76 characters
  • =<CRLF> ... continuation

QP example

  • :
       Le Japon est un pays insulaire de l'Asie de l'Est. Situ=E9 dans l'Oc=E9an P=
       acifique, il se trouve =E0 l'est de la mer du Japon, de la R=E9publique pop=
       ulaire de Chine et de la Russie et au nord de Ta=EFwan. =C9tymologiquement,=
        les kanjis (ou id=E9ogrammes) qui composent le nom du Japon signifient =AB=
        lieu d'origine du soleil =BB ; c'est ainsi que le Japon est parfois d=E9si=
       gn=E9 comme le =AB pays du Soleil levant =BB.
    

Base64

  • A-Za-z0-9+/ ("=" for padding)
  • 24 bits unit ... 3 bytes -> 4 bytes

Multipart Message

  • Content-Type: multipart/mixed; boundary="abc"
    • This is a multipart message
    • --abc
    • first part
    • --abc
    • second part
    • --abc--

Japanese Message

  • Content-Type: text/plain; charset=iso-2022-jp
  • ISO-2022 ... escape code sequence for charset exchange
  • "English \x1b$BF|K\x5c8l\x1b(B"
  • <ESC>$B ... JIS
  • <ESC>(B ... ASCII

Japanese Header

  • Subject: =?iso-2022-jp?b?GyRCRnxLXDhsGyhC?=
  • charset = iso-2022-jp
  • encoding = base64 (or q for quoted-printable)
  • decoded = "\x1b$BF|K\x5c8l\x1b(B"
  • =?(charset)?(b|q)?(encoded string)?=

Japanese Encoding Problem

  • ISO-2022-JP (JIS) expansion
  • Shift_JIS / EUC-JP
  • UNICODE (UTF-8)
  • if not in strict ISO-2022-JP: use UTF-8

Homework for Master Students

  • Compose an e-mail message using python email package
  • Send the e-mail with python smtplib library
  • The e-mail should have MIME conformant Japanese subject and message body.
  • Deadline of report is January 15, 2010.

Quiz of the Day

  • What are the two "content-transfer-encoding" mechanisms for encoding 8bit data into 7bit ASCII text ?