NAME

Image::MetaData::JPEG - Perl extension for showing/modifying JPEG (meta)data.

SYNOPSIS

use Image::MetaData::JPEG;

# Create a new JPEG file structure object
my $file = new Image::MetaData::JPEG($filename);
die 'Error: ' . Image::MetaData::JPEG::Error() unless $file;

# Get a list of references to JPEG segments
my @segments = $file->get_segments($regex, $do_indexes);

# Get the JPEG picture dimensions
my ($dim_x, $dim_y) = $file->get_dimensions();

# Show all JPEG segments and their content
print $file->get_description();

# Modify the DateTime tag for the main image
$file->set_Exif_data({'DateTime' => '1994:07:23 12:14:51'},
                     'IMAGE_DATA', 'ADD');

# Delete all meta-data segments (please, don't)
$file->drop_segments('METADATA');

# Rewrite file to disk after your modifications
$file->save('new_file_name.jpg');

... and a lot more methods for viewing/modifying meta-data, which
are accessed through the $file or $segments[$index] references.

DESCRIPTION

The purpose of this module is to read/modify/rewrite meta-data segments in JPEG (Joint Photographic Experts Group format) files, which can contain comments, thumbnails, Exif information (photographic parameters), IPTC information (editorial parameters) and similar data.

Each JPEG file is made of consecutive segments (tagged data blocks), and the actual row picture data. Most of these segments specify parameters for decoding the picture data into a bitmap; some of them, namely the COMment and APPlication segments, contain instead meta-data, i.e., information about how the photo was shot (usually added by a digital camera) and additional notes from the photograph. These additional pieces of information are especially valuable for picture databases, since the meta-data can be saved together with the picture without resorting to additional database structures.

This module works by breaking a JPEG file into individual segments. Each file is associated to an Image::MetaData::JPEG structure object, which contains one Image::MetaData::JPEG::Segment object for each segment. Segments with a known format are then parsed, and their content can be accessed in a structured way for display. Some of them can even be modified and then rewritten to disk.

DETAILED PACKAGE DESCRIPTION

MANAGING A JPEG STRUCTURE OBJECT

* JPEG::new($input, $regex)
* JPEG::Error()
* JPEG::get_segments($regex, $do_indexes)
* JPEG::drop_segments($regex)
* JPEG::insert_segments($segref, $pos, $overwrite)
* JPEG::get_description()
* JPEG::get_dimensions()
* JPEG::find_new_app_segment_position()
* JPEG::save($filename)

The first thing you need in order to interact with a JPEG picture is to create an Image::MetaData::JPEG structure object. This is done with a call to the new method, whose first argument is an input source, which can be a scalar, interpreted as a file name to be opened and read, or a scalar reference, interpreted as a pointer to an in-memory buffer containing a JPEG stream. This interface is similar to that of Image::Info, but no open file handle is (currently) accepted. The constructor then parses the picture content and stores its segments internally. The memory footprint is close to the size of the disk file plus a few tens of kilobytes.

my $file = new Image::MetaData::JPEG('a_file_name.jpg');
my $file = new Image::MetaData::JPEG(\ $a_JPEG_stream);

The constructor method accepts two optional arguments, a regular expression and an option string. If the regular expression is present, it is matched against segment names, and only those segments with a positive match are parsed (they are nonetheless stored); this allows for some speed-up if you just need partial information, but be sure not to miss something necessary; e.g., SOF segments are needed for reading the picture dimensions. For instance, if you just want to manipulate the comments, you could set the string to 'COM'.

my $file = new Image::MetaData::JPEG('a_file_name.jpg', 'COM');

The third optional argument is an option string. If it matches the string 'FASTREADONLY', only the segments matching the regular expression are actually stored; also, everything which is found after a Start Of Scan is completely neglected. This allows for very large speed-ups, but, obviously, you cannot rebuild the file afterwards, so this is only for getting information fast, e.g., when doing a directory scan.

my $file = new Image::MetaData::JPEG('a_file.jpg', 'COM', 'FASTREADONLY');

If the $file reference remains undefined after this call, the file is to be considered not parseable by this module, and one should issue some error message and go to another file. An error message explaining the reason of the failure can be retrieved with the Error method:

die 'Error: ' . Image::MetaData::JPEG::Error() unless $file;

If the new call is successful, the returned reference points to an Image::MetaData::JPEG structure object containing a list of references to Image::MetaData::JPEG::Segment objects, which can be retrieved with the get_segments method. This method returns a list containing the references (or their indexes in the Segment references' list, if the second argument is the string INDEXES) to those Segments whose name matches the $regex regular expression. For instance, if $regex is 'APP', all application Segments will be returned. If you want only APP1 Segments you need to specify '^APP1$'. The output can become invalid after adding/removing any Segment. If $regex is undefined, all references are returned.

my @segments = $file->get_segments($regex, $do_indexes);

Similarly, if you are only interested in eliminating some segments, you can use the drop_segments method, which erases from the internal segment list all segments matching a given regular expression. If the regular expression is undefined or evaluates to the empty string, this method throws an exception, because I don't want the user to erase the whole file just because he/she did not understand what he was doing. One should also remember that it is not wise to drop non-meta-data segments, because this in general invalidates the file. As a special case, if $regex == 'METADATA', all APP* and COM segments are erased.

$file->drop_segments('^APP1$');

Inserting a Segment into the picture's segment list is done with the insert_segments, whose arguments are $segref, $pos and $overwrite. This method inserts the segments referenced by $segref into the current list of segments at position $pos. If $segref is undefined, the method fails silently. If $pos is undefined, the position is chosen automatically (using find_new_app_segment_position); if $pos is out of bound, an exception is thrown; this happens also if $pos points to the first segment, and it is SOI. $segref may be a reference to a single segment or a reference to a list of segment references; everything else throws an exception. If $overwrite is defined, it must be the number of segments to overwrite during the splice.

$file->insert_segments([$my_comment_1, $my_comment_2], 3, 1);

Getting a string describing the findings of the parsing stage is as easy as calling the get_description method. Those Segments whose parsing failed have the first line of their description stating the stopping error condition. Non-printable characters are replaced, in the string returned by get_description, by a slash followed by the two digit hexadecimal code of the character. The (x,y) dimensions of the JPEG picture are returned by get_dimensions from the Start of Frame (SOF*) Segment:

print $file->get_description();
my ($dim_x, $dim_y) = $file->get_dimensions();

If a new comment or application Segment is to be added to the file, the module provides a standard algorithm for deciding the location of the new Segment in the find_new_app_segment_position method. If a DHP Segment is present, the method returns its position; otherwise, it tries the same with SOF Segments; otherwise, it selects the position immediately after the last application or comment Segment. If even this fails, it returns the position immediately after the SOI Segment (i.e., 1).

my $new_position = $file->find_new_app_segment_position();

The data areas of each Segment in the in-memory JPEG structure object can be rewritten to a disk file or to an in-memory scalar, thus recreating the (possibly modified) JPEG picture. This is accomplished by the save method, accepting a filename or a scalar reference as argument; if the file name is undefined, it defaults to the file originally used to create the JPEG structure object. This method returns "true" (1) if it works, "false" (undefined) otherwise. Remember that if the file had initially been opened with the 'FASTREADONLY' option, it is not possible to save it, and this call fails immediately.

print "Creation of $newJPEG failed!" unless $file->save($newJPEG);

An example of how to proficiently use the in-memory feature to read the content of a JPEG thumbnail is the following (see later for get_Exif_data, and also do some error checking!):

my $thumbnail = $file->get_Exif_data('THUMBNAIL');
print Image::MetaData::JPEG->new($thumbnail)->get_description();

MANAGING A JPEG SEGMENT OBJECT

* JPEG::Segment::name
* JPEG::Segment::error
* JPEG::Segment::records
* JPEG::Segment::search_record([$dirref], $keys ...)
* JPEG::Segment::search_record_value([$dirref], $keys ...)
* JPEG::Segment::update()
* JPEG::Segment::reparse_as($new_name)
* JPEG::Segment::output_segment_data()
* JPEG::Segment::get_description()
* JPEG::Segment::size()

An Image::MetaData::JPEG::Segment object is created for each Segment found in the JPEG image during the creation of a JPEG object, and a parser routine is executed at the same time. The name member of a Segment object identifies the "nature" of the Segment (e.g. 'APP0', ..., 'APP15' or 'COM'). If any error occurs (in the Segment or in an underlying class), the parsing of that Segment is interrupted at some point and remains therefore incomplete: the error member of the relevant Segment object is then set to a meaningful error message. If no error occurs, the same variable is left undefined.

printf 'Invalid %s!\n', $segment->{name} if $segment->{error};

The reference to the Segment object is returned in any case. In this way, a faulty Segment cannot inhibit the creation of a JPEG structure object; faulty segments cannot be edited or modified, basically because their structure could not be fully understood. They are always rewritten to disk unmodified, so that a file with corrupted or non-standard Segments can be partially edited without fearing of damaging it. Once a Segment has successfully been built, its parsed information can be accessed directly through the records member: this is a reference to an array of JPEG::Record objects, an internal class modelled on Exif records (see the subsection MANAGING A JPEG RECORD OBJECT for further details).

my $records = $segment->{records};
printf '%s has %d records\n', $segment->{name}, scalar @$records;

If a specific record is needed, it can be selected with the help of the search_record method, which searches for a record with a given key (see JPEG::Record::key) in a given record directory, returning a reference to the record if the search was fruitful, the undefined value otherwise. The algorithm for the search is as follows: 1) a start directory is chosen by looking at the first argument: if it is an ARRAY ref it is popped out and used, otherwise the top-level directory is selected; 2) a string is created by joining all remaining arguments on '@', then it is exploded into a list of keys on the same character; 3) these keys are used for an iterative search starting from the initially chosen directory: all but the last key must correspond to $REFERENCE records. If $key is exactly "FIRST_RECORD" / "LAST_RECORD", the first/last record in the current dir is used.

my $segments = $file->get_segments('APP0');
print "I found it!\n" if $$segments[0]->search_record('Identifier');

If you are interested only in the Record's value, you can use the search_record_value method, a simple wrapper around search_record(): it returns the record value (with get_value) if the search is successful, undef otherwise.

print "Its value is: ", $$segments[0]->search_record_value('Identifier');

If a Segment's content (i.e. its Records' values) is modified, it is necessary to dump it into the private binary data area of the Segment in order to have the modification written to disk at JPEG::Save time. This is accomplished by invoking the update method (necessary only if you changed record values "by hand"; all "high-level" methods for changing a Segment's content in fact call "update" on their own). However, only Segments without errors can be updated (don't try to undef Segment::error unless you know what you are doing!); trying to update a segment with errors throws an exception. The same happens when trying to update a segment without update support or without records (this catches segments created with the 'NOPARSE' flag). In practise, never use this method unless you are writing an extension for this module.

$segment->update();

The reparse_as method re-executes the parsing of a Segment after changing the Segment name. This is very handy if you have a JPEG file with a "correct" application Segment exception made for its name. I used it the first time for a file having an ICC_profile Segment (normally in APP2) stored as APP13. Note that the name of the Segment is permanently changed, so, if the Segment is updated and the file is rewritten to disk, it will be "correct".

    for my $segment ($file->get_segments('APP13')) {
	$segment->reparse_as('APP2') if $segment->{error} &&
	     $segment->search_record('Identifier') =~ 'ICC_PROFILE';
	$segment->update(); }

The current in-memory data area of a Segment can be output to a file through the output_segment_data method (exception made for entropy coded Segments, this includes the initial two bytes with the Segment identifier and the two bytes with the length if present); the argument is a file handle (this is likely to become more general in the future). If there are problems at output time (e.g., the segment content is too large), an exception is thrown

    eval { $segment->output_segment_data($output_handle) } ||
	print "A terrible output error occurred! Help me.\n";

A string describing the parsed content of the Segment is obtained through the get_description method (this is the same string used by the get_description method of a JPEG structure object). If the Segment parsing stage was interrupted, this string includes the relevant error. The size method returns the size of the internal data area of a Segment object. This can be different from the length of the scalar returned by get_segment_data, because the identifier and the length is not included.

print $segment->get_description();
print 'Size is 4 + ' . $segment->size();

MANAGING A JPEG RECORD OBJECT

* JPEG::Record::key
* JPEG::Record::type
* JPEG::Record::values
* JPEG::Record::extra
* JPEG::Record::get_category()
* JPEG::Record::get_value($index)
* JPEG::Record::get_description($names)
* JPEG::Record::get($endianness)

The JPEG::Record class is an internal class for storing parsed information about a JPEG::Segment, inspired by Exif records. A Record is made up by four fields: key, type, values and extra. The "key" is the record's identifier; it is either numeric or textual (numeric keys can be translated with the help of the %JPEG_lookup function in Tables.pm, included in this package). The "type" is obviously the type of stored info (like unsigned integers, ASCII strings and so on ...). "extra" is a helper field for storing additional information. Last, "values" is an array reference to the record content (almost always there is just one value). For instance, for a non-IPTC Photoshop record in APP13:

printf 'The numeric key 0x%04x means %s',
  $record->{key}, JPEG_lookup('APP13@Photoshop_RECORDS', $record->{key});
printf 'This record contains %d values\n', scalar @{$record->{values}};

A Record's type can be one among the following predefined constants:

 0  $NIBBLES    two 4-bit unsigned integers (private)
 1  $BYTE       An 8-bit unsigned integer
 2  $ASCII      A variable length ASCII string
 3  $SHORT      A 16-bit unsigned integer
 4  $LONG       A 32-bit unsigned integer
 5  $RATIONAL   Two LONGs (numerator and denominator)
 6  $SBYTE      An 8-bit signed integer
 7  $UNDEF      A generic variable length string
 8  $SSHORT     A 16-bit signed integer
 9  $SLONG      A 32-bit signed integer (2's complem.)
10  $SRATIONAL  Two SLONGs (numerator and denominator)
11  $FLOAT      A 32-bit float (a single float)
12  $DOUBLE     A 64-bit float (a double float)
13  $REFERENCE  A Perl list reference (internal)

$UNDEF is used for not-better-specified binary data. A record of a numeric type can have multiple elements in its @{values} list ($NIBBLES implies an even number); an $UNDEF or $ASCII type record instead has only one element, but its length can vary. Last, a $REFERENCE record holds a single Perl reference to another record list: this allows for the construction of a sort of directory tree in a Segment. The category of a record can be obtained with the get_category method, which returns 'p' for Perl references, 'I' for integer types, 'S' for $ASCII and $UNDEF, 'R' for rational types and 'F' for floating point types.

    for my $record (@{$segment->{records}}) {
	print "Subdir found\n" if $record->get_category() eq 'p'; }

A human-readable description of a Record's content is the output of the get_description method. Its argument is a reference to an array of names, which are to be used as successive keys in a general hash keeping translations of numeric tags. No argument is needed if the key is already non-numeric (see the example of get_value for more details). In the output of get_description unreasonably long strings are trimmed and non-printing characters are replaced with their hexadecimal representation. Strings are then enclosed between delimiters, and null-terminated $ASCII strings have their last character chopped off (but a dot is added after the closing delimiter). $ASCII strings use a " as delimiter, while $UNDEF strings use '.

print $record->get_description($names);

In absence of "high-level" routines for collecting information, a Record's content can be read directly, either by accessing the values member or by calling the get_value method. get_value($index) returns the $index-th value in the value list; if the index is undefined (not supplied), the sum/concatenation of all values is returned. The index is checked for out-of-bound errors. The following code, an abridged version of Segment::get_description, shows how to proficiently use these methods and members.

    sub show_directory {
      my ($segment, $records, $names) = @_;
      my @subdirs = ();
      for my $record (@$records) {
	print $record->get_description($names);
	push @subdirs, $record if $record->get_category() eq 'p'; }
      foreach my $subdir (@subdirs) {
	my $directory = $subdir->get_value();
	push @$names, $subdir->{key};
	printf 'Subdir %s (%d records)', $names, scalar @$directory;
	show_directory($segment, $directory, $names);
	pop @$names; } }
    show_directory($segment, $segment->{records}, [ $segment->{name} ]);

If the Record structure is needed in detail, one can resort to the get method; in list context this method returns (key, type, count, dataref). The data reference points to a packed scalar, ready to be written to disk. In scalar context, it returns "data", i.e. the dereferentiated dataref. This is tricky (but handy for other routines). The argument specify an endianness (this defaults to big endian).

my ($key, $type, $count, $dataref) = $record->get();

COMMENTS ("COM" segments)

* JPEG::get_number_of_comments()
* JPEG::get_comments()
* JPEG::add_comment($string)
* JPEG::set_comment($index, $string)
* JPEG::remove_comment($index)
* JPEG::remove_all_comments()
* JPEG::join_comments($separation, @selection)

Each "COM" Segment in a JPEG file contains a user comment, whose content is free format. There is however a limitation, because a JPEG Segment cannot be longer than 64KB; this limits the length of a comment to $max_length = (2^16 - 3) bytes. The number of comment Segments in a file is returned by get_number_of_comments, while get_comments returns a list of strings (each string is the content of a COM Segment); if no comments are present, they return zero and the empty list respectively.

my $number = $file->get_number_of_comments();
my @comments = $file->get_comments();

A comment can be added with the add_comment method, whose only argument is a string. Indeed, if the string is too long, it is broken into multiple strings with length smaller or equal to $max_length, and multiple comment Segments are added to the file. If there is already at least one comment Segment, the new Segments are created right after the last one. Otherwise, the standard position search of find_new_app_segment_position is applied.

$file->add_comment('a' x 100000);

An already existing comment can be replaced with the set_comment method. Its two arguments are an $index and a $string: the $index-th comment Segment is replaced with one or more new Segments based on $string (the index of the first comment Segment is 0). If $string is too big, it is broken down as in add_comment. If $string is undefined, the selected comment Segment is erased. If $index is out-of-bound a warning is printed out.

$file->set_comment(0, 'This is the new comment');

However, if you only need to erase the comment, you can just call remove_comment with just the Segment $index. If you want to remove all comments, just call remove_all_comments.

$file->remove_comment(0);
$file->remove_all_comments();

It is known that some JPEG comment readers out there do not read past the first comment. So, the join_comments method, whose goal is obvious, can be useful. This method creates a string from joining all comments selected by the @selection index list (the $separation scalar is a string inserted at each junction point), and overwrites the first selected comment while deleting the others. A exception is thrown for each illegal comment index. Similar considerations as before on the string length apply. If no separation string is provided, it defaults to \n. If no index is provided in @selection, it is assumed that the method must join all the comments into the first one, and delete the others.

$file->join_comments('---', 2, 5, 8);

JFIF DATA ("APP0" segments)

* JPEG::get_app0_data()

APP0 Segments are written by older cameras adopting the JFIF (JPEG File Interchange Format) for storing images. JFIF uses the APP0 application Segment for inserting configuration data and an RGB packed (24-bit) thumbnail image. The format is described in appendix "STRUCTURE OF A JFIF APP0 SEGMENT", including the names of all possible tags. It is of course possible to access each APP0 Segment individually by means of the get_segments() and search_records() methods. A snippet of code for doing this is the following:

    for my $segment ($file->get_segments('APP0')) {
	my $iden = $segment->search_record('Identifier')->get_value();
	my $xdim = $segment->search_record('Xthumbnail')->get_value();
	my $ydim = $segment->search_record('Ythumbnail')->get_value();
	printf 'Segment type: %s; dimensions: %dx%d\n',
		substr($iden, 0, -1), $xdim, $ydim;
	printf '%15s => %s\n', $_->{key}, $_->get_value()
		for $segment->{records}; }

However, if you want to avoid to deal directly with Segments, you can use the get_app0_data method, which returns a reference to a hash with a plain translation of the content of the first interesting APP0 segment (this is the first 'JFXX' APP0 segment, if present, the first 'JFIF' APP0 segment otherwise). Segments with errors are excluded. An empty hash means that no valid APP0 segment is present.

my $data = $file->get_app0_data();
printf '%15s => %s\n', $_, (($_=~/..Thumbnail/)?'...':$$data{$_});

EXIF DATA ("APP1" segments)

* JPEG::retrieve_app1_Exif_segment($index)
* JPEG::provide_app1_Exif_segment()
* JPEG::remove_app1_Exif_info($index)
* JPEG::get_Exif_data($what, $type)
* JPEG::set_Exif_data($data, $what, $action)
* JPEG::forge_interoperability_IFD()
* JPEG::Segment::get_Exif_data($what, $type)
* JPEG::Segment::set_Exif_data($data, $what, $action)

The DCT Exif (Exchangeable Image File format) standard provides photographic meta-data in the APP1 section. Various tag-values pairs are stored in groups called IFDs (Image File Directories), where each group refers to a different kind of information; one can find data about how the photo was shot, GPS data, thumbnail data and so on ... (see appendix "STRUCTURE OF AN EXIF APP1 SEGMENT" for more details). This module provides a number of methods for managing Exif data without dealing with the details of the low level representation. Note that, given the complicated structure of an Exif APP1 segment (where extensive use of "pointers" is made), some digital cameras and graphic programs decide to leave some unused space in the JPEG file. The dump routines of this module, on the other hand, leave no unused space, so just calling update() on an Exif APP1 segment even without modifying its content can give you a smaller file (some tens of kilobytes can be saved).

In order to work on Exif data, an Exif APP1 Segment must be selected. The retrieve_app1_Exif_Segment method returns a reference to the $index-th such Segment (the first Segment if the index is undefined). If no such Segment exists, the method returns the undefined reference. If $index is (-1), the routine returns the number of available APP1 Exif Segments (which is non negative).

my $num = $file->retrieve_app1_Exif_segment(-1);
my $ref = $file->retrieve_app1_Exif_segment($num - 1);

If you want to be sure to have an Exif APP1 Segment, use the provide_app1_Exif_segment method instead, which forces the Segment to be present in the file, and returns its reference. The algorithm is the following: 1) if at least one Segment with this properties is already present, we are done; 2) if [1] fails, an APP1 segment is added and initialised with a big endian Exif structure. Note that there is no $index argument here.

my $ref = $file->provide_app1_Exif_segment();

If you want to eliminate the $index-th Exif APP1 Segment from the JPEG file segment list use the remove_app1_Exif_info method. As usual, if $index is (-1), all Exif APP1 Segments are affected at once; if $index is undefined, it defaults to -1, so both (-1) and undef cause all Exif APP1 segments to be removed. Be aware that the file won't be a valid Exif file after this.

$file->remove_app1_Exif_info(-1);

How to inspect your EXIF data

Once you have a Segment reference pointing to your favourite Exif Segment, you may want to have a look at the records it contains, by using the get_Exif_data method: it accepts two arguments ($what and $type) and returns the content of the APP1 segment packed in various forms. Error conditions (invalid $what's and $type's) manifest themselves through an undefined return value.

All Exif records are natively identified by numeric tags (keys), which can be "translated" into a human-readable form by using the Exif standard docs; only a few fields in the Exif APP1 preamble (they are not Exif records) are always identified by this module by means of textual tags. The $type argument selects the output format for the record keys (tags):

* NUMERIC: record tags are native numeric keys
* TEXTUAL: record tags are human-readable (default)

Of course, record values are never translated. If a numeric Exif tag is not known, a custom textual key is created with "Unknown_tag_" followed by its numerical value (this solves problems with non-standard tags). The subset of Exif tags returned by this method is determined by the value of $what, which can be one of:

$what         returned info                         returned type
--------------------------------------------------------------------
ALL           (default) everything but THUMBNAIL    ref. to hash of hashes
IMAGE_DATA    a merge of IFD0_DATA and SUBIFD_DATA  ref. to flat hash
THUMB_DATA    this is an alias for IFD1_DATA        ref. to flat hash
THUMBNAIL     the actual (un)compressed thumbnail   ref. to scalar
ROOT_DATA     header records (TIFF and similar)     ref. to flat hash
IFD0_DATA     primary image TIFF tags               ref. to flat hash
SUBIFD_DATA   Exif private tags                     ref. to flat hash
GPS_DATA      GPS data of the primary image         ref. to flat hash
INTEROP_DATA  interoperability data                 ref. to flat hash
IFD1_DATA     thumbnail-related TIFF tags           ref. to flat hash

Setting $what equal to 'ALL' returns a reference to a hash of hashes, a data structure very close to the Exif APP1 segment structure; in the top-level hash there is an entry for each IFD or subIFD, plus a special entry (key equal to 'APP1') containing some non Exif parameters and the thumbnail (if present). Each entry of the top-level hash is a pair ($name, $hashref), where $hashref points to a second-level hash containing a copy of all Exif records present in the $name IFD (sub)directory (if this directory is not present or contains no records, the second-level hash exists and is empty). Note that the Exif record values' format is not checked to be valid according to the Exif standard. This is, in some sense, consistent with the fact that also "unknown" tags are included in the output. This complicated structure is more easily explained by showing an example (see also the "VALID TAGS FOR EXIF APP1 DATA" section for details on possible records):

    my $hash_ref = $segment->get_Exif_data('ALL', 'TEXTUAL');

			 can give
    $hash_ref = {
           'APP1' => 
                { 'Signature'               => [ 42             ],
                  'Endianness'              => [ 'MM'           ],
                  'Identifier'              => [ "Exif\000\000" ],
                  'ThumbnailData'           => [ ... image ...  ], },
           'APP1@IFD1' =>
                { 'ResolutionUnit'          => [ 2              ],
                  'JPEGInterchangeFormatLength' => [ 3922       ],
                  'JPEGInterchangeFormat'   => [ 2204           ],
                  'Orientation'             => [ 1              ],
                  'XResolution'             => [ 72, 1          ],
                  'Compression'             => [ 6              ],
                  'YResolution'             => [ 72, 1          ], },
           'APP1@IFD0@SubIFD' =>
                { 'ApertureValue'           => [ 35, 10         ],
                  'PixelXDimension'         => [ 2160           ],
                    etc., etc. ....
                  'ExifVersion'             => [ '0210'         ], },
           'APP1@IFD0' =>
                { 'Model' => [ "KODAK DX3900 ZOOM DIGITAL CAMERA\000" ],
                  'ResolutionUnit'          => [ 2              ],
                    etc., etc. ...
                  'YResolution'             => [ 230, 1         ], },
           'APP1@IFD0@GPS' => {},
           'APP1@IFD0@SubIFD@Interop' =>
                { 'InteroperabilityVersion' => [ '0100'         ],
                  'InteroperabilityIndex'   => [ "R98\000"      ], }, };

Setting $what equal to '*_DATA' returns a reference to a flat hash, corresponding to one or more IFD (sub)dirs. For instance, 'IMAGE_DATA' is a merge of 'IFD0_DATA' and 'SUBIFD_DATA': this interface is simpler for the end-user, because there is only one dereferentiation; also, he/she does not need to know the (sub)IFD names or to be aware of the partition of records related to the main image into two IFDs. If the (sub)directory is not present or contains no records, the returned hash exists and is empty. With reference to the previous example:

    my $hash_ref = $segment->get_Exif_data('IMAGE_DATA', 'TEXTUAL');

			 gives
    $hash_ref = {
           'ResolutionUnit'              => [ 2      ],
           'JPEGInterchangeFormatLength' => [ 3922   ],
           'JPEGInterchangeFormat'       => [ 2204   ],
           'Orientation'                 => [ 1      ],
           'XResolution'                 => [ 72, 1  ],
           'Compression'                 => [ 6      ],
           'YResolution'                 => [ 72, 1  ],
           'ApertureValue'               => [ 35, 10 ],
           'PixelXDimension'             => [ 2160   ],
              etc., etc. ....
           'ExifVersion'                 => [ '0210' ], };

Last, setting $what to 'THUMBNAIL' returns a reference to a copy of the actual Exif thumbnail image (this is not included in the set returned by 'THUMB_DATA'); if there is no thumbnail, a reference to the empty string is returned (the undefined value cannot be used, because it is assumed that it corresponds to an error condition here). Note that the pointed scalar may be quite large (~ 10^1 KB). If the thumbnail is in JPEG format (this corresponds to the 'Compression' property, in IFD1, set to 6), you can create another JPEG picture object from it, like in the following example:

my $data_ref = $segment->get_Exif_data('THUMBNAIL');
my $thumb = new Image::MetaData::JPEG($data_ref);
print $thumb->get_description();

If you are only interested in reading Exif data in a standard configuration, you can skip the segment-search calls and use directly JPEG::get_Exif_data (a method of the JPEG class, so you only need a JPEG structure object). This is an interface to the method with the same name in the Segment class, acting on the first Exif APP1 Segment (if no such segment is present, the undefined value is returned) and passing the arguments through. Note that most JPEG files with Exif data contain at most one Exif APP1 segment, so you are not going to loose anything here. A snippet of code for visualising Exif data looks like this:

    while (my ($d, $h) = each %{$image->get_Exif_data('ALL')}) {
      while (my ($t, $a) = each %$h) {
	printf '%-25s\t%-25s\t-> ', $d, $t;
	s/([\000-\037\177-\377])/sprintf '\\%02x',ord($1)/ge,
	$_ = (length $_ > 30) ? (substr($_,0,30) . ' ... ') : $_,
	printf '%-5s', $_ for @$a; print "\n"; } }

How to modify your EXIF data

The APP1 Exif structure is quite complicated, and the number of different possible cases when trying to modify it is very large; therefore, designing a clean and intuitive interface for this task is not trivial. The following method calls are a proposal open to discussion with the end user (if he/she can find a cleaner interface with an acceptable cost for the developer...). Similarly to the "getter" case, there is a set_Exif_data method callable from a picture object, which does nothing more than looking for the first Exif APP1 segment (creating it, if there is none) and invoke the method with the same name in the Segment class, passing its arguments through. So, the remaining of this section will concentrate on the Segment method. Let us discuss the guidelines for the Exif setter method(s).

Exif records are usually characterised by a numeric key (a tag); this was already discussed in the "getter" section. Since these keys, for valid records, can be translated from numeric to textual form and back, the end user has the freedom to use whichever form better fits his needs. The two forms can even be mixed in the same "setter" call: the method will take care to translate textual tags to numeric tags when possible, and reject the others; then, it will proceed as if all tags were numeric from the very beginning. Records with unknown textual or numeric tags are always rejected.

The arguments to set_Exif_data are $data, $what and $action. The $data argument must be a hash reference to a flat hash, containing the key - record values pairs supplied by the user. The "value" part of each hash element can be an array reference (containing a list of values for the record, remember that some records are multi-valued) or a single scalar (this is internally converted to a reference to an array containing only the supplied scalar). If a record value is supposed to be a null terminated string, the user can supply a Perl scalar without the final null character (it will be inserted automatically).

The $what argument must be a scalar, and it selects the portion of the Exif APP1 segment concerned by the set_Exif_data call. So, obviously, the end user can modify only one section at a time; this is a simplification (for the developer of course) but also for the end user, because trying to set all Exif-like values in one go would require an offensively complicated data structure to specify the destination of each record (note that some records in different sections can have the same numerical tag, so a plain hash would not trivially work). Valid values for $what are:

$what         modifies ...                          $data type
--------------------------------------------------------------------
IMAGE_DATA    as IFD0_DATA and SUBIFD_DATA          ref. to flat hash
THUMB_DATA    this is an alias for IFD1_DATA        ref. to flat hash
THUMBNAIL     the actual (un)compressed thumbnail   ref. to scalar
ROOT_DATA     header records (endianness)           ref. to flat hash
IFD0_DATA     primary image TIFF tags               ref. to flat hash
SUBIFD_DATA   Exif private tags                     ref. to flat hash
GPS_DATA      GPS data of the primary image         ref. to flat hash
INTEROP_DATA  interoperability data in SubIFD       ref. to flat hash
IFD1_DATA     thumbnail-related TIFF tags           ref. to flat hash

The $action argument controls whether the setter adds ($action = 'ADD') records to a given data directory or replaces ($action = 'REPLACE') them. In the first case, each user-supplied record replaces the existing version of that record if present, and simply inserts the record if it was not already present; however, existing records with no counterpart in the user supplied $data hash remain untouched. In the second case, the record directory is cleared before inserting user data. Note that, since Exif and Exif-like records are non-repeatable in nature, there is no need of an 'UPDATE' action, like for IPTC (see "PHOTOSHOP AND IPTC DATA ("APP13" segments)").

The set_Exif_data routine first checks that the concerned segment is of the appropriate type (Exif APP1), that $data is a hash reference (a scalar reference for the thumbnail), and that $action and $what are valid. If $action is undefined, it defaults to 'REPLACE'. Then, an appropriate (sub)IFD is created, if absent, and all user-supplied records are checked for consistency (have a look at the appendixes for this). Last, records are set in increasing (numerical) tag order, and mandatory data are added, if not present. The return value of the setter routine is always a hash reference; in general it contains records rejected by the specialised routines. If an error occurs in a very early stage of the setter, this reference contains a single entry with key='ERROR' and value set to some meaningful error message. So, returning a reference to an empty hash means that everything was OK. An example, concerning the much popular task of changing the DateTime record, follows:

$dt = '1994:07:23 12:14:51';
$hash = $image->set_Exif_data({'DateTime' => $dt}, 'IMAGE_DATA', 'ADD');
print "DateTime record rejected\n" if %$hash;

Additional notes on set_Exif_data

Depending on $what, some of the following notes apply:

ROOT_DATA: the only modifiable item is the 'Endianness' (and it can only be set to big endian, 'MM', or little endian, 'II'); everything else is rejected (see "STRUCTURE OF AN EXIF APP1 SEGMENT" for further details). This only influences how the image is written back to disk (the in-memory representation is always native).

IMAGE_DATA: by specifying this target one can address the IFD0_DATA and SUBIFD_DATA targets at once. First, all records are tried in the IFD0, then, rejected records are tried into SubIFD (then, they are definitively rejected).

IFD0_DATA: see the "Canonical Exif 2.2 and TIFF 6.0 tags for IFD0 and IFD1", "Additional TIFF 6.0 tags not in Exif 2.2 for IFD0" and "Exif tags assigned to companies for IFD0 and IFD1" sections in the appendixes (this target refers to the primary image). The 'XResolution', 'YResolution', 'ResolutionUnit', and 'YCbCrPositioning' records are forced if not present (to [1,72], [1,72], 2 and 1 respectively). Note that the situation would be more complicated if we were dealing with uncompressed (TIFF) primary images.

SUBIFD_DATA: see the "Exif tags for the 0th IFD Exif private subdirectory" section in the appendixes. The 'ExifVersion', 'ComponentsConfiguration', 'FlashpixVersion', and 'ColorSpace' records are forced if not present (to '0220', '1230', '0100', and 1 respectively). Also, the 'PixelXDimension' and 'PixelYDimension' are added if necessary, with the real image dimensions calculated by get_dimensions(). The MakerNote record can be supplied by the user, but it is currently treated as a purely binary field (this could change in future, if and when maker notes are supported).

THUMB_DATA (or its alias IFD1_DATA): see the "Canonical Exif 2.2 and TIFF 6.0 tags for IFD0 and IFD1", "Additional TIFF 6.0 tags not in Exif 2.2 for IFD0" and "Exif tags assigned to companies for IFD0 and IFD1" sections in the appendixes (this target refers to thumbnail properties). The 'XResolution', 'YResolution', 'ResolutionUnit', 'YCbCrSubSampling', 'PhotometricInterpretation' and 'PlanarConfiguration' records are forced if not present (to [1,72], [1,72], 2, [2,1], 2 and 1 respectively). Note that some of these records are not necessary for all types of thumbnails, but JPEG readers will probably skip unnecessary information without problems.

GPS_DATA: see the "Exif tags for the 0th IFD GPS subdirectory" section in the appendixes. The 'GPSVersionID' record is forced, if it is not present at the end of the process, because it is mandatory (ver 2.2 is chosen). There are some record inter-correlations which are still neglected here (for instance, the 'GPSAltitude' record can be inserted without providing the corresponding 'GPSAltitudeRef' record).

INTEROP_DATA: see the "Exif tags for the 0th IFD Interoperability subdirectory" section in the appendixes. The 'InteroperabilityIndex' and 'InteroperabilityVersion' records are forced, if they are not present at the end of the process, because they are mandatory ('R98' and ver 1.0 are chosen). Note that an Interoperability subIFD should be made as standard as possible: if you just want to add it to the file, it is better to use the forge_interoperability_IFD method, which takes care of all values ('RelatedImageFileFormat' is set to 'Exif JPEG Ver. 2.2', and the dimensions are taken from get_dimensions()).

THUMBNAIL: $data must be a reference to a scalar containing the new thumbnail; if it points to an empty string, the thumbnail is erased (the undefined value DOES NOT erase the thumbnail, it generates instead an error). All thumbnail specific records (see "Canonical Exif 2.2 and TIFF 6.0 tags for IFD0 and IFD1") are removed, and only those corresponding to the newly inserted thumbnail are calculated and written back. Currently, it is not possible to insert an uncompressed thumbnail (this will probably happen in the form of a TIFF image), only JPEG ones are accepted (automatic records contain the type, length and offset).

PHOTOSHOP AND IPTC DATA ("APP13" segments)

* JPEG::retrieve_app13_segment($index, $what)
* JPEG::provide_app13_segment($what)
* JPEG::remove_app13_info($index, $what)
* JPEG::Segment::get_app13_data($type, $what)
* JPEG::Segment::set_app13_data($data, $action, $what)
* JPEG::get_app13_data($type, $what)
* JPEG::set_app13_data($data, $action, $what)

The Adobe's Photoshop program, a de-facto standard for image manipulation, has, since long, used the APP13 segment for storing non-graphical information, such as layers, paths, ecc...; this includes editorial information modelled on IPTC/NAA recommendations (have a look at the appendices "VALID TAGS FOR PHOTOSHOP-STYLE APP13 DATA" and "Valid tags for IPTC data" for further details). This module provides a number of methods for managing Photoshop/IPTC data without dealing with the details of the low level representation (although sometimes this means taking some decisions for the end user ....). The interface is intentionally similar to the one for Exif data (see "EXIF DATA ("APP1" segments)").

The structure of the IPTC data block is managed in detail and separately from the rest, although this block is a sort of "sub-case" of Photoshop information; all public methods have a $what argument, which can be only 'PHOTOSHOP' or 'IPTC' [default], selecting which part of the APP13 segment you are working with. If $what is invalid, an exception is always raised.

In order to work on Photoshop/IPTC data, a suitable Photoshop-style APP13 Segment must first be selected. The retrieve_app13_segment method returns a reference to the $index-th Segment (the first Segment if the $index is undefined) which contains information matching the $what argument; so, specifically, if $what is 'IPTC', there must be a non-void IPTC data block, while if $what is 'PHOTOSHOP' at least a non-IPTC data block must be present. If such Segment does not exist, the method returns the undefined reference. If $index is (-1), the routine returns the number of available suitable APP13 Segments (which is non negative). Beware, the meaning of $index is influenced by the value of $what.

my $num_IPTC = $file->retrieve_app13_segment(-1, 'IPTC');
my $ref_IPTC = $file->retrieve_app13_segment($num - 1, 'IPTC');

If you want to be sure to have an APP13 Segment suitable for the kind of information you want to write, use the provide_app13_segment method instead, which forces the Segment to be present in the file, and returns its reference. If at least one segment matching $what is already present, the first one is returned. Otherwise, the first Photoshop-like APP13 is adapted by inserting an appropriate subdirectory record (update() is called automatically). If no such segment exists, it is first created and inserted. Note that there is no $index argument here.

my $ref_Photoshop = $file->provide_app13_segment('PHOTOSHOP');

If you want to remove all traces of non-IPTC information from the $index-th APP13 Photoshop-style Segment, use the remove_app13_info method with $what set to 'PHOTOSHOP'. Conversely, if you want to remove IPTC information from the $index-th APP13 IPTC-enabled Segment, set $what to 'IPTC'. If, after this, the segment is empty, it is eliminated from the list of segments in the file. If $index is (-1), all APP13 Segments are affected at once. Beware, the meaning of $index is influenced by the value of $what.

$file->remove_app13_info(3, 'PHOTOSHOP');
$file->remove_app13_info(-1, 'IPTC');

How to inspect and modify your IPTC data

Once you have a Segment reference pointing to your favourite IPTC-enabled APP13 Segment, you may want to have a look at the records it contains. Use the get_app13_data method for this: its behaviour is controlled by the $type and $what argument (here, $what is 'IPTC' of course). It returns a reference to a hash containing a copy of the list of IPTC records, if present, undef otherwise: each element of the hash is a pair (key, arrayref), where arrayref points to an array with the real values (some IPTC records are repeatable so multiple values are possible). The record keys can be the native numeric keys ($type eq 'NUMERIC') or translated textual keys ($type eq 'TEXTUAL', default); in any case, the record values are untranslated. If a numeric key stored in the JPEG file is unknown, and a textual translation is requested, the name of the key becomes "Unknown_tag_$tag". Note that there is no check on the validity of IPTC records' values: their format is not checked and one or multiple values can be attached to a single tag independently of its repeatability. This is, in some sense, consistent with the fact that also "unknown" tags are included in the output. If $type or $what is invalid, an exception is thrown out.

my $hash_ref = $segment->get_app13_data('TEXTUAL', 'IPTC');

An example of a possible output from this call is the following:

$hash_ref = { 'DateCreated'        => [ '19890207' ],
              'ByLine'             => [ 'Interesting picture', 'really' ],
              'Category'           => [ 'POL' ],
              'OriginatingProgram' => [ 'Mapivi' ] };

The hash returned by get_app13_data can be edited and reinserted with the set_app13_data method, whose arguments are $data, $action and, as usual, $what. If $action or $what is invalid, an exception is generated. This method accepts IPTC data in various formats and updates the corresponding subdirectory in the segment. The key type of each entry in the input hash can be numeric or textual, independently of the others (the same key can appear in both forms, the corresponding values will be put together). The value of each entry can be an array reference or a scalar (you can use this as a shortcut for value arrays with only one value). The $action argument can be:

   - ADD : new records are added and nothing is deleted; however, if you
	   try to add a non-repeatable record which is already present,
	   the newly supplied value ejects (replaces) the pre-existing value.
   - UPDATE : new records replace those characterised by the same tags,
           but the others are preserved. This makes it possible to modify
           some repeatable IPTC records without deleting the other tags.
   - REPLACE : all records present in the IPTC subdirectory are deleted
           before inserting the new ones (this is the default action).

If, after implementing the changes required by $action, the 'RecordVersion' record (dataset 0) is still undefined, it is added (with version = 2), because it is mandatory according to the IPTC standard. The return value is a reference to a hash containing the rejected key-values entries. The entries of %$data are not modified. An entry in the %$data hash can be rejected for various reasons (you might want to have a look at appendix "Valid tags for IPTC data" for further information): a) the tag is undefined or not known; b) the entry value is undefined or points to an empty array; c) the non-repeatability constraint is violated; d) the tag is marked as invalid; e) a value is undefined f) the length of a value is invalid; g) a value does not match its mandatory regular expression.

$segment->set_app13_data($additional_data, 'ADD', 'IPTC');

A snippet of code for changing IPTC data looks like this:

    my $segment = $file->provide_app13_segment('IPTC');
    my $hashref = {
	ObjectName => 'prova',
    	ByLine     => 'ciao',
    	Keywords   => [ 'donald', 'duck' ],
    	SupplementalCategory => ['arte', 'scienza', 'diporto'] };
    $segment->set_app13_data($hashref, 'REPLACE', 'IPTC');

If you are only interested in reading IPTC data in a standard configuration, you can skip most of the previous calls and use directly JPEG::get_app13_data (a method in the JPEG class, so you only need a JPEG structure object). This is an interface to the method with the same name in the Segment class, acting on the first relevant APP13 Segment (if no such segment is present, the undefined value is returned) and passing the arguments through. Note that most JPEG files with Photoshop/IPTC data contain at most one APP13 segment, so you are not going to "loose" anything here. A snippet of code for visualising IPTC data looks like this:

my $hashref = $file->get_app13_data('TEXTUAL', 'IPTC');
while (my ($tag, $val_arrayref) = each %$hashref) {
	printf '%25s --> ', $tag;
	print "$_ " for @$val_arrayref; print "\n"; }

There is, of course, a symmetric JPEG::set_app13_data method, which writes data to the JPEG object without asking the user to bother about Segments: it uses the first available suitable Segment; if this is not possible, a new Segment is created and initialised (because the method uses provide_app13_segment() internally, and not retrieve_app13_segment() as JPEG::get_IPTC_data does).

$file->set_app13_data($hashref, 'UPDATE', 'IPTC');

How to inspect and modify your Photoshop data

The procedure of inspecting and modifying Photoshop data (i.e., non-IPTC data in a Photoshop-style APP13 segment) is analogous to that for IPTC data, but with $what set to 'PHOTOSHOP'. The whole description will not be repeated here, have a look at the "How to inspect and modify your IPTC data" section for it: this section takes only care to point out differences. If you are not acquainted with the structure of an APP13 segment and its terminology (e.g., "resource data block"), have a look at the "STRUCTURE OF A PHOTOSHOP-STYLE APP13 SEGMENT" section.

About get_app13_data(), it should only be pointed out that resource block names are appended to the list of values for each tag (even if they are undefined), so the list length is alway even. Things are more complicated for set_app13_data(): non-IPTC Photoshop specifications are less uniform than IPTC ones, and checking the correctness of user supplied data would be an enumerative task. Currently, this module does not perform any syntax check on non-IPTC data, but this could change in the future (any contribution is welcome); only tags (or, how they are called in this case, "resource block identifiers") are checked for being in the allowed tags list (see "VALID TAGS FOR PHOTOSHOP-STYLE APP13 DATA"). The IPTC/NAA tag is of course rejected: IPTC data must be inserted with $what set to 'IPTC'.

Although not explicitly stated, it seems that non-IPTC Photoshop tags are non-repeatable (let me know if not so), so two resource blocks with the same tag shouldn't exist. For this reason, the 'UPDATE' action is changed internally to 'ADD'. Moreover, since the resource block structure is not explored, all resource blocks are treated as single-valued and the value type is $UNDEF. So, in the user-supplied data hash, if a tag key returns a data array reference, only the first element (which cannot be undefined) of the array is used as resource block value: if a second element is present, it is used as resource block name (which is otherwise set to the null string). Suppling more than two elements is an error and causes the record to be rejected.

    my $segment = $file->provide_app13_segment('PHOTOSHOP');
    my $hashref = {
	GlobalAngle    => pack('N', 0x1e),
        GlobalAltitude => pack('N', 0x1e),
        CopyrightFlag  => "\001",
	IDsBaseValue   => [ pack('N', 1), 'Layer ID Generator Base' ] };
    $segment->set_app13_data($hashref, 'ADD', 'PHOTOSHOP');

CURRENT STATUS

A lot of other routines for modifying other meta-data could be added in the future. The following is a list of the current status of various meta-data Segments (only APP and COM Segments).

Segment  Possible content           Status

* COM    User comments              parse/read/write
* APP0   JFIF data (+ thumbnail)    parse/read
* APP1   Exif or XMP data           parse/read[Exif]/write[Exif]
* APP2   FPXR data or ICC profiles  parse
* APP3   additional EXIF-like data  parse
* APP4   HPSC                       nothing
* APP12  PreExif ASCII meta         parse[devel.]
* APP13  IPTC and PhotoShop data    parse/read/write
* APP14  Adobe tags                 parse

KNOWN BUGS

USE WITH CAUTION! THIS IS EXPERIMENTAL SOFTWARE!

This module is still experimental, and not yet finished. In particular, it is far from being well tested, and some interfaces could change depending on user feedback. Parsing of maker notes in the Exif section is not yet implemented. APP13 data spanning multiple Segments are not correctly read/written. Most of APP12 Segments do not fit the structure parsed by parse_app12(), probably there is some standard I don't know.

OTHER PACKAGES

Other packages are available in the free software arena, with a feature set showing a large overlap with that found in this package; a probably incomplete list follows. However, none of them is completely satisfactory with respect to the package's objectives, which are: being a single package dealing with all types of meta-information in read/write mode in a JPEG (and possibly TIFF) file; depending on the least possible number of non standard packages and/or external programs or libraries; being open-source and written in Perl. Of course, most of these objectives are far from being reached ....

"ExifTool" and "Image::ExifTool" by Phil Harvey

This is a Perl script that extracts meta information from various image file types; it can read EXIF, IPTC, XPM and GeoTIFF formatted data as well as the maker notes of many digital cameras. The "exiftool" script is just a command-line interface to the Image::ExifTool module. This library is very complete, highly customisable and capable of organising the results in various ways, but cannot modify file data (it only reads).

"Image::IPTCInfo" by Josh Carter

This is a CPAN module for for extracting IPTC image meta-data. It allows reading IPTC data (there is an XML and also an HTML output feature) and manipulating them through native Perl structures. This library does not implement a full parsing of the JPEG file, so I did not consider it as a good base for the development of a full-featured module. Moreover, I don't like the separate treatment of keywords and supplemental categories.

"JPEG::JFIF" by Marcin Krzyzanowski, "Image::Exif" by Sergey Prozhogin and "exiftags" by Eric M. Johnston

JPEG::JFIF is a very small CPAN module for reading meta-data in JFIF/JPEG format files. In practice, it only recognises a subset of the IPTC tags in APP13, and the parsing code is not suitable for being reused for a generic JPEG segment. Image::Exif is just a perl wrapper around "exiftags", which is a program parsing the APP1 section in JPEG files for Exif meta-data (it supports a variety of MakerNotes). exiftags can also rewrite comments and date and time tags.

"Image::Info" and "Image::TIFF" by Gisle Aas

These CPAN modules extract meta information from a variety of graphic formats (including JPEG and TIFF). So, they are not specifically about JPEG segments: reported information includes file_media_type, file_extention, width, height, color_type, comments, Interlace, Compression, Gamma, LastModificationTime. For JPEG files, they additionally report from JFIF (APP0) and Exif (APP1) segments (including MakerNotes). This module does not allow for editing.

"exif" by Martin Krzywinski and "exifdump.py" by Thierry Bousch

These are two basic scripts to extract EXIF information from JPEGs. The first script is written in Perl and targets Canon pictures. The second one is written in Python, and it only works on JPEG files beginning with an APP1 section after the SOI. So, they are much simpler than all other programs/libraries described here. Of course, they cannot modify Exif data.

"exifprobe" by Duane H. Hesser

This is a C program which examines and reports the contents and structure of JPEG and TIFF image files. It recognises all standard JPEG markers and reports the contents of any properly structured TIFF IFD encountered, even when entry tags are not recognised. Camera MakerNotes are included. GPS and GeoTIFF tags are recognised and entries printed in "raw" form, but are not expanded. The output is nicely formatted, with indentation and colouration; this program is a great tool for inspecting a JPEG/TIFF structure while debugging.

"libexif" by Lutz Müller

This is a library, written in C, for parsing, editing, and saving EXIF data. All EXIF tags described in EXIF standard 2.1 are supported. Libexif can only handle some maker notes, and even those not very well. It is used by a number of front-ends, including: exif (read-only command-line utility), gexif (a GTK+ frontend for editing EXIF data), gphoto2 (command-line frontend to libgphoto2, a library to access digital cameras), gtkam (a GTK+ frontend to libgphoto2), thirdeye (a digital photos organizer and driver for eComStation).

"jpegrdf" by Norman Walsh

This is a Java application for manipulating (read/write) RDF meta-data in the comment sections of JPEG images (is this the same thing which can be found in APP1 segments in XMP format?). It can also access and convert into RDF the Exif tags and a few other general properties. However, I don't want to rely on a Java environment being installed in order to be able to access these properties.

"OpenExif" by Eastman Kodak Company

This is an object-oriented interface written in C++ to Exif formatted JPEG image files. It is very complete and sponsored by a large company, so it is to be considered a sort of reference. The toolkit allows creating, reading, and modifying the meta-data in the Exif file. It also provides means of getting and setting the main image and the thumbnail image. OpenExif is also extensible, and Application segments can be added.

APPENDICES - SEGMENT STRUCTURES

STRUCTURE OF JPEG PICTURES

The structure of a well formed JPEG file can be described by the following pseudo production rules (for sake of simplicity, some additional constraints between tables and SOF segments are neglected).

JPEG	    --> (SOI)(misc)*(image)?(EOI)
(image)	    --> (hierarch.)|(non-hier.)
(hierarch.) --> (DHP)(frame)+
(frame)	    --> (misc)*(EXP)?(non-hier.)
(non-hier.) --> (SOF)(scan)+
(scan)      --> (misc)*(SOS)(data)*(ECS)(DNL)?
(data)      --> (ECS)(RST)
(misc)	    --> (DQT)|(DHT)|(DAC)|(DRI)|(COM)|(APP)

(SOI) = Start Of Image
(EOI) = End Of Image
(SOF) = Start Of Frame header (10 types)
(SOS) = Start Of Scan header
(ECS) = Entropy Coded Segment (row data, not a real segment)
(DNL) = Define Number of Lines segment
(DHP) = Define Hierarchical P??? segment
(EXP) = EXPansion segment
(RST) = ReSTart segment (8 types)
(DQT) = Define Quantisation Table
(DHT) = Define Huffman coding Table
(DAC) = Define Arithmetic coding Table
(DRI) = Define Restart Interval
(COM) = COMment segment
(APP) = APPlication segment

This package does not check that a JPEG file is really correct; it accepts a looser syntax, were segments and ECS blocks are just contiguous (basically, because it does not need to display the image!). All meta-data information is concentrated in the (COM) and (APP) Segments, exception made for some records in the (SOF) segment (e.g. image dimensions). For further details see

"Digital compression and coding of continuous-tone still images:
 requirements and guidelines", CCITT recommendation T.81, 09/1992,
The International Telegraph and Telephone Consultative Committee.

STRUCTURE OF A JFIF APP0 SEGMENT

JFIF APP0 segments are an old standard used to store information about the picture dimensions and an optional thumbnail. The format of a JFIF APP0 segment is as follows (note that the size of thumbnail data is 3n, where n = Xthumbnail * Ythumbnail, and it is present only if n is not zero; only the first 8 records are mandatory):

    [Record name]    [size]   [description]
    ---------------------------------------
    Identifier       5 bytes  ("JFIF\000" = 0x4a46494600)
    MajorVersion     1 byte   major version (e.g. 0x01)
    MinorVersion     1 byte   minor version (e.g. 0x01 or 0x02)
    Units	     1 byte   units (0: densities give aspect ratio
				     1: density values are dots per inch
				     2: density values are dots per cm)
    Xdensity         2 bytes  horizontal pixel density
    Ydensity         2 bytes  vertical pixel density
    Xthumbnail       1 byte   thumbnail horizontal pixel count
    Ythumbnail       1 byte   thumbnail vertical pixel count
    ThumbnailData   3n bytes  thumbnail image

There is also an extended JFIF (only possible for JFIF versions 1.02 and above). In this case the identifier is not JFIF but JFXX. This extension allows for the inclusion of differently encoded thumbnails. The syntax in this case is modified as follows:

    [Record name]    [size]   [description]
    ---------------------------------------
    Identifier       5 bytes  ("JFXX\000" = 0x4a46585800)
    ExtensionCode    1 byte   (0x10 Thumbnail coded using JPEG
			       0x11 Thumbnail using 1 byte/pixel
			       0x13 Thumbnail using 3 bytes/pixel)

Then, depending on the extension code, there are other records to define the thumbnail. If the thumbnail is coded using a JPEG stream, a binary JPEG stream immediately follows the extension code (the byte count of this file is included in the byte count of the APP0 Segment). This stream conforms to the syntax for a JPEG file (SOI .... SOF ... EOI); however, no 'JFIF' or 'JFXX' marker Segments should be present:

[Record name]    [size]   [description]
---------------------------------------
JPEGThumbnail  ... bytes  a variable length JPEG picture

If the thumbnail is stored using one byte per pixel, after the extension code one should find a palette and an indexed RGB. The records are as follows (remember that n = Xthumbnail * Ythumbnail):

    [Record name]    [size]   [description]
    ---------------------------------------
    Xthumbnail       1 byte    thumbnail horizontal pixel count
    YThumbnail       1 byte    thumbnail vertical pixel count
    ColorPalette   768 bytes   24-bit RGB values for the colour palette
			       (defining the colours represented by each
				value of an 8-bit binary encoding)
    1ByteThumbnail   n bytes   8-bit indexed values for the thumbnail

If the thumbnail is stored using three bytes per pixel, there is no colour palette, so the previous fields simplify into:

[Record name]    [size]   [description]
---------------------------------------
Xthumbnail       1 byte    thumbnail horizontal pixel count
YThumbnail       1 byte    thumbnail vertical pixel count
3BytesThumbnail 3n bytes 24-bit RGB values for the thumbnail

STRUCTURE OF AN EXIF APP1 SEGMENT

Exif (Exchangeable Image File format) JPEG files use APP1 segments in order not to conflict with JFIF files (which use APP0). Exif APP1 segments store a great amount of information on photographic parameters for digital cameras and are the preferred way to store thumbnail images nowadays. They can also host an additional section with GPS data. Exif APP1 segments are made up by an identifier, a TIFF header and a sequence of IFDs (Image File Directories) and subIFDs. The high level IFDs are only two (IFD0, for photographic parameters, and IFD1 for thumbnail parameters); they can be followed by thumbnail data. The structure is as follows:

[Record name]    [size]   [description]
---------------------------------------
Identifier       6 bytes   ("Exif\000\000" = 0x457869660000), not stored
Endianness       2 bytes   'II' (little endian) or 'MM' (big endian)
Signature        2 bytes   a fixed value = 42
IFD0_Pointer     4 bytes   offset of 0th IFD (usually 8), not stored
IFD0                ...    main image IFD
IFD0@SubIFD         ...    EXIF private tags (optional, linked by IFD0)
IFD0@SubIFD@Interop ...    Interoperability IFD (optional,linked by SubIFD)
IFD0@GPS            ...    GPS IFD (optional, linked by IFD0)
APP1@IFD1           ...    thumbnail IFD (optional, pointed to by IFD0)
ThumbnailData       ...    Thumbnail image (optional, 0xffd8.....ffd9)

So, each Exif APP1 segment starts with the identifier string "Exif\000\000"; this avoids a conflict with other applications using APP1, for instance XMP data. The three following fields (Endianness, Signature and IFD0_Pointer) constitute the so called TIFF header. The offset of the 0th IFD in the TIFF header, as well as IFD links in the following IFDs, is given with respect to the beginning of the TIFF header (i.e. the address of the 'MM' or 'II' pair). This means that if the 0th IFD begins (as usual) immediately after the end of the TIFF header, the offset value is 8. An EXIF segment is the only part of a JPEG file whose endianness is not fixed to big endian.

If the thumbnail is present it is located after the 1st IFD. There are 3 possible formats: JPEG (only this is compressed), RGB TIFF, and YCbCr TIFF. It seems that JPEG and 160x120 pixels are recommended for Exif ver. 2.1 or higher (mandatory for DCF files). Since the segment size for a segment is recorded in 2 bytes, thumbnails are limited to slightly less than 64KB.

Each IFD block is a structured sequence of records, called, in the Exif jargon, Interoperability arrays. The beginning of the 0th IFD is given by the 'IFD0_Pointer' value. The structure of an IFD is the following:

[Record name]    [size]   [description]
---------------------------------------
                 2 bytes  number n of Interoperability arrays
               12n bytes  the n arrays (12 bytes each)
                 4 bytes  link to next IFD (can be zero)
                   ...    additional data area

The next_link field of the 0th IFD, if non-null, points to the beginning of the 1st IFD. The 1st IFD as well as all other sub-IFDs must have next_link set to zero. The thumbnail location and size is given by some interoperability arrays in the 1st IFD. The structure of an Interoperability array is:

[Record name]    [size]   [description]
---------------------------------------
                 2 bytes  Tag (a unique 2-byte number)
                 2 bytes  Type (one out of 12 types)
                 4 bytes  Count (the number of values)
                 4 bytes  Value Offset (value or offset)

The possible types are the same as for the Record class, exception made for nibbles and references (see "MANAGING A JPEG RECORD OBJECT"). Indeed, the Record class is modelled after interoperability arrays, and each interoperability array gets stored as a Record with given tag, type, count and values. The "value offset" field gives the offset from the TIFF header base where the value is recorded. It contains the actual value if it is not larger than 4 bytes (32 bits). If the value is shorter than 4 bytes, it is recorded in the lower end of the 4-byte area (smaller offsets). For further details see the section "VALID TAGS FOR EXIF APP1 DATA".

STRUCTURE OF A PHOTOSHOP-STYLE APP13 SEGMENT

The Adobe's Photoshop program, a de-facto standard for image manipulation, uses the APP13 segment for storing non-graphic information, such as layers, paths, IPTC data and more. The unit for this kind of information is called a "resource data block" (because they hold data that was stored in the Macintosh's resource fork in early versions of Photoshop). The content of an APP13 segment is the string "Photoshop 3.0\000" followed by a sequence of resource data blocks; a resource block has the following structure:

[Record name]    [size]   [description]
---------------------------------------
(Type)           4 bytes  Photoshop always uses '8BIM'
(ID)             2 bytes  a unique identifier, e.g., "\004\004" for IPTC
(Name)             ...    a Pascal string (padded to make size even)
(Size)           4 bytes  actual size of resource data
(Data)             ...    resource data, padded to make size even

(a Pascal string is made up of a single byte, giving the string length, followed by the string itself, padded to make size even including the length byte; since the string length is explicit, there is no need of a terminating null character). Valid Image Resource IDs are listed in the "VALID TAGS FOR PHOTOSHOP-STYLE APP13 DATA" section. In general a resource block contains only a few bytes, but there is an important block, the IPTC block, which can be quite large; the structure of this block is analysed in more detail in the "Structure of an IPTC data block" section.

The reference document for the Photoshop file format is:
  "Adobe Photoshop 6.0: File Formats Specifications",
   Adobe System Inc., ver.6.0, rel.2, November 2000.

Another interesting source of information is:
  "\"Solo\" Image File Format. RichTIFF and its
   replacement by \"Solo\" JFIF", version 2.0a,
   Coatsworth Comm. Inc., Brampton, Ontario, Canada

Structure of an IPTC data block

An IPTC/NAA resource data block of a Photoshop-style APP13 segment embeds an IPTC stream conforming to the standard defined by the International Press and Telecommunications Council (IPTC) and the Newspaper Association of America (NAA) for exchanging interoperability information related to various news objects. The data part of a resource block, an IPTC stream, is simply a sequence of units called datasets; no preamble nor count is present. Each dataset consists of a unique tag header and a data field (the list of valid tags [dataset numbers] can be found in section "Valid tags for IPTC data"). A standard tag header is used when the data field size is less than 32768 bytes; otherwise, an extended tag header is used. The datasets do not need to show up in numerical order according to their tag. The structure of a dataset is:

[Record name]    [size]   [description]
---------------------------------------
(Tag marker)     1 byte   this must be 0x1c
(Record number)  1 byte   always 2 for 2:xx datasets
(Dataset number) 1 byte   this is what we call a "tag"
(Size specifier) 2 bytes  data length (< 32768 bytes) or length of ...
(Size specifier)  ...     data length (> 32767 bytes only)
(Data)            ...     (its length is specified before)

So, standard datasets have a 5 bytes tag header; the last two bytes in the header contain the data field length, the most significant bit being always 0. For extended datasets instead, these two bytes contain the length of the (following) data field length, the most significant bit being always 1. The value of the most significant bit thus distinguishes "standard" from "extended"; in digital photographies, I assume that the datasets which are actually used (a subset of the standard) are always standard; therefore, we likely do not have the IPTC block spanning more than one APP13 segment. The record types defined by the IPTC-NAA standard are the following (but the "pseudo"-standard by Adobe for APP13 IPTC data is restricted to the first application record, 2:xx, I believe, because the envelopping structure is replaced by the resource data block):

    [Record name]                [dataset record number]
    ----------------------------------------------------
    Object Envelop Record                 1:xx
    Application Records:             2:xx through 6:xx
    Pre-ObjectData Descriptor Record:     7:xx
    ObjectData Record:                    8:xx
    Post-ObjectData Descriptor Record:    9:xx

  The reference document for the IPTC standard is:
        "IPTC-NAA: Information Interchange Model", version 4, 1-Jul-1999, 
	Comité International des Télécommunications de Presse,

APPENDICES - TAG LISTS

VALID TAGS FOR EXIF APP1 DATA

The Japan Electronics and Information Technology Industries Association (JEITA) set up a standard for an exchange format for digital still cameras pictures, known as EXIF. This standard defines a structure for embedding meta-data in a JPEG picture, to be written in the APP1 segment. The generalities about this structure are shown in the section "STRUCTURE OF AN EXIF APP1 SEGMENT"; this section and its subsections list the valid interoperability record tags as well as their format.

  The reference document for Exif 2.2 standard is:
    "Exchangeable image file format for digital still cameras:
     Exif Version 2.2", JEITA CP-3451, Apr 2002 
    Japan Electronic Industry Development Association (JEIDA)
  with Interoperability information being found in:
    "Design rule for Camera File system", (DCF), v1.0
     English Version 1999.1.7, Adopted December 1998
    Japan Electronic Industry Development Association (JEIDA)

  The TIFF (Tagged Image File format) standard documents are also useful:
    - "TIFF(TM) Revision 6.0, Final", June 3, 1992, Adobe Devel. Association
    - ISO 12639, "Graphic technology -- Prepress digital data exchange
	  -- Tag image file format for image  technology (TIFF/IT)"
    - ISO 12234-2, "Electronic still-picture imaging -- Removable memory
	  -- Part 2: TIFF/EP image data format"
  as well as some updates and corrections:
    - DRAFT - TIFF CLASS F, October 1, 1991
    - DRAFT - TIFF Technical Note #2, 17-Mar-95 (updates for JPEG-in-TIFF)
    - "Adobe Pagemaker 6.0 TIFF Technical Notes", (1,2,3 and OPI), 14-Sep-1995

Canonical Exif 2.2 and TIFF 6.0 tags for IFD0 and IFD1

In general, IFD0 and IFD1 can host tags from the same set. These tags are divided in three categories: canonical, additional and registered to companies. The tags listed in the following table are to be considered canonical; they are described at length in the EXIF standard document, and can be found both in the IFD0 and in the IFD1 (some of them, in fact, must be present in both directories). The 'class' column carries the tag class; possible values are: A (image data structure), B (offsets), C (image data characteristics), D (other tags) and P (pointers to other IFDs). The two following columns show tag hexadecimal codes and names. The 'type' column specifies the (always unsigned) tag type: I (short or long), S (short), L (long), R (rational) and A (ASCII, always null terminated). The 'count' column obviously carries the tag count ('-' for a variable count, either because it is a variable length string or because it depends on other tags).

The 'IFD0' and 'IFD1' columns specify the support level in the respective directory; each column comprises four letters, because both the primary image (IFD0) and the thumbnail (IFD1) can come in four varieties (uncompressed chunky, uncompressed planar, uncompressed YCC and JPEG compressed). This module currently focuses only on JPEG pictures (not TIFF), so only the fourth letter of the 'IFD0' column is interesting, but note that the thumbnail of a JPEG image can be uncompressed. The support level codes stand for: M (mandatory), R (recommended), O (optional), N (not_recorded) and J (included in JPEG marker and so not recorded).

The 'thumbnail-only' column contains has a 'T' for records which cannot be set/changed by the user exception made during a thumbnail update action (and some of them are calculated automatically anyway). Note that, in some cases, it is possible to set a tag when its support level is 'N' (e.g., the YCbCr stuff in IFD1): picture displaying programs should however simply ignore it. Some other tags, concerning offsets or thumbnail specific information, cannot be set by the module user (they are calculated automatically, more reliably): these are marked by 'calculated' in the notes, or by a 'T' in the thumbnail-only column.

   Hexadecimal code                count   IFD0 IFD1 thumbnail-only
class |  Tag name                 type |   supp.supp.| notes
|     |  |                           | |   |    |    | |
A   100  ImageWidth                  I 1   MMMJ MMMJ T (not JPEG) num columns
A   101  ImageLength                 I 1   MMMJ MMMJ T (not JPEG) num rows
A   102  BitsPerSample               S 3   MMMJ MMMJ T (not JPEG) 8,8,8
A   103  Compression                 S 1   MMMJ MMMM T 1(uncompr.) or 6(JPEG)
A   106  PhotometricInterpretation   S 1   MMMN MMMJ   2 (RGB) or 6 (YCbCr)
D   10e  ImageDescription            A -   RRRR OOOO   pure ASCII
D   10f  Make                        A -   RRRR OOOO   camera maker
D   110  Model                       A -   RRRR OOOO   camera model
B   111  StripOffsets                I -   MMMN MMMN   calculated
A   112  Orientation                 S 1   RRRR OOOO   1-8
A   115  SamplesPerPixel             S 1   MMMJ MMMJ T (not JPEG) 3 compon.
B   116  RowsPerStrip                I 1   MMMN MMMN T (not JPEG)
B   117  StripByteCounts             I -   MMMN MMMN T (not JPEG)
A   11a  XResolution                 R 1   MMMM MMMM   [72 default]
A   11b  YResolution                 R 1   MMMM MMMM   [72 default]
A   11c  PlanarConfiguration         S 1   OMOJ OMOJ   1 or 2
A   128  ResolutionUnit              S 1   MMMM MMMM   2 or 3
C   12d  TransferFunction            S 768 RRRR OOOO   .
D   131  Software                    A -   OOOO OOOO   .
D   132  DateTime                    A 20  RRRR OOOO   YYYY:MM:DD HH:MM:SS
D   13b  Artist                      A -   OOOO OOOO   .
C   13e  WhitePoint                  R 2   OOOO OOOO   .
C   13f  PrimaryChromaticities       R 6   OOOO OOOO   .
B   201  JPEGInterchangeFormat       L 1   NNNN NNNM   calculated
B   202  JPEGInterchangeFormatLength L 1   NNNN NNNM T (only JPEG)
C   211  YCbCrCoefficients           R 3   NNOO NNOO   see the sYCC standard
A   212  YCbCrSubSampling            S 2   NNMJ NNMJ   [2,1] or [2,2]
A   213  YCbCrPositioning            S 1   NNMM NNOO   1 or 2
C   214  ReferenceBlackWhite         R 6   OOOO OOOO   .
D  8298  Copyright                   A -   OOOO OOOO   one or two copyrights
P  8769  ExifOffset                  L 1   MMMM OOOO   calculated
P  8825  GPSInfo                     L 1   OOOO OOOO   calculated

Additional TIFF 6.0 tags not in Exif 2.2 for IFD0

The tags listed in the following table are present in the TIFF 6.0 standard and not in the Exif 2.2 standard. They are presented here just for reference, since some digital cameras or programs still include them, incorrectly, in the IFD0 (they are not present in IFD1, I assume). The 'class' column carries the tag class; possible values are: a (TIFF 6.0 tags for baseline TIFFs not in Exif 2.2), b (extensions to TIFF 6.0 specs not in Exif 2.2) and '-' (updates and corrections to TIFF 6.0). The two following columns show tag hexadecimal codes and names. The 'type' column specifies the tag type: i (byte or short), I (short or long), B (byte), S (short), L (long), R (rational), F (floating point numbers), D (double precision floating point numbers), '-' (unspecified, best fit) and A (ASCII, always null terminated). The 'count' column obviously carries the tag count ('-' that it is variable, either because it is a variable length string or because it depends on other tags).

   Hexadecimal code                count  notes
class |  Tag name                 type |  |
|     |  |                           | |  |
a    fe  NewSubfileType              L 1  TIFFs can hold multiple images
a    ff  SubFileType                 S 1  TIFFs can hold multiple images
a   107  Thresholding                S 1  for Graylevel to Black&White
a   108  CellWidth                   S 1  halftoning matrix support
a   109  CellLength                  S 1  halftoning matrix support
a   10a  FillOrder                   S 1  bits' logical order in a byte
b   10d  DocumentName                A -  document storage and retrieval
a   118  MinSampleValue              S -  only for statistical purposes
a   119  MaxSampleValue              S -  only for statistical purposes
b   11d  PageName                    A -  document storage and retrieval
b   11e  XPosition                   R 1  document storage and retrieval
b   11f  YPosition                   R 1  document storage and retrieval
a   120  FreeOffsets                 L -  not recommended for interchange
a   121  FreeByteCounts              L -  not recommended for interchange
a   122  GrayResponseUnit            S 1  for gray-scale images
a   123  GrayResponseCurve           S -  for gray-scale images
b   124  T4Options                   L 1  (group 3 options)
b   125  T6Options                   L 1  (group 4 options)
b   129  PageNumber                  S 2  document storage and retrieval
-   12c  ColorResponseUnit           S 1  [obsoleted in TIFF 6.0]
a   13c  HostComputer                A -  computer/OS used for creation
b   13d  Predictor                   S 1  differencing predictor
a   140  Colormap                    S -  RGB colour map
b   141  HalftoneHints               S 2  half tone hints
b   142  TileWidth                   I 1  tiled images
b   143  TileLength                  I 1  tiled images
b   144  TileOffsets                 L -  tiled images
b   145  TileByteCounts              I -  tiled images
-   146  BadFaxLines                 I 1  [TIFF class F draft]
-   147  CleanFaxData                S 1  [TIFF class F draft]
-   148  ConsecutiveBadFaxLines      I 1  [TIFF class F draft]
-   14a  SubIFDs                     L -  [Adobe TIFF technote 1]
b   14c  InkSet                      S 1  CMYK images
b   14d  InkNames                    A -  CMYK images
b   14e  NumberOfInks                S 1  CMYK images
b   150  DotRange                    i -  CMYK images
b   151  TargetPrinter               A -  CMYK images
a   152  ExtraSamples                S -  pixel extra components
b   153  SampleFormats               S -  data sample format
b   154  SMinSampleValue             - -  data sample format
b   155  SMaxSampleValue             - -  data sample format
b   156  TransferRange               S 6  image colourimetry
-   157  ClipPath                    B -  [Adobe TIFF technote 2]
-   158  XClipPathUnits              D 1  [Adobe TIFF technote 2]
-   159  YClipPathUnits              D 1  [Adobe TIFF technote 2]
-   15a  Indexed                     S 1  [Adobe TIFF technote 3]
-   15b  JPEGTables                  - -  [update (1995) for JPEG-in-TIFF]
-   15f  OPIProxy                    S 1  [Adobe TIFF technote (OPI)]
b   200  JPEGProc                    S 1  JPEG support
b   203  JPEGRestartInterval         S 1  JPEG support
b   205  JPEGLosslessPredictors      S -  JPEG support
b   206  JPEGPointTransforms         S -  JPEG support
b   207  JPEGQTables                 L -  JPEG support
b   208  JPEGDCTables                L -  JPEG support
b   209  JPEGACTables                L -  JPEG support
-   2bc  XML_Packet                  B -  [Adobe XMP technote 9-14-02]

Exif tags assigned to companies for IFD0 and IFD1

The tags listed in the following table, all with a value larger than 0x8000, i.e. 32000, were requested by individual companies and assigned to them by the TIFF committee; well, at least I think, because it is very difficult to have an official list for these tags, so that they should be considered at the level of "rumours". This list also includes some TIFF/IT tags from ISO 12639 and some TIFF/EP tags from ISO 12234 (private Exif tags in JPEG APP1 originated from TIFF/EP, so there is a large intersection: TIFF/EP tags which are also Exif are not listed here).

Hexadecimal code                count  notes
   |  Tag name                 type |  |
   |  |                           | |  |
800d  ImageID                     A -  [Adobe TIFF technote            (OPI)]
80b9  RefPts                      ? ?  [Island Graphics                     ]
80ba  RegionTackPoint             ? ?  [Island Graphics                     ]
80bb  RegionWarpCorners           ? ?  [Island Graphics                     ]
80bc  RegionAffine                ? ?  [Island Graphics                     ]
80e3  Matteing                    S 1  [SGI      (obsoleted by ExtraSamples)]
80e4  DataType                    S -  [SGI      (obsoleted by SampleFormat)]
80e5  ImageDepth                  I 1  [SGI                    (z dimension)]
80e6  TileDepth                   I 1  [SGI               (subvolume tiling)]
8214  ImageFullWidth              L 1  [Pixar               (cropped images)]
8215  ImageFullLength             L 1  [Pixar               (cropped images)]
8216  TextureFormat               A -  [Pixar              (texture formats)]
8217  WrapModes                   A -  [Pixar              (texture formats)]
8218  FovCot                      F 1  [Pixar              (texture formats)]
8219  MatrixWorldToScreen         F 16 [Pixar              (texture formats)]
821a  MatrixWorldToCamera         F 16 [Pixar              (texture formats)]
827d  WriterSerialNumber          ? ?  [Eastman Kodak (device serial number)]
828d  CFARepeatPatternDim         S 2  [             ISO/DIS 12234-2 TIFF/EP]
828e  CFAPattern                  B -  [             ISO/DIS 12234-2 TIFF/EP]
828f  BatteryLevel               RA 1- [             ISO/DIS 12234-2 TIFF/EP]
830e  ModelPixelScaleTag          D 3  [SoftDesk                   (GeoTIFF)]
83bb  IPTC/NAA                   LA -  [             ISO/DIS 12234-2 TIFF/EP]
8480  IntergraphMatrixTag         D 16 [Intergraph, deprecated     (GeoTIFF)]
8482  ModelTiepointTag            D -  [Intergraph, aka Georef.Tag (GeoTIFF)]
84e0  Site                        A -  [               ISO/DIS 12639 TIFF/IT]
84e1  ColorSequence               A -  [               ISO/DIS 12639 TIFF/IT]
84e2  IT8Header                   A -  [               ISO/DIS 12639 TIFF/IT]
84e3  RasterPadding               S 1  [               ISO/DIS 12639 TIFF/IT]
84e4  BitsPerRunLength            S 1  [               ISO/DIS 12639 TIFF/IT]
84e5  BitsPerExtendedRunLength    S 1  [               ISO/DIS 12639 TIFF/IT]
84e6  ColorTable                  B -  [               ISO/DIS 12639 TIFF/IT]
84e7  ImageColorIndicator         B 1  [               ISO/DIS 12639 TIFF/IT]
84e8  BackgroundColorIndicator    B 1  [               ISO/DIS 12639 TIFF/IT]
84e9  ImageColorValue             B 1  [               ISO/DIS 12639 TIFF/IT]
84ea  BackgroundColorValue        B 1  [               ISO/DIS 12639 TIFF/IT]
84eb  PixelIntensityRange         B 2  [               ISO/DIS 12639 TIFF/IT]
84ec  TransparencyIndicator       B 1  [               ISO/DIS 12639 TIFF/IT]
84ed  ColorCharacterization       A -  [               ISO/DIS 12639 TIFF/IT]
84ee  HCUsage                     L 1  [               ISO/DIS 12639 TIFF/IT]
84ef  TrapIndicator               B 1  [               ISO/DIS 12639 TIFF/IT]
84f0  CMYKEquivalent              i -  [               ISO/DIS 12639 TIFF/IT]
84f1  Reserved_TIFF_IT_1          - -  [               ISO/DIS 12639 TIFF/IT]
84f2  Reserved_TIFF_IT_2          - -  [               ISO/DIS 12639 TIFF/IT]
84f3  Reserved_TIFF_IT_3          - -  [               ISO/DIS 12639 TIFF/IT]
85b8  FrameCount                  L 1  [Texas Instruments   (Sequence Count)]
85d8  ModelTransformationTag      D 16 [JPL Cartogr. App. Group    (GeoTIFF)]
8649  PhotoshopImageResources     B ?  [Adobe                    (Photoshop)]
8773  ICCProfile                  - -  [Inter Colour Consortium    (TIFF/IT)]
87af  GeoKeyDirectoryTag          S -  [SPOT Image Inc.            (GeoTIFF)]
87b0  GeoDoubleParamsTag          D -  [SPOT Image Inc.            (GeoTIFF)]
87b1  GeoAsciiParamsTag           A -  [SPOT Image Inc.            (GeoTIFF)]
87be  JBIGOptions                 ? ?  [Pixel Magic                         ]
8829  Interlace                   S 1  [             ISO/DIS 12234-2 TIFF/EP]
882a  TimeZoneOffset             SS -  [             ISO/DIS 12234-2 TIFF/EP]
882b  SelfTimerMode               S 1  [             ISO/DIS 12234-2 TIFF/EP]
885c  FaxRecvParams               L 1  [SGI                    (fax support)]
885d  FaxSubAddress               A -  [SGI                    (fax support)]
885e  FaxRecvTime                 L 1  [SGI                    (fax support)]
8871  FedExEDR                    ? ?  [FedEx                               ]
920b  FlashEnergy                 R -  [             ISO/DIS 12234-2 TIFF/EP]
920c  SpatialFrequencyResponse    - -  [             ISO/DIS 12234-2 TIFF/EP]
920d  Noise                       - -  [             ISO/DIS 12234-2 TIFF/EP]
920e  FocalPlaneXResolution       R 1  [             ISO/DIS 12234-2 TIFF/EP]
920f  FocalPlaneYResolution       R 1  [             ISO/DIS 12234-2 TIFF/EP]
9210  FocalPlaneResolutionUnit    S 1  [             ISO/DIS 12234-2 TIFF/EP]
9211  ImageNumber                 L 1  [             ISO/DIS 12234-2 TIFF/EP]
9212  SecurityClassification      A -  [             ISO/DIS 12234-2 TIFF/EP]
9213  ImageHistory                A -  [             ISO/DIS 12234-2 TIFF/EP]
9215  ExposureIndex               R -  [             ISO/DIS 12234-2 TIFF/EP]
9216  TIFF/EPStandardID           B 4  [             ISO/DIS 12234-2 TIFF/EP]
9217  SensingMethod               S 1  [             ISO/DIS 12234-2 TIFF/EP]
923f  StoNits                     D 1  [SGI                (LogLuv Encoding)]
935c  ImageSourceData             - -  [Adobe Photoshop                     ]
c4a5  PrintIM_Data                ? ?  [Epson                               ]
c44f  PhotoshopAnnotations        ? ?  [Adobe Photoshop                     ]
ffff  DCSHueShiftValues           ? ?  [Eastman Kodak                       ]

Exif tags for the 0th IFD Exif private subdirectory

The tags listed in the following table are all the Exif 2.2 private tags, i.e., those which populate the 0th IFD SubIFD; they are described at length in the EXIF standard document (but see also the non-standard Photoshop SubIFD tags at the end of this section). The 'class' column carries the tag class; possible values are: a (tags relating to version), b (image data characteristics), c (image configuration), d (user information), e (related file information), f (date and time), g (picture taking conditions) and h (other Exif 2.2 tags). The two following columns show tag hexadecimal codes and names. The 'type' column specifies the tag type: I (short or long), S (short), L (long), R (rational), SR (signed rational), U (undefined) and A (ASCII, always null terminated). The 'count' column obviously carries the tag count ('-' means that it is variable).

The 'SubIFD' column specifies the support level; it comprises four letters, because the primary image (IFD0) can come in four varieties (uncompressed chunky, uncompressed planar, uncompressed YCC and JPEG compressed). This module currently focuses only on JPEG pictures (not TIFF), so only the fourth letter is interesting. The support level codes stand for: M (mandatory), R (recommended), O (optional), and N (not recorded). Tags marked as 'calculated' in the notes must not be set by the module user, since they concern offsets and data types (which are calculated automatically, more reliably).

    Hexadecimal code                count SubIFD notes
 class |  Tag name                 type | support|
 |     |  |                           | |   |    |
 g  829a  ExposureTime                R 1   RRRR in seconds
 g  829d  FNumber                     R 1   OOOO see note 1)
 g  8822  ExposureProgram             S 1   OOOO valid values are 0-8
 g  8824  SpectralSensitivity         A -   OOOO see ASTM technical committee
 g  8827  ISOSpeedRatings             S -   OOOO see ISO 12232
 g  8828  OECF                        U -   OOOO see ISO 14524
 a  9000  ExifVersion                 U 4   MMMM see note 2)
 f  9003  DateTimeOriginal            A 20  OOOO see note 3)
 f  9004  DateTimeDigitized           A 20  OOOO see note 3)
 c  9101  ComponentsConfiguration     U 4   NNNM see note 4)
 c  9102  CompressedBitsPerPixel      R 1   NNNO compression rate
 g  9201  ShutterSpeedValue          SR 1   OOOO see note 1)
 g  9202  ApertureValue               R 1   OOOO see note 1)
 g  9203  BrightnessValue            SR 1   OOOO see note 1)
 g  9204  ExposureBiasValue          SR 1   OOOO see note 1)
 g  9205  MaxApertureValue            R 1   OOOO smallest ApertureValue
 g  9206  SubjectDistance             R 1   OOOO in meters
 g  9207  MeteringMode                S 1   OOOO valid values are 0-6 and 255
 g  9208  LightSource                 S 1   OOOO use 0-4,9-15,17-24 or 255
 g  9209  Flash                       S 1   RRRR see note 5)
 g  920a  FocalLength                 R 1   OOOO in millimetres
 g  9214  SubjectArea                 S -   OOOO see note 6)
 d  927c  MakerNote                   U -   OOOO maker-specific (not checked)
 d  9286  UserComment                 U -   OOOO see note 7)
 f  9290  SubSecTime                  A -   OOOO see note 8)
 f  9291  SubSecTimeOriginal          A -   OOOO see note 8)
 f  9292  SubSecTimeDigitized         A -   OOOO see note 8)
 a  a000  FlashpixVersion             U 4   MMMM see note 2)
 b  a001  ColorSpace                  S 1   MMMM valid values are 1 and 65535
 c  a002  PixelXDimension             I 1   NNNM picture X-dim from SOS 
 c  a003  PixelYDimension             I 1   NNNM picture Y-dim from SOS
 e  a004  RelatedSoundFile            A 13  OOOO see note 9)
 h  a005  InteroperabilityOffset      L 1   NNNO calculated by the module
 g  a20b  FlashEnergy                 R 1   OOOO in BCPS
 g  a20c  SpatialFrequencyResponse    U -   OOOO see ISO 12233
 g  a20e  FocalPlaneXResolution       R 1   OOOO .
 g  a20f  FocalPlaneYResolution       R 1   OOOO .
 g  a210  FocalPlaneResolutionUnit    S 1   OOOO valid values are 2 and 3
 g  a214  SubjectLocation             S 2   OOOO in pixels
 g  a215  ExposureIndex               R 1   OOOO .
 g  a217  SensingMethod               S 1   OOOO valid values are 1-5,7 and 8
 g  a300  FileSource                  U 1   OOOO only allowed value is 3
 g  a301  SceneType                   U 1   OOOO only allowed value is 1
 g  a302  CFAPattern                  U -   OOOO see note 10)
 g  a401  CustomRendered              S 1   OOOO valid values are 0 and 1
 g  a402  ExposureMode                S 1   RRRR valid values are 0,1 and 2
 g  a403  WhiteBalance                S 1   RRRR valid values are 0 and 1
 g  a404  DigitalZoomRatio            R 1   OOOO .
 g  a405  FocalLengthIn35mmFilm       S 1   OOOO .
 g  a406  SceneCaptureType            S 1   RRRR valid values are 0,1,2 and 3
 g  a407  GainControl                 S 1   OOOO valid values are 0,1,2,3 & 4
 g  a408  Contrast                    S 1   OOOO valid values are 0,1 and 2
 g  a409  Saturation                  S 1   OOOO valid values are 0,1 and 2
 g  a40a  Sharpness                   S 1   OOOO valid values are 0,1 and 2
 g  a40b  DeviceSettingDescription    U -   OOOO see note 11)
 g  a40c  SubjectDistanceRange        S 1   OOOO valid values are 0,1,2 and 3
 h  a420  ImageUniqueID               A 33  OOOO matches /[0-9a-fA-F]+\000+/

 Notes:
 1) The camera information in the Exif standard conforms to the APEX
    (Additive System of Photographic Exposure) unit system. APEX is a
    convenient unit for expressing exposure (Ev). The relation of APEX
    to other units is essentially as follows:
    --------------------------------------------------------
    ApertureValue     (Av) = 2 log2(FNumber)
    ShutterSpeedValue (Tv) = - log2(ExposureTime)
    BrightnessValue   (Bv) =   log2(Brightness) + constant
    FilmSensitivity   (Sv) =   log2(ASA/3.125) [not in Exif]
    Exposure          (Ev) = Av + Tv = Bv + Sv
    --------------------------------------------------------
 2) A version tag is a sequence of four numerical characters representing
    the supported version of the standard (e.g., '0220' for version 2.2).
    Possible versions for Exif: 1.0, 1.1, 2.0, 2.1, 2.2 and 2.2.1.
    Possible versions for Flashpix: 1.0.
 3) A date-time tag value is a null terminated string of the form
    "YYYY:MM:DD HH:MM:SS" (note the space in the middle and the colon
    signs). If the tag is set, but the value is not meaningful, all
    numbers are set to spaces (replacing also the colons with spaces
    is permitted too).
 4) This tag indicates the channels of each component, arranged in order
    from the 1st component to the 4th. For uncompressed data the data
    arrangement is given in the PhotometricInterpretation tag. The four
    numeric characters must be in the range '0' - '6', and legal combina-
    tions are '4560' (if RGB uncompressed) and '1230' (all other cases).
 5) This tag indicates the status of flash when the image was shot.
    Bit 0 indicates the flash firing status, bits 1 and 2 indicate the
    flash return status, bits 3 and 4 indicate the flash mode, bit 5
    indicates whether the flash function is present, and bit 6 indicates
    "red eye" mode. The allowed decimal values for the bitmask are
    therefore 0, 1, 5, 7, 9, 13, 15, 16, 24, 25, 29, 31, 32, 65, 69,
    71, 73, 77, 79, 89, 93 and 95.
 6) This tag indicates the location and area of the main subject in the
    overall scene. Count can be 2 (a spot defined by two coordinates),
    3 (a circle defined by centre coordinates and diameter) and
    4 (a rectangle defined by its centre coordinates and its dimensions).
 7) The 'UserComment' tag must start with an 8 byte "ID code", which
    can be "ASCII\00\00\00", "JIS\00\00\00\00\00", "Unicode" or eight
    null bytes for "undefined". The ID code identifies the character
    code to be used in the following. A null terminator is not required.
 8) A sub-second-time tag value represents a fraction of a second, relative
    to the DateTime tag and other such tags, as an ASCII null-terminated
    string made of numeric characters; an arbitrary number of spaces can
    be appended to the numeric characters string. If sub-second data is not
    known the tag value may contain only spaces. The corresponding regular
    expression is /\d*\s*\000/.
 9) This tag is used to record the name of an audio file related to the
    image data: an ASCII string consisting of 8 characters + '.' + 3
    characters, terminated by NULL. The path is not recorded.
    The corresponding regular expression is /\w{8}\.\w{3}\000/.
10) This tag indicates the colour filter array (CFA) geometric pattern of
    the image sensor when a one-chip colour area sensor is used. The first
    four bytes must be interpreted as two shorts giving the horizontal (m)
    and vertical (n) repeat pixel units. Then, m x n bytes follow, giving
    the actual colour filter values (in the range 0-6).
11) This tag indicates information on the picture-taking conditions of a
    particular camera model, for a reader. The first four bytes must be
    interpreted as two shorts giving the number of display rows and columns.
    The following bytes must be interpreted as Unicode (UCS-2) streams,
    NULL terminated and including the signature. The specifics of the
    Unicode string are as given in ISO/IEC 10464-1. An approximation to
    the corresponding regular expression is /.{4}(\376\377(.{2})*\000\000)*/.

The Adobe's Photoshop program, at least from version 7.0 on, seems to add some non-standard tags to the Exif private tags subdirectory during the treatment of raw camera pictures. The corresponding record values are all ASCII strings ($ASCII type), and contain the description of the tags themselves. The following might be an incomplete list:

Hexadecimal code      count
   |  Tag name     type |   begins with:
   |  |               | |   |
fde8  _OwnerName      A -   "Owner's Name: "  + null terminated string
fde9  _SerialNumber   A -   "Serial Number: " + null terminated string
fdea  _Lens           A -   "Lens: "          + null terminated string
fe4c  _RawFile        A -   "Raw File: "      + null terminated string
fe4d  _Converter      A -   "Converter: "     + null terminated string
fe4e  _WhiteBalance   A -   "White Balance: " + null terminated string
fe51  _Exposure       A -   "Exposure: "      + null terminated string
fe52  _Shadows        A -   "Shadows: "       + null terminated string
fe53  _Brightness     A -   "Brightness: "    + null terminated string
fe54  _Contrast       A -   "Contrast: "      + null terminated string
fe55  _Saturation     A -   "Saturation: "    + null terminated string
fe56  _Sharpness      A -   "Sharpness: "     + null terminated string
fe57  _Smoothness     A -   "Smoothness: "    + null terminated string
fe58  _MoireFilter    A -   "Moire Filter: "  + null terminated string

Exif tags for the 0th IFD Interoperability subdirectory

If the main image is compressed (which is always the case for a JPEG picture), the "Design rule for Camera File system" recommendations suggest to add another IFD below SubIFD, the Interoperability IFD, pointed to by the InteroperabilityOffset tag; legal tags are listed in the following table. The first two columns show tag hexadecimal codes and names. The 'type' column specifies the tag type: I (short or long), U (undefined) and A (ASCII, always null terminated). The 'count' column obviously specifies the value count ('-' means that it is variable). The "Index" and "Version" tags are mandatory if the subIFD is present, and they are automatically added by this module if necessary.

Hexadecimal code                count SubIFD notes
   |  Tag name                 type | suppt. |
   |  |                           | |   |    |
0001  InteroperabilityIndex       A 4 NNNM   R98 (THM would work for IFD1)
0002  InteroperabilityVersion     U 4 NNNM   e.g. '0100' means 1.00
1000  RelatedImageFileFormat      A - NNNO   e.g. 'Exif JPEG Ver. 2.1'
1001  RelatedImageWidth           I 1 NNNO   image X dimension
1002  RelatedImageLength          I 1 NNNO   image Y dimension

Exif tags for the 0th IFD GPS subdirectory

The following tags are used for GPS attributes in the GPS IFD, pointed to (if present) by the GPSInfo tag in IFD0 or IFD1. This standard was already used in TIFF/EP, and is now part of Exif 2.2. The first two columns show tag hexadecimal codes and names. The 'type' column specifies the tag type: B (byte), S (short), R (rational), U (undefined) and A (ASCII, always null terminated). The 'count' column obviously specifies the value count ('-' means that it is variable). All GPS tags are optional in a JPEG or TIFF file, but the 'VersionID' tag must be present, if the GPS IFD is present (a default 'VersionID' = (2,2,0,0), i.e. v.2.2, is automatically added by this module if necessary).

   Hexadecimal code                count   notes
      |  Tag name                 type |   |
      |  |                           | |   |
     00  GPSVersionID                B 4   mandatory
     01  GPSLatitudeRef              A 2   see note 1)
     02  GPSLatitude                 R 3   see note 2)
     03  GPSLongitudeRef             A 2   see note 1)
     04  GPSLongitude                R 3   see note 2)
     05  GPSAltitudeRef              B 1   0 (sea level) or 1 (absolute)
     06  GPSAltitude                 R 1   in metres
     07  GPSTimeStamp                R 3   hours, minutes and seconds
     08  GPSSatellites               A -   satellites used for measurement
     09  GPSStatus                   A 2   'A' (in progr.) or 'V' (interop.)
     0a  GPSMeasureMode              A 2   '2' (2-dim) or '3' (3-dim)
     0b  GPSDOP                      R 1   data degree of precision
     0c  GPSSpeedRef                 A 2   see note 3)
     0d  GPSSpeed                    R 1   speed of the GPS receiver
     0e  GPSTrackRef                 A 2   see note 4)
     0f  GPSTrack                    R 1   see note 5)
     10  GPSImgDirectionRef          A 2   see note 4)
     11  GPSImgDirection             R 1   see note 5)
     12  GPSMapDatum                 A -   geodetic survey data
     13  GPSDestLatitudeRef          A 2   see note 1)
     14  GPSDestLatitude             R 3   see note 2)
     15  GPSDestLongitudeRef         A 2   see note 1)
     16  GPSDestLongitude            R 3   see note 2)
     17  GPSDestBearingRef           A 2   see note 4)
     18  GPSDestBearing              R 1   see note 5)
     19  GPSDestDistanceRef          A 2   see note 3)
     1a  GPSDestDistance             R 1   distance to the destination point
     1b  GPSProcessingMethod         U -   see note 6), location finding
     1c  GPSAreaInformation          U -   see note 6), name of the GPS area
     1d  GPSDateStamp                A 11  YYYY:MM:DD
     1e  GPSDifferential             S 1   0 (without) or 1 (with) diff.corr.

Notes:
1) A latitude or longitude reference specifies a sign for another
   (related) latitude or longitude value tag. A latitude reference can be
   only 'N' (for North) or 'S' (for South); a longitude reference can be
   only 'E' (for East) or 'W' (for West).
2) A latitude or a longitude is stored as a sequence of three rational
   numbers (each rational number is the ratio of two unsigned long
   integers), representing degrees, minutes and seconds. A typical format
   is (dd/1, mm/1, ss/1). Sometimes, seconds are dropped in favour of
   fractions of minutes (usually with two decimal places); in this case
   the format is (dd/1, mmmm/100, 0/1).
3) A "speed (distance) reference" is the unit for the speed (distance)
   value stored in another (related) tag. The only allowed values are 'K'
   (for Km/h or Km), 'M' (for miles/h or miles) or 'N' (knots). Let us
   appreciate the fact that knot in English is both a unit of speed (one
   nautical mile per hour) and of distance (one nautical mile).
4) A direction reference specifies how to interpret a following direction
   value. Only two references are possible: 'T' (for the true direction)
   or 'M' (for the magnetic direction).
5) A direction (of the pointed image, of the movement of the GPS
   receiver, ecc ...) is a decimal number specifying an angle. The
   allowed range is between 0.00 and 359.99.
6) The processing method and the area information are character strings,
   whose first character specifies the character code used: this is the
   first character of the 8-byte character code identification in the
   'UserComment' tag in the SubIFD, so 'A' means ASCII, 'J' means JIS,
   'U' means Unicode and a null character means undefined. Since the type
   is not ASCII, null termination is not required.

VALID TAGS FOR PHOTOSHOP-STYLE APP13 DATA

The structure of a Photoshop-style APP13 segment is introduced in section "STRUCTURE OF A PHOTOSHOP-STYLE APP13 SEGMENT". This section contains only the list of valid Image Resource ID's; note that not all file formats use all ID's, and some information may be stored somewhere else in the file. In the following list 'PS' stands for Photoshop, and 'Pstring' for Pascal string:

  Hexadecimal code                 notes
     |  Tag name                   |
     |  |                          |
   3e8  Photoshop2Info             [obsolete] (PS.2.0) General information
   3e9  MacintoshPrintInfo         [optional] Macintosh print manager info
   3eb  Photoshop2ColorTable       [obsolete] (PS.2.0) Indexed colour table
   3ed  ResolutionInfo             see appendix A in Photoshop SDK
   3ee  AlphaChannelsNames         as a series of Pstrings
   3ef  DisplayInfo                see appendix A in Photoshop SDK
   3f0  PStringCaption             [optional] the caption, as a Pstring
   3f1  BorderInformation          border width and units 
   3f2  BackgroundColor            see additional Adobe information
   3f3  PrintFlags                 labels, crop marks, colour bars, ecc...
   3f4  BWHalftoningInfo           Gray-scale and multich. half-toning info
   3f5  ColorHalftoningInfo        Colour half-toning information
   3f6  DuotoneHalftoningInfo      Duo-tone half-toning information
   3f7  BWTransferFunc             Gray-scale and multich. transfer function
   3f8  ColorTransferFuncs         Colour transfer function
   3f9  DuotoneTransferFuncs       Duo-tone transfer function
   3fa  DuotoneImageInfo           Duo-tone image information
   3fb  EffectiveBW                effective black and white values
   3fc  ObsoletePhotoshopTag1      [obsolete] ??
   3fd  EPSOptions                 Encapsulated Postscript options
   3fe  QuickMaskInfo              channel ID plus initial state flag
   3ff  ObsoletePhotoshopTag2      [obsolete] ??
   400  LayerStateInfo             index of target layer (0 means bottom)
   401  WorkingPathInfo            should not be saved to the file
   402  LayersGroupInfo            for grouping layers together
   403  ObsoletePhotoshopTag3      [obsolete] ??
   404  IPTC/NAA                   see L<VALID TAGS FOR IPTC DATA>
   405  RawImageMode               image mode for raw format files
   406  JPEGQuality                [private]
   408  GridGuidesInfo             see additional Adobe information
   409  ThumbnailResource          see additional Adobe information
   40a  CopyrightFlag              true if image is copyrighted
   40b  URL                        text string with a resource locator
   40c  ThumbnailResource2         see additional Adobe information
   40d  GlobalAngle                global lighting angle for effects layer
   40e  ColorSamplersResource      see additional Adobe information
   40f  ICCProfile                 see notes from Internat. Color Consortium
   410  Watermark                  one byte
   411  ICCUntagged                1 means intentionally untagged
   412  EffectsVisible             1 byte to show/hide all effects layers
   413  SpotHalftone               version, length and data
   414  IDsBaseValue               base value for new layers ID's
   415  UnicodeAlphaNames          length plus Unicode string
   416  IndexedColourTableCount    (PS.6.0) 2 bytes
   417  TransparentIndex           (PS.6.0) 2 bytes
   419  GlobalAltitude             (PS.6.0) 4 bytes
   41a  Slices                     (PS.6.0) see additional Adobe info
   41b  WorkflowURL                (PS.6.0) 4 bytes length + Unicode string
   41c  JumpToXPEP                 (PS.6.0) see additional Adobe info
   41d  AlphaIdentifiers           (PS.6.0) 4*(n+1) bytes
   41e  URLList                    (PS.6.0) structured Unicode URL's
   421  VersionInfo                (PS.6.0) see additional Adobe info
7d0-bb6 PathInfo_%3x               see additional Adobe info (saved path)
   bb7  ClippingPathName           see additional Adobe info
  2710  PrintFlagsInfo             see additional Adobe info

Valid tags for IPTC data

The structure of an IPTC stream is introduced in section "Structure of an IPTC data block". This section contains only the list of valid editorial IPTC tags (2:xx, application records). Numeric tag values (record keys), in the first column, are in decimal notation, and they are followed by tag names in the second column. The presence of 'N' in the third column means that the record is non-repeatable (i.e., there should not be two such records in the file). The following number or range in square brackets indicates valid lengths for the record data field. The final comment specifies additional format constraints, sometimes in natural language: "/regex/" means that the string must match the regular expression regex; "invalid" means that this valid IPTC tag is not used in JPEG pictures; other formats are specified in the notes. Note that IPTC strings are stored in records with an explicit length, so they do not need the final null character (they are not C-strings).

Decimal code                          size     notes
   |  Tag name             repeatable |        |
   |  |                             | |        |
   0  RecordVersion                 N [  2   ] binary, always 2 in JPEGs ?
   3  ObjectTypeReference           N [ 3-67 ] /\d{2}?:[\w\s]{0,64}?/
   4  ObjectAttributeReference        [ 4-68 ] /\d{3}?:[\w\s]{0,64}?/
   5  ObjectName                    N [ <=64 ] line (see note 1)
   7  EditStatus                    N [ <=64 ] line (see note 1)
   8  EditorialUpdate               N [  2   ] /01/
  10  Urgency                       N [  1   ] /[1-8]/
  12  SubjectReference                [13-236] complicated, see note 5
  15  Category                      N [ <=3  ] /[a-zA-Z]{1,3}?/
  20  SupplementalCategory            [ <=32 ] line (see note 1)
  22  FixtureIdentifier             N [ <=32 ] line without spaces
  25  Keywords                        [ <=64 ] line (see note 1)
  26  ContentLocationCode             [  3   ] /[A-Z]{3}?/
  27  ContentLocationName             [ <=64 ] line (see note 1)
  30  ReleaseDate                   N [  8   ] date (see note 2)
  35  ReleaseTime                   N [ 11   ] time (see note 3)
  37  ExpirationDate                N [  8   ] date (see note 2)
  38  ExpirationTime                N [ 11   ] time (see note 3)
  40  SpecialInstructions           N [ <=256] line (see note 1)
  42  ActionAdvised                 N [  2   ] /0[1-4]/
  45  ReferenceService                [ 10   ] "invalid" like 1:30
  47  ReferenceDate                   [  8   ] "invalid" like 1:70
  50  ReferenceNumber                 [  8   ] "invalid" like 1:40
  55  DateCreated                   N [  8   ] date (see note 2)
  60  TimeCreated                   N [ 11   ] time (see note 3)
  62  DigitalCreationDate           N [  8   ] date (see note 2)
  63  DigitalCreationTime           N [ 11   ] time (see note 3)
  65  OriginatingProgram            N [ 32   ] line (see note 1)
  70  ProgramVersion                N [ <=10 ] line (see note 1)
  75  ObjectCycle                   N [  1   ] /a|p|b/
  80  ByLine                          [ <=32 ] line (see note 1)
  85  ByLineTitle                     [ <=32 ] line (see note 1)
  90  City                          N [ <=32 ] line (see note 1)
  92  SubLocation                   N [ <=32 ] line (see note 1)
  95  Province/State                N [ <=32 ] line (see note 1)
 100  Country/PrimaryLocationCode   N [  3   ] /[A-Z]{3}?/
 101  Country/PrimaryLocationName   N [ <=64 ] line (see note 1)
 103  OriginalTransmissionReference N [ <=32 ] line (see note 1)
 105  Headline                      N [ <=256] line (see note 1)
 110  Credit                        N [ <=32 ] line (see note 1)
 115  Source                        N [ <=32 ] line (see note 1)
 116  CopyrightNotice               N [ <=128] line (see note 1)
 118  Contact                         [ <=128] line (see note 1)
 120  Caption/Abstract              N [<=2000] line with CR and LF 
 122  Writer/Editor                   [ <=32 ] line (see note 1)
 125  RasterizedCaption             N [ 7360 ] binary data (460x128 PBM)
 130  ImageType                     N [  2   ] /[0-49][WYMCKRGBTFLPS]/
 131  ImageOrientation              N [  1   ] /P|L|S/
 135  LanguageIdentifier            N [ 2-3  ] /[a-zA-Z]{2,3}?/
 150  AudioType                     N [  2   ] /[012][ACMQRSTVW]/
 151  AudioSamplingRate             N [  6   ] /\d{6}?/
 152  AudioSamplingResolution       N [  2   ] /\d{2}?/
 153  AudioDuration                 N [  6   ] duration (see note 4)
 154  AudioOutcue                   N [ <=64 ] line (see note 1)
 200  ObjDataPreviewFileFormat      N [  2   ] "invalid" like 1:20, binary
 201  ObjDataPreviewFileFormatVer   N [  2   ] "invalid" like 1:22, binary
 202  ObjDataPreviewData            N [<=256000B] "invalid", binary

 Notes:
 1) A "line" is made of printable characters from the ASCII table, i.e. all
    codes from "space" on, excluding the "delete" character. As a regular
    expression, this corresponds to /^[^\000-\037\177]*$/.
 2) A date is stored, following the ISO 8601 standard, as the eight character
    string 'CCYYMMDD', ex. '19890317' indicates March 17th 1989. As a regular
    expression, this corresponds to /[0-2]\d\d\d(0\d|1[0-2])([0-2]\d|3[01])/.
 3) A time is stored, following the ISO 8601 standard, as the eleven character
    string 'HHMMSS+/-HHMM', ex. '090000-0500' indicates 9AM, 5 hours behind
    the coordinated universal time. As a regular expression, this corresponds
    to /([01]\d|2[0-3])[0-5]\d[0-5]\d[\+-]([01]\d|2[0-3])[0-5]\d/.
 4) A "duration" is stored like a "time", but there is no time zone spec;
    this means that the string is only six characters wide (see also note 3).
 5) The complicated regular expression for the SubjectReference is the
    following: /[$validchar]{1,32}?:[01]\d{7}?(:[$validchar\s]{0,64}?){3}?/,
    where $validchar is '\040-\051\053-\071\073-\076\100-\176'.

AUTHOR

Stefano Bettelli, bettelli@cpan.org

COPYRIGHT AND LICENSE

Copyright (C) 2004 by Stefano Bettelli

This library is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License. See the COPYING and LICENSE file for the license terms.

SEE ALSO

perl(1), perlgpl(1), Image::IPTCInfo(3), JPEG::JFIF(3), Image::Exif(3), Image::Info(3)

1 POD Error

The following errors were encountered while parsing the POD:

Around line 1167:

Non-ASCII character seen before =encoding in 'Müller'. Assuming CP1252