However this is far from all the metadata embedded in an image. In fact mediawiki currently only extracts Exif metadata. Exif metadata is arguably the most popular form of metadata, so if you're going to only extract one, Exif is a good choice. Every time you take a picture with your digital camera, it adds exif data to your picture. Most of this type of data is technical - fNumber, shutter speed, camera model, etc. You can also encode things like Artist, copyright, image description in exif, however that is much more rare.
What I'm doing is first of all fixing up the exif support a little bit. Currently some of the exif tags are not supported (Bug 13172). Most of these are fairly obscure tags no one really cares about, but there are some exceptions like GPSLatitude, GPSLongitude, and UserComment.
I'm also (among other things) adding support for iptc-iim tags. IPTC-IIM is a very old format for transmitting news stories between news agencies. Adobe adopted parts of this format to use for embedding metadata in jpeg files with photoshop. Now a days its being slowly replaced by XMP, but many photos still use it. IPTC metadata tends to be more descriptive (stuff like title, author, etc) in nature compared to how exif metadata is technical (aperature, shutter speed) in nature.
My code will also try to sort out conflicts. Sometimes there are conflicting values in the different metadata formats. If an image has two different descriptions in the exif and iptc data, which should be displayed? Exif, IPTC, or both? Luckily for me, several companies involved in images got together and thought long and hard about that issue. They then produced a standard for how to act if there is a conflict [1]. For example If both iptc and exif data conflict on the image description, then the exif data wins.
Consider [[File:2005-09-17 10-01 Provence 641 St Rémy-de-Provence - Glanum.jpg]]
On commons the metadata table looks like:
Camera manufacturer | CASIO COMPUTER CO.,LTD |
---|---|
Camera model | EX-Z55 |
Exposure time | 1/800 sec (0.00125) |
F Number | f/4.3 |
Date and time of data generation | 14:21, 28 September 2005 |
Lens focal length | 5.8 mm |
Orientation | Normal |
Horizontal resolution | 72 dpi |
Vertical resolution | 72 dpi |
Software used | Microsoft Pro Photo Tools |
File change date and time | 14:21, 28 September 2005 |
Y and C positioning | 1 |
Exposure Program | Normal program |
Exif version | 2.21 |
Date and time of digitizing | 14:21, 28 September 2005 |
Image compression mode | 3.6666666666667 |
Exposure bias | 0 |
Maximum land aperture | 2.8 |
Metering mode | Pattern |
Light source | Unknown |
Flash | Flash did not fire, compulsory flash suppression |
Color space | sRGB |
Custom image processing | Normal process |
Exposure mode | Auto exposure |
White balance | Auto white balance |
Focal length in 35 mm film | 35 |
Scene capture type | Standard |
Contrast | Normal |
Saturation | Normal |
Sharpness | Normal |
North or south latitude | North latitude |
East or west longitude | East longitude |
But on my test wiki the table looks like:
Camera manufacturer | CASIO COMPUTER CO.,LTD |
---|---|
Camera model | EX-Z55 |
Exposure time | 1/800 sec (0.00125) |
F Number | f/4.3 |
Date and time of data generation | 14:21, 28 September 2005 |
Lens focal length | 5.8 mm |
Latitude | 43° 46′ 21.35″ N |
Longitude | 4° 50′ 1.34″ E |
Orientation | Normal |
Horizontal resolution | 72 dpi |
Vertical resolution | 72 dpi |
Software used | Microsoft Pro Photo Tools |
File change date and time | 14:21, 28 September 2005 |
Y and C positioning | Centered |
Exposure Program | Normal program |
Exif version | 2.21 |
Date and time of digitizing | 14:21, 28 September 2005 |
Meaning of each component |
|
Image compression mode | 3.66666666667 |
Exposure bias | 0 |
Maximum land aperture | 2.8 |
Metering mode | Pattern |
Light source | Unknown |
Flash | Flash did not fire, compulsory flash suppression |
Supported Flashpix version | 0,100 |
Color space | sRGB |
File source | DSC |
Custom image processing | Normal process |
Exposure mode | Auto exposure |
White balance | Auto white balance |
Focal length in 35 mm film | 35 |
Scene capture type | Standard |
Scene control | None |
Contrast | Normal |
Saturation | Normal |
Sharpness | Normal |
Most notably, GPS information is now supported. As a note, the wikipedia links for camera model are a commons customization, which is why they don't appear on my test output.
As another example, consider [[file:Pöstlingbahn TFXV.jpg]]. On commons, it has no metadata extracted. (It does have some information about the image on the page, but this was all hand-entered by a human). On my test wiki, the following metadata table is generated:
Image title | Triebfahrzeug Nr. XV der Pöstlingbergbahn bei der Rangierfahrt an der Bergstation |
---|---|
Author | Erich Heuer |
Date and time of data generation | 8 April 2006 |
Copyright holder | http://creativecommons.org/licenses/by-sa/2.0/de/deed.de |
Headline | Pöstlingbergbahn Triebfahrzeug XV |
Special instructions | Eastman Kodak Company, Kodak CX7430; 1/181 sec; F 9.51; Farbmanagement; 640 x 526 Pixel |
Source | Erich Heuer, Dresden |
Object name | Pöstlingbahn TF XV |
City shown | Linz-Pöstlingberg |
Province or state shown | Oberösterreich |
Country shown | Republik Österreich |
Keywords | Bergbahn, Pöstlingbergbahn, Linz |
I'm almost done with iim metadata, and plan to start working on XMP metadata soon. If your curious, all the code is currently in the img_metadata branch. You can also look at the status page which I will try to update occasionally.
Cheers,
Bawolff