Many files exported and imported in Geopsy softwares are based on XML structures. For instance, (listed with their common extension):
Those files are compressed TAR archives containing at least one file called contents.xml.
tar xvfz Legend_page.page
Note that option 'z' might not be necessary if your browser recognizes page files as compressed gz files and decompresses it silently. Anyhow it produces a file contents.xml. Some files may also contain binary files named bin_data_1*. The format of those files is currently undocumented, refer to the source code for details.
contents.xml can be viewed by any Internet browser (e.g. Firefox).
<SciFigs> <libVersion>2.3.0</libVersion> <type>Page</type> <GraphicSheet> <LegendWidget objectName="object"> <objectName>object</objectName> <printX>1</printX> <printY>1</printY> <anchor>TopLeft</anchor> ...
The encoding of contenst.xml is UTF-16 which may not be directly editable in a text editor that does not support Unicode. However most modern editors does:
- Notepad++ (Windows only): UTF-16 is automatically recognized.
- Vim: on most systems, UTF-16 is automatically recognized. If not follow instructions below.
- Kate or KWrite: UTF-16 is not automatically recognized up to version 4.4 (not tested above), but it can be specified manually in the menu (Tools/Encoding/Unicode/UTF-16) or through the command line
kate contents.xml --encoding UTF-16 kwrite contents.xml --encoding UTF-16
- nano: UTF-16 might be supported, but not successful on a platform with LC_ALL=en_US.UTF8 (version 2.2.4).
- Notepad and Wordpad (Windows only): UTF-16 is not supported. Follow instructions below.
iconv -f UTF-16 -t ASCII contents.xml > tmp; mv tmp contents.xml
Any special character (non US characters) are lost in this transformation. This is important only for titles and texts displayed for instance in page files, if another language than English is used. Note that playing with contents.xml does not alter the original file, all modifications can still be erased.
Now contents.xml can be manipulated as an ASCII file.
Saving modifications or compression
To pack back contents.xml to the original compressed file format, the complete archive must reconstructed:
tar cvfz Legend_page.page contents.xml
In this case, UTF-16, UTF-8 or ASCII are accepted. If binary files were present in the original file, they must be packed together. The order is not critical.
tar cvfz Legend_page.page contents.xml bin_data_*