0% found this document useful (0 votes)
91 views

Mathematica 9

This document discusses reading and writing Mathematica files. It describes how to store Mathematica expressions in external files, save multiple expressions, and save expressions in different formats like OutputForm. It also covers saving definitions of Mathematica objects in files.

Uploaded by

angusyoung1
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
91 views

Mathematica 9

This document discusses reading and writing Mathematica files. It describes how to store Mathematica expressions in external files, save multiple expressions, and save expressions in different formats like OutputForm. It also covers saving definitions of Mathematica objects in files.

Uploaded by

angusyoung1
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 80

Wolfram Mathematica Tutorial Collection

DATA MANIPULATION

For use with Wolfram Mathematica 7.0 and later. For the latest updates and corrections to this manual: visit reference.wolfram.com For information on additional copies of this documentation: visit the Customer Service website at www.wolfram.com/services/customerservice or email Customer Service at [email protected] Comments on this manual are welcomed at: [email protected] Printed in the United States of America. 15 14 13 12 11 10 9 8 7 6 5 4 3 2

2008 Wolfram Research, Inc. All rights reserved. No part of this document may be reproduced or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, without the prior written permission of the copyright holder. Wolfram Research is the holder of the copyright to the Wolfram Mathematica software system ("Software") described in this document, including without limitation such aspects of the system as its code, structure, sequence, organization, look and feel, programming language, and compilation of command names. Use of the Software unless pursuant to the terms of a license granted by Wolfram Research or as otherwise authorized by law is an infringement of the copyright. Wolfram Research, Inc. and Wolfram Media, Inc. ("Wolfram") make no representations, express, statutory, or implied, with respect to the Software (or any aspect thereof), including, without limitation, any implied warranties of merchantability, interoperability, or fitness for a particular purpose, all of which are expressly disclaimed. Wolfram does not warrant that the functions of the Software will meet your requirements or that the operation of the Software will be uninterrupted or error free. As such, Wolfram does not recommend the use of the software described in this document for applications in which errors or omissions could threaten life, injury or significant loss. Mathematica, MathLink, and MathSource are registered trademarks of Wolfram Research, Inc. J/Link, MathLM, .NET/Link, and webMathematica are trademarks of Wolfram Research, Inc. Windows is a registered trademark of Microsoft Corporation in the United States and other countries. Macintosh is a registered trademark of Apple Computer, Inc. All other trademarks used herein are the property of their respective owners. Mathematica is not associated with Mathematica Policy Research, Inc.

Contents
Files, Streams, and External Operations
Reading and Writing Mathematica Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . External Programs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Streams and Low-Level Input and Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Naming and Finding Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Files for Packages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Manipulating Files and Directories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Reading Textual Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Searching Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Searching and Reading Strings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Binary Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Generating C and Fortran Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Splicing Mathematica Output into External Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 8 12 18 26 27 28 36 41 44 47 48

Importing and Exporting


Importing and Exporting Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Importing and Exporting Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exporting Graphics and Sounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Generating and Importing TeX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exchanging Material with the Web . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 51 54 57 58

Image Processing
Image Creation and Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Basic Image Manipulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Image Processing by Point Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Image Processing by Area Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 63 66 71

Files, Streams, and External Operations


Reading and Writing Mathematica Files
Storing Mathematica Expressions in External Files
You can use files on your computer system to store definitions and results from Mathematica. The most general approach is to store everything as plain text that is appropriate for input to Mathematica. With this approach, a version of Mathematica running on one computer system produces files that can be read by a version running on any computer system. In addition, such files can be manipulated by other standard programs, such as text editors. << file or Get@" file"D FilePrint@" file"D expr>> file or Put@expr," file"D expr>>> file or PutAppend@expr," file"D
append an expression to a file
Reading and writing files. This expands Hx + yL3 , and outputs the result to a file called tmp.
In[1]:=

read in a file of Mathematica input, and return the last expression in the file display the contents of a file write an expression to a file

Expand@Hx + yL ^ 3D >> tmp Here are the contents of tmp. They can be used directly as input for Mathematica.

In[2]:=

FilePrint@"tmp"D

x^3 + 3*x^2*y + 3*x*y^2 + y^3


This reads in tmp, evaluating the Mathematica input it contains.
In[3]:= Out[3]=

<< tmp
x3 + 3 x2 y + 3 x y2 + y3

Data Manipulation

This shows the contents of the file factors.


In[1]:=

FilePrint@"ExampleDatafactors"D

(* Factors of x^20 - 1 *) (-1 + x)*(1 + x)*(1 + x^2)*(1 - x + x^2 - x^3 + x^4)* (1 + x + x^2 + x^3 + x^4)*(1 - x^2 + x^4 - x^6 + x^8)
This reads in the file, and returns the last expression in it.
In[2]:= Out[2]=

<< ExampleData/factors
H-1 + xL H1 + xL I1 + x2 M I1 - x + x2 - x3 + x4 M I1 + x + x2 + x 3 + x4 M I1 - x2 + x4 - x6 + x8 M

If Mathematica cannot find the file you ask it to read, it prints a message, then returns the symbol $Failed .
In[19]:=

<< faxors
Get::noopen : Cannot open faxors.

Out[19]= $Failed

When you read in a file with << file, Mathematica returns the last expression it evaluates in the file. You can avoid getting any visible result from reading a file by ending the last expression in the file with a semicolon, or by explicitly adding Null after that expression. If Mathematica encounters a syntax error while reading a file, it reports the error, skips the remainder of the file, then returns $Failed . If the syntax error occurs in the middle of a package which uses BeginPackage and other context manipulation functions, then Mathematica tries to restore the context to what it was before the package was read.

Saving Multiple Mathematica Expressions


Mathematica input files can contain any number of expressions. Each expression, however, must start on a new line. The expressions may continue for as many lines as necessary. Just as in a standard interactive Mathematica session, the expressions are processed as soon as they are complete. Note, that in a file, unlike an interactive session, you can insert a blank line at any point without effect.

Data Manipulation

When you use expr >>> file, Mathematica appends each new expression you give to the end of your file. If you use expr >> file, however, then Mathematica instead wipes out anything that was in the file before, and then puts expr into the file.
This writes an expression to the file tmp.
In[4]:=

Factor@x ^ 6 - 1D >> tmp Here are the contents of the file.

In[5]:=

FilePrint@"tmp"D

(-1 + x)*(1 + x)*(1 - x + x^2)*(1 + x + x^2)


This appends another expression to the same file.
In[6]:=

Factor@x ^ 8 - 1D >>> tmp Both expressions are now in the file.

In[7]:=

FilePrint@"tmp"D

(-1 + x)*(1 + x)*(1 - x + x^2)*(1 + x + x^2) (-1 + x)*(1 + x)*(1 + x^2)*(1 + x^4)
If you are familiar with command-line operating systems, you will recognize the Mathematica redirection operators >>, >>> and << as being analogous to the command-line operators >, >> and <.

Saving Mathematica Expressions in Different Formats


When you use either >> or >>> to write expressions to files, the expressions are usually given in Mathematica input format, so that you can read them back into Mathematica. Sometimes, however, you may want to save expressions in other formats. You can do this by explicitly wrapping a format directive such as OutputForm around the expression you write out.
This writes an expression to the file tmp in output format.
In[8]:=

OutputForm@Factor@x ^ 6 - 1DD >> tmp The expression in tmp is now in output format.

In[9]:=

FilePrint@"tmp"D

2 2 (-1 + x) (1 + x) (1 - x + x ) (1 + x + x )

Saving Definitions of Mathematica Objects

Data Manipulation

Saving Definitions of Mathematica Objects


One of the most common reasons for using files is to save definitions of Mathematica objects, to be able to read them in again in a subsequent Mathematica session. The operators >> and >>> allow you to save Mathematica expressions in files. You can use the function Save to save complete definitions of Mathematica objects, in a form suitable for execution in subsequent Mathematica sessions.
save the complete definitions for a symbol in a file save definitions for symbols whose names match the string pattern form save definitions for all symbols in the specified context save definitions for several objects

Save @" file",symbolD Save @" file"," form"D Save @" file","context`"D Save @" file",8object1 ,object2 ,<D
Saving definitions in plain text files.

This assigns a value to the symbol a.


In[51]:=

a = 2 - x^2

2 Out[51]= 2 - x

You can use Save to write the definition of a to a file.


In[52]:=

Save@"afile", aD Here is the definition of a that was saved in the file.

In[53]:=

FilePrint@"afile"D

a = 2 - x^2
This defines a function f which depends on the symbol a previously defined.
In[54]:=

f@z_D := a ^ 2 - 2 This saves the complete definition of f in a file.

In[55]:=

Save@"ffile", fD

Data Manipulation

The file contains not only the definition of f itself, but also the definition of the symbol a on which f depends.
In[56]:=

FilePrint@"ffile"D

f[z_] := a^2 - 2 a = 2 - x^2


This clears the definitions of f and a.
In[57]:=

Clear@f, aD You can reinstate the definitions you saved simply by reading in the file ffile.

In[58]:=

<< ffile

2 Out[58]= 2 - x

The function Save makes use of the output forms Definition and FullDefinition, which print as definitions of Mathematica symbols. In some cases, you may find it convenient to use these output forms directly.
The output form Definition @ f D prints as the sequence of definitions that have been made for f.
In[59]:= Out[59]=

Definition@fD

FullDefinition@ f D includes definitions of the objects on which f depends.


In[60]:= Out[60]=

FullDefinition@fD

When you define a new object in Mathematica, your definition will often depend on other objects that you defined before. If you are going to be able to reconstruct the definition of your new object in a subsequent Mathematica session, it is important that you store not only its own definition, but also the definitions of other objects on which it depends. The function Save looks

Data Manipulation

through the definitions of the objects you ask it to save, and automatically also saves all definitions of other objects on which it can see that these depend. However, in order to avoid saving a large amount of unnecessary material, Save never includes definitions for symbols that have the attribute Protected. It assumes that the definitions for these symbols are also built in. Nevertheless, with such definitions taken care of, it should always be the case that reading the output generated by Save back into a new Mathematica session will set up the definitions of your objects exactly as you had them before.

Saving Mathematica Definitions in Encoded Form


When you create files for input to Mathematica, you usually want them to contain only plain text, which can be read or modified directly. Sometimes, however, you may want the contents of a file to be encoded so that they cannot be read or modified directly as plain text, but can be loaded into Mathematica. You can create encoded files using the Mathematica function Encode.
write an encoded version of the file source to the file dest read in an encoded file encode with the specified key read in a file that was encoded with a key create an encoded file which can only be read on a machine with a particular ID
Creating and reading encoded files. This writes an expression in plain text to the file tmp.
In[61]:=

Encode@"source","dest"D <<dest Encode@"source","dest","key"D Get@"dest","key"D

Encode@"source","dest",MachineID->"ID"D

Factor@x ^ 2 - 1D >> tmp This writes an encoded version of the file tmp to the file tmp.x.

In[62]:=

Encode@"tmp", "tmp.x"D

Data Manipulation

Here are the contents of the encoded file. The only recognizable part is the special Mathematica comment at the beginning.
In[63]:=

FilePrint@"tmp.x"D

(*!1N!*)mcm _QZ9tcI1cfre*Wo8:) P
Even though the file is encoded, you can still read it into Mathematica using the << operator.
In[64]:=

<< tmp.x

Out[64]= H-1 + xL H1 + xL

DumpSave @" file.mx",symbolD DumpSave @" file.mx","context`"D

save definitions for a symbol in internal Mathematica format save definitions for all symbols in a context

DumpSave @" file.mx",8object1 ,object2 ,<D


save definitions for several symbols or contexts

DumpSave @" package`",objectsD

save definitions in a file with a specially chosen name

Saving definitions in internal Mathematica format.

If you have to read in very large or complicated definitions, you will often find it more efficient to store these definitions in internal Mathematica format, rather than as text. You can do this using DumpSave .
This saves the definition for f in internal Mathematica format.
In[22]:= Out[22]=

DumpSave@"ffile.mx", fD
8f<

You can still use << to read the definition in.


In[23]:=

<< ffile.mx

<< recognizes when a file contains definitions in internal Mathematica format, and operates accordingly. One subtlety is that the internal Mathematica format differs from one computer system to another. As a result, .mx files created on one computer cannot typically be read on another.

Data Manipulation

If you use DumpSave @" package`", D then Mathematica will write out definitions to a file with a name like package.mx system package.mx, where system identifies your type of computer system.
This creates a file with a name that reflects the name of the computer system being used.
In[24]:= Out[24]=

DumpSave@"gffile`", fD
8f<

<< automatically picks out the file with the appropriate name for your computer system.
In[25]:=

<< gffile`

External Programs
On most computer systems, you can execute external programs or commands from within Mathematica. Often you will want to take expressions you have generated in Mathematica, and send them to an external program, or take results from external programs, and read them into Mathematica. Mathematica supports two basic forms of communication with external programs: structured and unstructured.
Structured communication Unstructured communication use MathLink to exchange expressions with MathLinkcompatible external programs use file reading and writing operations to exchange ordinary text

Two kinds of communication with external programs in Mathematica.

The idea of structured communication is to exchange complete Mathematica expressions to external programs which are specially set up to handle such objects. The basis for structured communication is the MathLink system, discussed in "MathLink and External Program Communication". Unstructured communication consists in sending and receiving ordinary text from external programs. The basic idea is to treat an external program very much like a file, and to support the same kinds of reading and writing operations.

Data Manipulation

<< file <<"!command " expr>>"!command " ReadList A"!command ",NumberE

read in a file run an external command, and read in the output it produces feed the textual form of expr to an external command run an external command, and read in a list of the numbers it produces

Some ways to communicate with external programs.

In general, wherever you might use an ordinary file name, Mathematica allows you instead to give a pipe, written as an external command, prefaced by an exclamation point. When you use the pipe, Mathematica will execute the external command, and send or receive text from it.
This sends the result from FactorInteger to the external program lpr. On many Unix systems, this program generates a printout.
In[1]:=

FactorInteger@2 ^ 31 - 1D >> !lpr This executes the external command echo $TERM, then reads the result as Mathematica input.

In[2]:=

<< "!echo $TERM"

Out[2]= xterm

With a text-based interface, putting ! at the beginning of a line causes the remainder of the line to be executed as an external command. squares is an external program which prints numbers and their squares.

In[1]:= !squares 4 1 2 3 4 1 4 9 16
This runs the external command squares 4, then reads numbers from the output it produces.
In[3]:= Out[3]=

ReadList@"!squares 4", Number, RecordLists -> TrueD


881, 1<, 82, 4<, 83, 9<, 84, 16<<

One point to notice is that you can get away with dropping the double quotes around the name of a pipe on the right-hand side of << or >> if the name does not contain any spaces or other special characters. Pipes in Mathematica provide a very general mechanism for unstructured communication with external programs. On many computer systems, Mathematica pipes are implemented using pipe mechanisms in the underlying operating system; in some cases, however, other interprocess communication mechanisms are used. One restriction of unstructured communication in

10

Data Manipulation

Pipes in Mathematica provide a very general mechanism for unstructured communication with external programs. On many computer systems, Mathematica pipes are implemented using pipe mechanisms in the underlying operating system; in some cases, however, other interprocess communication mechanisms are used. One restriction of unstructured communication in Mathematica is that a given pipe can only be used for input or for output, and not for both at the same time. In order to do genuine two-way communication, you need to use MathLink. Even with unstructured communication, you can nevertheless set up somewhat more complicated arrangements by using "temporary files". The basic idea is to write data to a file, then to read it as needed.
open a new file with a unique name in the default area for temporary files on your computer system

OpenWrite@D
Opening a "temporary file".

Particularly when you work with temporary files, you may find it useful to be able to execute external commands which do not explicitly send or receive data from Mathematica. You can do this using the Mathematica function Run. Run@"command ",arg1 ,D
run an external command from within Mathematica

Running external commands without input or output. This executes the external Unix command date. The returned value is an "exit code" from the operating system.
In[4]:= Out[4]=

Run@"date"D
0

Note that when you use Run, you must not preface commands with exclamation points. Run simply takes the textual forms of the arguments you specify, then joins them together with spaces in between, and executes the resulting string as an external shell command.

Data Manipulation

11

It is important to realize that Run never "captures" any of the output from an external command. As a result, where this output goes is purely determined by your operating system. Similarly, Run does not supply input to external commands. This means that the commands can get input through any mechanism provided by your operating system. Sometimes external commands may be able to access the same input and output streams that are used by Mathematica itself. In some cases, this may be what you want. But particularly if you are using Mathematica with a front end, this can cause considerable trouble.
run command , using expr as input, and reading the output back into Mathematica

RunThrough @"command ",exprD

Running Mathematica expressions through external programs.

As discussed above, << and >> cannot be used to both send and receive data from an external program at the same time. Nevertheless, by using temporary files, you can effectively both send and receive data from an external program while still using unstructured communication. The function RunThrough writes the text of an expression to a temporary file, then feeds this file as input to an external program, and captures the output as input to Mathematica. Note that in RunThrough , like Run, you should not preface the names of external commands with exclamation points.
This feeds the expression 789 to the external program cat, which in this case simply echoes the text of the expression. The output from cat is then read back into Mathematica.
In[5]:= Out[5]=

RunThrough@"cat", 789D
789

SystemOpen @"target"D
Opening files with external programs.

opens the specified file, URL or other target with the associated program on your computer system

This opens the URL using your system's preferred web browser.
In[6]:=

SystemOpen@"http: www.wolfram.com"D

SystemOpen uses settings in your operating system to determine how to open a URI or file. When opening files, it typically uses the same program that would be used if you double-clicked the file's icon.

Streams and Low-Level Input and Output

12

Data Manipulation

Streams and Low-Level Input and Output


Files and pipes are both examples of general Mathematica objects known as streams. A stream in Mathematica is a source of input or output. There are many operations that you can perform on streams. You can think of >> and << as "high-level" Mathematica input-output functions. They are based on a set of lower-level input-output primitives that work directly with streams. By using these primitives, you can exercise more control over exactly how Mathematica does input and output. You will often need to do this, for example, if you write Mathematica programs which store and retrieve intermediate data from files or pipes. The basic low-level scheme for writing output to a stream in Mathematica is as follows. First, you call OpenWrite or OpenAppend to "open the stream", telling Mathematica that you want to write output to a particular file or external program, and in what form the output should be written. Having opened a stream, you can then call Write or WriteString to write a sequence of expressions or strings to the stream. When you have finished, you call Close to "close the stream".
a file, specified by name a command, specified by name an input stream an output stream

"name" "!name" InputStream @"name",nD OutputStream @"name",nD


Streams in Mathematica.

When you open a file or a pipe, Mathematica creates a "stream object" that specifies the open stream associated with the file or pipe. In general, the stream object contains the name of the file or the external command used in a pipe, together with a unique number. The reason that the stream object needs to include a unique number is that in general you can have several streams connected to the same file or external program at the same time. For example, you may start several different instances of the same external program, each connected to a different stream. Nevertheless, when you have opened a stream, you can still refer to it using a simple file name or external command name so long as there is only one stream associated with this object.
This opens an output stream to the file tmp.

Data Manipulation

13

This opens an output stream to the file tmp.


In[1]:=

stmp = OpenWrite@"tmp"D

Out[1]= OutputStream@tmp, 36D

This writes a sequence of expressions to the file.


In[2]:=

Write@stmp, a, b, cD Since you only have one stream associated with file tmp, you can refer to it simply by giving the name of the file.

In[3]:=

Write@"tmp", xD This closes the stream.

In[4]:=

Close@stmpD

Out[4]= tmp

Here is what was written to the file.


In[5]:=

FilePrint@"tmp"D

abc x
OpenWrite@" file"D OpenWrite@D OpenAppend @" file"D OpenWrite@"!command "D Write @stream,expr1 ,expr2 ,D WriteString @stream,str1 ,str2 ,D Close @streamD
Low-level output functions.

open an output stream to a file, wiping out the previous contents of the file open an output stream to a new temporary file open an output stream to a file, appending to what was already in the file open an output stream to an external command write a sequence of expressions to a stream, ending the output with a newline (line feed) write a sequence of character strings to a stream, with no extra newlines tell Mathematica that you are finished with a stream

When you call Write @stream, exprD, it writes an expression to the specified stream. The default is to write the expression in Mathematica input form. If you call Write with a sequence of expressions, it will write these expressions one after another to the stream. In general, it leaves no space between the successive expressions. However, when it has finished writing all the expressions, Write always ends its output with a newline.
This reopens the file tmp.

14

Data Manipulation

This reopens the file tmp.


In[6]:=

stmp = OpenWrite@"tmp"D

Out[6]= OutputStream@tmp, 37D

This writes a sequence of expressions to the file, then closes the file.
In[7]:=

Write@stmp, a ^ 2, 1 + b ^ 2D; Write@stmp, c ^ 3D; Close@stmpD

Out[7]= tmp

All the expressions are written in input form. The expressions from a single Write are put on the same line.
In[8]:=

FilePrint@"tmp"D

a^21 + b^2 c^3


Write provides a way of writing out complete Mathematica expressions. Sometimes, however, you may want to write out less structured data. WriteString allows you to write out any character string. Unlike Write , WriteString adds no newlines or other characters.
This opens the stream.
In[9]:=

stmp = OpenWrite@"tmp"D

Out[9]= OutputStream@tmp, 38D

This writes two strings to the stream.


In[10]:=

WriteString@stmp, "Arbitrary output.\n", "More output."D This writes another string, then closes the stream.

In[11]:=

WriteString@stmp, " Second line.\n"D; Close@stmpD

Out[11]= tmp

Here are the contents of the file. The strings were written exactly as specified, including only the newlines that were explicitly given.
In[12]:=

FilePrint@"tmp"D

Arbitrary output. More output. Second line.

Data Manipulation

15

Write @8stream1 ,stream2 <,expr1 ,D

write expressions to a list of streams

WriteString @8stream1 ,stream2 <,str1 ,D


write strings to a list of streams
Writing output to lists of streams.

An important feature of the functions Write and WriteString is that they allow you to write output not just to a single stream, but also to a list of streams. In using Mathematica, it is often convenient to define a channel which consists of a list of streams. You can then simply tell Mathematica to write to the channel, and have it automatically write the same object to several streams. In a standard interactive Mathematica session, there are several output channels that are usually defined. These specify where particular kinds of output should be sent. Thus, for example, $Output specifies where standard output should go, while $Messages specifies where messages should go. The function Print then works essentially by calling Write with the $Output channel. Message works in the same way by calling Write with the $Messages channel. "The Main Loop" lists the channels used in a typical Mathematica session. Note that when you run Mathematica through MathLink, a different approach is usually used. All output is typically written to a single MathLink link, but each piece of output appears in a packet which indicates what type it is. In most cases, the names of files or external commands that you use in Mathematica correspond exactly with those used by your computers operating system. On some systems, however, Mathematica supports various streams with special names.
standard output standard error

"stdout" "stderr"

Special streams used on some computer systems.

The special stream "stdout" allows you to give output to the standard output provided by the operating system. Note however that you can use this stream only with simple text-based interfaces to Mathematica. If your interaction with Mathematica is more complicated, then this stream will not work, and trying to use it may cause considerable trouble.

16

Data Manipulation

option name

default value

FormatType PageWidth NumberMarks CharacterEncoding

InputForm 78 $NumberMarks $CharacterEncodi ng

the default output format to use the width of the page in characters whether to include ` marks in approximate numbers encoding to be used for special characters

Some options for output streams.

You can associate a number of options with output streams. You can specify these options when you first open a stream using OpenWrite or OpenAppend .
This opens a stream, specifying that the default output format used should be OutputForm .
In[13]:=

stmp = OpenWrite@"tmp", FormatType -> OutputFormD

Out[13]= OutputStream@tmp, 39D

This writes expressions to the stream, then closes the stream.


In[14]:=

Write@stmp, x ^ 2 + y ^ 2, " ", z ^ 2D; Close@stmpD

Out[14]= tmp

The expressions were written to the stream in OutputForm .


In[15]:=

FilePrint@"tmp"D

2 x + y

2 z

Note that you can always override the output format specified for a particular stream by wrapping a particular expression you write to the stream with an explicit Mathematica format directive, such as OutputForm or TeXForm . The option PageWidth gives the width of the page available for textual output from Mathematica. All lines of output are broken so that they fit in this width. If you do not want any lines to

Data Manipulation

17

be broken, you can set PageWidth -> Infinity . Usually, however, you will want to set PageWidth to the value appropriate for your particular output device. On many systems, you will have to run an external program to find out what this value is. Using SetOptions , you can make the default rule for PageWidth be, for example, PageWidth :> << "!devicewidth", so that an external program is run automatically to find the value of the option.
This opens a stream, specifying that the page width is 20 characters.
In[16]:=

stmp = OpenWrite@"tmp", PageWidth -> 20D

Out[16]= OutputStream@tmp, 40D

This writes out an expression, then closes the stream.


In[17]:=

Write@stmp, Expand@H1 + xL ^ 5DD; Close@stmpD

Out[17]= tmp

The lines in the expression written out are all broken so as to be at most 20 characters long.
In[18]:=

FilePrint@"tmp"D

1 + 5*x + 10*x^2 + 10*x^3 + 5*x^4 + x^5


The option CharacterEncoding allows you to specify a character encoding that will be used for all strings which are sent to a particular output stream, whether by Write or WriteString . You will typically need to use CharacterEncoding if you want to modify an international character set, or prevent a particular output device from receiving characters that it cannot handle.
find the options that have been set for a stream reset options for an open stream

Options @streamD SetOptions @stream,opt1 ->val1 ,D


Manipulating options of streams.

This opens a stream with the default settings for options.


In[19]:=

stmp = OpenWrite@"tmp"D

Out[19]= OutputStream@tmp, 41D

18

Data Manipulation

This changes the FormatType option for the open stream.


In[20]:=

SetOptions@stmp, FormatType -> TeXFormD;

Options shows the options you have set for the open stream.
In[21]:=

Options@stmpD
TotalWidth , TotalHeight , CharacterEncoding Automatic, NumberMarks $NumberMarks<

Out[21]= 8BinaryFormat False, FormatType TeXForm, PageWidth 78, PageHeight 22,

This closes the stream again.


In[22]:=

Close@stmpD

Out[22]= tmp

Options A$Output E SetOptions A$Output,opt1 ->val1 ,E

find the options set for all streams in the channel $Output

set options for all streams in the channel $Output


Manipulating options for the standard output channel.

At every point in your session, Mathematica maintains a list Streams @D of all the input and output streams that are currently open, together with their options. In some cases, you may find it useful to look at this list directly. Mathematica will not, however, allow you to modify the list, except indirectly through OpenRead and so on.

Naming and Finding Files


Directory Operations
The precise details of the naming of files differ from one computer system to another. Nevertheless, Mathematica provides some fairly general mechanisms that work on all systems. Mathematica assumes that all your files are arranged in a hierarchy of directories. To find a particular file, Mathematica must know both what the name of the file is, and what sequence of directories it is in. At any given time, however, you have a current working directory, and you can refer to files or other directories by specifying where they are relative to this directory. Typically you can refer to files or directories that are actually in this directory simply by giving their names, with no directory information.

Data Manipulation

19

At any given time, however, you have a current working directory, and you can refer to files or other directories by specifying where they are relative to this directory. Typically you can refer to files or directories that are actually in this directory simply by giving their names, with no directory information.
your current working directory set your current working directory revert to your previous working directory

Directory@D SetDirectory @"dir"D ResetDirectory@D


Manipulating directories.

This gives a string representing your current working directory.


In[1]:= Out[1]=

Directory@D
/users/sw

This sets your current working directory to be the Examples subdirectory.


In[2]:= Out[2]=

SetDirectory@"Examples"D
/users/sw/Examples

Now your current working directory is different.


In[3]:= Out[3]=

Directory@D
/users/sw/Examples

This reverts to your previous working directory.


In[4]:= Out[4]=

ResetDirectory@D
/users/sw

When you call SetDirectory , you can give any directory name that is recognized by your operating system. Thus, for example, on Unix-based systems, you can specify a directory one level up in the directory hierarchy using the notation .., and you can specify your "home" directory as ~. Whenever you go to a new directory using SetDirectory , Mathematica always remembers what the previous directory was. You can return to this previous directory using ResetDirectory. In general, Mathematica maintains a stack of directories, given by DirectoryStack@D. Every time you call SetDirectory , it adds a new directory to the stack, and every time you call ResetDirectory it removes a directory from the stack.

20

Data Manipulation

ParentDirectory @D $InitialDirectory $HomeDirectory $BaseDirectory $UserBaseDirectory $InstallationDirectory


Special directories.

the parent of your current working directory the initial directory when Mathematica was started your home directory, if this is defined the base directory for systemwide files to be loaded by Mathematica the base directory for user-specific files to be loaded by Mathematica the top-level directory in which your Mathematica installation resides

Finding a File
Whenever you ask for a particular file, Mathematica in general goes through several steps to try and find the file you want. The first step is to use whatever standard mechanisms exist in your operating system or shell. Mathematica scans the full name you give for a file, and looks to see whether it contains any of the "metacharacters" *, $, ~, ?, @, ", and '. If it finds such characters, then it passes the full name to your operating system or shell for interpretation. This means that if you are using a Unix-based system, then constructions like name * and $VAR will be expanded at this point. But in general, Mathematica takes whatever was returned by your operating system or shell, and treats this as the full file name. For output files, this is the end of the processing that Mathematica does. If Mathematica cannot find a unique file with the name you specified, then it will proceed to create the file. If you are trying to get input from a file, however, then there is another round of processing that Mathematica does. What happens is that Mathematica looks at the value of the Path option for the function you are using to determine the names of directories relative to which it should search for the file. The default setting for the Path option is the global variable $Path .

GetA" file",Path ->8"dir1 ","dir2 ",<E


get a file, searching for it relative to the directories diri default list of directories relative to which to search for input files

$Path
Search path for files.

In general, the global variable $Path is defined to be a list of strings, with each string representing a directory. Every time you ask for an input file, what Mathematica effectively does is

Data Manipulation

21

In general, the global variable $Path is defined to be a list of strings, with each string representing a directory. Every time you ask for an input file, what Mathematica effectively does is temporarily to make each of these directories in turn your current working directory, and then from that directory to try and find the file you have requested.
Here is a typical setting for $Path . The current directory (.) and your home directory (~) are listed first.
In[5]:= Out[5]=

$Path
{., ~, /users/math/bin, /users/math/Packages}

You can also use FindFile to locate a file.

FindFile @"name"D FileExistsQ @"name"D


Finding a file on the $Path .

find the file with the specified name that would be loaded by Get and related functions determine whether the file exists

FindFile searches all directories in $Path and returns the absolute name of the file that would be loaded by Get, Needs , and other functions. FileExistsQ tests whether the file with the given name exists.
In[5]:= Out[5]=

FindFile@"init.m"D
"C:\\Documents and Settings\\sw\\Application Data\\Mathematica\\Kernel\\init.m"

FindFile applied to a package name returns the absolute name of the init.m file from that package.
In[5]:= Out[5]=

FindFile@"Combinatorica`"D
"C:\\Program Files\\Wolfram Research\\Mathematica\\7.0\\AddOns\\Packages\\Combinatorica\\Kernel\\init.m"

22

Data Manipulation

Listing Contents of Directories


FileNames@D FileNames@" form"D FileNames@8" form1 "," form2 ",<D FileNames@ forms,8"dir1 ","dir2 ",<D
give the full names of all files whose names match forms in any of the directories diri list all files in your current working directory list all files in your current working directory whose names match the string pattern form list all files whose names match any of the formi

FileNames@ forms,dirs,nD FileNamesA forms,dirs,Infinity E

include files that are in subdirectories up to n levels down

include files in all subdirectories

FileNamesA forms,$Path,Infinity E
give all files whose names match forms in any subdirectory of the directories in $Path
Getting lists of files in particular directories.

FileNames returns a list of strings corresponding to file names. When it returns a file that is not in your current directory, it gives the name of the file relative to the current directory. Note that all names are given in the format appropriate for the particular computer system on which they were generated.
Here is a list of all files in the current working directory whose names end with .m.
In[6]:= Out[6]=

FileNames@"*.m"D
{alpha.m, control.m, signals.m, test.m}

This lists files whose names start with a in the current directory, and in subdirectories with names that start with P.
In[7]:= Out[7]=

FileNames@"a*", 8".", "P*"<D


{alpha.m, Packages/astrodata, Packages/astro.m, Previous/atmp}

The file name form you give to FileNames can use any of Mathematica's string pattern objects, typically combined with the ~~ operator.

Data Manipulation

23

This gives a list of all files in your current working directory whose names match the form Test * .m.
In[3]:= Out[3]=

FileNames@"Test*.m"D
{Test1.m, Test2.m, TestFinal.m}

This lists only those files with names of the form Test d .m, where d is a sequence of one or more digits.
In[3]:= Out[3]=

FileNames@"Test" ~~ DigitCharacter .. ~~ ".m"D


{Test1.m, Test2.m}

Composing a Filename
DirectoryName @" file"D ToFileName @"directory","name"D ParentDirectory @"directory"D
extract the directory name from a file name assemble a full file name from a directory name and a file name give the parent of a directory assemble a full file name from a hierarchy of directory names

ToFileName @8"dir1 ","dir2 ",<,"name"D

ToFileName @8"dir1 ","dir2 ",<D


Manipulating file names.

assemble a single directory name from a hierarchy of directory names

You should realize that different computer systems may give file names in different ways. Thus, for example, Windows systems typically give names in the form dir : dir dir name and Unix systems give names in the form dir dir name. The function ToFileName assembles file names in the appropriate way for the particular computer system you are using.
This gives the directory portion of the file name.
In[8]:= Out[8]=

DirectoryName@"PackagesMathtest.m"D
PackagesMath

This constructs the full name of another file in the same directory as test.m.
In[9]:= Out[9]=

ToFileName@%, "abc.m"D
PackagesMathabc.m

24

Data Manipulation

FileNameSplit @"name"D FileNameJoin @8dir1 ,<D FileNameTake @"name",D FileNameDrop @"name",D FileNameDepth @"name"D $PathnameSeparator
Manipulating file names.

split the file name into a list of directory and file names combine a list of directory and file names into the file name extract part of the file name drop parts of the file name get the number of path elements in the file name path name separator used in your operating system

Functions like FileNameSplit and FileNameJoin provide additional operations on file names. They respect the file name separator used by your operating system and will split the file name appropriately. FileNameJoin will by default use the $PathnameSeparator to produce the name in a canonical form suitable for your operating system. If you want to set up a collection of related files, it is often convenient to be able to refer to one file when you are reading another one. The global variable $Input gives the name of the file from which input is currently being taken. Using DirectoryName and ToFileName you can then conveniently specify the names of other related files.
the name of the file or stream from which input is currently being taken

$Input

Finding out how to refer to a file currently being read by Mathematica.

One issue in handling files in Mathematica is that the form of file and directory names varies between computer systems. This means for example that names of files which contain standard Mathematica packages may be quite different on different systems. Through a sequence of conventions, it is however possible to read in a standard Mathematica package with the same command on all systems. The way this works is that each package defines a so-called Mathematica context, of the form name`name`. On each system, all files are named in correspondence with the contexts they define. Then when you use the command << name`name` Mathematica automatically translates the context name into the file name appropriate for your particular computer system.

Data Manipulation

25

Standard Filename Extensions


file.m file.nb file.mx
Typical names of Mathematica files.

Mathematica expression file in plain text format Mathematica notebook file Mathematica definitions in DumpSave format

If you use a notebook interface to Mathematica, then the Mathematica front end allows you to save complete notebooks, including not only Mathematica input and output, but also text, graphics and other material. It is conventional to give Mathematica notebook files names that end in .nb, and most versions of Mathematica enforce this convention.
the parent of your current working directory the initial directory when Mathematica was started

FileBaseName @"name"D FileExtension @"name"D


File name and extension.

You can use FileBaseName and FileExtension to extract the name of the file and its extension. When you open a notebook in the Mathematica front end, Mathematica will immediately display the contents of the notebook, but it will not normally send any of these contents to the kernel for evaluation until you explicitly request this to be done. Within a Mathematica notebook, however, you can use the Cell menu in the front end to identify certain cells as initialization cells, and if you do this, then the contents of these cells will automatically be evaluated whenever you open the notebook.
The I in the cell bracket indicates that the second cell is an initialization cell that will be evaluated whenever the notebook is opened.

It is sometimes convenient to maintain Mathematica material both in a notebook which contains explanatory text, and in a package which contains only raw Mathematica definitions. You can do this by putting the Mathematica definitions into initialization cells in the notebook. Every time you save the notebook, the front end will then allow you to save an associated .m file which contains only the raw Mathematica definitions.

26

Data Manipulation

It is sometimes convenient to maintain Mathematica material both in a notebook which contains explanatory text, and in a package which contains only raw Mathematica definitions. You can do this by putting the Mathematica definitions into initialization cells in the notebook. Every time you save the notebook, the front end will then allow you to save an associated .m file which contains only the raw Mathematica definitions.

Files for Packages


When you create or use Mathematica packages, you will often want to refer to files in a systemindependent way. You can use contexts to do this. The basic idea is that on every computer system there is a convention about how files corresponding to Mathematica contexts should be named. Then, when you refer to a file using a context, the particular version of Mathematica you are using converts the context name to the file name appropriate for the computer system you are on.
read in the file corresponding to the specified context

<<context`
Using contexts to specify files.

This reads in one of the standard packages that come with Mathematica.
In[1]:=

<< VectorAnalysis`

name.mx name.mx$SystemIDname.mx name.m nameinit.m dir

file in DumpSave format file in DumpSave format for your computer system file in Mathematica source format initialization file for a particular directory files in other directories specified by $Path

The typical sequence of files looked for by << name`.

Mathematica is set up so that << name` will automatically try to load the appropriate version of a file. It will first try to load a name.mx file that is optimized for your particular computer system. If it finds no such file, then it will try to load a name.m file containing ordinary system-independent Mathematica input. If name is a directory, then Mathematica will try to load the initialization file init.m in that directory. The purpose of the init.m file is to provide a convenient way to set up Mathematica packages that involve many separate files. The idea is to allow you to give just the command << name`, but then to load init.m to initialize the whole package, reading in whatever other

Data Manipulation

27

If name is a directory, then Mathematica will try to load the initialization file init.m in that directory. The purpose of the init.m file is to provide a convenient way to set up Mathematica packages that involve many separate files. The idea is to allow you to give just the command << name`, but then to load init.m to initialize the whole package, reading in whatever other files are necessary.

Manipulating Files and Directories


CopyFile @" file1 "," file2 "D RenameFile @" file1 "," file2 "D DeleteFile @" file"D FileByteCount @" file"D FileDate @" file"D SetFileDate @" file"D FileType @" file"D
Functions for manipulating files.

copy file1 to file2 give file1 the name file2 delete a file give the number of bytes in a file give the modification date for a file set the modification date for a file to be the current date give the type of a file as File, Directory or None

Different operating systems have different commands for manipulating files. Mathematica provides a simple set of file manipulation functions, intended to work in the same way under all operating systems. Notice that CopyFile and RenameFile give the final file the same modification date as the original one. FileDate returns modification dates in the 8year, month, day, hour, minute, second < format used by DateList .
create a new directory delete an empty directory delete a directory and all files and directories it contains

CreateDirectory @"name"D DeleteDirectory @"name"D DeleteDirectory A"name", DeleteContents->True E CopyDirectory @"name1 ","name2 "D
Functions for manipulating directories.

RenameDirectory @"name1 ","name2 "D rename a directory


copy a directory and all the files in it

28

Data Manipulation

Reading Textual Data


With <<, you can read files which contain Mathematica expressions given in input form. Sometimes, however, you may instead need to read files of data in other formats. For example, you may have data generated by an external program which consists of a sequence of numbers separated by spaces. This data cannot be read directly as Mathematica input. However, the function ReadList can take such data from a file or input stream, and convert it to a Mathematica list.
read a sequence of numbers from a file, and put them in a Mathematica list

ReadList A" file",NumberE


Reading numbers from a file. Here is a file of numbers.
In[1]:=

FilePrint@"ExampleDatanumbers"D

11.1 44.4

22.2 55.5

33.3 66.6

This reads all the numbers in the file, and returns a list of them.
In[2]:=

ReadList@"ExampleDatanumbers", NumberD

Out[2]= 811.1, 22.2, 33.3, 44.4, 55.5, 66.6<

ReadList A" file",9Number,Number=E read numbers from a file, putting each successive pair into
a separate list

ReadList A" file",Table ANumber,9n=EE


put each successive block of n numbers in a separate list

ReadList A" file",Number,RecordLists ->True E


put all the numbers on each line of the file into a separate list
Reading blocks of numbers. This puts each successive pair of numbers from the file into a separate list.
In[3]:=

ReadList@"ExampleDatanumbers", 8Number, Number<D

Out[3]= 8811.1, 22.2<, 833.3, 44.4<, 855.5, 66.6<<

This makes each line in the file into a separate list.

Data Manipulation

29

This makes each line in the file into a separate list.


In[4]:=

ReadList@"ExampleDatanumbers", Number, RecordLists -> TrueD

Out[4]= 8811.1, 22.2, 33.3<, 844.4, 55.5, 66.6<<

ReadList can handle numbers which are given in Fortran-like "E " notation. Thus, for example, ReadList will read 2.5 E + 5 as 2.5 105 . Note that ReadList can handle numbers with any number of digits of precision.
Here is a file containing numbers in Fortran-like "E " notation.
In[5]:=

FilePrint@"ExampleDatabignum"D

4.5E-5 2.5E2

7.8E4 -8.9

ReadList can handle numbers in this form.


In[6]:=

ReadList@"ExampleDatabignum", NumberD

Out[6]= 80.000045, 78 000., 250., -8.9<

ReadList @" file",typeD ReadList @" file",type,nD


Reading objects of various types.

read a sequence of objects of a particular type read at most n objects

ReadList can read not only numbers, but also a variety of other types of object. Each type of object is specified by a symbol such as Number.
Here is a file containing text.
In[7]:=

FilePrint@"ExampleDatastrings"D

Here is text. And more text.


This produces a list of the characters in the file, each given as a one-character string.
In[8]:=

ReadList@"ExampleDatastrings", CharacterD
, A, n, d, < , i, s, , t, e, x, t, ., , , m, o, r, e, , t, e, x, t, .,

Out[8]= 8H, e, r, e,

Here are the integer codes corresponding to each of the bytes in the file.
In[9]:=

ReadList@"ExampleDatastrings", ByteD
10, 65, 110, 100, 32, 109, 111, 114, 101, 32, 116, 101, 120, 116, 46, 10<

Out[9]= 872, 101, 114, 101, 32, 105, 115, 32, 116, 101, 120, 116, 46, 32,

This puts the data from each line in the file into a separate list.

30

Data Manipulation

This puts the data from each line in the file into a separate list.
In[10]:=

ReadList@"ExampleDatastrings", Byte, RecordLists -> TrueD


865, 110, 100, 32, 109, 111, 114, 101, 32, 116, 101, 120, 116, 46<<

Out[10]= 8872, 101, 114, 101, 32, 105, 115, 32, 116, 101, 120, 116, 46, 32<,

Byte Character Real Number Word Record String Expression Hold AExpression E
Types of objects to read.

single byte of data, returned as an integer single character, returned as a one-character string approximate number in Fortran-like notation exact or approximate number in Fortran-like notation sequence of characters delimited by word separators sequence of characters delimited by record separators string terminated by a newline complete Mathematica expression complete Mathematica expression, returned inside Hold

This returns a list of the words in the file strings.


In[11]:=

ReadList@"ExampleDatastrings", WordD

Out[11]= 8Here, is, text., And, more, text.<

ReadList allows you to read words from a file. It considers a word to be any sequence of characters delimited by word separators. You can set the option WordSeparators to specify the strings you want to treat as word separators. The default is to include spaces and tabs, but not to include, for example, standard punctuation characters. Note that in all cases successive words can be separated by any number of word separators. These separators are never taken to be part of the actual words returned by ReadList .
option name default value

RecordLists RecordSeparators WordSeparators NullRecords NullWords TokenWords


Options for ReadList .

False 8"\r\n", "\n","\r"< 8" ","t"< False False 8<

whether to make a separate list for the objects in each record separators for records separators for words whether to keep zero-length records whether to keep zero-length words words to take as tokens

This reads the text in the file strings as a sequence of words, using the letter e and . as word separators.

Data Manipulation

31

This reads the text in the file strings as a sequence of words, using the letter e and . as word separators.
In[12]:=

ReadList@"ExampleDatastrings", Word, WordSeparators -> 8"e", "."<D


is t, xt, , And mor, t, xt<

Out[12]= 8H, r,

Mathematica considers any data file to consist of a sequence of records. By default, each line is considered to be a separate record. In general, you can set the option RecordSeparators to give a list of separators for records. Note that words can never cross record separators. As with word separators, any number of record separators can exist between successive records, and these separators are not considered to be part of the records themselves.
By default, each line of the file is considered to be a record.
In[13]:=

ReadList@"ExampleDatastrings", RecordD InputForm

Out[13]//InputForm= {"Here is text. ", "And more text."}

Here is a file containing three sentences ending with periods.


In[14]:=

FilePrint@"ExampleDatasentences"D

Here is text. And more. And a second line.


This allows both periods and newlines as record separators.
In[15]:=

ReadList@"ExampleDatasentences", Record, RecordSeparators -> 8".", "\n"<D


And more, And a second line<

Out[15]= 8Here is text,

This puts the words in each sentence into a separate list.


In[16]:=

ReadList@"ExampleDatasentences", Word, RecordLists -> True, RecordSeparators -> 8".", "\n"<D

Out[16]= 88Here, is, text<, 8And, more<, 8And, a, second, line<<

ReadList A" file",Record,RecordSeparators ->9=E


read the whole of a file as a single string

ReadList A" file",Record,RecordSeparators ->88"lsep1 ",<,8"rsep1 ",<<E


make a list of those parts of a file which lie between the lsepi and the rsepi
Settings for the RecordSeparators option.

32

Data Manipulation

Here is a file containing some text.


In[17]:=

FilePrint@"ExampleDatasource"D

f[x] (: function f :) g[x] (: function g :)


This reads all the text in the file source, and returns it as a single string.
In[18]:=

InputForm@ReadList@"ExampleDatasource", Record, RecordSeparators -> 8<DD

Out[18]//InputForm= {"f[x] (: function f :)\ng[x] (: function g :)\n"}

This gives a list of the parts of the file that lie between H : and : L separators.
In[19]:=

ReadList@"ExampleDatasource", Record, RecordSeparators -> 88"H: "<, 8" :L"<<D

Out[19]= 8function f, function g<

By choosing appropriate separators, you can pick out specific parts of files.
In[20]:=

ReadList@"ExampleDatasource", Record, RecordSeparators -> 88"H: function ", "@"<, 8" :L", "D"<<D

Out[20]= 8x, f, x, g<

Mathematica usually allows any number of appropriate separators to appear between successive records or words. Sometimes, however, when several separators are present, you may want to assume that a null record or null word appears between each pair of adjacent separators. You can do this by setting the options NullRecords -> True or NullWords -> True .
Here is a file containing words separated by colons.
In[21]:=

FilePrint@"ExampleData words"D

first:second::fourth:::seventh
Here the repeated colons are treated as single separators.
In[22]:=

ReadList@"ExampleDatawords", Word, WordSeparators -> 8":"<D

Out[22]= 8first, second, fourth, seventh<

Now repeated colons are taken to have null words in between.


In[23]:=

ReadList@"ExampleDatawords", Word, WordSeparators -> 8":"<, NullWords -> TrueD

Out[23]= 8first, second, , fourth, , , seventh<

In most cases, you want words to be delimited by separators which are not themselves considered as words. Sometimes, however, it is convenient to allow words to be delimited by special token words, which are themselves words. You can give a list of such token words as a setting for the option TokenWords .

Data Manipulation

33

In most cases, you want words to be delimited by separators which are not themselves considered as words. Sometimes, however, it is convenient to allow words to be delimited by special token words, which are themselves words. You can give a list of such token words as a setting for the option TokenWords .
Here is some text.
In[24]:=

FilePrint@"ExampleDatalanguage"D

22*a*b+56*c+13*a*d
This reads the text, using the specified token words to delimit words in the text.
In[25]:=

ReadList@"ExampleDatalanguage", Word, TokenWords -> 8"+", "*"<D

Out[25]= 822, *, a, *, b, +, 56, *, c, +, 13, *, a, *, d<

You can use ReadList to read Mathematica expressions from files. In general, each expression must end with a newline, although a single expression may go on for several lines.
Here is a file containing text that can be used as Mathematica input.
In[26]:=

FilePrint@"ExampleDataexprs"D

x + y + z 2^8
This reads the text in exprs as Mathematica expressions.
In[27]:=

ReadList@"ExampleDataexprs", ExpressionD

Out[27]= 8x + y + z, 256<

This prevents the expressions from being evaluated.


In[28]:=

ReadList@"ExampleDataexprs", Hold@ExpressionDD

8 Out[28]= 9Hold@x + y + zD, HoldA2 E=

ReadList can insert the objects it reads into any Mathematica expression. The second argument to ReadList can consist of any expression containing symbols such as Number and Word specifying objects to read. Thus, for example, ReadList @" file", 8Number, Number<D inserts successive pairs of numbers that it reads into lists. Similarly, ReadList @" file", Hold @Expression DD puts expressions that it reads inside Hold . If ReadList reaches the end of your file before it has finished reading a particular set of objects you have asked for, then it inserts the special symbol EndOfFile in place of the objects it has not yet read.

34

Data Manipulation

If ReadList reaches the end of your file before it has finished reading a particular set of objects you have asked for, then it inserts the special symbol EndOfFile in place of the objects it has not yet read.
Here is a file of numbers.
In[29]:=

FilePrint@"ExampleDatanumbers"D

11.1 44.4

22.2 55.5

33.3 66.6

The symbol EndOfFile appears in place of numbers that were needed after the end of the file was reached.
In[30]:=

ReadList@"ExampleDatanumbers", 8Number, Number, Number, Number<D

Out[30]= 8811.1, 22.2, 33.3, 44.4<, 855.5, 66.6, EndOfFile, EndOfFile<<

ReadList @"!command ",typeD ReadList @stream,typeD


Reading from commands and streams.

execute a command, and read its output read any input stream

This executes the Unix command date, and reads its output as a string.
In[31]:= Out[31]=

ReadList@"!date", StringD
8Thu Mar 31 19:20:36 CST 2005<

OpenRead @" file"D OpenRead @"!command "D Read @stream,typeD Skip @stream,typeD Skip @stream,type,nD Close @streamD
Functions for reading from input streams.

open a file for reading open a pipe for reading read an object of the specified type from a stream skip over an object of the specified type in an input stream skip over n objects of the specified type in an input stream close an input stream

ReadList allows you to read all the data in a particular file or input stream. Sometimes, however, you want to get data a piece at a time, perhaps doing tests to find out what kind of data to expect next. When you read individual pieces of data from a file, Mathematica always remembers the current point that you are at in the file. When you call OpenRead , Mathematica sets up an input stream from a file, and makes your current point the beginning of the file. Every time you read an object from the file using Read , Mathematica sets your current point to be just after the object you have read. Using Skip , you can advance the current point past a sequence of

Data Manipulation

35

When you read individual pieces of data from a file, Mathematica always remembers the current point that you are at in the file. When you call OpenRead , Mathematica sets up an input stream from a file, and makes your current point the beginning of the file. Every time you read an object from the file using Read , Mathematica sets your current point to be just after the object you have read. Using Skip , you can advance the current point past a sequence of objects without actually reading the objects.
Here is a file of numbers.
In[32]:=

FilePrint@"ExampleDatanumbers"D

11.1 44.4

22.2 55.5

33.3 66.6

This opens an input stream from the file.


In[33]:=

snum = OpenRead@"ExampleDatanumbers"D

Out[33]= InputStream@ExampleDatanumbers, 66D

This reads the first number from the file.


In[34]:=

Read@snum, NumberD

Out[34]= 11.1

This reads the second pair of numbers.


In[35]:=

Read@snum, 8Number, Number<D

Out[35]= 822.2, 33.3<

This skips the next number.


In[36]:=

Skip@snum, NumberD And this reads the remaining numbers.

In[37]:=

ReadList@snum, NumberD

Out[37]= 855.5, 66.6<

This closes the input stream.


In[38]:=

Close@snumD

Out[38]= ExampleDatanumbers

You can use the options WordSeparators and RecordSeparators in Read and Skip just as you do in ReadList .

36

Data Manipulation

You can use the options WordSeparators and RecordSeparators in Read and Skip just as you do in ReadList . Note that if you try to read past the end of file, Read returns the symbol EndOfFile.

Searching Files
FindList @" file","text"D FindList @" file","text",nD FindList @" file", 8"text1 ","text2 ",<D
Finding lines that contain specified text. Here is a file containing some text.
In[1]:=

get a list of all the lines in the file that contain the specified text get a list of the first n lines that contain the specified text get lines that contain any of the texti

FilePrint@"ExampleDatatextfile"D

Here is the first line of text. And the second. And the third. Here is the end.
This returns a list of all the lines in the file containing the text is.
In[2]:=

FindList@"ExampleDatatextfile", "is"D

Out[2]= 8Here is the first line of text., And the third. Here is the end.<

The text fourth appears nowhere in the file.


In[3]:=

FindList@"ExampleDatatextfile", "fourth"D

Out[3]= 8<

By default, FindList scans successive lines of a file, and returns those lines which contain the text you specify. In general, however, you can get FindList to scan successive records, and return complete records which contain specified text. As in ReadList , the option RecordSeparators allows you to tell Mathematica what strings you want to consider as record separators. Note that by giving a pair of lists as the setting for RecordSeparators , you can specify different left and right separators. By doing this, you can make FindList search only for text which is between specific pairs of separators.

Data Manipulation

37

This finds all sentences ending with a period which contain And.
In[4]:=

FindList@"ExampleDatatextfile", "And", RecordSeparators -> 8"."<D


And the second, And the third<

Out[4]= 8

option name

default value

RecordSeparators AnchoredSearch WordSeparators WordSearch IgnoreCase


Options for FindList .

8"n"< False 8" ","t"< False False

separators for records whether to require the text searched for to be at the beginning of a record separators for words whether to require that the text searched for appear as a word whether to treat lowercase and uppercase letters as equivalent

This finds only the occurrence of Here which is at the beginning of a line in the file.
In[5]:=

FindList@"ExampleDatatextfile", "Here", AnchoredSearch -> TrueD

Out[5]= 8Here is the first line of text.<

In general, FindList finds text that appears anywhere inside a record. By setting the option WordSearch -> True , however, you can tell FindList to require that the text it is looking for appears as a separate word in the record. The option WordSeparators specifies the list of separators for words.
The text th does appear in the file, but not as a word. As a result, the FindList fails.
In[6]:=

FindList@"ExampleDatatextfile", "th", WordSearch -> TrueD

Out[6]= 8<

FindList @8" file1 "," file2 ",<,"text"D


search for occurrences of the text in any of the filei
Searching in multiple files. This searches for third in two copies of textfile.
In[7]:=

FindList@8"ExampleDatatextfile", "ExampleDatatextfile"<, "third"D

Out[7]= 8And the third. Here is the end., And the third. Here is the end.<

It is often useful to call FindList on lists of files generated by functions such as FileNames.

38

Data Manipulation

It is often useful to call FindList on lists of files generated by functions such as FileNames.

FindList @"!command ",D

run an external command, and find text in its output

Finding text in the output from an external program. This runs the external Unix command date in a text-based interface.
In[8]:=

! date

Thu Mar 31 19:20:36 CST 2006


Out[8]=

This finds the time-of-day field in the date.


In[9]:= Out[9]=

FindList@"!date", ":", RecordSeparators -> 8" "<D


819:20:36<

OpenRead @" file"D OpenRead @"!command "D Find @stream,textD Close @streamD
Finding successive occurrences of text.

open a file for reading open a pipe for reading find the next occurrence of text close an input stream

FindList works by making one pass through a particular file, looking for occurrences of the text you specify. Sometimes, however, you may want to search incrementally for successive occurrences of a piece of text. You can do this using Find . In order to use Find , you first explicitly have to open an input stream using OpenRead . Then, every time you call Find on this stream, it will search for the text you specify, and make the current point in the file be just after the record it finds. As a result, you can call Find several times to find successive pieces of text.
This opens an input stream for textfile.
In[10]:=

stext = OpenRead@"ExampleDatatextfile"D

Out[10]= InputStream@ExampleDatatextfile, 76D

This finds the first line containing And.


In[11]:=

Find@stext, "And"D

Out[11]= And the second.

Calling Find again gives you the next line containing And.

Data Manipulation

39

Calling Find again gives you the next line containing And.
In[12]:=

Find@stext, "And"D

Out[12]= And the third. Here is the end.

This closes the input stream.


In[13]:=

Close@stextD

Out[13]= ExampleDatatextfile

Once you have an input stream, you can mix calls to Find , Skip and Read . If you ever call FindList or ReadList , Mathematica will immediately read to the end of the input stream.
This opens the input stream.
In[14]:=

stext = OpenRead@"ExampleDatatextfile"D

Out[14]= InputStream@ExampleDatatextfile, 77D

This finds the first line which contains second, and leaves the current point in the file at the beginning of the next line.
In[15]:=

Find@stext, "second"D

Out[15]= And the second.

Read can then read the word that appears at the beginning of the line.
In[16]:=

Read@stext, WordD

Out[16]= And

This skips over the next three words.


In[17]:=

Skip@stext, Word, 3D Mathematica finds is in the remaining text, and prints the entire record as output.

In[18]:=

Find@stext, "is"D

Out[18]= And the third. Here is the end.

This closes the input stream.


In[19]:=

Close@stextD

Out[19]= ExampleDatatextfile

40

Data Manipulation

StreamPosition@streamD SetStreamPosition@stream,nD SetStreamPosition@stream,0D

find the position of the current point in an open stream set the position of the current point set the current point to the beginning of a stream

SetStreamPositionAstream,Infinity E
set the current point to the end of a stream
Finding and setting the current point in a stream.

Functions like Read , Skip and Find usually operate on streams in an entirely sequential fashion. Each time one of the functions is called, the current point in the stream moves on. Sometimes, you may need to know where the current point in a stream is, and be able to reset it. On most computer systems, StreamPosition returns the position of the current point as an integer giving the number of bytes from the beginning of the stream.
This opens the stream.
In[20]:=

stext = OpenRead@"ExampleDatatextfile"D

Out[20]= InputStream@ExampleDatatextfile, 78D

When you first open the file, the current point is at the beginning, and StreamPosition returns 0.
In[21]:=

StreamPosition@stextD

Out[21]= 0

This reads the first line in the file.


In[22]:=

Read@stext, RecordD

Out[22]= Here is the first line of text.

Now the current point has advanced.


In[23]:=

StreamPosition@stextD

Out[23]= 31

This sets the stream position back.


In[24]:=

SetStreamPosition@stext, 5D

Out[24]= 5

Data Manipulation

41

Now Read returns the remainder of the first line.


In[25]:=

Read@stext, RecordD

Out[25]= is the first line of text.

This closes the stream.


In[26]:=

Close@stextD

Out[26]= ExampleDatatextfile

Searching and Reading Strings


Functions like Read and Find are most often used for processing text and data from external files. In some cases, however, you may find it convenient to use these same functions to process strings within Mathematica. You can do this by using the function StringToStream , which opens an input stream that takes characters not from an external file, but instead from a Mathematica string.
open an input stream for reading from a string close an input stream

StringToStream @"string"D Close @streamD


Treating strings as input streams.

This opens an input stream for reading from the string.


In[1]:=

str = StringToStream@"A string of words."D

Out[1]= InputStream@String, 27D

This reads the first word from the string.


In[2]:=

Read@str, WordD

Out[2]= A

This reads the remaining words from the string.


In[3]:=

ReadList@str, WordD

Out[3]= 8string, of, words.<

42

Data Manipulation

This closes the input stream.


In[4]:=

Close@strD

Out[4]= String

Input streams associated with strings work just like those with files. At any given time, there is a current position in the stream, which advances when you use functions like Read . The current position is given as the number of characters from the beginning of the string by the function StreamPosition@streamD. You can explicitly set the current position using SetStreamPosition@stream, nD.
Here is an input stream associated with a string.
In[5]:=

str = StringToStream@"123 456 789"D

Out[5]= InputStream@String, 28D

The current position is initially 0 characters from the beginning of the string.
In[6]:=

StreamPosition@strD

Out[6]= 0

This reads a number from the stream.


In[7]:=

Read@str, NumberD

Out[7]= 123

The current position is now 3 characters from the beginning of the string.
In[8]:=

StreamPosition@strD

Out[8]= 3

This sets the current position to be 1 character from the beginning of the string.
In[9]:=

SetStreamPosition@str, 1D

Out[9]= 1

If you now read a number from the string, you get the 23 part of 123.
In[10]:=

Read@str, NumberD

Out[10]= 23

Data Manipulation

43

This sets the current position to the end of the string.


In[11]:=

SetStreamPosition@str, InfinityD

Out[11]= 11

If you now try to read from the stream, you will always get EndOfFile.
In[12]:=

Read@str, NumberD

Out[12]= EndOfFile

This closes the stream.


In[13]:=

Close@strD

Out[13]= String

Particularly when you are processing large volumes of textual data, it is common to read fairly long strings into Mathematica, then to use StringToStream to allow further processing of these strings within Mathematica. Once you have created an input stream using StringToStream , you can read and search the string using any of the functions discussed for files.
This puts the whole contents of textfile into a string.
In[14]:=

s = First@ReadList@"ExampleDatatextfile", Record, RecordSeparators -> 8<DD


And the second. And the third. Here is the end.

Out[14]= Here is the first line of text.

This opens an input stream for the string.


In[15]:=

str = StringToStream@sD

Out[15]= InputStream@String, 30D

This gives the lines of text in the string that contain is.
In[16]:=

FindList@str, "is"D

Out[16]= 8Here is the first line of text., And the third. Here is the end.<

This resets the current position back to the beginning of the string.
In[17]:=

SetStreamPosition@str, 0D

Out[17]= 0

44

Data Manipulation

This finds the first occurrence of the in the string, and leaves the current point just after it.
In[18]:=

Find@str, "the", RecordSeparators -> 8" "<D

Out[18]= the

This reads the word which appears immediately after the.


In[19]:=

Read@str, WordD

Out[19]= first

This closes the input stream.


In[20]:=

Close@strD

Out[20]= String

Binary Files
Functions like Read and Write handle ordinary printable text. But in dealing with external data files or devices it is sometimes necessary to go to a lower level, and work directly with raw binary data. You can do this using BinaryRead and BinaryWrite .
read one byte read an object of the specified type read a list of objects write one byte write a sequence of bytes write the characters in a string write an object of the specified type write a sequence of objects

BinaryRead @streamD BinaryRead @stream,typeD BinaryRead @stream,8type1 ,type2 ,<D BinaryWrite @stream,bD BinaryWrite @stream,8b1 ,b2 ,<D BinaryWrite @stream,"string"D BinaryWrite @stream,x,typeD BinaryWrite @ stream,8x1 ,x2 ,<,typeD

BinaryWrite @stream,8x1 ,x2 ,<,8type1 ,type2 ,<D


write objects of different types
Reading and writing binary data.

Data Manipulation

45

"Byte" "Character8" "Character16" "Complex64" "Complex128" "Complex256" "Integer8" "Integer16" "Integer32" "Integer64" "Integer128" "Real32" "Real64" "Real128" "TerminatedString" "UnsignedInteger8" "UnsignedInteger16" "UnsignedInteger32" "UnsignedInteger64" "UnsignedInteger128"

8-bit unsigned integer 8-bit character 16-bit character IEEE single-precision complex number IEEE double-precision complex number IEEE quad-precision complex number 8-bit signed integer 16-bit signed integer 32-bit signed integer 64-bit signed integer 128-bit signed integer IEEE single-precision real number IEEE double-precision real number IEEE quad-precision real number null-terminated string of 8-bit characters 8-bit unsigned integer 16-bit unsigned integer 32-bit unsigned integer 64-bit unsigned integer 128-bit unsigned integer

Types supported in BinaryRead and BinaryWrite . This writes a sequence of bytes to a file.
In[1]:=

BinaryWrite@"tmp", 897, 98, 99, 100, 101<D

Out[1]= tmp

BinaryWrite automatically opens a stream for the file. This closes it.
In[2]:=

Close@"tmp"D; This reads the first byte from the file, returning it as an integer.

In[3]:=

BinaryRead@"tmp"D

Out[3]= 97

46

Data Manipulation

This reads the second 8 bits in the file as a character.


In[4]:=

BinaryRead@"tmp", "Character8"D

Out[4]= b

This reads the next 32 bits as a 32-bit integer.


In[5]:=

BinaryRead@"tmp", "Integer32"D

Out[5]= EndOfFile

Like Read and Write , BinaryRead and BinaryWrite work with streams. But if you give a file name, they automatically open the specified file as a stream. To create a stream directly you can use OpenRead or is OpenWrite. required On some computer to be systems, used the option setting and for any stream with

BinaryFormat -> True

BinaryRead

BinaryWrite , in order to prevent possible corruption from such issues as newline translation. In using Mathematica you are normally completely insulated from the raw representation of data inside your computer. But with BinaryRead and BinaryWrite this is no longer so. One of the subtleties that then arises is that different computers may take the bytes that make up numbers to be in different orders, as specified by their setting for $ByteOrdering .
This writes a 32-bit integer to a file.
In[6]:=

BinaryWrite@"tmp2", 45 671, "Integer32"D

Out[6]= tmp2

This closes the file.


In[7]:=

Close@"tmp2"D; This reads the integer back, but assumes an opposite byte ordering.

In[8]:=

BinaryRead@"tmp2", "Integer32", ByteOrdering -> - $ByteOrderingD

Out[8]= 1 739 718 656

BinaryReadList@" file"D BinaryReadList@" file",typeD

read all the bytes in a file read all the data, treating it as objects of a certain type

BinaryReadList@" file",8type1 ,type2 ,<D


treat the data as objects of a sequence of types

BinaryReadList@" file",types,nD
Reading complete binary files.

read only the first n objects

This writes out a 128-bit real number.

Data Manipulation

47

This writes out a 128-bit real number.


In[9]:=

BinaryWrite@"tmp3", 5.67891, "Real128"D

Out[9]= tmp3

This reads back the bytes in the number.


In[10]:=

BinaryReadList@"tmp3", "Byte"D

Out[10]= 80, 0, 0, 0, 0, 0, 0, 224, 89, 187, 237, 66, 115, 107, 1, 64<

This reads back the bytes as a sequence of 32-bit real numbers.


In[11]:=

BinaryReadList@"tmp3", "Real32"D

19 Out[11]= 90., -3.68935 10 , 118.866, 2.02218=

This treats the data as pairs containing a byte and a 32-bit real.
In[12]:=

BinaryReadList@"tmp3", 8"Byte", "Real32"<D

-38 Out[12]= 980, 0.<, 80, -0.00332451<, 9237, 4.32454 10 =, 864, EndOfFile<=

BinaryRead and BinaryWrite allow complete flexibility in reading and writing raw binary data. But in many practical applications one instead wants to work only with particular predefined formats. You can do this using Import and Export. In addition to many complex formats, Import and Export support files containing sequences of identical data elements, of the same types as in BinaryRead and BinaryWrite . They also support the "Bit" format, consisting of individual binary bits, represented as 0 or 1.

Generating C and Fortran Expressions


If you have special-purpose programs written in C or Fortran, you may want to take formulas you have generated in Mathematica and insert them into the source code of your programs. Mathematica allows you to convert mathematical expressions into C and Fortran expressions.
write out expr so it can be used in a C program write out expr for Fortran

CForm @exprD FortranForm @exprD

Mathematica output for programming languages.

48

Data Manipulation

Here is an expression, written out in standard Mathematica form.


In[1]:= Out[1]=

Expand@H1 + x + yL ^ 2D
1 + 2 x + x2 + 2 y + 2 x y + y2

Here is the expression in Fortran form.


In[2]:=

FortranForm@%D
1 + 2*x + x**2 + 2*y + 2*x*y + y**2

Out[2]//FortranForm=

Here is the same expression in C form. Macros for objects like Power are defined in the C header file mdefs.h that comes with most versions of Mathematica.
In[3]:=

CForm@%D
1 + 2*x + Power(x,2) + 2*y + 2*x*y + Power(y,2)

Out[3]//CForm=

You should realize that there are many differences between Mathematica and C or Fortran. As a result, expressions you translate may not work exactly the same as they do in Mathematica. In addition, there are so many differences in programming constructs that no attempt is made to translate these automatically.
compile an expression into efficient internal code

Compile @x,exprD
A way to compile Mathematica expressions.

One of the common motivations for converting Mathematica expressions into C or Fortran is to try to make them faster to evaluate numerically. But the single most important reason that C and Fortran can potentially be more efficient than Mathematica is that in these languages one always specifies up front what type each variable one uses will be~integer, real number, array, and so on. The Mathematica function Compile makes such assumptions within Mathematica, and generates highly efficient internal code. Usually this code runs not much if at all slower than custom C or Fortran.

Splicing Mathematica Output into External Files


If you want to make use of Mathematica output in an external file such as a program or document, you will often find it useful to splice the output automatically into the file.

Data Manipulation

49

Splice@" file.mx"D Splice@"infile","outfile"D


Splicing Mathematica output into files.

splice Mathematica output into an external file named file.mx, putting the results in the file file.x splice Mathematica output into infile, sending the output to

outfile

The basic idea is to set up the definitions you need in a particular Mathematica session, then run Splice to use the definitions you have made to produce the appropriate output to insert into the external files. #include "mdefs.h" double f(x) double x; { double y; y = <* Integrate[Sin[x]^5, x] *> ; return(2*y - 1) ; }
A simple C program containing a Mathematica formula.

#include "mdefs.h" double f(x) double x; { double y; y = -5*Cos(x)/8 + 5*Cos(3*x)/48 - Cos(5*x)/80 ; return(2*y - 1) ; }
The C program after processing with Splice.

50

Data Manipulation

Importing and Exporting


Importing and Exporting Data
Import@" file","Table"D Export@" file",list,"Table"D
Importing and exporting tabular data. This exports an array of numbers to the file out.dat.
In[1]:=

import a table of data from a file export list to a file as a table of data

Export@"out.dat", 885.7, 4.3<, 8- 1.2, 7.8<<D

Out[1]= out.dat

Here are the contents of the file out.dat.


In[2]:=

FilePrint@"out.dat"D

5.7 -1.2

4.3 7.8

This imports the contents of out.dat as a table of data.


In[3]:= Out[3]=

Import@"out.dat", "Table"D
885.7, 4.3<, 8-1.2, 7.8<<

Import@" file", "Table" D will handle many kinds of tabular data, automatically deducing the details of the format whenever possible. Export@" file", list, "Table" D writes out data separated by spaces, with numbers given in C or Fortran-like form, as in 2.3 E5 and so on.
import data assuming a format deduced from the file name export data in a format deduced from the file name

Import@"name.ext"D Export@"name.ext",exprD
Importing and exporting general data.

table formats matrix formats specialized data formats


Some common formats for tabular data.

"CSV", "TSV", "XLS" "HarwellBoeing", "MAT", "MTX" "DIF", "FITS", "HDF5", "MPS", "SDTS", etc.

Import and Export can handle not only tabular data, but also data corresponding to graphics, sounds, expressions and even whole documents. Import and Export can often deduce the

Data Manipulation

51

Import and Export can handle not only tabular data, but also data corresponding to graphics, sounds, expressions and even whole documents. Import and Export can often deduce the appropriate format for data simply by looking at the extension of the file name for the file in which the data is being stored. "Exporting Graphics and Sounds" and "Importing and Exporting Files" discuss in more detail how Import and Export work. Note that you can also use Import and Export to manipulate raw files of binary data.
This imports a graphic in JPEG format.
In[4]:=

Import@"ExampleDataturtle.jpg"D

Out[4]=

$ImportFormats $ExportFormats

import formats supported on your system export formats supported on your system

Finding the complete list of supported import and export formats.

Importing and Exporting Files


Import@" file","List"D Export@" file",list,"List"D Import@" file","Table"D Export@" file",list,"Table"D Import@" file","CSV"D Export@" file",list,"CSV"D
import a one-dimensional list of data from a file export list to a file as a one-dimensional list of data import a two-dimensional table of data from a file export list to a file as a two-dimensional table of data import data in comma-separated format export data in comma-separated format

Importing and exporting lists and tables of data. This exports a list of data to the file out1.
In[1]:=

Export@"out1", 86.7, 8.5, - 5.3<, "List"D

Out[1]= out1

Here are the contents of the file.

52

Data Manipulation

Here are the contents of the file.


In[2]:=

FilePrint@"out1"D

6.7 8.5 -5.3


This imports the contents back into Mathematica.
In[3]:=

Import@"out1", "List"D

Out[3]= 86.7, 8.5, -5.3<

If you want to use data purely within Mathematica, then the best way to keep it in a file is usually as a complete Mathematica expression, with all its structure preserved, as discussed in "Reading and Writing Mathematica Files: Files and Streams". But if you want to exchange data with other programs, it is often more convenient to have the data in a simple list or table format.
This exports a two-dimensional array of data.
In[4]:=

Export@"out2.dat", 885.6 10 ^ 12, 7.2 10 ^ 12<, 83, 5<<, "Table"D

Out[4]= out2.dat

When necessary, numbers are written in C or Fortran-like "E" notation.


In[5]:=

FilePrint@"out2.dat"D

5.6e12 3 5

7.2e12

This imports the array back into Mathematica.


In[6]:=

Import@"out2.dat", "Table"D

12 12 Out[6]= 995.6 10 , 7.2 10 =, 83, 5<=

If you have a file in which each line consists of a single number, then you can use Import@" file", "List"D to import the contents of the file as a list of numbers. If each line consists of a sequence of numbers separated by tabs or spaces, then Import@" file", "Table" D will yield a list of lists of numbers. If the file contains items that are not numbers, then these are returned as Mathematica strings.

Data Manipulation

53

This exports a mixture of textual and numerical data.


In[7]:=

Export@"out3.dat", 88"first", 3.4<, 8"second", 7.8<<D

Out[7]= out3.dat

Here is the exported data.


In[8]:=

FilePrint@"out3.dat"D

first second

3.4 7.8

This imports the data back into Mathematica.


In[9]:=

Import@"out3.dat", "Table"D

Out[9]= 88first, 3.4<, 8second, 7.8<<

With InputForm, you can explicitly see the strings.


In[10]:=

InputForm@%D

Out[10]//InputForm= {{"first", 3.4}, {"second", 7.8}}

Import@" file","List"D Import@" file","Table"D Import@" file","String"D Import@" file","Text"D Import@" file",8"Text","Lines"<D Import@" file",8"Text","Words"<D
Importing files in different formats.

treat each line as a separate numerical or other data item treat each element on each line as a separate numerical or other data item treat the whole file as a single character string treat the whole file as a single string of text treat each line as a string of text treat each separated word as a string of text

This creates a file with two lines of text.


In[11]:=

Export@"out4.txt", 8"The first line.", "The second line."<, 8"Text", "Lines"<D

Out[11]= out4.txt

Here are the contents of the file.


In[12]:=

FilePrint@"out4.txt"D

The first line. The second line.

54

Data Manipulation

This imports the whole file as a single string.


In[13]:=

Import@"out4.txt", "Text"D InputForm

Out[13]//InputForm= "The first line.\nThe second line."

This imports the file as a list of lines of text.


In[14]:=

Import@"out4.txt", 8"Text", "Lines"<D InputForm

Out[14]//InputForm= {"The first line.", "The second line."}

This imports the file as a list of words separated by white space.


In[15]:=

Import@"out4.txt", 8"Text", "Words"<D InputForm

Out[15]//InputForm= {"The", "first", "line.", "The", "second", "line."}

Exporting Graphics and Sounds


Mathematica allows you to export graphics and sounds in a wide variety of formats. If you use the notebook front end for Mathematica, then you can typically just copy and paste graphics and sounds directly into other programs using the standard mechanism available on your computer system.
export graphics to a file in a format deduced from the file name export graphics in the specified format

Export@"name.ext",graphicsD Export@" file",graphics," format"D

Export@"!command ",graphics," format"D


export graphics to an external command

Export@" file",8g1 ,g2 ,<,D ExportString @graphics," format"D

export a sequence of graphics for an animation generate a string representation of exported graphics

Exporting Mathematica graphics and sounds.

Data Manipulation

55

"EPS" "PDF" "SVG" "PICT" "WMF" "TIFF" "GIF" "JPEG" "PNG" "BMP" "PCX" "XBM" "PBM" "PPM" "PGM" "PNM" "DICOM" "AVI"

Encapsulated PostScript (.eps) Adobe Acrobat portable document format (.pdf) Scalable Vector Graphics (.svg) Macintosh PICT Windows metafile format (.wmf) TIFF (.tif, .tiff) GIF and animated GIF (.gif) JPEG (.jpg, .jpeg) PNG format (.png) Microsoft bitmap format (.bmp) PCX format (.pcx) X window system bitmap (.xbm) portable bitmap format (.pbm) portable pixmap format (.ppm) portable graymap format (.pgm) portable anymap format (.pnm) DICOM medical imaging format (.dcm, .dic) Audio Video Interleave format (.avi)

Typical graphics formats supported by Mathematica. Formats in the first group are resolution independent. This generates a plot.
In[1]:=

Plot@Sin@xD + Sin@Sqrt@2D xD, 8x, 0, 10<D

Out[1]=

This exports the plot to a file in Encapsulated PostScript format.


In[2]:=

Export@"sinplot.eps", %D

Out[2]= sinplot.eps

When you export a graphic outside of Mathematica, you usually have to specify the absolute size at which the graphic should be rendered. You can do this using the ImageSize option to Export. ImageSize -> x makes the width of the graphic be x printers points; ImageSize -> 72 xi thus makes the width xi inches. The default is to produce an image that is four inches wide.

56

Data Manipulation

ImageSize -> x makes the width of the graphic be x printers points; ImageSize -> 72 xi thus makes the width xi inches. The default is to produce an image that is four inches wide. ImageSize -> 8x, y< scales the graphic so that it fits in an xy region.
absolute image size in printers points how the image is oriented in the file resolution in dpi for the image

ImageSize "ImageTopOrientation" ImageResolution


Options for Export.

Automatic Top Automatic

Within Mathematica, graphics are manipulated in a way that is completely independent of the resolution of the computer screen or other output device on which the graphics will eventually be rendered. Many programs and devices accept graphics in resolution-independent formats such as Encapsulated PostScript (EPS). But some require that the graphics be converted to rasters or bitmaps with a specific resolution. The ImageResolution option for Export allows you to determine what resolution in dots per inch (dpi) should be used. The lower you set this resolution, the lower the quality of the image you will get, but also the less memory the image will take to store. For screen display, typical resolutions are 72 dpi and above; for printers, 300 dpi and above.
AutoCAD drawing interchange format (.dxf) STL stereolithography format (.stl)

"DXF" "STL"

Typical 3D geometry formats supported by Mathematica.

"WAV" "AU" "SND" "AIFF"

Microsoft wave format (.wav)

m law encoding (.au)


sound file format (.snd) AIFF format (.aif, .aiff)

Typical sound formats supported by Mathematica.

Data Manipulation

57

Generating and Importing TeX


Mathematica notebooks provide a sophisticated environment for creating technical documents. But particularly if you want to merge your work with existing material in TeX, you may find it convenient to use TeXForm to convert expressions in Mathematica into a form suitable for input to TeX.
print expr in TeX input form

TeXForm @exprD
Mathematica output for TeX.

Here is an expression, printed in standard Mathematica form.


In[1]:= Out[1]=

Hx + yL ^ 2 Sqrt@x yD
Hx + yL2 xy

Here is the expression in TeX input form.


In[2]:=

TeXForm@%D
\frac{(x+y)^2}{\sqrt{x y}}

Out[2]//TeXForm=

ToExpression A"input",TeXForm E
Converting TeX strings to Mathematica.

convert TeX input to Mathematica

This converts a TeX string to Mathematica. Note the double backslashes needed in the string.
In[3]:= Out[3]=

ToExpression@"\\sqrt8x y<", TeXFormD


xy

In addition to being able to convert individual expressions to TeX, Mathematica also provides capabilities for translating complete notebooks. These capabilities can usually be accessed from the File Save As... menu in the notebook front end.

58

Data Manipulation

Exchanging Material with the Web


Export@" file.html",nbD
Converting notebooks to HTML.

save the notebook nb in HTML form

Export has many options applying to HTML export that allow you to specify how notebooks should be converted for web browsers with different capabilities.
print expr in MathML form use StandardForm rather than traditional mathematical notation interpret a string of MathML as Mathematica input

MathMLForm @exprD MathMLForm @StandardForm @exprDD ToExpression A "string",MathMLForm E


Converting to and from MathML.

Here is an expression printed in MathML form.


In[1]:=

MathMLForm@x ^ 2 zD
<mfrac> <msup> <mi>x</mi> <mn>2</mn> </msup> <mi>z</mi> </mfrac> </math>

Out[1]//MathMLForm= <math>

If you paste MathML into a Mathematica notebook, Mathematica will automatically try to convert it to Mathematica input. You can copy an expression from a notebook as MathML using the Copy As menu in the notebook front end.
export in XML format import from XML import data from a string of XML

Export@" file.xml",exprD Import@" file.xml"D ImportString @"string","XML"D


XML importing and exporting.

Somewhat like Mathematica expressions, XML is a general format for representing data. Mathematica automatically converts certain types of expressions to and from specific types of XML. MathML is one example. Another example is SVG for graphics. If you ask Mathematica to import a generic piece of XML, it will produce a SymbolicXML expression. Each XML element of the form < elem attr = ' val ' > data < elem > is translated to a Mathematica SymbolicXML expression of the form XMLElement @"elem", 8"attr" -> "val"<, 8data<D.

Data Manipulation

59

If you ask Mathematica to import a generic piece of XML, it will produce a SymbolicXML expression. Each XML element of the form < elem attr = ' val ' > data < elem > is translated to a Mathematica SymbolicXML expression of the form XMLElement @"elem", 8"attr" -> "val"<, 8data<D. Once you have imported a piece of XML as SymbolicXML, you can use Mathematica's powerful symbolic programming capabilities to manipulate the expression you get. You can then use Export to export the result in XML form.
This generates a SymbolicXML expression, with an XMLElement representing the a element in the XML string.
In[2]:= Out[2]=

ImportString@"<a aa='va'>s<a>", "XML"D


XMLObject@DocumentD@8<, XMLElement@a, 8aa va<, 8s<D, 8<D

There are now two nested levels in the SymbolicXML.


In[3]:= Out[3]=

ImportString@"<a><b bb='1'>ss<b><b bb='2'>ss<b><a>", "XML"D


XMLObject@DocumentD@8<, XMLElement@a, 8<, 8XMLElement@b, 8bb 1<, 8ss<D, XMLElement@b, 8bb 2<, 8ss<D<D, 8<D

This does a simple transformation on the SymbolicXML.


In[4]:= Out[4]=

% . "ss" -> XMLElement@"c", 8<, 8"xx"<D


XMLObject@DocumentD@8<, XMLElement@a, 8<, 8XMLElement@b, 8bb 1<, 8XMLElement@c, 8<, 8xx<D<D, XMLElement@b, 8bb 2<, 8XMLElement@c, 8<, 8xx<D<D<D, 8<D

This shows the result as an XML string.


In[5]:=

ExportString@%, "XML"D

Out[5]= <a> <b bb='1'> <c>xx<c> <b> <b bb='2'> <c>xx<c> <b> <a>

Import@"http:url",D Import@"ftp:url",D
Importing data from web sources.

import a file from any accessible URL import a file from an FTP server

This imports a picture from a website.


In[6]:=

Import@"http:reference.wolfram.commathematicaExampleDataocelot.jpg"D

60

Data Manipulation

Image Processing
Image Processing
Mathematica now provides built-in support for both programmatic and interactive image process ing~fully integrated with Mathematica's powerful mathematical and algorithmic capabilities. You can create and import images, manipulate them with built-in functions, apply linear and nonlinear filters to them, and visualize them in any number of ways.

Image Creation and Representation


Images can be created from numerical arrays, from Mathematica graphics via cut-and-paste methods, and from external sources via Import.
raster image with pixel values given by data import data from a file

Image @dataD Import@" file"D


Image creation functions.

The simplest method for creating an image object is to wrap Image around a matrix of real values ranging from 0 to 1.
Here is a one-channel image created from a matrix of numbers.
In[1]:= Out[1]=

Image@880., 1., 0.<, 81., 0., 1.<, 80., 1., 0.<<D

You can also copy and paste or drag and drop an image from other applications. You can use Import to obtain an image from a file on the local file system or any accessible remote location.

Data Manipulation

61

This imports an image from the Mathematica documentation directory ExampleData .


In[10]:=

i = Import@"ExampleDataocelot.jpg"D

Out[10]=

Useful properties of an image can be obtained by calling the following functions.

ImageDimensions @imageD ImageChannels @imageD ImageType@imageD ImageQ@imageD Options @symbolD ImageData@imageD


Image properties.

give the pixel dimensions of the raster associated with

image
give the number of channels present in the data for image give the type of values used for each pixel element in image give True if image has the form of a valid Image object and False otherwise give the list of default options assigned to a symbol the array of pixel values in image

This returns the image dimensions.


In[11]:=

ImageDimensions@iD

Out[11]= 8200, 200<

Here is the setting of the ColorSpace option.


In[12]:=

Options@i, ColorSpaceD

Out[12]= 8ColorSpace Grayscale<

The image's array of pixel values can be easily extracted using the function ImageData. By default, the function returns real values, but you can ask for a specific type using the optional "type" argument.

62

Data Manipulation

This returns a fragment of the image as a matrix of real values scaled to the range 0 to 1.
In[14]:=

ImageData@ImageTake@i, 894, 97<, 854, 59<DD MatrixForm


0.772549 0.392157 0.0627451 0.203922 0.352941 0.372549 0.560784 0. 0.164706 0.415686 0.415686 0.458824 0.278431 0.0352941 0.286275 0.435294 0.427451 0.368627 0.184314 0.0666667 0.32549 0.443137 0.54902 0.701961

Out[14]//MatrixForm=

Here is the same fragment as a matrix of integers in the range 0 to 255.


In[13]:=

ImageData@ImageTake@i, 894, 97<, 854, 59<D, "Byte"D MatrixForm


197 100 16 52 90 95 143 0 42 106 106 117 71 9 73 111 109 94 47 17 83 113 140 179

Out[13]//MatrixForm=

In the case of multichannel images, the raw pixel data is represented by a 3D array arranged in one of two possible ways as determined by the option Interleaving .
This imports a color image.
In[1]:=

i = Import@"ExampleDatalena.tif"D

Out[1]=

With the default setting Interleaving -> True , the data is organized as a 2D array of lists of color values, a triplet in the common case of images in RGB color space.
This shows the default data organization.
In[22]:=

MatrixForm ImageData@ImageTake@i, 890, 93<, 850, 53<D, "Byte"D

Out[22]//MatrixForm=

The option setting Interleaving -> False can be used to store and retrieve the raw data as a list of matrices, one for each of the color channels.
Here is a fragment of the example image arranged as a list of channel matrices.

Data Manipulation

63

Here is a fragment of the example image arranged as a list of channel matrices.


In[23]:=

MatrixForm ImageData@ImageTake@i, 890, 93<, 850, 53<D, "Byte", Interleaving FalseD


30 33 31 33 35 34 32 34 37 40 38 35 43 49 53 55 , 24 28 28 31 22 24 23 27 24 26 24 21 27 33 37 38 , 30 33 32 35 23 23 22 25 16 18 16 13 19 22 24 27 >

Out[23]= :

A multichannel image can be split into a list of single-channel images and, conversely, a multichannel image can be created from any number of single-channel images.
This splits the example RGB color image into three grayscale images.
In[2]:=

ColorSeparate@iD

Out[2]= :

>

In[3]:=

First@Options@, "ColorSpace"D & %D

Out[3]= 8ColorSpace Grayscale<

Basic Image Manipulation


Consider the image manipulation operations that change the image dimensions by cropping or padding. These operations serve a variety of useful purposes. Cropping allows you to create a new image from a selected portion of a larger one, while padding is typically used to extend an image at the borders to ensure uniform treatment of the border pixels in many image processing tasks.
give an image consisting of the first n rows of image crop image by removing borders of uniform color pad image on all sides with m background pixels

ImageTake@image,nD ImageCrop@imageD ImagePad @image,mD


Image cropping and padding operations.

64

Data Manipulation

This selects the first 50 rows of the example image.


In[24]:= Out[24]=

ImageTake@i, 50D

ImageCrop conveniently complements ImageTake. Instead of specifying the exact number of rows or columns to be extracted, it allows you to define the desired dimensions of the resulting image, namely, the number of rows or columns that are to be retained. By default, the cropping operation is centered, thus an equal number of rows and columns are deleted from the edges of the image.
Here a 100100 pixel region is extracted from the center of the example image.
In[27]:=

ImageCrop@i, 8100, 100<D

Out[27]=

While ImageCrop is primarily used to reduce the dimensions of the source image, it is frequently desirable to pad an image to increase its dimensions. All the most common padding methods are supported.
This shows four different padding methods applied to the right edge of the example image.
In[33]:=

Grid Partition@ ImagePad@i, 880, 50<, 80, 0<<, D & 80, "Reflected", "Fixed", "Periodic"<, 2D

Out[33]=

It is frequently necessary to change the dimensions of an image by resampling or to reposition it in some manner. Functions that perform these basic geometric tasks are readily available.

Data Manipulation

65

ImageResize @image,wD Thumbnail@imageD ImageRotate @imageD ImageReflect @imageD


Spatial operations.

give a resized version of image that is w pixels wide give a thumbnail version of image rotate image counterclockwise by 90 reverse image by top-bottom mirror reflection

Here, ImageResize is used to increase and diminish the size of the original image, respectively.
In[38]:=

Row 8ImageResize@i, 200D, Spacer@10D, ImageResize@i, 50D<

Out[38]=

ImageRotate is another common spatial operation. It results in an image whose pixel positions are all rotated counter-clockwise with respect to a pivot point centered on the image.
This rotates the example image by 30 degrees.
In[39]:=

ImageRotate@i, p 6D

Out[39]=

Several useful image processing tasks require nothing more than simple arithmetic operations between two images or an image and a constant. For example, you can change brightness by multiplying an image by a constant factor or by adding (subtracting) a constant to (from) an image. More interestingly, the difference of two images can be used to detect change and the product of two images can be used to hide or highlight regions in an image in a process called masking. For this purpose, three basic arithmetic functions are available.

66

Data Manipulation

ImageAdd @image,xD ImageSubtract @image,xD ImageMultiply @image,xD


Arithmetic operations.

add an amount x to each channel value in image subtract a constant amount x from each channel value in

image
multiply each channel value in image by a factor x

Here is an example of image blending using addition and multiplication.

In[17]:=

ImageAddBImageMultiply@i, 2 3D, ImageMultiplyB

, 1 3FF

Out[17]=

Image Processing by Point Operations


Point operations constitute a simple but important class of image processing operations. These operations change the luminance values of an image and therefore modify how an image appears when displayed. The terminology originates from the fact that point operations take single pixels as inputs. This can be expressed as gHi, jL = T @ f Hi, jLD where T is a grayscale transformation that specifies the mapping between the input image f and the result g, and i, j denotes the row, column index of the pixel. Point operations are a oneto-one mapping between the original (input) and modified (output) images according to some function defining the transformation T.

Contrast Modification
Contrast modifying point operations frequently encountered in image processing include negation (grayscale or color), gamma correction, which is a power-law transformation, and linear or nonlinear contrast stretching.

Data Manipulation

67

Lighter Aimage,E DarkerAimage,E ColorNegate @imageD ImageAdjust @imageD ImageApply @ f ,imageD


Selected point operators.

give a lighter version of an image give a darker version of an image give the negative of image, in which all colors have been negated adjust the levels in image, rescaling them to cover the range 0 to 1 apply f to the list of channel values for each pixel in image

One of the simplest examples of a point transformation is negation. For a grayscale image f , the transformation is defined by gHi, jL = 1 - f Hi, jL. It is applied to every pixel in the source image. In the case of multichannel images, the same transformation is applied to each color value, of every pixel.
This show the original example image and its digital negative.
In[6]:=

GraphicsRow@8i, ColorNegate@iD<, ImageSize MediumD

Out[6]=

The function ImageAdjust can be used to perform most of the commonly needed contrast stretching and power-law transformations, while ImageApply enables you to realize any desired point transformation whatsoever.
This increases contrast using linear scaling.
In[37]:=

ImageAdjust@i, 1.5D

Out[37]=

As an example of a nonlinear contrast stretching operation, consider the following transformation called sigma scaling. Assuming the default range of 0 to 1, the transformation is defined by

68

Data Manipulation

As an example of a nonlinear contrast stretching operation, consider the following transformation called sigma scaling. Assuming the default range of 0 to 1, the transformation is defined by gHi, jL =
1+ 1
f Ii, jM-m s

This defines the transformation.


In[10]:=

f@x_, m_, s_D :=

1 1+
x-m s

Here are several plots of the transformation for different values of the variance parameter.
In[12]:=

GraphicsRow@Plot@f@x, 0.5, D, 8x, 0, 1<, PlotRange 80, 1<, Ticks False, ImageSize TinyD & 80.15, 0.1, 0.05, 0.01<D

Out[12]=

This shows the effect of the transformation on the example image.


In[36]:=

ImageApply@f@, 0.5, 0.1D &, iD

Out[36]=

Image binarization is the operation of converting a multilevel image into a binary image. In a binary image, each pixel value is represented by a single binary digit. In its simplest form, binarization, also called thresholding, is a point-based operation that assigns the value of 0 or 1 to each pixel of an image based on a comparison with some global threshold value t. gHi, jL = 1, if f Hi, jL t 0, if f Hi, jL < t

Thresholding is an attractive early processing step because it leads to significant reduction in data storage and results in binary images that are simpler to analyze. Binary images permit the use of powerful morphological operators for shape and structure-based analysis of image content. Binarization is also a form of image segmentation, as it divides an image into distinct regions.

Data Manipulation

69

Binarize @imageD ColorQuantize @image,nD


Quantization functions.

create a binary image from image give an approximation to image that uses only n distinct colors

Color images are first converted to grayscale prior to thresholding. If the threshold value is not explicitly given, an optimal value is calculated using one of several well-known methods.
Here is the default binarization based on Otsu's method for optimal threshold selection.
In[2]:=

Binarize@iD

Out[2]=

Here ImageApply is used to return a color image in which each individual channel is binarized, resulting in a maximum of 8 distinct colors.
In[17]:=

ImageApply@UnitStep@ - 0.5D &, iD

Out[17]=

Color Conversion
Four color spaces are currently supported: RGB (red, green, and blue), CMYK (cyan, magenta, yellow, and black), HSB (hue, saturation, and brightness) and grayscale. The RGB (red, green, blue) color scheme is the most frequently used color representation used in practice. The three so-called primary colors are combined (added) in various proportions to produce a composite, full-color image. The RGB color model is universally used in color moni-

70

Data Manipulation

tors and video recorders and cameras. Also, the human visual system is tuned to perceive color as a variable combination of these primary colors. The primary colors added in equal amounts produce the secondary colors of light: cyan (C), magenta (M), and yellow (Y). These are the primary pigment colors used in the printing industry and thus the relevance of the CMY color model. For image processing applications it is often useful to separate the color information from luminance. The HSB (hue, saturation, brightness) model has this property. Hue represents the dominant color as seen by an observer, saturation refers to the amount of dilution of the color with white light, and brightness defines the average luminance. The luminance component may, therefore, be processed independently of the images color information.
convert color specifications in expr to refer to the color space represented by colspace

ColorConvert @expr, colspaceD


Color conversion function.

This shows the conversion results from an RGB source to the remaining supported color spaces.
In[38]:= Out[38]=

i = Image@8881, 0, 0<, 80, 1, 0<, 80, 0, 1<<<, ColorSpace "RGB"D

In[39]:=

Column@InputForm@ColorConvert@i, DD & 8"CMYK", "HSB", "Grayscale"<D

Image@8880., 1., 1., 0.<, 81., 0., 1., 0.<, 81., 1., 0., 0.<<<, "Real", ColorSpace -> "CMYK", Interleaving -> TrueD Out[39]= Image@8880., 1., 1.<, 80.3333333333333333, 1., 1.<, 80.6666666666666666, 1., 1.<<<, "Real", ColorSpace -> "HSB", Interleaving -> TrueD [email protected], 0.587, 0.114<<, "Real", ColorSpace -> "Grayscale", Interleaving -> NoneD

Note that the RGB -> Grayscale transformation uses the weighting coefficients recommended for U.S. broadcast TV (NTSC) and later incorporated into the CCIR 601 standard for digital video.

Image Histogram
An important concept common to many image enhancement operations is that of a histogram, which is simply a count (or relative frequency, if normalized) of the gray levels in the image. Analysis of the histogram gives useful information about image contrast. Image histograms are important in many areas of image processing, most notably compression, segmentation, and thresholding.

Data Manipulation

71

ImageLevels @imageD ImageHistogram @imageD


Image histogram functions.

give a list of pixel values and counts for each channel in

image
plot a histogram of the pixel levels for each channel in

image

This shows two different histogram visualization methods.


In[3]:=

GraphicsRow@8ImageHistogram@iD, ImageHistogram@i, Appearance "Separated"D<, ImageSize MediumD

Out[3]=

Image Processing by Area Operations


Most useful image processing operators are area based. Area based operations calculate a new pixel value based on the values in a local, typically small, neighborhood. This is usually implemented through a linear or nonlinear filtering operation with a finite-sized operator (i.e., a filter). Without loss of generality, consider a centered and symmetric 3 3 neighborhood of the image pixel at position n, m, with value f @n, m D. A general area-based transformation can be expressed as f @i - 1, j - 1D f @i, j - 1D f @i + 1, j - 1D f @i - 1, jD f @i, jD f @i + 1, jD f @i - 1, j + 1D f @i, j + 1D f @i + 1, j + 1D

g @i, jD = T

where g is the output image resulting from applying transformation T to the 3 3 centered neighborhoods of all the pixels in input image f. It should be noted that the spatial dimensions and geometry of the neighborhood are generally determined by the needs of the application. Examples of image processing region-based operations include noise reduction, edge detection, edge sharpening, image enhancement, segmentation, and more.

72

Data Manipulation

Linear and Nonlinear Filtering


Linear image filtering using convolution is one the most common methods of processing images. To achieve a desired result you must specify an appropriate filter. Tasks such as smoothing, sharpening, edge finding, zooming, and more are typical examples of image processing tasks that have convolution-based implementations. Other tasks, noise removal for example, are better accomplished using nonlinear processing techniques.
apply f to the range r of each pixel in each channel of image give the convolution of image with kernel ker

ImageFilter @ f ,image,rD ImageConvolve @image,kerD


General filtering operators.

Here is a typical blurring operation using one of the smoothing filters.


In[4]:=

ImageConvolve@i, BoxMatrix@5D 121.D

Out[4]=

The more general (but slower) ImageFilter function can be used in cases when traditional linear filtering is not possible and the desired operation is not implemented by any of the builtin filtering functions.
This calculates the maximum range of values within a small neighborhood of each pixel.
In[5]:=

ImageFilter@Max@Flatten@DD - Min@Flatten@DD &, i, 1D

Out[5]=

Data Manipulation

73

A large number of linear and nonlinear operators are available as built-in functions. Here is a partial listing.
give a blurred version of image give a sharpened version of image replace every value by the mean value in its range r convolve with a Gaussian kernel of pixel radius r replace every value by the median in its range r replace every value by the minimum in its range r replace each pixel with the most common pixel value in its range r

Blur @imageD Sharpen @imageD MeanFilter @image,rD GaussianFilter@image,rD MedianFilter @image,rD MinFilter@image,rD CommonestFilter @image,rD

Common linear and nonlinear filtering operators.

One of the more common applications of linear filtering in image processing has been in the computation of approximations of discrete derivatives and consequently edge detection. The well-known methods of Prewitt, Sobel, and Canny are all essentially based on the calculation of two orthogonal derivatives at each point in an image and the gradient magnitude.
Here are the two Sobel filters.
In[6]:=

sobelY = 881, 2, 1<, 80, 0, 0<, 8- 1, - 2, - 1<< 4.; sobelX = 881, 0, - 1<, 82, 0, - 2<, 81, 0, - 1<< 4.; This returns the edges of a grayscale image using Sobel filters.
2

In[7]:=

ImageBSqrtBImageDataBImageConvolveB

, sobelXFF +

ImageDataBImageConvolveB

, sobelYFF FF

Out[7]=

As a second example, consider the task of removing the impulsive noise, which is called salt noise due to its visual appearance, from an image. This is a classic example contrasting the

74

Data Manipulation

As a second example, consider the task of removing the impulsive noise, which is called salt noise due to its visual appearance, from an image. This is a classic example contrasting the different outcomes resulting from a linear moving-average and a nonlinear moving-median calculation.
This creates a small image with impulsive noise.
In[13]:=

Image@ReplacePart@ArrayPad@ConstantArray@160, 820, 20<D, 15, 60D, 255, RandomInteger@81, 50<, 8100, 2<DD, "Byte"D

Out[13]=

Here is the side-by-side comparison.


In[14]:=

Row@8MeanFilter@%, 1D, Spacer@5D, MedianFilter@%, 1D<D

Out[14]=

Clearly, the median filter returns the better result.

Morphological Processing
Mathematical morphology provides an approach to the processing of digital images that is based on the spatial structure of objects in a scene. In binary morphology, unlike linear and nonlinear operators discussed so far, morphological operators modify the shape of pixel groupings instead of their amplitude. However, in analogy with these operators, binary morphological operators may be implemented using convolution-like algorithms with the fundamental operations of addition and multiplication replaced by logical OR and AND.
give the dilation with respect to a range r square give the erosion with respect to a range r square

Dilation @image,rD Erosion @image,rD


Fundamental morphological operators.

Data Manipulation

75

This shows the dilation (left) and erosion (right) of the example image (center) using a 55 uniform structuring element.
In[8]:=

b = Binarize@iD; GraphicsRow@8Dilation@b, 2D, b, Erosion@b, 2D<, ImageSize MediumD

Out[9]=

The definitions of binary morphology extend naturally to the domain of grayscale images with Boolean AND and OR becoming point-wise minimum and maximum operators, respectively. For a uniform, zero-valued structuring element, the dilation of an image f reduces to the following simple form: f @i - 1, j - 1D f @i, j - 1D f @i + 1, j - 1D f @i - 1, jD f @i, jD f @i + 1, jD f @i - 1, j + 1D f @i, j + 1D f @i + 1, j + 1D

g @i, jD = Max

This shows the grayscale dilation (left) and erosion (right) of the example image (center) using a 55 uniform structuring element.
In[10]:=

GraphicsRow@8Dilation@b, 2D, b, Erosion@b, 2D<, ImageSize MediumD

Out[10]=

These operators can be used in combinations using a single structuring element or a list of such elements to perform many useful image processing tasks. A partial listing includes thinning, thickening, edge and corner detection, and background normalization.
This uses dilation and erosion to detect edges in a grayscale image.
In[17]:=

g = ColorConvert@i, "Grayscale"D; e = ImageSubtract@Dilation@g, 1D, Erosion@g, 1DD

Out[18]=

76

Data Manipulation

GeodesicDilation @marker,maskD GeodesicErosion @marker,maskD DistanceTransform @imageD

give the fixed point of the geodesic dilation of the image marker constrained by the image mask give the fixed point of the geodesic erosion of the image marker constrained by the image mask give the distance transform of image, in which the value of each pixel is replaced by its distance to the nearest background pixel give an array in which each pixel of image is replaced by an integer index representing the connected foreground image component in which the pixel lies

MorphologicalComponents @imageD

Selected morphological functions.

An important category of morphological algorithms, called morphological reconstruction, are based on repeated application of dilation (or erosion) to a marker image, while the result of each step is constrained by a second image, the mask. The process ends when a fixed point is reached. Interestingly, many image processing tasks have a natural formulation in terms of reconstruction. Peak and valley detection, hole filling, region flooding, and hysteresis threshold are just a few examples. The latter, also known as a double threshold, is an integral part of the widely used Canny edge detector. Pixels falling below the low threshold are rejected, pixels above the high threshold are accepted, while pixels in the intermediate range are accepted only if they are "connected" to the high threshold pixels. Connectivity may be established using a variety of algorithms, but reconstruction gives an effective and very simple solution.
Here are the low, high, and double threshold images, respectively.
In[36]:=

8mask = Binarize@e, 0.45D, mark = Binarize@e, 0.8D, GeodesicDilation@mark, maskD<

Out[36]= :

>

This clears all the symbols.


In[37]:=

Clear@b, g, i, e, mask, markD;

You might also like