Actions

Elements of Python programming

From AstroEd

At this step in our short course on Python for Physics and Astronomy you have Python running, and have seen how it works interactively and with executable files. Let's explore what we can do with simple useful programming. Some essential topics are

  • Getting data into and out of a program
  • Storing data as numbers and text
  • Accessing data efficiently in lists, tuples, and dictionaries
  • Performing logical and mathematical operations on data
  • Controlling program flow (coming up in the next section)


Input and output

Python accepts data from the command line when it starts an application, locally stored files, files or other input from the web, through ports -- typically as serial or TCPIP data, or from attached instruments that communicate through specialized device drivers.


Input from a console and keyboard

To have a Python program accept data from a user at a console, include lines like these in Python 2.7


newtext = raw_input()
print newtext

to take the raw input as text and print it. You can prompt for the input too

newtext = raw_input('Write what you like: ')
print 'This is what I like: ', newtext


In Python 2.7 there is also a Python command "input()" which treats incoming text as Python code. With Python 3.0 this has changed, so some care is needed if you write for the new Python. In that case, rather than raw_input() you would use input(), and to get the effect of the old "input()", you would use eval(input()). You can see why using the newer Python 3.0 with older programs can raise some problems, though they are usually easy to fix.

The input is text, but suppose we want a number instead. If we know it's a number, then

newtext = raw_input('Input a number >> ')
x = float(newtext)
print 'My number was ', x


should do it. But, if you try this and input text that is not a number, the program will generate an error and respond with something like this

python input_number.py
Input a number >> x
Traceback (most recent call last):
  File "input_number.py", line 2, in <module>
    x = float(newtext)
ValueError: could not convert string to float: x

How would we know if we have a number, given arbitrary text in the data, and avoid this error? One way is to use isdigit() --

newtext = raw_input('Input a number >> ')
if newtext.isdigit():
  x = float(newtext)
  print 'My number was ', x
else:
  print 'That is not a number.'

In this you see that isdigit() tests whether newtext is a number. It returns a True or False which is used by the "if" statement to control what to do with the data. We will look at such flow control more later.

You may also want to read data by splitting a line of text into pieces, for example with something like this --

newtext = raw_input('Input the item name and quantity >>')
print newtext.split()

When to this last one you input "eggs 12", newtext.split() will return ['eggs','12']. That is, it makes a list of the items that are on the line. You can now go through that list and look for the information you want, one entry at a time.


Input from the command line

In Linux, MacOS, or other Unix-like environments you can pass information to a program on the command line. There's a straightforward way to do this in Python using sys:

import sys
if len(sys.argv) == 1:
  sys.exit("Usage: convert_fits infile.fits outfile.fits newbitpix")
  exit()
elif len(sys.argv) == 4:
  infits = sys.argv[1]
  outfits = sys.argv[2]
  newbitpix = sys.argv[3]
else:
  sys.exit("Usage: convert_fits infile.fits outfile.fits newbitpix ")
  exit() 

Alternatively there is argparse, a standard command-line parsing module, and this example from the Python on-line tutorial to find xy:

import argparse
parser = argparse.ArgumentParser(description="calculate X to the power of Y")
group = parser.add_mutually_exclusive_group()
group.add_argument("-v", "--verbose", action="store_true")
group.add_argument("-q", "--quiet", action="store_true")
parser.add_argument("x", type=int, help="the base")
parser.add_argument("y", type=int, help="the exponent")
args = parser.parse_args()
answer = args.x**args.y


Printing to the display

When you have data or text to display, you'd use a "print" function to have the data appear on the console as the program executes. (In Windows, you may follow this with a raw_input() (in Python 2.7, or input() in Python 3) so that the console will not disappear before you read it.) In Linux or MacOS, you would usually run Python from a console, and the printed information appears on the display and remains visible after the program has finished. Unix-like environments "print" to the standard output, stdout, and may be redirected to a file. For example running a Python program from the command line that generates output, you could write

python myprogram.py >> myfile.txt

and the output would go into myfile.txt instead of displaying. Similarly, output can be parsed to send the error information to a separate file

python myprogram.py 1> myfile.txt 2>myerrors.txt

sends the stdout to myfile.txt and stderr to myerrors.txt . These options are not available in Windows.


To print text the command is

print 'This will print on the screen.\n'

where the quoted (' and " have the same effect) text is sent. The '\n' is a line feed.


To print variables simply use them in the print statement

x = 1
y = 2
z = 3
h = 'Help me!!'
print x,y,z,h,'\n'

will display the values of x, y, or z regardless of whether they are numbers or text.

1 2 3 Help me!! 

Of course, printing can be formatted. If you are familiar with C or Fortran, you'll find similarities that will help creating formatted output. I in this instance, it's also helpful to remember that the "print" function is converting internal data into displayed "text", so the formatting is really a way of controlling how some text and numerical data are mapped onto text that is then displayed.

Formatting is available in both versions 2.7 and 3 in two ways

  • Formatting expressions that are like C's printf are commonly used.
  • Formatting methods are unique to Python and use operators that act on text (strings).

Before we can really use these effectively we will need to explain what we mean by "strings", "integers", "floats" and other data types. But, here are a few examples that illustrate how this works. From Mark Lutz' Learning Python we have this summary


To format strings using expressions

  1. Insert a % "operator". To the left of it, put a string that that is operated on by the instruction that immediately follows.
  2. To the right of the % and its instruction, provide the objects that are inserted into the format string on the left.

Here's an example:

'That is  %d  %s cat! % (1, 'fluffy')

which will print

That is 1 fluffy cat!
  

Here %d is an integer format in the style of C, and %s is a string format conversion code. Other common type codes

s String
d Integer "double"
i Integer
x Hex (also X for capital letters)
e Exponent (also E)
f Floating point

The general structure with formatting commands is

%[(name)][flags][width].[precision]typecode

For example, in interactive Python try

>> x=1.2345678901234567890

If you ask for "x"

>> x

Python will respond with

1.2345678901234567

to the precision of its floating point storage. You can format this by

>> '%6.2f'%x  (or with spaces for clarity, %6.2f' % x but no spaces after the first %

to which Python will respond

'  1.23'

You see that it left 6 places for the text, used a precision of 2 decimal places, and right-adjusted the text to the field. If you ask for more precision than you've allowed, Python will expand the field as needed. To have the data left-adjusted, put a minus sign in the formatting like this

>> '%-6.2f'%x
'1.23  '

In a program, rather than interactively, statements like these work in a print command

>>  print '%6.5e'%x
1.2346e+00


The alternative is a new "format method" scheme that is being developed for Python 3. In this the method acts on string object to create a new string. Here's very simple example of what it looks like to format the division of 25 by 7

>> '{0:.4f}'.format(25. / 7.)
3.5714

The first "0" is a position, and often there will be many similar {} to hold the data in the following format. You see the familiar "f" character to tell Python how to treat the data.

>> '%.4f'% (25. / 7.)

would have the same effect.

Finally, there is yet another way to use a format method that is perhaps clearer

>> format(3.5714, '.2f')
3.57

which is neat if there's only one variable.

Generally the most commonly used is the % expression which is embodied in Python 2.7 and in Python 3.

Input from a file and writing to files

It's more likely you will want to input data to a program from a file on your computer. Opening and reading a file in Python is very easy --

mydata = open('datafile.dat', 'r')

opens the file named datafile.dat for read-only, and assigns it to the object "mydata". You can read the data as text

mytext = mydata.read()

and the entire file is now contained in mytext. If you do this on the Python command line, and then enter "mytext", you'll see the context of your file (with end of line characters like \n too).

As with any text, we can split it into parts with

mytext.split()

which generates a list of space-delimited data from the file, ignoring the end of line's. You can read individual lines sequentially with

myline = mydata.readlines()

which returns a sequential list with the lines as items in the list.

myline.split()

When you are finished reading the file, you close it with

mydata.close()

Similarly, to write a file you would open it for writing

mynewdata = open('newdata.dat', 'w')

write text to it

mynewdata.write('This is a line of text.\n')

and continue with other lines

mynewdata.write('1 2 3\n')

until you are finished

mynewdata.close()

The close() is essential because, without it, the computer's buffers may not flush the contents of the file to the disk.

Whenever you are writing data to a file, it may be formatted with the same techniques used in formatting displayed data.


Numbers, text, and data types

So far we have see integers (e.g. 0, 1, 2 ...), floating point (e.g 3.14159), and strings (e.g. "deer in the headlights"). How are these quantities stored, and how do we access them in whole or part?


Binary numbers

Computer data are stored as a sequence of bits which may be 1 or 0. A byte is a sequence of 8 bits, and is taken to represent a number that is the sum of powers of 2:

0 0 0 0 0 0 0 1 = 2**0 = 1
0 0 0 0 0 0 1 0 = 2**1 = 2
0 0 0 0 0 1 0 0 = 2**2 = 4
0 0 0 0 1 0 0 0 = 2**3 = 8
0 0 0 1 0 0 0 0 = 2**4 = 16
0 0 1 0 0 0 0 0 = 2**5 = 32
0 1 0 0 0 0 0 0 = 2**6 = 64
1 0 0 0 0 0 0 0 = 2**7 = 128

In this way, any value from 0 to 255 can be represented by turning on the bits in the byte:

0 0 0 0 0 0 1 1 = 2*0 + 2*1 = 3

Binary numbers are often referred to in hexadecimal or "hex" code, which counts up to 15 and is given in powers of 16 rather than powers of 2. The counting sequence is 0,1,2,3,4,5,6,7,8,9,A,B,C,D,E,F. So, 4 bits of a byte can be a hex code, and it takes two of them to represent an 8-bit byte. In this terminology, decimal 1 is binary 1 and hex 1. Decimal 255 is binary 11111111 and hex FF.

To make longer numbers we string together bytes, and for a 64-bit computer, we treat these in chunks of 8 bytes per word. You can think of computer memory of a linear space with bits one after another, organized as bytes, and words, and sorted out by programs into text and numbers that are stored in these bits. Normally, you would not even worry over such workings, unless you need to "set a bit", control a logical decision, or sometimes control an instrument by changing internal values in a word. There are some "gotchas" to be at least aware of.

Negative integers are indicated by setting a sign bit. We see in the above example we start counting at 0 and go up to 255, but what if we want to have a -1? The standard way is to use the "most significant bit", that is the one on the far left in the display above, to indicate that the value is negative. When that bit is set to 1, the number gets a minus sign. Clearly after we reach 127 (decimal), and go one step higher to 128 we'd turn on the minus sign. Thus 1 0 0 0 0 0 0 0 is taken to be -127, and as we turn on more bits, we count up from -127 to -1. An integer stored in this way is said to be "signed", just as one that runs from 0 to 255 is said to be unsigned. The concept of signed and unsigned integers is not limited to 8-bit data, and is found even for much bigger integer storage allocation.

Another gotcha is the order in which bits and bytes are arranged in memory to associate with the numbers they represent. Integers are stored in memory as a sequence of bytes, combined in the simplest ways

  • Little-endian -- increasing numeric significance with increasing memory addresses
  • Big-endian -- decreasing numeric significance with increasing memory addresses

The x86 processor architecture uses little-endian, both within the byte and from byte to byte within a larger "word". The least significant bit has the lowest memory addresses. Consequently, in our pictoral representation of a binary number above, the memory address increases from right to left. Knowledge of how this works at the machine level is needed to program microcontrollers in instrumentation.


Now we see how integers are stored in memory, what about floating point? Fortunately, we rarely need to know the details, except that for most computers floating point numbers are stored in a succession of bytes, with most of the space allocated to the signficant part of the number in a power of 2 format standardized by IEEE. In the common base b=2, finite numbers are stored as three integers: s = a sign (zero or one), c = a significand or coefficient, and q = an exponent. The numerical value of a finite number is then

(−1)s × c × bq


Floating point with 64-bit storage uses 52 bits for c and has nearly 16 digits of precision. The exponent (of 2) must be in the range from -1022 to +1023, limiting the range of decimal values to 10**308.


Finally, there is the issue of how to store text, that is, how to encode text into numbers that are stored in memory. Each letter is assigned to a bit pattern, that is to an 8-bit integer. The old standard "ASCII" encoding utilizes only the first 127 values, but this has been extended to encompass symbols and characters used in various languages. You can see the full list at [www.ascii-code.com www.ascii-code.com]. As a short list, here are a few

 0 00 00000000 Null
10 0A 00001010 Line feed (\n)
13 0D 00001101 Carriage return (\r)
27 1B 00011011 Escape
32 20 00100000 Space
48 30 00110000 0
49 31 00110001 1
65 41 01000001 A
66 42 01000010 B
97 61 01100001 a
98 62 01100010 b

In modern UTF-8, the Unicode standard for the World Wide Web, the mapping may change with choice of language. It is backward compatible with ASCII, but uses 64-bits by using from one to four bytes. For our purposes, text storage may be regarded as a sequence of 8-bit bytes with each one representing a different character following the ASCII assignments.


Integers

In Python the variables are dynamically typed. That is, there use determines the type of data they store. This is different from other languages, like C, where the type has to be stated before the variable is used. We have already seen this in the examples.

>> x = 1./3.
>> print x
0.333333333333
>> y = 1/3
>> print y
0
>> y = x
>> print y
0.333333333333

When we calculate x its type is set to floatin point because it is the result of dividing two floating point numbers. Print x and you get all the precision the machine has. But, when we calculate y as the division of two integers, then we get 0 because y is an integer. Yet, we can set y = x and now y is a floating point number taking on the value of x.

In all these cases the symbols refer to values stored in memory. They act like those values, not like "pointers" to the memory where the value is stored.

An integer may be found from a float by the int() operation on the float:

>> z = 11/9 
>> print z
1.2222222222222223
>> z = int(z)
>> print z
1

There are other basic operations on integers in Python, among them

float() turns an integer or numeric string into a floating point
% means "modulo", that is representation of the modulus of the value in that base, so 10%7 is 3.
- negation
** raise to a power
int() creates an integer from a string
long() creates a long integer from a string
abs() returns the absolute value of an number
factorial() returns the factorial of an integer


Floating point and math

Floating point numbers, like other variables, are dynamically typed. They have the full precision of the machine, typically 64-bits. Many of the useful math operations on floating point numbers are in the math package, and would require

import math

to access the functions. With that, you things like math.pi and math,e to return the values of pi and e, the trigonometric functions, and others less obvious but often needed. All of these would need math. in front of them when you used them:

floor(x) the largest integer less than or equal to x 
trunc(x) truncate x to an integral value r

For example if x=-1.1

>> math.floor(x) 
-2.0
>> math.trunc(x)
-1.0
fabs(x) the absolute value
fsum() an iterable summation
isnan() checks for a NaN, not a number
isinf() checkes for positive or negative infinity
log(x[,base]) where [,base] is optional and it defaults to e
log10(x)
pow(x,y) returns x^y
sqrt(x) 
cos(x) and other similar ones
acos(x) and other similar ones
atan2(y,x) returns the angle with awareness of quadrant based on signs of y and x
degrees(x) returns angle in degrees given angle in radians
radians(x) returns angle in radians given angle in degrees
acosh(x) and other similar hyperbolic functions
erf(x) error function
erfc(x) complementary error function
gamma(x) gamma function


There is also a complex math library, which we will leave for another time.


Characters and text

Data may be strings, that is long sequences of characters. You do not have to allocate space for them before you make the assignment, unlike other languages. For example, you can write ...

>> mystring ='My kingdom for a horse (Richard III).'
>> print mystring
'My kingdom for a horse (Richard III).'

Strings are set off by single tick's, though a quote " will do too, and may be needed if the string includes a tick. Triple quotes """ start a long string which continues in until the next """ and are used when including blocks of text in a program.

When you create the string, Python allocates memory for it and then refers to that object with the symbol you use. As long as the symbol is in use, the object exists. As soon as the symbol is changed or removed, the object's memory space is freed by "garbage collection".

Individual characters in a string are accessed by refering to an index count. In the above example,

>> mystring[3]
'k'

while a range of characters is indicated with [3:7], where "3" is the starting place and "7" the character after the last one you are selecting

>>mystring[3:7]
king

The letter k is the fourth character in the string. Fourth? Yes, strings, like other variables in Python are zero-indexed. That is, the first element is [0], the second [1], and so on.


A long string can be separated into words with the split() function:

>> import string
>> mystring = "A  string with numbers like 1, 2, and 3."
>> mystring.split()
['A',  'string', 'with', 'numbers', 'like', '1,', '2,', 'and', '3.']

There are built-in Python functions to convert an integer to a character and reverse

>> ord('X')
88
>> char(88)
'X'

While we can see single characters in a string with something like mystring[7], strings are immutable and we cannot reassign mystring[7] to another character with "=". Strings are an immutable sequence, that is, they cannot be changed in place. To change a string, you create a new one, and assign the old name to the new string. Garbage collection then frees the memory that had been used by the old sequence.

>> S = 'yellow'
>> S = S + 'cats'
>> S
'yellowcats'
>> S = S[0:6]+' '+'cats'
>> S
'yellow cats'
>>S[0:6]
'yellow'
>>S[6:]
' cats'

The index can be negative. Here, S[0] is the first element of the string, and S[-1] is the last element, S[-2] the second to last, an so on

>>S[-3]
'a'

You can replace parts of a string, really creating a new string from the old one and keeping the name, this way

>> S = 'interstellar dust'
>> S = S.replace('r d','r gas and d')
S
'interstellar gas and dust'

This brings us to the subject of lists and such.


Lists, tuples, dictionaries, and statements

Lists

In Python, a list is an ordered collection of objects that are referred to by an offset index, much like a string, but more powerful. Lists may contain numbers, strings, or other lists. They can be changed in place, and Python manages memory so that you do not have to think about that. While it seems that the objects are "in" the list, actually the list is sequence of references to objects (like an array of pointers in C, but much easier to work on).

>> L = []

is an empty list.

>> L = [0, 10, 100, 1000]

is a list of 4 items indexed from 0 to 3

>> L[3]
1000
>> L[1:3]
[10, 100]

As with strings, when a range is specified in a list, the second value is the index for the entry after the last one.

>> L[2:4]
[100, 1000]

Notice that Python indicates lists with [ ] brackets in the assignment, and when it prints them.

Individual objects in a list can be changed in place. Given a list L = ['Earth','mass',1]

>> L[0] = 'Mars'
>> L[2] = '0.107'
>> L
['Mars', 'mass', '0.107']

Operations on lists include ones to find the length, and to repeat the list

>> L = [1, 2, 3]
>> len(L)
3
>> L * 3
[1, 2, 3, 1, 2, 3, 1, 2, 3]

You can grow a list with an append which pushes an object onto the "stack" that is the list

>> L.append(4)
[1, 2, 3, 4]

or by concatentating

>> L = L + [5]
>> L
[1, 2, 3, 4, 5]

and find the length of the list

>> len(L)
5

To remove the last item, pop it off the list. The function pop() returns the popped object and changes the list

>> L.pop()
5
>> L
[1, 2, 3, 4]

You can extend a list

>> L.extend([5, 6, 7, 8, 9, 10])
>> L
[1, 2, 3, 4, 50, 6, 7, 8, 9, 10]

Items can be removed from a list

>> L.remove(50)
L
>> [1, 2, 3, 4, 6, 7, 8, 9, 10]

You can test if an object is in a list

>> L = [1, 2, 3, 4, 5.82]
>> 3 in L
True
>> 5 in L
False
>> 5.82 in L
True

You can sort a list by value and by alphabetical ordering

>> L = [4, 7, 2.3, 75, 92, -10]
>> L.sort()
>> L
[-10, 2.3, 4, 7, 75, 92]
>> L = ['alpha', 'gamma', 'beta']
>> L.sort()
>> L
['alpha', 'beta', 'gamma']


Dictionaries

Dictionaries are data types that are indexed by a key, rather than an offset. Where in C you might program a loop to compare and search for a value or an item, with a dictionary in Python you simply ask for the item associated with a key. The item can be a value, a string, a list, or another dictionary. Dictionaries are simply unordered collections of objects that you can access by asking for the object's key. Here's a very simple example

>> messier = {1 : 'SNR', 2 : 'globular cluster', 51 : 'galaxy', 42 : 'nebula'}

creates a dictionary called "messier" with keys 1, 2, 51, and 42.

To find the data associated with the key, we query

>> messier[51]
'galaxy'

We add to it by assigning new keys

>> messier[41] = 'open cluster'
>> messier[41]
'open cluster'

In this example the key is an integer, but it could be string. For example

>> starmass = {'Sun' : 1.0 , 'Sirius' : 2.02 , 'Rigel' : 18}
>> starmass['Sirius']
2.02

You can test if an entry is in the dictionary

>> 'Sirius' in starmass
True
>> 'Betelgeuse' in starmass
False

You can add to a dictionary as you go. For example, we may want to include Messier object 42, the Orion Nebula, which would be done with

messier[42] = 'nebula in Orion'


Tuples

A "tuple" is an ordered, immutable, group of objects. Like a string and a list, a tuple is accessed by offset. Tuples are distinguished by ()

>> T1 = (1,3,5,7,11,13)

They can be concatenated

>> T2 = (17,19)
>> T = T1 + T2
>> T
(1, 3, 5, 7, 11, 13, 17, 19)

Since a tuple is immutable, you can be sure that once it is created it will be the same in other parts of a program. It may be used as the key in a dictionary too. Compare this to a list, which is a data structure that may change.


Examples

For examples of Python illustrating input, output, data types, lists, and dictionaries, see the examples section.


Assignments

For the assigned homework to use these ideas, see the assignments section.