As part of our short course on Python for Physics and Astronomy we consider how users interact with their computing environment. A programming language such as Python provides tools to build code that computes scientific models, captures data, sorts it and analyzes it largely without operator action. In effect, once you have written the program, you point it at the data or task it is to do, and wait for it to return new science to you. This is the command line, or batch, model of computing and is at the core of large data science today. Indeed, from your handheld devices to supercomputers, the work that is done is for the most part autonomous. We have seen how Python has built-in components to accept input from the command line, the operating system, the computer that is hosting the program, and the Internet or cloud. What about the other side, the user's perspective on computing?
As an end user, would you prefer to move a mouse or tap a screen in order to select a file, or to type in the path and file name? What if you had to make operational decisions based on graphical output, or changing real world environments as data are collected? In modern computing, most of us interact with the machine and software through a graphical user interface or GUI.
In a unix-like enviroment (Linux or MacOS), the command line is an accessible and often preferred way to instruct a program on what to do. A typical program, as we've seen, might start like this example to interpolate a data file and plot the result:
import sys import numpy as np from scipy.interpolate import UnivariateSpline import matplotlib.pyplot as plt
sfactorflag = True
if len(sys.argv) == 1: print " " print "Usage: interpolate_data.py indata.dat outdata.dat nout [sfactor]" print " " sys.exit("Interpolate data with a univariate spline\n") elif len(sys.argv) == 4: infile = sys.argv outfile = sys.argv nout = int(sys.argv) sfactorflag = False elif len(sys.argv) == 5: infile = sys.argv outfile = sys.argv nout = int(sys.argv) sfactor = float(sys.argv) else: print " " print "Usage: interpolate_data.py indata.dat outdata.dat nout [sfactor]" print " " sys.exit("Interpolate data with a univariate spline\n")
It uses "sys" to parse the command line arguments into text and numbers that control what the program will do. Because its first line directs the system to use the python interpreter, if the program is marked as executable to the user it will run as a single command followed by arguments. In this case it would be something like
interpolate_data.py indata.dat outdata.dat nout sfactor
where indata.dat is a text-based data file of x,y pairs, one pair per line, outdata.dat is the interpolated file, nout is the number of points to be interpolated, and sfactor is an optional floating point smoothing factor. When you run this it will read the files, do the interpolation without further interaction, and (as written) plot a result as well as write out a data file. The rest of the code is
# Take x,y coordinates from a plain text file # Open the file with data infp = open(infile, 'r') # Read all the lines into a list intext = infp.readlines() # Split data text and parse into x,y values # Create empty lists xdata =  ydata =  i = 0 for line in intext: try: # Treat the case of a plain text comma separated entry entry = line.strip().split(",") # Get the x,y values for these fields xval = float(entry) yval = float(entry) xdata.append(xval) ydata.append(yval) i = i + 1 except: try: # Treat the case of a plane text blank space separated entry entry = line.strip().split() xval = float(entry) yval = float(entry) xdata.append(xval) ydata.append(yval) i = i + 1 except: pass # How many points found? nin = i if nin < 1: sys.exit('No objects found in %s' % (infile,))
# Import data into a np arrays x = np.array(xdata) y = np.array(ydata)
# Function to interpolate the data with a univariate cubic spline if sfactorflag: f_interpolated = UnivariateSpline(x, y, k=3, s=sfactor) else: f_interpolated = UnivariateSpline(x, y, k=3)
# Values of x for sampling inside the boundaries of the original data x_interpolated = np.linspace(x.min(),x.max(), nout) # New values of y for these sample points y_interpolated = f_interpolated(x_interpolated)
# Create an plot with labeled axes plt.figure().canvas.set_window_title(infile) plt.xlabel('X') plt.ylabel('Y') plt.title('Interpolation') plt.plot(x, y, color='red', linestyle='None', marker='.', markersize=10., label='Data') plt.plot(x_interpolated, y_interpolated, color='blue', linestyle='-', marker='None', label='Interpolated', linewidth=1.5) plt.legend() plt.minorticks_on() plt.show()
# Open the output file outfp = open(outfile, 'w') # Write the interpolated data for i in range(nout): outline = "%f %f\n" % (x[i],y[i]) outfp.write(outline) # Close the output file outfp.close() # Exit gracefully exit()