Actions

Difference between revisions of "Image processing with Python and SciPy"

From AstroEd

(Processing images with NumPy and SciPy)
 
(50 intermediate revisions by the same user not shown)
Line 1: Line 1:
Given that NumPy provides multidimensional arrays, and that there is core support through the Python Imaging Library and Matplotlib to display images and manipulate images in the Python environment, it's easy to take the next step and combine these for scientific image processing.  As part of our  [http://prancer.physics.louisville.edu/astrowiki/index.php/Python_for_Physics_and_Astronomy short course on Python for Physics and Astronomy] we begin by exploring how Python handle image input and output
+
Given that NumPy provides multidimensional arrays, and that there is core support through the Python Imaging Library and Matplotlib to display images and manipulate images in the Python environment, it's easy to take the next step and combine these for scientific image processing.  As part of our  [http://prancer.physics.louisville.edu/astrowiki/index.php/Python_for_Physics_and_Astronomy short course on Python for Physics and Astronomy] we begin by exploring how Python handles image input and output through pillow, scikit-image, and pyfits.  Once loaded, an image may be processed using library routines or by mathematical operations that would take advantage of the speed and conciseness of numpy and scipy.  Some of the resources mentioned here require Python >3.4, and at this time Python 3.6 is the current one.
  
  
== Python Imaging Library - PIL ==
+
== Pillow - An Imaging Library ==
  
Before we get into the broad area of image processing in Python, there is a caveat for users PIL in  Python 3. The essential [http://www.pythonware.com/products/pil/ Python Imaging Library(PIL) is not yet completely compatible with new version of Python.  Consequently the FITS tools we will need for astronomical image processing are also currently only supported in the mature versions of Python 2. The comments that follow are based on Python 2.7.
+
The Python Imaging Library (PIL) was developed for Python 2.x and provided functions to manipulate images, including reading, modifying and saving in various standard image formats in a package called "PIL".  With the coming of age of Python 3.x, a fork of the older version has evolved that is more suited for the new technologies and is in a package called "Pillow".  It continues to improve, and the features described here are tested with "Pillow 5.1" and Python 3.6 as of April 2018. Pillow will probably be on any packaged distribution of Python 3, or it may be installed with (note the capital "P")
  
PIL provides functions to manipulate images, including reading, modifying and saving in various standard image formats.  Its functions are documented in an [http://www.pythonware.com/library/pil/handbook/index.htm on-line manual] with a tutorial, and in this handy
+
pip install Pillow
[http://prancer.physics.louisville.edu/classes/650/python/pil/pil-handbook.pdf pdf guide].
 
  
As a simple starting example, suppose you have an image that was taken with the camera turned  so that "up" is to the side when the image is displayedHere's how you would rotate an image 90 degrees.
+
Pillow  includes the basics of image processing. with functions that are documented by the developers in a [https://pillow.readthedocs.io/en/5.1.x/handbook/index.html handbook] describing the methods and giving some examples.  Pillow uses the same "namespace" as PIL and older code should work, perhaps with a few modifications  to allow for recent developments. Most import for us, Pillow has routines to read and write conventional image formatsOnce an image has been read into a numpy array, the full power of Python is available to process it, and we can turn to Pillow again to save a processed image in  png or jpg or another format.  Flexibile Image Transport System (FITS)  files used for astronomy should be managed with astropy or pyfits.
  
  import Image as pil
+
As a simple starting example, suppose you have an image that was taken with the camera turned so that "up" is to the side when the image is displayed.  Here's how you would read it, rotate it 90 degrees, and write it out again using Pillow.
  
  parser= argparse.ArgumentParser(description = 'Rotate a png image 90 degrees')
+
  import os  
 +
  from PIL import Image as pil
  
if len(sys.argv) == 1:
+
  parser= argparse.ArgumentParser(description = 'Rotate a png image 90 degrees')
  sys.exit("Usage: png_image_rotate file.png ")
 
  exit()
 
elif len(sys.argv) == 2:
 
  infilename = sys.argv[1]
 
else:
 
  sys.exit("Usage: png_image_rotate file.png ")
 
  exit()  
 
  
myimage = pil.open(infilename)
+
  if len(sys.argv) == 1:
 +
    sys.exit("Usage: png_image_rotate file.png ")
 +
    exit()
 +
  elif len(sys.argv) == 2:
 +
    infilename = sys.argv[1]
 +
  else:
 +
    sys.exit("Usage: png_image_rotate file.png ")
 +
    exit()
 +
 
 +
  myimage = pil.open(infilename)
 
   
 
   
mirror = myimage.transpose(pil.ROTATE_90)
+
  mirror = myimage.transpose(pil.ROTATE_90)
outfilename = os.path.splitext(os.path.basename(infilename))[0]+'_r90.png'
+
  outfilename = os.path.splitext(os.path.basename(infilename))[0]+'_r90.png'
mirror.save(outfilename)
+
  mirror.save(outfilename)
  
  
Line 36: Line 38:
 
You can transpose an image left-right with
 
You can transpose an image left-right with
  
mirror = myimage.transpose(pil.FLIP_LEFT_RIGHT)
+
  mirror = myimage.transpose(pil.FLIP_LEFT_RIGHT)
  
 
or do both in one step with
 
or do both in one step with
  
mirror = myimage.transpose(Image.FLIP_LEFT_RIGHT).transpose(pil.ROTATE_90)
+
  mirror = myimage.transpose(Image.FLIP_LEFT_RIGHT).transpose(pil.ROTATE_90)
 +
 
 +
Processing is not limited to "PNG" files, though that file type is preferred because it is not a lossy storage option.  Python reads and writes "jpg" files too.  While PIL provides some essential functionality, for more practical uses in astronomy we need to read Flexible Image Transport or "FITS" files, and to enable numerical work on images in SciPy. Many of the processing functions you will find in Python Imaging Library (PIL) are also available in SciPy where we have precise mathematical control over their definitions and operation. Some more advanced techiques are available in SciPy too, courtesy of researchers who have contributed to SciKit as we will see.
 +
 
 +
Python's core routines dependent on matplotlib may be used to display an image, but these are designed for graphics, and limited by the constraints of the matplotlib interface.  With a little effort there are better choices.
 +
 
 +
 
 +
== SciKit Image ==
 +
 
 +
We've mentioned that [https://www.scipy.org/scikits.html SciKits] is a searchable index of highly specialized tools that are built on SciPy and NumPy.  Among them, [http://scikit-image.org/ scikit-image] is for image processing in Python. It is oriented toward extracting physical information from images, and has routines for reading, writing, and modifying images that are powerful, and fast.  Scikit-image is often compared to OpenCV, a collection of programs for computer vision that include live video.  Both are actively maintained and in many ways complementary, but for physics and astronomy scikit-image is more powerful at this time.
 +
 
 +
The scikit-image package is part of Anaconda and Enthought Python, and that would be a recommended platform for Windows.  However for Linux and Mac OSX,
 +
 
 +
  pip install scikit-image
 +
 
 +
should work.  The package usually requires local compilation of the code when installed this way. Qt bindings are provided by PyQt5, PySide or by qtpy, depending on licensing and software requirements.  PySide is less restrictive than PyQt and qtpy is an abstraction that draws on both.
 +
 
 +
  pip install PySide
 +
 
 +
 
 +
While scipy has included an image reader and writer, as of April 2018 this function is deprecated in the base code and rather than use pillow, we can turn to scikit-image. The module to read and write image is skimage.io
 +
 
 +
  import skimage.io
 +
  import numpy as np
 +
 
 +
and the command
 +
 
 +
  skimage.io.find_available_plugins()
 +
 
 +
will  provide a dictionary of the libraries that may be used to read various file types.  For example
 +
 
 +
imlibs = skimage.io.find_available_plugins()
 +
 
 +
and imlibs['pil'] will list the functions that the Python imaging library provides.  The package tries the libraries in order until it finds one that works:
 +
 
 +
  myimage = skimage.io.imread(filename).astype( np.float32)
 +
 
 +
will read an image and return a numpy array which by default will be an RGB image if the file is a png file, for example.  A greyscale image image be specified by including as_grey=True as an argument. A numpy image has a shape that for color has 3 values in each pixel
 +
 
 +
  print(myimage.shape)
 +
  (498, 680, 3)
 +
 
 +
as an example for an RGB image, or
 +
 
 +
  (498,680)
 +
 
 +
for a gray scale image.  Since numpy by default would store into a 64-bit float and matplotlib (the default display for skimage) requires 32-bit, we specify loading into a 32 bit array while planning ahead to seeing the result.
 +
 
 +
 
 +
Images may be saved:
 +
 
 +
  skimage.io.imsave(filename, nparray)
 +
 
 +
and the file type is determined by file name extension. 
 +
 
 +
Images may be displayed, but it takes two steps
 +
 
 +
  skimage.io.imshow(myimage)
 +
  skimage.io.show()
 +
 
 +
when invoking the default matplotlib plugin.  The display will look like one created by pyplot.  There is a simpler,  viewer module too, without pyplot toolbar.
 +
 
 +
  import skimage.viewer
 +
 
 +
  viewer = skimage.viewer.viewers.ImageViewer(myimage)
 +
  viewer.show
 +
 
 +
 
 +
We will use  routines from the scikit-image  package to identify stars in an image, adapting a program given by Eli Bressert in his book SciPy and NumPy to handle an image from one of our telescopes. The complete program is given in the Examples section below.
 +
 
 +
It begins with the requisite imports including the ones from SciKit
 +
 
 +
  import os
 +
  import sys
 +
  import argparse
 +
  import matplotlib.pyplot as mpl
 +
  import numpy as np
 +
  import astropy.io.fits as pyfits
 +
  import skimage.morphology as morph
 +
  import skimage.exposure as skie
 +
 
 +
It opens an FITS file and loads only part of the image
 +
 
 +
  img =  pyfits.getdata(infits)[1000:3000, 1000:3000]
 +
 
 +
to insure a square sample area and to limit the size of the search region. This is an alternative to reading a FITS image and also getting the header.
 +
 
 +
With this data the program creates a new image using an mapping arc sinh that captures the full dynamic range effectively.  It locates lower and upper bounds that should include only stars.  The parameters would probably have to be refined to optimize the extraction of stars from background.
 +
 
 +
  limg = np.arcsinh(img)
 +
  limg = limg / limg.max()
 +
  low = np.percentile(limg, 0.25)
 +
  high = np.percentile(limg, 99.5)
 +
  opt_img  = skie.exposure.rescale_intensity(limg, in_range=(low,high))
 +
 
 +
With the useful range determined, we create a new image that is scaled between the lower and upper limits that will be used for displaying the star map.  We search the arcsinh-stretched original image for local maxima and catalog those brighter than a threshold that is adjusted based on the image.
 +
 
 +
  lm = morph.is_local_maximum(limg)
 +
  x1, y1 = np.where(lm.T == True)
 +
  v = limg[(y1,x1)]
 +
  lim = 0.7
 +
  x2, y2 = x1[v > lim], y1[v > lim]
 +
 
 +
The list x2,y2 has the stars the were found. The rest is image display code to draw circles around the stars and create an image that shows where they are.
 +
 +
 
 +
[[File:m34_100s_log1000_rgb_20121104.png  | center | 400px]]
 +
 +
When the program processes a 100 second image of the open cluster M34, shown above, it identifies the stars  in the right panel below.
  
Processing is not limited to "PNG" files, though that file type is preferred because it is not a lossy storage option.  Python reads and writes "jpg" files too.  While PIL provides some essential functionality, for more practical uses in astronomy we need to read Flexible Image Transport or "FITS" files, and to enable numerical work on images in SciPy.
+
[[File:test_m34_stars.png | center | 600px]]
  
 
== Images with NumPy and SciPy ==
 
== Images with NumPy and SciPy ==
Line 48: Line 158:
 
SciPy can read jpg and png images directly, without using PIL.  With SciPy images are stored in numpy arrays, and we have direct access to the data for uses other than visualization.
 
SciPy can read jpg and png images directly, without using PIL.  With SciPy images are stored in numpy arrays, and we have direct access to the data for uses other than visualization.
  
import numpy as np
+
  import numpy as np
import matplotlib.pyplot as plt
+
  import matplotlib.pyplot as plt
from scipy.misc import imread, imsave
+
  import skimage.io.imread as imread
 +
  import skimage.io.imsave as imsave
 +
  #import skimage.io.imshow as imshow  is an alternative for display
  
image_data = imread('test.jpg').astype(np.float32)
+
  image_data = imread('test.jpg').astype(np.float32)
print 'Size: ', image_data.size
+
  print 'Size: ', image_data.size
print 'Shape: ', image_data.shape
+
  print 'Shape: ', image_data.shape
 
   
 
   
scaled_image_data = image_data / 255.
+
  scaled_image_data = image_data / 255.
  
# Save the modified image if you want to
+
  # Save the modified image if you want to
# imsave('test_out.png', scaled_image_data)
+
  # imsave('test_out.png', scaled_image_data)
  
plt.imshow(scaled_image_data)
+
  plt.imshow(scaled_image_data)
plt.show()
+
  plt.show()
  
exit()
+
  exit()
  
 
For our 512x512 color test image, this returns
 
For our 512x512 color test image, this returns
  
Size:  786432
+
  Size:  786432
Shape:  (512, 512, 3)
+
  Shape:  (512, 512, 3)
  
 
because the image is 512x512 pixels and has 3 planes -- red, green, and blue. When SciPy reads a jpg or png image it will separate the colors for you. The "image" is a data cube. In the imread line we control the data type within the Python environment.  Of course the initial data is typically 8 bits for color images from a cell phone camera, 16 bits for scientific images from a CCD, and perhaps 32 bits for processed images that require the additional dynamic range.
 
because the image is 512x512 pixels and has 3 planes -- red, green, and blue. When SciPy reads a jpg or png image it will separate the colors for you. The "image" is a data cube. In the imread line we control the data type within the Python environment.  Of course the initial data is typically 8 bits for color images from a cell phone camera, 16 bits for scientific images from a CCD, and perhaps 32 bits for processed images that require the additional dynamic range.
Line 75: Line 187:
 
We can display images with matplotlib.pyplot using [http://matplotlib.org/api/pyplot_api.html#matplotlib.pyplot.imshow imshow()] --
 
We can display images with matplotlib.pyplot using [http://matplotlib.org/api/pyplot_api.html#matplotlib.pyplot.imshow imshow()] --
  
imshow(X, cmap=None, norm=None, aspect=None, interpolation=None,
+
  imshow(X, cmap=None, norm=None, aspect=None, interpolation=None,
  alpha=None, vmin=None, vmax=None, origin=None, extent=None, **kwargs)
+
    alpha=None, vmin=None, vmax=None, origin=None, extent=None, **kwargs)
  
 
where "X" is the image array.  If X is 3-dimensional, imshow will display a color image.
 
where "X" is the image array.  If X is 3-dimensional, imshow will display a color image.
Line 89: Line 201:
 
We can slice the image data to  see each color plane by creating new arrays indexed from the original one.   
 
We can slice the image data to  see each color plane by creating new arrays indexed from the original one.   
  
import numpy as np
+
  import numpy as np
from scipy.misc import imread, imsave
+
  from scipy.misc import imread, imsave
import pylab as plt
+
  import pylab as plt
  
image_data = imread('test.jpg').astype(np.float32)
+
  image_data = imread('test.jpg').astype(np.float32)
scaled_image_data = image_data / 255.
+
  scaled_image_data = image_data / 255.
  
print 'Size: ', image_data.size
+
  print 'Size: ', image_data.size
print 'Shape: ', image_data.shape
+
  print 'Shape: ', image_data.shape
 
    
 
    
image_slice_red =  scaled_image_data[:,:,0]
+
  image_slice_red =  scaled_image_data[:,:,0]
image_slice_green =  scaled_image_data[:,:,1]
+
  image_slice_green =  scaled_image_data[:,:,1]
image_slice_blue =  scaled_image_data[:,:,2]
+
  image_slice_blue =  scaled_image_data[:,:,2]
  
print 'Size: ', image_slice_0.size
+
  print 'Size: ', image_slice_0.size
print 'Shape: ', image_slice_0.shape
+
  print 'Shape: ', image_slice_0.shape
  
plt.subplot(221)
+
  plt.subplot(221)
plt.imshow(image_slice_red, cmap=plt.cm.Reds_r)
+
  plt.imshow(image_slice_red, cmap=plt.cm.Reds_r)
  
plt.subplot(222)
+
  plt.subplot(222)
plt.imshow(image_slice_green, cmap=plt.cm.Greens_r')
+
  plt.imshow(image_slice_green, cmap=plt.cm.Greens_r')
  
plt.subplot(223)
+
  plt.subplot(223)
plt.imshow(image_slice_blue, cmap=plt.cm.Blues_r)   
+
  plt.imshow(image_slice_blue, cmap=plt.cm.Blues_r)   
  
plt.subplot(224)
+
  plt.subplot(224)
plt.imshow(scaled_image_data)   
+
  plt.imshow(scaled_image_data)   
  
plt.show()
+
  plt.show()
  
  
Line 125: Line 237:
 
[[File:single_image_slice.png | center | 600px ]]
 
[[File:single_image_slice.png | center | 600px ]]
  
== Astronomical FITS files with PyFITS ==
 
  
[http://www.stsci.edu/institute/software_hardware/pyfits PyFITS] is available from the Space Telescope Science Institute, and can be added easily to a Python installation that already has NumPy and SciPy.  As of January 2013, the current version 3.1.1 of PyFITS supports all the functions needed to manage image and table data in the standard Flexibile Image Transport System (FITS) files of astronomy.
+
This approach offers a template for displaying multidimensional computed or experimental data as an image created with Python.  Consider this short program that creates and displays an image with Gaussian noise:
  
Once a FITS file has been read, the header is accessible for reading, and the image data are available in a NumPy array. With PyFITS and NumPY you can read, extract information, modify, display and save image data.
+
  # Import the packages you need
 +
  import numpy as np
 +
  import matplotlib.pyplot as plt
 +
  from scipy.misc import imsave
 +
 
 +
  # Create the image data
 +
  image_data = np.zeros(512*512, dtype=np.float32).reshape(512,512)
 +
  random_data = np.random.randn(512,512)
 +
  image_data = image_data + 100.*random_data
 +
 +
  print 'Size: ', image_data.size
 +
  print 'Shape: ', image_data.shape
 +
  scaled_image_data = image_data / 255.
 +
 
 +
  # Save and display the image  
 +
  imsave('noise.png', scaled_image_data)
 +
  plt.imshow(scaled_image_data, cmap='gray')
 +
  plt.show()
 +
 
 +
  exit()
 +
 
 +
 
 +
Instead of random noise, you could create any functional data you wanted to display. For images with color, the NumPy array would have red, green, and blue planes.
  
  
=== Displaying a FITS file in Python ===
+
== Astronomical FITS Files ==  
  
There are many image display tools for astronomy, and perhaps the most widely used is [http://hea-www.harvard.edu/RD/ds9/site/Home.html ds9] which is available for Linux, MacOS, and Windows, as well as in source codeIt is useful to have a version on your computer when you are working with FITS images. However, sometimes we want to view an image while we are processing it in Python, and for that we could use a routine such as this one.
+
Astronomical image data are potentially complex and rich, for which quantitative  structures have been developed to standardize lossless storage of the data along with the metadata that describe its origin and previous processing.  While photographic images are often only 8 bits deep, mixed with red, green and blue in a single image,  and compressed to reduce file size, astronomical images are 16 or 32 bits deep in a single color.  Until recently, the file sizes needed for astronomy were unrivaled by commodity cameras, but in todays market of megapixel imaging on cell phones, the camera in your pocket also produces very large rich images.  The difference in handling them is that for science we want to preserve the data without loss, quantitatively calibrate and measure  the flux from the source and map that back to a specific angle in space, while for art or even for some academic uses, it is the beauty, color, and highlights shown in the image display that are important.  Commodity images are saved usually in compressed formats such as JPG, or uncompressed TIFF or proprietary binary formats.  For astronomy and other quantitative imaging work, the [https://fits.gsfc.nasa.gov/fits_primer.html Flexible Image Transport System (or FITS)] format is almost universal. It includes the image data, and a header describing the data.  FITS files may also be tables of data, or a cube of images in sequenceThe standards developed for creating these files are slowing evolving as the needs of big data in astronomy have grown.   Programmers and scientists at NASA, the Space Telescope Science Institute, and the academic community at large are contributing to libraries that enable reading, processing, and saving FITS files in Python, as well as in C, and Fortran.
  
 +
Once a FITS file has been read, the header its accessible as a Python dictionary of  the data contents, and the image data are  in a NumPy array. With Python using NumPy and SciPy you can read, extract information, modify, display, create  and save image data. 
  
import os
 
import sys
 
import argparse
 
import numpy as np
 
import pyfits
 
import matplotlib
 
import matplotlib.pyplot
 
  
Sometimes submodules are not loaded with the larger package, and we explictly ask for pyplot.
+
=== Reading and Writing a FITS File in Python ===
 +
 
 +
There are many image display tools for astronomy, and perhaps the most widely used is [http://hea-www.harvard.edu/RD/ds9/site/Home.html ds9] which is available for Linux, MacOS, and Windows, as well as in source code.  It is always useful to have a version on your computer when you are working with FITS images. A more versatile Java platform for astronomical image viewing  that also does and processing'  is now widely used for precision astronomical photometry where interactive analysis is needed. AstroImageJ is free and simple to install on most computers, and because it is also a powerful processor, for many purposes it is an all-in-one tool.  If you are working with FITS images, this is highly recommended: 
 +
 
 +
[http://www.astro.louisville.edu/software/astroimagej/index.html AstroImageJ] website and program source
 +
 
 +
[http://iopscience.iop.org/article/10.3847/1538-3881/153/2/77 Astronomical Journal] article describing Astroimagej
 +
 
 +
 
 +
However, sometimes we want perform specialized work on image data, and to view it while processing it in Python.  This routine demonstrates how to read a FITS file, inspect its header, and show the image on the computer display.  It is dependent on the "PyFITS" library developed at Space Telescope Science Institute and  incorporated into a larger package by the  [http://www.astropy.org/ AstroPy Project].  AstroPy's library is part of the Enthought and Canopy distributions of Python.  If you are using a system version of python, you may need to install it with pip
 +
 
 +
  pip install astropy
 +
 
 +
should do it.  The package have several dependences on python (it requires version 3.5 or higher at this time (April 2018), and on numpy and scipy.  It is also frequently updated, and while stable may push the "cutting edge" of distributions from conservative operating system distributors like OpenSuse.  That's a good reason, if you need this capability, to have a version of Python built with current sources, or to use a complete distribution such as Anaconda or Enthought.
 +
 
 +
A program to work with FITS files would begin by importing the packages that it needs
 +
 
 +
 
 +
  import os
 +
  import sys
 +
  import argparse
 +
  import numpy as np
 +
  import astropy.io.fits as pyfits
 +
  import matplotlib
 +
  import matplotlib.pyplot
 +
 
 +
Importing the FITS modules in this way makes the code backward compatible with the earlier versions of PyFITS.  Also, since sometimes submodules are not loaded with the larger package, we explicitly ask for the io.fits components, and for pyplot.
  
# Define a function for making a linear gray scale
+
  # Define a function for making a linear gray scale
def lingray(x, a=None, b=None):
+
  def lingray(x, a=None, b=None):
  """
+
  """
  Auxiliary function that specifies the linear gray scale.
+
  Auxiliary function that specifies the linear gray scale.
  a and b are the cutoffs : if not specified, min and max are used
+
  a and b are the cutoffs : if not specified, min and max are used
  """
+
  """
  if a == None:
+
  if a == None:
          a = np.min(x)
+
    a = np.min(x)
  if b == None:
+
  if b == None:
          b = np.max(x)
+
    b = np.max(x)
  return 255.0 * (x-float(a))/(b-a)
+
  return 255.0 * (x-float(a))/(b-a)
  
# Define a function for making a logarithmic gray scale
+
  # Define a function for making a logarithmic gray scale
def loggray(x, a=None, b=None):
+
  def loggray(x, a=None, b=None):
  """
+
    """
  Auxiliary function that specifies the logarithmic gray scale.
+
    Auxiliary function that specifies the logarithmic gray scale.
  a and b are the cutoffs : if not specified, min and max are used
+
    a and b are the cutoffs : if not specified, min and max are used
  """
+
    """
  if a == None:
+
    if a == None:
          a = np.min(x)
+
      a = np.min(x)
  if b == None:
+
    if b == None:
          b = np.max(x)           
+
      b = np.max(x)           
  linval = 10.0 + 990.0 * (x-float(a))/(b-a)
+
  linval = 10.0 + 990.0 * (x-float(a))/(b-a)
  return (np.log10(linval)-1.0)*0.5 * 255.0
+
  return (np.log10(linval)-1.0)*0.5 * 255.0
  
 
These functions may be used to rescale the data for display.  Scaling can be done with a colormap, or by modifying the data before displaying.  Here, we modify the data and save it in a temporary buffer for display.  In a more elaborate program with a full user interface, this scaling could be done interactively.
 
These functions may be used to rescale the data for display.  Scaling can be done with a colormap, or by modifying the data before displaying.  Here, we modify the data and save it in a temporary buffer for display.  In a more elaborate program with a full user interface, this scaling could be done interactively.
  
# Provide information to the argparse routine if we need it
+
  # Provide information to the argparse routine if we need it
parser= argparse.ArgumentParser(description = 'Display a
+
  parser= argparse.ArgumentParser(description = 'Display a fits image')
  fits image')
 
  
The argparse function offers more flexibility than the system routines, though here we only include this for future additions.  
+
The argparse function offers more flexibility than the system routines, though here we only include this for future additions. A simpler method is to parse the command line itself using the system utilities:
  
# Test for command line arguments
+
  # Test for command line arguments
if len(sys.argv) == 1:
+
  if len(sys.argv) == 1:
  sys.exit("Usage: display_fits infile.fits ")
+
    sys.exit("Usage: display_fits infile.fits ")
  exit()
+
    exit()
elif len(sys.argv) == 2:
+
  elif len(sys.argv) == 2:
  infits = sys.argv[1]
+
    infits = sys.argv[1]
else:
+
  else:
  sys.exit("Usage: display_fits infile.fits ")
+
    sys.exit("Usage: display_fits infile.fits ")
  exit()  
+
    exit()  
  
# Open the fits file and create an hdulist
+
  # Open the fits file and create an hdulist
inhdulist = pyfits.open(infits)  
+
  inhdulist = pyfits.open(infits)  
  
# Assign the input header in case it is needed later
+
  # Assign the input header in case it is needed later
inhdr = inhdulist[0].header
+
  inhdr = inhdulist[0].header
  
# Assign image data to a numpy array
+
  # Assign image data to a numpy array
image_data =  inhdulist[0].data
+
  image_data =  inhdulist[0].data
  
 
The header and data are now available.  We'll look at header information later.  For now, all we need are the values in the numpy data array. It will be indexed from [0,0] at the upper left of the data space, which would be the upper left of the displayed image.
 
The header and data are now available.  We'll look at header information later.  For now, all we need are the values in the numpy data array. It will be indexed from [0,0] at the upper left of the data space, which would be the upper left of the displayed image.
  
# Print information about the image
+
  # Print information about the image
print 'Size: ', image_data.size
+
  print 'Size: ', image_data.size
print 'Shape: ', image_data.shape
+
  print 'Shape: ', image_data.shape
  
 
For this example we use linear scaling.
 
For this example we use linear scaling.
  
# Show the image
+
  # Show the image
new_image_data = lingray(image_data)
+
  new_image_data = lingray(image_data)
new_image_min = 0.
+
  new_image_min = 0.
new_image_max = np.max(new_image_data)
+
  new_image_max = np.max(new_image_data)
matplotlib.pyplot.imshow(new_image_data, vmin = new_image_min,  
+
  matplotlib.pyplot.imshow(new_image_data, vmin = new_image_min, vmax = new_image_max, cmap ='gray')   
  vmax = new_image_max, cmap ='gray')   
+
  matplotlib.pyplot.show()   
matplotlib.pyplot.show()   
 
 
 
# Close the input image file and exit
 
inhdulist.close()
 
exit ()
 
 
 
It would be a straightforward exercise to add an interactive scale control and to read out the value of each pixel in this program.  With that, it has the basic functionality of ds9.
 
  
 +
  # Close the input image file and exit
 +
  inhdulist.close()
 +
  exit ()
  
 +
It would be a straightforward exercise to add an interactive scale control and to read out the value of each pixel in this program.  With that, it has the basic functionality of ds9 or the AstroImageJ viewer.
  
  
=== Dark subtraction and flat fielding ===
+
=== Correcting and Combining Images ===
  
 
There are  processing operations done on all "raw" astronomical images taken with [http://prancer.physics.louisville.edu/astrowiki/index.php/Use_a_CCD_Camera charge coupled device cameras]. Since the voltage that is digitized is proprotional to the number of photons that arrived at each pixel during the exposure, the data for that pixel should be proportional to the photon count, that is to the irradiance of photons/area-time times the area of the pixel times the exposure time.  An absolute conversion to the flux from the sources that are imaged requires correcting for
 
There are  processing operations done on all "raw" astronomical images taken with [http://prancer.physics.louisville.edu/astrowiki/index.php/Use_a_CCD_Camera charge coupled device cameras]. Since the voltage that is digitized is proprotional to the number of photons that arrived at each pixel during the exposure, the data for that pixel should be proportional to the photon count, that is to the irradiance of photons/area-time times the area of the pixel times the exposure time.  An absolute conversion to the flux from the sources that are imaged requires correcting for
Line 257: Line 406:
 
and has a shape that is (n,m) with the elements that we wanted.   
 
and has a shape that is (n,m) with the elements that we wanted.   
  
Median operations on a image stack remove random noise more effectively than averaging because one source of noise in CCD images is cosmic ray events that produce an occasional large signal at a pixel.  If we mean-averaged that data for that pixel the result would be too large because of the one singular event in the stack, but by median-averaging we discard that event without adversely affecting the new reference frame.
+
Median operations on a image stack remove random noise more effectively than averaging because one source of noise in CCD images is cosmic ray events that produce an occasional large signal at a pixel.  If we mean-averaged with an outlier in one pixel the result would be too large because of the one singular event in the stack, but by median-averaging we discard that event without adversely affecting the new reference frame.
  
 
A median of a stack of flat frames, all normalized so that they should be identical, will remove stars from the reference image as long as each contributing flat image in the stack is taken of a different star field.  Typically, these "sky flats" are  images taken at twilight, processed to remove the dark signal, normalized to unity, and then median averaged to remove stars and reduce random noise.   
 
A median of a stack of flat frames, all normalized so that they should be identical, will remove stars from the reference image as long as each contributing flat image in the stack is taken of a different star field.  Typically, these "sky flats" are  images taken at twilight, processed to remove the dark signal, normalized to unity, and then median averaged to remove stars and reduce random noise.   
Line 276: Line 425:
 
We would do this to create a final image that is effectively one long exposure, the sum of all the contributing image exposure times.  Because of guiding errors, cosmic rays, and weather,  one very long exposure is often not possible, but 10's or 100's of shorter exposures can be "co-added" after selecting the best ones and aligning them so that each pixel in the contributing image corresponds to the same place in the sky.
 
We would do this to create a final image that is effectively one long exposure, the sum of all the contributing image exposure times.  Because of guiding errors, cosmic rays, and weather,  one very long exposure is often not possible, but 10's or 100's of shorter exposures can be "co-added" after selecting the best ones and aligning them so that each pixel in the contributing image corresponds to the same place in the sky.
  
Finally, for spectroscopy the useful data in an image may be a sum over a region.  In that case we could index through the array and explicitly create the sum if needed, but in the ideal case of a spectrum we may want only the sum along a column for each element of a row.  For that, the function is
 
  
  spectrum = np.sum(image, axis)
+
=== Masked Image Operations ===
 +
 
 +
A [http://docs.scipy.org/doc/numpy/reference/maskedarray.generic.html Numpy array mask] is a boolean array that determines whether or not an operation is to be performed.  If you have an image in a array, the mask allows you to work on only part of the image, ignoring the other part.  This is useful for finding the mean of a selected region, or for computing a function that fits part of an image but ignores another part.  For example, consider an array
 +
 
 +
  x = np.arange(4*4).reshape(4,4)
 +
 
 +
which is a simple 4x4 image with values from 0 to 15:
 +
 
 +
print x
 +
array([[ 0,  1, 2,  3],
 +
      [ 4,  5,  6,  7],
 +
      [ 8,  9, 10, 11],
 +
      [12, 13, 14, 15]])
 +
 
 +
Make a copy of ths image which is a boolean mask
 +
 
 +
xmask = np.ma.make_mask(x,copy=True,shrink=True,dtype=np.bool)
 +
 
 +
and it will have all True  values except for the first entry, 0, which is interpreted False:
 +
 
 +
print xmask
 +
array([[False,  True,  True,  True],
 +
      [ True,  True,  True,  True],
 +
      [ True,  True,  True,  True],
 +
      [ True,  True,  True,  True]], dtype=bool)
 +
 
 +
You can set all values (or individual ones) to the state you need
 +
 
 +
xmask[:,:] = True
 +
 
 +
will make them all True, or
 +
 
 +
xmask[0,:] = False
 +
xmask[3,:] = False
 +
xmask[:,0] = False
 +
xmask[:,3] = False
 +
 
 +
will set the values around the perimeter False and leave the others True
 +
 
 +
print xmask
 +
array([[False, False, False, False],
 +
      [False,  True,  True, False],
 +
      [False,  True,  True, False],
 +
      [False, False, False, False]], dtype=bool)
 +
 
 +
 
 +
We apply the mask to the original data
 +
 
 +
mx = np.ma.masked_array(x, mask=xmask)
 +
print mx
 +
 +
[[0 1 2 3]
 +
[4 -- -- 7]
 +
[8 -- -- 11]
 +
[12 13 14 15]]
 +
 
 +
and see that the values that are masked by "True" are not included in the new masked array.  If we want the sum of the elements in the masked array, then
 +
 
 +
np.sum(mx)
 +
90
 +
 
 +
while
 +
 
 +
np.sum(x)
 +
120
  
which returns a numpy array that is the sum along the specified axis.  If there are regions in the image that should not be included in the sum, then the image could be masked before computing the sum. The result is a 1-d array in which each element is the signal at a different wavelength.  
+
There are many ways to create a mask, and to operate on masked arrays, described in the [http://docs.scipy.org/doc/numpy/reference/maskedarray.generic.html Numpy documentation].
  
Sometimes the spectral lines are not along a column but are still straight (if curved, then we need other ways to mask the array), in which case scipy.misc.imrotate can be used before the sum to align the spectrum with a row or column.
 
  
=== FITS headers ===
+
=== FITS Headers ===
  
 
FITS files contain a "header" that tells us what is in the file, and then data in a format that is defined by the header.  As we have seen, when a FITS file is read in NumPy with PyFITS, the data and the header are separate entities. Here's the complete header from an image taken with one of our telescopes:
 
FITS files contain a "header" that tells us what is in the file, and then data in a format that is defined by the header.  As we have seen, when a FITS file is read in NumPy with PyFITS, the data and the header are separate entities. Here's the complete header from an image taken with one of our telescopes:
Line 294: Line 505:
 
We will open an image ''v1701_00146_i.fits'' in interactive Python and look at the header.
 
We will open an image ''v1701_00146_i.fits'' in interactive Python and look at the header.
  
import numpy as np
+
  import numpy as np
import pyfits
+
  import astropy.io.fits as pyfits
infits='v1701_00146_i.fits'
+
  infits='v1701_00146_i.fits'
inhdulist = pyfits.open(infits)
+
  inhdulist = pyfits.open(infits)
inhdr = inhdulist[0].header
+
  inhdr = inhdulist[0].header
inhdr
+
  inhdr
  
  
SIMPLE  =                    T / file does conform to FITS standard
+
  SIMPLE  =                    T / file does conform to FITS standard
BITPIX  =                  16 / number of bits per data pixel
+
  BITPIX  =                  16 / number of bits per data pixel
NAXIS  =                    2 / number of data axes
+
  NAXIS  =                    2 / number of data axes
NAXIS1  =                4096 / length of data axis 1
+
  NAXIS1  =                4096 / length of data axis 1
NAXIS2  =                4096 / length of data axis 2
+
  NAXIS2  =                4096 / length of data axis 2
EXTEND  =                    T / FITS dataset may contain extensions
+
  EXTEND  =                    T / FITS dataset may contain extensions
COMMENT  FITS (Flexible Image Transport System) format  
+
  COMMENT  FITS (Flexible Image Transport System) format is defined in  'Astronomy
  is defined in  'Astronomy
+
  COMMENT  and Astrophysics', volume 376, page 359; bibcode: 2001A&A...376..359H
COMMENT  and Astrophysics', volume 376, page 359;  
+
  BZERO  =                32768 / offset data range to that of unsigned short
  bibcode: 2001A&A...376..359H
+
  BSCALE  =                    1 / default scaling factor
BZERO  =                32768 / offset data range to that of unsigned short
+
  EXPTIME =                100. / exposure time (seconds)
BSCALE  =                    1 / default scaling factor
+
  DATE-OBS= '2013-01-19T04:18:09.140' / date of observation (UT)
EXPTIME =                100. / exposure time (seconds)
+
  IMAGETYP= 'Light Frame'        / image type
DATE-OBS= '2013-01-19T04:18:09.140' / date of observation (UT)
+
  TARGET  = 'V1701  '          / target
IMAGETYP= 'Light Frame'        / image type
+
  INSTRUME= 'griz    '          / instrument
TARGET  = 'V1701  '          / target
+
  CCD-TEMP=              -9.921 / temperature (C)
INSTRUME= 'griz    '          / instrument
 
CCD-TEMP=              -9.921 / temperature (C)
 
 
   FILTER  = ' (3) i (700-825)'  / filter
 
   FILTER  = ' (3) i (700-825)'  / filter
TELESCOP= 'CDK20N  '          / telescope
+
  TELESCOP= 'CDK20N  '          / telescope
DATE    = '2013-01-19T04:20:08' / file creation date (YYYY-MM-DDThh:mm:ss UT)
+
  DATE    = '2013-01-19T04:20:08' / file creation date (YYYY-MM-DDThh:mm:ss UT)
  
 
The first entries tell us it is a simple image file, 4096x4096 pixels (16 megapixels) written with 16 integer data bits per pixel. The other entries provide information about the image data.  Therefore in dealing with FITS data we may need to change the first entries if the file is modified, and append new entries that annotate what has been done, or add to the archival notes.
 
The first entries tell us it is a simple image file, 4096x4096 pixels (16 megapixels) written with 16 integer data bits per pixel. The other entries provide information about the image data.  Therefore in dealing with FITS data we may need to change the first entries if the file is modified, and append new entries that annotate what has been done, or add to the archival notes.
  
The header in PyFITS is a Python dictionary.  If you want to know the exposure time, ask
+
The header in pyfits  is a Python dictionary.  If you want to know the exposure time, ask
  
inhdr['EXPTIME']
+
  inhdr['EXPTIME']
  
 
and Python responds
 
and Python responds
  
100.
+
  100.
  
 
or  
 
or  
  
inhdr['DATE-OBS']
+
  inhdr['DATE-OBS']
  
 
gets
 
gets
  
'2013-01-19T04:18:09.140'
+
  '2013-01-19T04:18:09.140'
  
 
This means that you can sort through the contents of headers in a Python program to find the exposures you need, identify the filters used, and see what processing has been done.  Most of the KEYWORDS shown above are standard, and those that are not can be easily added to specialized Python code.
 
This means that you can sort through the contents of headers in a Python program to find the exposures you need, identify the filters used, and see what processing has been done.  Most of the KEYWORDS shown above are standard, and those that are not can be easily added to specialized Python code.
  
When a new FITS image is written with PyFITS it contains only the bare necessities in the header -- the data type, some reference values for zero  and scaling if needed, the size of the array.  However you can copy items from the header of an image into images you create, and annotate the headers of your work to maintain a record of what has been done.
+
When a new FITS image is written with pyfits it contains only the bare necessities in the header -- the data type, some reference values for zero  and scaling if needed, the size of the array.  However you can copy items from the header of an image into images you create, and annotate the headers of your work to maintain a record of what has been done.
  
 
An output FITS image file  is created in steps from a NumPy data array outimage with
 
An output FITS image file  is created in steps from a NumPy data array outimage with
  
outhdu = pyfits.PrimaryHDU(outimage)
+
  outhdu = pyfits.PrimaryHDU(outimage)
  
 
which encapsulates an image in an "HDU" object.  The line
 
which encapsulates an image in an "HDU" object.  The line
  
outhdulist = pyfits.HDUList([outhdu])
+
  outhdulist = pyfits.HDUList([outhdu])
  
 
creates a list that contains the primary HDU which will have a default header
 
creates a list that contains the primary HDU which will have a default header
  
outhdr = outhdulist[0].header
+
  outhdr = outhdulist[0].header
  
 
Now you can append to this header or modify it
 
Now you can append to this header or modify it
  
history = 'This is what I did to the file.'
+
  history = 'This is what I did to the file.'
outhdr.append(('HISTORY',history))
+
  outhdr.append(('HISTORY',history))
more_history = 'I did this today.'
+
  more_history = 'I did this today.'
outhdr.append(('HISTORY',more_history))
+
  outhdr.append(('HISTORY',more_history))
  
 
since it is only a standard Python dictionary.
 
since it is only a standard Python dictionary.
  
  
For astronomical images headers may also include information that maps the image to the sky, a World Coordinate System or WCS header within the FITS header.  These keywords apply only if the spatial mapping of the image is unchanged, and in processing that shifts or distorts images the WCS header should not be copied.
+
For astronomical images headers may also include information that maps the image to the sky, a World Coordinate System or WCS header within the FITS header.  These keywords apply only if the spatial mapping of the image is unchanged, and in processing that shifts or distorts images the WCS header should not be copied. Setting the WCS header and interpreting it to obtain the celestial coordinates requires -- what else -- PyWCS also developed at the Space Telescope Institute and now included in AstroPy.
Setting the WCS header and intepreting it to obtain the celestial coordinates requires -- what else -- [http://stsdas.stsci.edu/astrolib/pywcs/index.html PyWCS] from the Space Telescope Institute.
+
 
 +
For example, to access the world coordinate system in a fits file, we would import the module
  
== Processing images ==
+
  from astropy.wcs import WCS
 +
 
 +
to have the same namespace as the original PyWCS and library functions for conversion to and from celestial coordinates and pixel coordinates in the image. There are many examples of this in a package of utilities we have developed here
 +
 
 +
  [http://www.astro.louisville.edu/software/alsvid/index.html Alsvid] Algorithms for Visualization and Processing of Image Data
 +
 
 +
 
 +
== Other Processing  ==
  
 
We have seen that there are many useful basic operations for image processing available simply through NumPy and PyFITS.  SciPy adds several others in the [http://docs.scipy.org/doc/scipy/reference/ndimage.html ndimage] package.  The functions include image convolution, various averaging or filtering algorithms, Fourier processing, image interpolation, and image rotation.
 
We have seen that there are many useful basic operations for image processing available simply through NumPy and PyFITS.  SciPy adds several others in the [http://docs.scipy.org/doc/scipy/reference/ndimage.html ndimage] package.  The functions include image convolution, various averaging or filtering algorithms, Fourier processing, image interpolation, and image rotation.
  
  
Since images are stored as arrays, there are some simple one-line ways to modify them.  Flipping an image top to bottom  or left to rignt is done with
+
Since images are stored as arrays, there are some simple one-line ways to modify them.  Flipping an image top to bottom  or left to right is done with
  
 
  import numpy as np
 
  import numpy as np
Line 393: Line 610:
 
  imsave(filename,rotated_image_data)
 
  imsave(filename,rotated_image_data)
  
Image rotation by other angles is somewhat more complex since it requires tranforming an image in a way that does not preserve its shape, or even the number of elements.  We will consider this with the help of scipy.ndimage later.
+
Image rotation by other angles is somewhat more complex since it requires tranforming an image in a way that does not preserve its shape, or even the number of elements.   
  
For example, arbitrary image rotation requires interpolation and will change the shape of an array.  There is a simple one-line command that does all this for you
+
There is a simple one-line command that does all this for you
  
 
   from scipy.misc import imrotate
 
   from scipy.misc import imrotate
Line 408: Line 625:
 
*'bicubic'
 
*'bicubic'
  
=== Adding SciKits ===
+
Finally, for spectroscopy the useful data in an image may be a sum over a region.  In that case we could index through the array and explicitly create the sum if needed, but in the ideal case of a spectrum we may want only the sum along a column for each element of a row.  For that, the function is
  
We've mentioned that SciKits is a searchable index of tools that are built on SciPy and NumPy. There is an imaging package that includes a very useful tool for astronomy to find local maxima (stars!!) in an image.  Look for
+
spectrum = np.sum(image, axis)
  
[http://scikits.appspot.com/scikit-image scikit-image]
+
which returns a numpy array that is the sum along the specified axis. If there are regions in the image that should not be included in the sum, then the image could be masked before computing the sum. The result is a 1-d array in which each element is the signal at a different wavelength.
  
The package may be available for your computer system, or if not, the instructions on the website will work for Linux and MacOS. Routines in SciKits are usually not available pre-compiled, which can limit their use in Windows. On an OpenSUSE installation, with SciPy already installed, you will need to run as root
+
Sometimes the spectral lines are not along a column but are still straight (if curved, then we need other ways to mask the array), in which case scipy.misc.imrotate can be used before the sum to align the spectrum with a row or column.
  
easy_install cython
 
easy_install scikit-image
 
  
where the first one is a Python processor for C programs that may (or may not) already be on your system.
 
  
We will use  this package to identify stars in an image, adapting a program
 
given by Eli Bressert in his book SciPy and NumPy to handle an image from one of our telescopes. The complete program is given in the Examples section below.
 
  
It begins with the requisite imports including the ones from SciKit
+
==  AstroImageJ and Alsvid ==
  
  import os
+
We have developed a collection of Python routines to do many of the routine astronomical image processing tasks such as dark subtraction, flat fielding, co-addition, and FITS header management through PyFITS and PyWCSThe current version of the "Alsvid" package is available for download:
import sys
 
import argparse
 
import matplotlib.pyplot as mpl
 
import numpy as np
 
import pyfits
 
import skimage.morphology as morph
 
  import skimage.exposure as skie
 
  
It opens an FITS file and loads only part of the image
 
  
img =  pyfits.getdata(infits)[1000:3000, 1000:3000]
+
[http://www.astro.louisville.edu/software/alsvid/index.html Alsvid]
  
to insure a square sample area and to limit the size of the search region. This is an alternative to reading a FITS image and also getting the header.
 
  
With this data the program creates a new image using an mapping arc sinh that captures the full dynamic range effectively.  It locates lower and upper bounds that should include only stars.  The parameters would probably have to be refined to optimize the extraction of stars from background.
 
  
limg = np.arcsinh(img)
+
== Examples ==
limg = limg / limg.max()
 
low = np.percentile(limg, 0.25)
 
high = np.percentile(limg, 99.5)
 
opt_img  = skie.exposure.rescale_intensity(limg, in_range=(low,high))
 
  
With the useful range determined, we create a new image that is scaled between the lower and upper limits that will be used for displaying the star map. We search the arcsinh-stretched original image for local maxima and catalog those brighter than a threshold that is adjusted based on the image.
+
For examples of Python illustrating image processing, see the [http://prancer.physics.louisville.edu/astrowiki/index.php/Python_examples examples] section.
  
lm = morph.is_local_maximum(limg)
 
x1, y1 = np.where(lm.T == True)
 
v = limg[(y1,x1)]
 
lim = 0.7
 
x2, y2 = x1[v > lim], y1[v > lim]
 
  
The list x2,y2 has the stars the were found. The rest is image display code to draw circles around the stars and create an image that shows where they are.
 
 
  
[[File:m34_100s_log1000_rgb_20121104.png  | center | 400px]]
 
 
   
 
   
When the program processes a 100 second image of the open cluster M34, shown above, it identifies the stars in the right panel below.
+
Alsvid is intended as a command line supplement to the powerful Java program [http://www.astro.louisville.edu/software/astroimagej/index.html  AstroImageJ] which provides real-time interactivity with astronomical image processing and precision photometry.  AstroImageJ is built on the orirignal [https://imagej.nih.gov/ij/ ImageJ], an image processing program developed at the National Institutes of Health and now maintained as public domain opensource resource. As such, this core component of AIJ offers many specialized tools for image analysis in the biological sciences which are equally useful in Physics and Astronomy. AstroImageJ along side versatile Python desktop processing is a powerful combination for astronomical image analysis.
 
 
[[File:test_m34_stars.png | center | 600px]]
 
 
 
== Examples ==
 
  
For examples of Python illustrating image processing, see the [http://prancer.physics.louisville.edu/astrowiki/index.php/Python_examples examples] section.
 
  
 
== Assignments ==
 
== Assignments ==
  
 
For the assigned homework to use these ideas, see the [http://prancer.physics.louisville.edu/astrowiki/index.php/Python_assignments assignments] section.
 
For the assigned homework to use these ideas, see the [http://prancer.physics.louisville.edu/astrowiki/index.php/Python_assignments assignments] section.

Latest revision as of 17:54, 14 April 2018

Given that NumPy provides multidimensional arrays, and that there is core support through the Python Imaging Library and Matplotlib to display images and manipulate images in the Python environment, it's easy to take the next step and combine these for scientific image processing. As part of our short course on Python for Physics and Astronomy we begin by exploring how Python handles image input and output through pillow, scikit-image, and pyfits. Once loaded, an image may be processed using library routines or by mathematical operations that would take advantage of the speed and conciseness of numpy and scipy. Some of the resources mentioned here require Python >3.4, and at this time Python 3.6 is the current one.


Pillow - An Imaging Library

The Python Imaging Library (PIL) was developed for Python 2.x and provided functions to manipulate images, including reading, modifying and saving in various standard image formats in a package called "PIL". With the coming of age of Python 3.x, a fork of the older version has evolved that is more suited for the new technologies and is in a package called "Pillow". It continues to improve, and the features described here are tested with "Pillow 5.1" and Python 3.6 as of April 2018. Pillow will probably be on any packaged distribution of Python 3, or it may be installed with (note the capital "P")

pip install Pillow

Pillow includes the basics of image processing. with functions that are documented by the developers in a handbook describing the methods and giving some examples. Pillow uses the same "namespace" as PIL and older code should work, perhaps with a few modifications to allow for recent developments. Most import for us, Pillow has routines to read and write conventional image formats. Once an image has been read into a numpy array, the full power of Python is available to process it, and we can turn to Pillow again to save a processed image in png or jpg or another format. Flexibile Image Transport System (FITS) files used for astronomy should be managed with astropy or pyfits.

As a simple starting example, suppose you have an image that was taken with the camera turned so that "up" is to the side when the image is displayed. Here's how you would read it, rotate it 90 degrees, and write it out again using Pillow.

 import os  
 from PIL import Image as pil
 parser= argparse.ArgumentParser(description = 'Rotate a png image 90 degrees')
 if len(sys.argv) == 1:
   sys.exit("Usage: png_image_rotate file.png ")
   exit()
 elif len(sys.argv) == 2:
   infilename = sys.argv[1]
 else:
   sys.exit("Usage: png_image_rotate file.png ")
   exit() 
 myimage = pil.open(infilename)

 mirror = myimage.transpose(pil.ROTATE_90)
 outfilename = os.path.splitext(os.path.basename(infilename))[0]+'_r90.png'
 mirror.save(outfilename)


The first part of this is standard form to get the image name on the command line and make it available to the program. The PIL is imported with Image, and appears in the code as "pil". This is an amazingly short program, because in opening the image the library handles all the conversions in formatting and stores the image internally so that you refer to it only by the name assigned when it is loaded. We operate on the image with the transpose function, which has an argument that controls what it does. Here we rotate the image 90 degrees, and then save it to a file with a new name. The saving operation converts the internal data back to the file format set by the extension used in the file name.

You can transpose an image left-right with

 mirror = myimage.transpose(pil.FLIP_LEFT_RIGHT)

or do both in one step with

 mirror = myimage.transpose(Image.FLIP_LEFT_RIGHT).transpose(pil.ROTATE_90)

Processing is not limited to "PNG" files, though that file type is preferred because it is not a lossy storage option. Python reads and writes "jpg" files too. While PIL provides some essential functionality, for more practical uses in astronomy we need to read Flexible Image Transport or "FITS" files, and to enable numerical work on images in SciPy. Many of the processing functions you will find in Python Imaging Library (PIL) are also available in SciPy where we have precise mathematical control over their definitions and operation. Some more advanced techiques are available in SciPy too, courtesy of researchers who have contributed to SciKit as we will see.

Python's core routines dependent on matplotlib may be used to display an image, but these are designed for graphics, and limited by the constraints of the matplotlib interface. With a little effort there are better choices.


SciKit Image

We've mentioned that SciKits is a searchable index of highly specialized tools that are built on SciPy and NumPy. Among them, scikit-image is for image processing in Python. It is oriented toward extracting physical information from images, and has routines for reading, writing, and modifying images that are powerful, and fast. Scikit-image is often compared to OpenCV, a collection of programs for computer vision that include live video. Both are actively maintained and in many ways complementary, but for physics and astronomy scikit-image is more powerful at this time.

The scikit-image package is part of Anaconda and Enthought Python, and that would be a recommended platform for Windows. However for Linux and Mac OSX,

 pip install scikit-image

should work. The package usually requires local compilation of the code when installed this way. Qt bindings are provided by PyQt5, PySide or by qtpy, depending on licensing and software requirements. PySide is less restrictive than PyQt and qtpy is an abstraction that draws on both.

 pip install PySide


While scipy has included an image reader and writer, as of April 2018 this function is deprecated in the base code and rather than use pillow, we can turn to scikit-image. The module to read and write image is skimage.io

 import skimage.io
 import numpy as np

and the command

 skimage.io.find_available_plugins()

will provide a dictionary of the libraries that may be used to read various file types. For example

imlibs = skimage.io.find_available_plugins()

and imlibs['pil'] will list the functions that the Python imaging library provides. The package tries the libraries in order until it finds one that works:

 myimage = skimage.io.imread(filename).astype( np.float32)

will read an image and return a numpy array which by default will be an RGB image if the file is a png file, for example. A greyscale image image be specified by including as_grey=True as an argument. A numpy image has a shape that for color has 3 values in each pixel

 print(myimage.shape)
 (498, 680, 3)

as an example for an RGB image, or

 (498,680)

for a gray scale image. Since numpy by default would store into a 64-bit float and matplotlib (the default display for skimage) requires 32-bit, we specify loading into a 32 bit array while planning ahead to seeing the result.


Images may be saved:

 skimage.io.imsave(filename, nparray)

and the file type is determined by file name extension.

Images may be displayed, but it takes two steps

 skimage.io.imshow(myimage)
 skimage.io.show()

when invoking the default matplotlib plugin. The display will look like one created by pyplot. There is a simpler, viewer module too, without pyplot toolbar.

 import skimage.viewer
 viewer = skimage.viewer.viewers.ImageViewer(myimage)
 viewer.show


We will use routines from the scikit-image package to identify stars in an image, adapting a program given by Eli Bressert in his book SciPy and NumPy to handle an image from one of our telescopes. The complete program is given in the Examples section below.

It begins with the requisite imports including the ones from SciKit

 import os
 import sys
 import argparse
 import matplotlib.pyplot as mpl
 import numpy as np
 import astropy.io.fits as pyfits
 import skimage.morphology as morph
 import skimage.exposure as skie

It opens an FITS file and loads only part of the image

 img =  pyfits.getdata(infits)[1000:3000, 1000:3000]

to insure a square sample area and to limit the size of the search region. This is an alternative to reading a FITS image and also getting the header.

With this data the program creates a new image using an mapping arc sinh that captures the full dynamic range effectively. It locates lower and upper bounds that should include only stars. The parameters would probably have to be refined to optimize the extraction of stars from background.

 limg = np.arcsinh(img)
 limg = limg / limg.max()
 low = np.percentile(limg, 0.25)
 high = np.percentile(limg, 99.5)
 opt_img  = skie.exposure.rescale_intensity(limg, in_range=(low,high))

With the useful range determined, we create a new image that is scaled between the lower and upper limits that will be used for displaying the star map. We search the arcsinh-stretched original image for local maxima and catalog those brighter than a threshold that is adjusted based on the image.

 lm = morph.is_local_maximum(limg)
 x1, y1 = np.where(lm.T == True)
 v = limg[(y1,x1)]
 lim = 0.7
 x2, y2 = x1[v > lim], y1[v > lim]

The list x2,y2 has the stars the were found. The rest is image display code to draw circles around the stars and create an image that shows where they are.


M34 100s log1000 rgb 20121104.png

When the program processes a 100 second image of the open cluster M34, shown above, it identifies the stars in the right panel below.

Test m34 stars.png

Images with NumPy and SciPy

SciPy can read jpg and png images directly, without using PIL. With SciPy images are stored in numpy arrays, and we have direct access to the data for uses other than visualization.

 import numpy as np
 import matplotlib.pyplot as plt
 import skimage.io.imread as imread
 import skimage.io.imsave as imsave
 #import skimage.io.imshow as imshow  is an alternative for display
 image_data = imread('test.jpg').astype(np.float32)
 print 'Size: ', image_data.size
 print 'Shape: ', image_data.shape

 scaled_image_data = image_data / 255.
 # Save the modified image if you want to
 # imsave('test_out.png', scaled_image_data)
 plt.imshow(scaled_image_data)
 plt.show()
 exit()

For our 512x512 color test image, this returns

 Size:  786432
 Shape:  (512, 512, 3)

because the image is 512x512 pixels and has 3 planes -- red, green, and blue. When SciPy reads a jpg or png image it will separate the colors for you. The "image" is a data cube. In the imread line we control the data type within the Python environment. Of course the initial data is typically 8 bits for color images from a cell phone camera, 16 bits for scientific images from a CCD, and perhaps 32 bits for processed images that require the additional dynamic range.

We can display images with matplotlib.pyplot using imshow() --

 imshow(X, cmap=None, norm=None, aspect=None, interpolation=None,
   alpha=None, vmin=None, vmax=None, origin=None, extent=None, **kwargs)

where "X" is the image array. If X is 3-dimensional, imshow will display a color image. Matplotlib has a tutorial on how to manage images. Here, we linearly scale the image data because for floating point imshow requires values between 0. and 1., and we know beforehand that the image is 8-bits with maximum values of 255.

Here's what a single test image displayed from this program looks like in Python.


Single image.png


We can slice the image data to see each color plane by creating new arrays indexed from the original one.

 import numpy as np
 from scipy.misc import imread, imsave
 import pylab as plt
 image_data = imread('test.jpg').astype(np.float32)
 scaled_image_data = image_data / 255.
 print 'Size: ', image_data.size
 print 'Shape: ', image_data.shape
  
 image_slice_red =  scaled_image_data[:,:,0]
 image_slice_green =  scaled_image_data[:,:,1]
 image_slice_blue =  scaled_image_data[:,:,2]
 print 'Size: ', image_slice_0.size
 print 'Shape: ', image_slice_0.shape
 plt.subplot(221)
 plt.imshow(image_slice_red, cmap=plt.cm.Reds_r)
 plt.subplot(222)
 plt.imshow(image_slice_green, cmap=plt.cm.Greens_r')
 plt.subplot(223)
 plt.imshow(image_slice_blue, cmap=plt.cm.Blues_r)  
 plt.subplot(224)
 plt.imshow(scaled_image_data)  
 plt.show()


For a colorful image you will see the differences between each slice --

Single image slice.png


This approach offers a template for displaying multidimensional computed or experimental data as an image created with Python. Consider this short program that creates and displays an image with Gaussian noise:

 # Import the packages you need
 import numpy as np
 import matplotlib.pyplot as plt
 from scipy.misc import imsave
 # Create the image data
 image_data = np.zeros(512*512, dtype=np.float32).reshape(512,512)
 random_data = np.random.randn(512,512)
 image_data = image_data + 100.*random_data

 print 'Size: ', image_data.size
 print 'Shape: ', image_data.shape 
 scaled_image_data = image_data / 255.
 # Save and display the image 
 imsave('noise.png', scaled_image_data)
 plt.imshow(scaled_image_data, cmap='gray')
 plt.show()
 exit()


Instead of random noise, you could create any functional data you wanted to display. For images with color, the NumPy array would have red, green, and blue planes.


Astronomical FITS Files

Astronomical image data are potentially complex and rich, for which quantitative structures have been developed to standardize lossless storage of the data along with the metadata that describe its origin and previous processing. While photographic images are often only 8 bits deep, mixed with red, green and blue in a single image, and compressed to reduce file size, astronomical images are 16 or 32 bits deep in a single color. Until recently, the file sizes needed for astronomy were unrivaled by commodity cameras, but in todays market of megapixel imaging on cell phones, the camera in your pocket also produces very large rich images. The difference in handling them is that for science we want to preserve the data without loss, quantitatively calibrate and measure the flux from the source and map that back to a specific angle in space, while for art or even for some academic uses, it is the beauty, color, and highlights shown in the image display that are important. Commodity images are saved usually in compressed formats such as JPG, or uncompressed TIFF or proprietary binary formats. For astronomy and other quantitative imaging work, the Flexible Image Transport System (or FITS) format is almost universal. It includes the image data, and a header describing the data. FITS files may also be tables of data, or a cube of images in sequence. The standards developed for creating these files are slowing evolving as the needs of big data in astronomy have grown. Programmers and scientists at NASA, the Space Telescope Science Institute, and the academic community at large are contributing to libraries that enable reading, processing, and saving FITS files in Python, as well as in C, and Fortran.

Once a FITS file has been read, the header its accessible as a Python dictionary of the data contents, and the image data are in a NumPy array. With Python using NumPy and SciPy you can read, extract information, modify, display, create and save image data.


Reading and Writing a FITS File in Python

There are many image display tools for astronomy, and perhaps the most widely used is ds9 which is available for Linux, MacOS, and Windows, as well as in source code. It is always useful to have a version on your computer when you are working with FITS images. A more versatile Java platform for astronomical image viewing that also does and processing' is now widely used for precision astronomical photometry where interactive analysis is needed. AstroImageJ is free and simple to install on most computers, and because it is also a powerful processor, for many purposes it is an all-in-one tool. If you are working with FITS images, this is highly recommended:

AstroImageJ website and program source

Astronomical Journal article describing Astroimagej


However, sometimes we want perform specialized work on image data, and to view it while processing it in Python. This routine demonstrates how to read a FITS file, inspect its header, and show the image on the computer display. It is dependent on the "PyFITS" library developed at Space Telescope Science Institute and incorporated into a larger package by the AstroPy Project. AstroPy's library is part of the Enthought and Canopy distributions of Python. If you are using a system version of python, you may need to install it with pip

 pip install astropy

should do it. The package have several dependences on python (it requires version 3.5 or higher at this time (April 2018), and on numpy and scipy. It is also frequently updated, and while stable may push the "cutting edge" of distributions from conservative operating system distributors like OpenSuse. That's a good reason, if you need this capability, to have a version of Python built with current sources, or to use a complete distribution such as Anaconda or Enthought.

A program to work with FITS files would begin by importing the packages that it needs


 import os
 import sys
 import argparse
 import numpy as np
 import astropy.io.fits as pyfits
 import matplotlib 
 import matplotlib.pyplot

Importing the FITS modules in this way makes the code backward compatible with the earlier versions of PyFITS. Also, since sometimes submodules are not loaded with the larger package, we explicitly ask for the io.fits components, and for pyplot.

 # Define a function for making a linear gray scale
 def lingray(x, a=None, b=None):
  """
  Auxiliary function that specifies the linear gray scale.
  a and b are the cutoffs : if not specified, min and max are used
  """
  if a == None:
    a = np.min(x)
  if b == None:
    b = np.max(x)
  return 255.0 * (x-float(a))/(b-a)
 # Define a function for making a logarithmic gray scale
 def loggray(x, a=None, b=None):
   """
   Auxiliary function that specifies the logarithmic gray scale.
   a and b are the cutoffs : if not specified, min and max are used
   """
   if a == None:
     a = np.min(x)
   if b == None:
     b = np.max(x)          
  linval = 10.0 + 990.0 * (x-float(a))/(b-a)
  return (np.log10(linval)-1.0)*0.5 * 255.0

These functions may be used to rescale the data for display. Scaling can be done with a colormap, or by modifying the data before displaying. Here, we modify the data and save it in a temporary buffer for display. In a more elaborate program with a full user interface, this scaling could be done interactively.

 # Provide information to the argparse routine if we need it
 parser= argparse.ArgumentParser(description = 'Display a fits image')

The argparse function offers more flexibility than the system routines, though here we only include this for future additions. A simpler method is to parse the command line itself using the system utilities:

 # Test for command line arguments
 if len(sys.argv) == 1:
   sys.exit("Usage: display_fits infile.fits ")
   exit()
 elif len(sys.argv) == 2:
   infits = sys.argv[1]
 else:
   sys.exit("Usage: display_fits infile.fits ")
   exit() 
 # Open the fits file and create an hdulist
 inhdulist = pyfits.open(infits) 
 # Assign the input header in case it is needed later
 inhdr = inhdulist[0].header
 # Assign image data to a numpy array
 image_data =  inhdulist[0].data

The header and data are now available. We'll look at header information later. For now, all we need are the values in the numpy data array. It will be indexed from [0,0] at the upper left of the data space, which would be the upper left of the displayed image.

 # Print information about the image
 print 'Size: ', image_data.size
 print 'Shape: ', image_data.shape

For this example we use linear scaling.

 # Show the image
 new_image_data = lingray(image_data)
 new_image_min = 0.
 new_image_max = np.max(new_image_data)
 matplotlib.pyplot.imshow(new_image_data, vmin = new_image_min, vmax = new_image_max, cmap ='gray')  
 matplotlib.pyplot.show()  
 # Close the input image file and exit
 inhdulist.close()
 exit ()

It would be a straightforward exercise to add an interactive scale control and to read out the value of each pixel in this program. With that, it has the basic functionality of ds9 or the AstroImageJ viewer.


Correcting and Combining Images

There are processing operations done on all "raw" astronomical images taken with charge coupled device cameras. Since the voltage that is digitized is proprotional to the number of photons that arrived at each pixel during the exposure, the data for that pixel should be proportional to the photon count, that is to the irradiance of photons/area-time times the area of the pixel times the exposure time. An absolute conversion to the flux from the sources that are imaged requires correcting for

  • Signal at each pixel with no light present -- the "dark" image
  • Signal at each pixel for the same irradiance/pixel -- the "flat" field
  • Non-linear responses
  • Absolute calibration to an energy or photon flux based on spectral response

We can do all of these in NumPy using its built-in array math. By taking an image with no light and subtracting it from an image, we have corrected for the dark response (as well as for electronic "bias"). Or, if needed, we can scale a dark image taken at a different exposure time from the image we are measuring, and then subtract that. We can divide an image by a reference flat to correct for pixel-to-pixel variations in response, for vignetting in an optical system (the non-linear fall-off of transmission across the field of view). We can scale images non-linearly to correct for non-linear amplifier responses and saturation or charge loss from each pixel.

For dark subtraction we take the difference of two images:

dark_corrected_image = raw_image - reference_dark_image

and for flat field correction we divide

final_image = dark_corrected_image / reference_flat_image

In NumPy there is no array indexing needed, and the operations are one-liners. Similarly, a non-linear second order correction or a scaling to physical units may be done on the entire array with

corrected_image = a * (final_image) + b * (final_image**2) 

The reference dark and flat images must be obtained beforehand. For example, a reference dark image may be a median average of many images taken with the same exposure time as the science image, but with the shutter closed. To perform the median operation on the arrays rather than sequentially on the elements, we stack all of the original individual dark images to make a 3-d stack of 2-d arrays. Using numpy arrays we would have

dark_stack = np.array([dark_1, dark_2, dark_3])

where dark_1, dark_2, and dark_3 are the original dark images. We need at least 3, or any odd number in the dark stack. If the images are m rows of n columns, and if we have k images in the stack, the stack will have a shape (k,n,m): k images, each of n rows, each of m columns. A median on the first axis of this stack returns the median value for each pixel in the stack --

dark_median = np.median(dark_stack, axis=0)

and has a shape that is (n,m) with the elements that we wanted.

Median operations on a image stack remove random noise more effectively than averaging because one source of noise in CCD images is cosmic ray events that produce an occasional large signal at a pixel. If we mean-averaged with an outlier in one pixel the result would be too large because of the one singular event in the stack, but by median-averaging we discard that event without adversely affecting the new reference frame.

A median of a stack of flat frames, all normalized so that they should be identical, will remove stars from the reference image as long as each contributing flat image in the stack is taken of a different star field. Typically, these "sky flats" are images taken at twilight, processed to remove the dark signal, normalized to unity, and then median averaged to remove stars and reduce random noise.

Fortunately, normalizing an image is very simple because

 image_mean = np.mean(image_data)

returns the mean value of all elements in the array. Divide an image by its mean to create a normalized image with unity mean

 normalized_image = image_data / image_mean


As long as the images have the same size you can sum them by simple addition

sum_image = image1 + image2 + image3 ...

We would do this to create a final image that is effectively one long exposure, the sum of all the contributing image exposure times. Because of guiding errors, cosmic rays, and weather, one very long exposure is often not possible, but 10's or 100's of shorter exposures can be "co-added" after selecting the best ones and aligning them so that each pixel in the contributing image corresponds to the same place in the sky.


Masked Image Operations

A Numpy array mask is a boolean array that determines whether or not an operation is to be performed. If you have an image in a array, the mask allows you to work on only part of the image, ignoring the other part. This is useful for finding the mean of a selected region, or for computing a function that fits part of an image but ignores another part. For example, consider an array

x = np.arange(4*4).reshape(4,4)

which is a simple 4x4 image with values from 0 to 15:

print x
array([[ 0,  1,  2,  3],
      [ 4,  5,  6,  7],
      [ 8,  9, 10, 11],
      [12, 13, 14, 15]])

Make a copy of ths image which is a boolean mask

xmask = np.ma.make_mask(x,copy=True,shrink=True,dtype=np.bool)

and it will have all True values except for the first entry, 0, which is interpreted False:

print xmask
array([[False,  True,  True,  True],
      [ True,  True,  True,  True],
      [ True,  True,  True,  True],
      [ True,  True,  True,  True]], dtype=bool)

You can set all values (or individual ones) to the state you need

xmask[:,:] = True

will make them all True, or

xmask[0,:] = False
xmask[3,:] = False
xmask[:,0] = False
xmask[:,3] = False

will set the values around the perimeter False and leave the others True

print xmask
array([[False, False, False, False],
      [False,  True,  True, False],
      [False,  True,  True, False],
      [False, False, False, False]], dtype=bool)


We apply the mask to the original data

mx = np.ma.masked_array(x, mask=xmask)
print mx

[[0 1 2 3]
[4 -- -- 7]
[8 -- -- 11]
[12 13 14 15]]

and see that the values that are masked by "True" are not included in the new masked array. If we want the sum of the elements in the masked array, then

np.sum(mx)
90

while

np.sum(x)
120

There are many ways to create a mask, and to operate on masked arrays, described in the Numpy documentation.


FITS Headers

FITS files contain a "header" that tells us what is in the file, and then data in a format that is defined by the header. As we have seen, when a FITS file is read in NumPy with PyFITS, the data and the header are separate entities. Here's the complete header from an image taken with one of our telescopes:


The first entries tell us it is a simple image file, 4096x4096 pixels (16 megapixels) written with 16 integer data bits per pixel. The other entries provide information about the image data. Therefore in dealing with FITS data we may need to change the first entries if the file is modified, and append new entries that annotate what has been done, or add to the archival notes.

We will open an image v1701_00146_i.fits in interactive Python and look at the header.

 import numpy as np
 import astropy.io.fits as pyfits
 infits='v1701_00146_i.fits'
 inhdulist = pyfits.open(infits)
 inhdr = inhdulist[0].header
 inhdr


 SIMPLE  =                    T / file does conform to FITS standard
 BITPIX  =                   16 / number of bits per data pixel
 NAXIS   =                    2 / number of data axes
 NAXIS1  =                 4096 / length of data axis 1
 NAXIS2  =                 4096 / length of data axis 2
 EXTEND  =                    T / FITS dataset may contain extensions
 COMMENT   FITS (Flexible Image Transport System) format  is defined in   'Astronomy
 COMMENT   and Astrophysics', volume 376, page 359;  bibcode: 2001A&A...376..359H
 BZERO   =                32768 / offset data range to that of unsigned short
 BSCALE  =                    1 / default scaling factor
 EXPTIME =                 100. / exposure time (seconds)
 DATE-OBS= '2013-01-19T04:18:09.140' / date of observation (UT)
 IMAGETYP= 'Light Frame'        / image type
 TARGET  = 'V1701   '           / target
 INSTRUME= 'griz    '           / instrument
 CCD-TEMP=               -9.921 / temperature (C)
 FILTER  = ' (3) i (700-825)'   / filter
 TELESCOP= 'CDK20N  '           / telescope
 DATE    = '2013-01-19T04:20:08' / file creation date (YYYY-MM-DDThh:mm:ss UT)

The first entries tell us it is a simple image file, 4096x4096 pixels (16 megapixels) written with 16 integer data bits per pixel. The other entries provide information about the image data. Therefore in dealing with FITS data we may need to change the first entries if the file is modified, and append new entries that annotate what has been done, or add to the archival notes.

The header in pyfits is a Python dictionary. If you want to know the exposure time, ask

 inhdr['EXPTIME']

and Python responds

 100.

or

 inhdr['DATE-OBS']

gets

 '2013-01-19T04:18:09.140'

This means that you can sort through the contents of headers in a Python program to find the exposures you need, identify the filters used, and see what processing has been done. Most of the KEYWORDS shown above are standard, and those that are not can be easily added to specialized Python code.

When a new FITS image is written with pyfits it contains only the bare necessities in the header -- the data type, some reference values for zero and scaling if needed, the size of the array. However you can copy items from the header of an image into images you create, and annotate the headers of your work to maintain a record of what has been done.

An output FITS image file is created in steps from a NumPy data array outimage with

 outhdu = pyfits.PrimaryHDU(outimage)

which encapsulates an image in an "HDU" object. The line

 outhdulist = pyfits.HDUList([outhdu])

creates a list that contains the primary HDU which will have a default header

 outhdr = outhdulist[0].header

Now you can append to this header or modify it

 history = 'This is what I did to the file.'
 outhdr.append(('HISTORY',history))
 more_history = 'I did this today.'
 outhdr.append(('HISTORY',more_history))

since it is only a standard Python dictionary.


For astronomical images headers may also include information that maps the image to the sky, a World Coordinate System or WCS header within the FITS header. These keywords apply only if the spatial mapping of the image is unchanged, and in processing that shifts or distorts images the WCS header should not be copied. Setting the WCS header and interpreting it to obtain the celestial coordinates requires -- what else -- PyWCS also developed at the Space Telescope Institute and now included in AstroPy.

For example, to access the world coordinate system in a fits file, we would import the module

 from astropy.wcs import WCS

to have the same namespace as the original PyWCS and library functions for conversion to and from celestial coordinates and pixel coordinates in the image. There are many examples of this in a package of utilities we have developed here

 Alsvid Algorithms for Visualization and Processing of Image Data


Other Processing

We have seen that there are many useful basic operations for image processing available simply through NumPy and PyFITS. SciPy adds several others in the ndimage package. The functions include image convolution, various averaging or filtering algorithms, Fourier processing, image interpolation, and image rotation.


Since images are stored as arrays, there are some simple one-line ways to modify them. Flipping an image top to bottom or left to right is done with

import numpy as np
np.flipud(image_data)
np.fliplr(image_data)

While we also have to add lines to read the file, update the header, and write it out again, the program to preform these operations is remarkably short. A template program is given in the examples below.

Rotating it by 90 degrees is also easy

np.rot90(image_data)

Rotated and flipped images may be saved either as FITS image (using PyFITS) or as png or jpg images using

from scipy.misc import imsave
imsave(filename,rotated_image_data)

Image rotation by other angles is somewhat more complex since it requires tranforming an image in a way that does not preserve its shape, or even the number of elements.

There is a simple one-line command that does all this for you

 from scipy.misc import imrotate
 angle = 22.5
 new_image = imrotate(image_data, angle, interp='bilinear')

The angle is in degrees. The interp string determines how the interpolation will be done, with self-explanatory options

  • 'nearest'
  • 'bilinear'
  • 'cubic'
  • 'bicubic'

Finally, for spectroscopy the useful data in an image may be a sum over a region. In that case we could index through the array and explicitly create the sum if needed, but in the ideal case of a spectrum we may want only the sum along a column for each element of a row. For that, the function is

spectrum = np.sum(image, axis)

which returns a numpy array that is the sum along the specified axis. If there are regions in the image that should not be included in the sum, then the image could be masked before computing the sum. The result is a 1-d array in which each element is the signal at a different wavelength.

Sometimes the spectral lines are not along a column but are still straight (if curved, then we need other ways to mask the array), in which case scipy.misc.imrotate can be used before the sum to align the spectrum with a row or column.



AstroImageJ and Alsvid

We have developed a collection of Python routines to do many of the routine astronomical image processing tasks such as dark subtraction, flat fielding, co-addition, and FITS header management through PyFITS and PyWCS. The current version of the "Alsvid" package is available for download:


Alsvid


Examples

For examples of Python illustrating image processing, see the examples section.



Alsvid is intended as a command line supplement to the powerful Java program AstroImageJ which provides real-time interactivity with astronomical image processing and precision photometry. AstroImageJ is built on the orirignal ImageJ, an image processing program developed at the National Institutes of Health and now maintained as public domain opensource resource. As such, this core component of AIJ offers many specialized tools for image analysis in the biological sciences which are equally useful in Physics and Astronomy. AstroImageJ along side versatile Python desktop processing is a powerful combination for astronomical image analysis.


Assignments

For the assigned homework to use these ideas, see the assignments section.