Spring 2016: PHY599 – Python for Scientists

Some useful links:

Timeline and scripts from the lecture:

Please note that lecture notes are no longer available here;  example scripts, however, are available on this site. For lecture notes, please refer to https://github.com/mommermi/Introduction-to-Python-for-Scientists

  • Jan 22: some basis; data types, strings, lists
    Exercise: make Python work on your computer!
  • Jan 29: more basics; more lists, dictionaries, sets,
    Exercise: Write a code that reads this data file with Flagstaff climate data, calculates the average temperature of every month, as well as the amount of rain and snow that falls on every day of the year.  Store your results in dictionaries temp_c, rain, and snow, such that temp_c['june'] provides you the average temperature in June measured in Celsius and rain['jan 29']/snow['jan 29'] provides you the amount of rain and snow that fell on that day. Note that precipitation numbers in the data file are cumulative.
  • Feb 5: no class
  • Feb 12: advanced concepts: control flow, file i/o, modules, exceptions, classes
    Exercise: Write three different functions, each of which returns the first n numbers of the Fibonacci Sequence as a list. One of the functions has to use a for-loop, one has to use a while-loop, and the last one has to use a recursive approach (a function that calls itself over and over again).
  • Feb 19: modules, standard libraries: math, os, datetime, urllib2, subprocess
    Exercise: Write a code that grabs the NAU weather website every 5 minutes and extracts the current temperature, dew point, humidity, wind speed, and direction (hint: identify the corresponding html source code lines and extract the numbers); write the data into a file (another hint: use append when opening the file so you can interrupt your running code).
  • Feb 26: numpy: basics, arrays
    Exercise: Calculate Euler’s number e from the series of factorials (see here) in two different approaches: (1) not using any numpy functionality at all (i.e., using a simple for-loop and lists), (2) using only numpy functionality (hint: you will have to write your own factorial function, do not use the scipy one). Evaluate the series to its n-th term for n=1e2, 1e3, 1e4 and determine for each approach its runtime and the residual to numpy.e. Try to make the numpy approach as fast as possible!
  • Mar 4: more numpy: masked arrays, structured arrays, file i/o
    Exercise: Use numpy functions to read in MPCORB.DAT, the list of all known asteroids in the Solar System (here is some documentation on the structure of the file and be warned: this is a big file), into a structured array. Using as little code as possible, derive the following things:
    1. how many asteroids have q<1.3 and can be considered near-Earth asteroids? (q is the perihelion distance, which is defined as q=a(1-e)) How many are trans-Neptunian objects (q>30)?
    2. how many asteroids have accurately known orbits (uncertainty parameter U=0)? How many NEOs/TNOs?
    (Hint: only read in those columns from the file that you really need using the usecols option in genfromtxt.)
  • Mar 11: even more numpy: random, statistics, linalg, lambda function
    Exercise: Prove the Central limit theorem, by showing that for large values of lambda, the Poisson distribution approaches the normal distribution. Use numpy functions to show the convergence, e.g., using histograms.
  • Mar 18: no class (spring break)
  • Mar 25: matplotlib; Example: scatter plotting image and reference catalogs (script, image catalog, reference catalog) Exercise: implement the matching between both catalogs based on RA and Dec and try to derive the magnitude zeropoint of the image.
  • Apr 1: Example continued: match two catalogs and derive the magnitude zeropoint of an image (script, image catalog, reference catalog) Exercise:  catalog matching with the for loops will be slow for large amounts of data. Design a matching routine using numpy’s array functions (hint: you can calculate the distance of one object from the first catalog to all objects in the second catalog at one time) and then try to implement a matching routine that utilizes a kd-tree query. What are the runtimes for the invididual routines? What are the runtimes if you try to match two catalogs with coordinates created from random numbers with each other if each catalog has 1e3, 1e4, 1e5, 1e6 sources?
  • Apr 8: scipy.optimize; Example: fitting a model to data (script, data)
    Exercise: Image a binary asteroid (primary + secondary body), each of which rotates, and the secondary also orbits the primary. Derive the rotation periods of both asteroids from the data (cumulative flux from both bodies as a function of time) under the assumption that the secondary is always visible. Use the same method we used in class. Then try to implement a Lomb-Scargle Periodogram using scipy.signal.lombscargle. Do you get the same result?
  • Apr 15: scipy.interpolate; Example: interpolating 1d and 2d data (script)
  • Apr 22: jupyter, threading (threading template, example code)
  • Apr 29: no class
  • May 6: no class
  • May 13: no class