A plethora of asteroids in TESS data

TESS is the Transiting Exoplanet Survey Satellite, which was launched into space by NASA in 2018 to identify faint brightness variations in stars that are characteristic of planets orbiting around and transiting in front of these stars.

https://upload.wikimedia.org/wikipedia/commons/c/c2/Transiting_Exoplanet_Survey_Satellite_artist_concept_%28transparent_background%29.png
A rendering of the TESS spacecraft. The instrument consists of four imaging cameras, each covering 24° x 24° of the sky.

Over the course of its nominal two-year mission, TESS will monitor more than 200,000 stars with the photometric accuracy necessary to identify exoplanet transits. In order to observe this huge number of stars, TESS points at different “Sectors” in the sky, amounting to a total of 85% of the entire sky. Each Sector spanning 24° x 96°  is observed over a period of about one month.

TESS data are publicly available at MAST – including calibrated Full Frame Images that are stacked over a period of 30 min each. Given the huge field of view of the TESS cameras, there should be an enormous number of asteroids in each frame at any given time. So I went for a data mining fishing expedition…

Extracting asteroids from TESS data

The idea is conceptually simple:

  • download and pre-process the TESS Full Frame images for further analysis ;
  • perform a background subtraction to remove stars and other fixed sources to improve the detection of moving objects;
  • predict the locations of all known ~800,000 asteroids for the time of the observations in order to be able to identify them in the frames;
  • perform photometry and measure their brightness.

Putting all these parts together, we can measure brightness variations of a large number of asteroids over a baseline of up to nearly one month – this is a unique data set that enables the accurate measurement of rotational periods over a continuous period of time that is not accessible from the ground: keep in mind that ground-based observations are restricted to night-time observations only, whereas TESS is able to observe 24/7.

This project is mainly a computational problem: running one out of four cameras per Sector on a desktop machine takes about one day. Multithreading has been implemented where possible to expedite the processing.

First results

A first cool result is shown in this short video clip:

This video shows the rotational lightcurve of asteroid 1693 Hertzsprung. The left plot shows a thumbnail image centered on the asteroid and the right plot shows the brightness of the target. As time progresses, the variation in brightness, which is caused by the asteroid’s irregular shape and rotation, becomes obvious. This specific asteroid has a rotational period of about 9 hrs. Gaps appear in the lightcurve when the asteroid comes too close to areas affected by image artifacts, which are highlighted with red overlays in the left plot.

I will add more results and visualizations in the near future.

Current status

Right now, we are working on the publication of a pilot study for this project, as well as a funding proposal to apply our data mining technique to the entire TESS data set.

Stay tuned!

 

Advertisements

Identifying Stars and Galaxies from Sloan Digital Sky Survey Photometric Data

This is a little toy project that started out of curiosity about what simple machine learning models are able to accomplish and ended up being my contribution to this year’s Flagstaff Astronomy Symposium. I hope this project will serve as a motivation for  astronomers to consider machine learning approaches in their work.

Motivation

The idea behind this project is that object classification is a common problem. Are all these objects stars? Is this a spiral galaxy? Is that an elliptical galaxy? This sounds like a typical classification problem, so it should be easy to implement a machine learning model that is able to distinguish between these object types based on the imaging data, using methods similar to those used in face recognition.

Abell_2537
Galaxy cluster Abell 2537 as observed with the Hubble Space Telescope and showing a wide range of galaxy types (NASA/ESA/HST).

Given the data volume of large-scale imaging surveys, I was more interested in investigating if, instead of running on the actual imaging data, a model could be trained and run on reduced photometric data. This approach would significantly reduce the data volume as well as the computational complexity of the problem.

Data

Photometric data obtained with a range of different methods are available from surveys such as the Sloan Digital Sky Survey (SDSS). Luckily, the citizen science project Galaxy Zoo extracted several tens of thousands of galaxy candidates from SDSS images that were then classified by-eye by humans. Using data the classification data from Galaxy Zoo and photometric properties as measured by SDSS, a training set including 62000 elliptical galaxies, 190000 spiral galaxies, and 183000 stars could be assembled for this project.

sdss_features
Example for feature engineering: the left-hand plot shows the quotient of the Petrosian effective radii including 50% or the flux and 90% of the flux for the three different target types, whereas the right-hand plot shows the difference between the two properties. While the quotient nicely splits the three object types across a range of values, the difference allows only to distinguish stars from galaxies in general. The quotient is part of the final feature set used in the modeling.

By plotting similar features against each other for the three object types and engineering additional features, I was able to reduce the total number of features to be used in the model to only five.

Modeling

Providing a level of accuracy very similar to more complex models but performing much faster than them, I decided to go with a simple Decision Tree classifier as my model of choice.

sdss_confusionmatrix
Best-fit model confusion matrix.

The best-fit f1-score for this model is 96.4%, which sound pretty good. A look at the confusion matrix reveals that stars are identified with a 100% confidence, but elliptical galaxies only with 90% confidence. The possible confusion between the two galaxy types is less than 10%, which is still a good result given the simplicity of the model.

Results

I test the trained model against SDSS of galaxy cluster Abell 1631 and predict target types for each source with reliable photometric properties.

sdss_stars
Objects classified as stars from the Abell 1631 SDSS field using the model described here.

As expected, stars are identified without any ambiguity. The example stars shown look almost identical and show a steep brightness gradient as expected for point sources.

sdss_ellipticals
Objects classified as elliptical galaxies from the Abell 1631 SDSS field using the model.

Objects that were identified as elliptical galaxies show a significant degree of fuzziness compared to stars, and they lack the steep brightness gradient as is indicated by the shallower scaling of the images. The fuzzy and structure-less appearance is typical for elliptical galaxies.

sdss_spirals
Objects classified as spiral galaxies from the Abell 1631 SDSS field using the model.

Finally, objects that the model identifies as spiral galaxies usually show more structure and often a clearly linear orientation, which the elliptical galaxies are lacking. However, there are some cases that could just as well be classified as elliptical galaxies, as suggested by the confusion matrix.

Conclusions

Given the simplicity of the model, I think that it performs very well. It is definitely possible to distinguish between stars and galaxies, and even some galaxy types, based on the photometric properties alone, which is quite remarkable. Additional tweaking and using a more elaborate modell approach is probably able to improve the model’s accuracy and reduce the fractions of false positives.

Code, Discussion, and Acknowledgments

The code used in this project is available on github, together with a more in-depth discussion of the project and its flaws.

This project uses data from SDSS and Galaxy Zoo.

Travis-CI builds won’t start

I was having troubles using Travis-CI and github that I would like to spare you: after modifying the .travis.yml of my github repository and pushing the changes to an already submitted pull request, the Travis-CI build checks just wouldn’t start. It’s not that the checks failed, they just simply would not start.

The problem was that this made debugging pretty much impossible, as there was no way of retrieving information on what went wrong. Spending too many hours and about 100 commits to the pull request later, I think I figured out what went wrong:

.travis.yml does not like tabs. Using emacs as my python IDE, am used to hitting tab a lot and emacs is smart enough to replace tabs with an equivalent number of white spaces. However, that did not happen in the case of .travis.yml– tabs stayed as tabs.

So please watch out and keep in mind that tabs will cause issues here…

How far is Ikea? A data science toy project.

And now for something completely different…

This toy project is the result of a harmless question that came up during a Sunday afternoon walk in the park:

Living in Flagstaff, Arizona, the closest Ikea location is in Tempe, which is a two-hour drive. How does this compare to the US population in general? Does the majority of American have to go on a longer drive to get to the closest Ikea? Or is it a shorter drive?

Answering these questions is not too hard. If you have the right data and tools at hand. Using Python, open data, and open source code I pursue the following approach:

  1. Retrieve a list of all Ikea locations in the US.
  2. Build a grid across the (continental) US that allows me to…
  3. calculate the distance from every cell in this grid to the closest Ikea location.
  4. Weight the resulting geographic distance distribution with the local population density.

Let’s have a look at some results:

  • The following maps shows the distance to the closest Ikea location for any location in the continental US. The black points indicate Ikea locations, the green star indicates the location of Flagstaff. Flagstaff is about 200 km from Ikea in Tempe. There certainly are areas that are farther from the closest Ikea, especially areas in New Mexico and Montana. Some areas of Montana share the longest distance to any Ikea location in the US: 1050 km. Ikea_distances
  • There are obvious cumulations of Ikea locations on the East coast and West coast. Looking at the (logarithmic) population density distribution, it is clear that these Ikea_popdistmatch with highly populated areas. Also note how the other Ikea locations clearly correlate with densely populated areas.
  • By combining the information in the two plots above, we can build a cumulative distribution that shows how far Americans live from their closest Ikeas, weighted by population density: Ikea_cumpopdist
    Keep in mind that the curve shown in this plot is cumulative, so it shows for a given distance on the x-axis what fraction of Americans lives within this radius of the closest Ikea location. Flagstaff and the population median are shown in this plot for reference.

Results

The previous plot enables us to address the questions motivating this project:

  • The US population median distance from any Ikea location is 60 km; this means that 50% of Americans have to cross a longer distance, the other 50% a shorter distance. Given that the US spans about 4000 km from coast to coast, this is a pretty decent coverage.
  • Assuming that people are prepared to drive between 30 min (~50 km) to 1 hr (~100 km) to buy things at Ikea, what fraction of the population does this cover? Based on these ranges 46% to 60% are covered.
  • And of course the most important question: What fraction of the US population has a longer drive to Ikea than people living in Flagstaff? The answer is: only 20%. Despite the seemingly short distance to Tempe on the above map, only a fifth of the US population has a longer drive to Ikea. This is somewhat depressing…

For details on this analysis and project, have a look at this jupyter notebook.

Solar System Data using astroquery

As mentioned in an earlier post, I published some code to query the JPL Horizons system, which provides ephemerides, orbital elements, and state vectors for Solar System bodies, in the framework of astroquery.

astroquery is an astropy affiliated package that provides access to a wide range of astronomical data catalogs, archives, and services within Python. Hence, astroquery is the logical place for Solar System-related query functionality.

As part of the sbpy project, a number of sub-modules have been added to astroquery or extended. Here, I would like to introduce some functionality briefly:

jplhorizons – JPL Horions Queries

The JPL Horizons service provides ephemerides, state vectors, and orbital elements for every known body in the Solar System – including the SpaceX Roadster (object id -143205). If you would like to know where you could see the Roadster in the sky (if you had a telescope large enough) for your location, jplhorizons can provide you exactly that information:

from astroquery.jplhorizons import Horizons
from astropy.time import Time

# define epoch for query
epoch = Time('2019-02-03 20:09', format='iso')

# define target
# '-143205' is the Horizons-internal id for this object, use<span 				data-mce-type="bookmark" 				id="mce_SELREST_start" 				data-mce-style="overflow:hidden;line-height:0" 				style="overflow:hidden;line-height:0" 			></span> id_type='id'
# 'G37' is the IAU code for Lowell's Discovery Channel Telescope
obj = Horizons('-143205', id_type='id', location='G37', epochs=epoch.jd)

# query ephemerides
eph = obj.ephemerides()

print(eph['RA', 'DEC', 'delta', 'V'])
   RA       DEC          delta        V
   deg      deg           AU         mag
--------- -------- ---------------- -----
338.13194 -8.76581 2.43178151964055 28.98

So, the roadster is 2.432 au from Earth with an apparent magnitude of 29.0? We might want to wait for it to come a little bit closer to observe it.

Obtaining orbital elements and state vectors for any object is just as easy. Simply define the target as a Horizons object and then use the corresponding query function. The output is always in the form of an astropy Table, supporting astropy units.

For more information, have a look at the documentation, or contact the author of jplhorizons.

jplsbdb – JPL Small-Body Database Browser Queries

The JPL Small-Body Database Browser is a query tool that provides pretty much all available data on a Solar System small-body that is available on the JPL servers.

jplsbdb is easy to use:

from astroquery.jplsbdb import SBDB
SBDB.query('3552')
OrderedDict([('object', OrderedDict([('shortname', '3552 Don Quixote'), ('neo', True), ('orbit_class', OrderedDict([('name', 'Amor'), ('code', 'AMO')])), ('pha', False), ('spkid', '2003552'), ('kind', 'an'), ('orbit_id', '188'), ('fullname', '3552 Don Quixote (1983 SA)'), ('des', '3552'), ('prefix', None)])), ('signature', OrderedDict([('source', 'NASA/JPL Small-Body Database (SBDB) API'), ('version', '1.0')])), ('orbit', OrderedDict([('source', 'JPL'), ('cov_epoch', Unit("2.45657e+06 d")), ('moid_jup', Unit("0.441 AU")), ('t_jup', '2.315'), ('condition_code', '0'), ('not_valid_before', None), ('rms', '0.51'), ('model_pars', []), ('orbit_id', '188'), ('producer', 'Otto Matic'), ('first_obs', '1983-09-10'), ('soln_date', '2018-07-06 06:55:08'), ('two_body', None), ('epoch', Unit("2.4582e+06 d")), ('elements', OrderedDict([('e', '0.709'), ('e_sig', '4.8e-08'), ('a', Unit("4.26 AU")), ('a_sig', Unit("2.3e-08 AU")), ('q', Unit("1.24 AU")), ('q_sig', Unit("2e-07 AU")), ('i', Unit("31.1 deg")), ('i_sig', Unit("1.1e-05 deg")), ('om', Unit("350 deg")), ('om_sig', Unit("1e-05 deg")), ('w', Unit("316 deg")), ('w_sig', Unit("1.1e-05 deg")), ('ma', Unit("355 deg")), ('ma_sig', Unit("3.9e-06 deg")), ('tp', Unit("2.45825e+06 d")), ('tp_sig', Unit("3.5e-05 d")), ('per', Unit("3210 d")), ('per_sig', Unit("2.6e-05 d")), ('n', Unit("0.112 deg / d")), ('n_sig', Unit("9.2e-10 deg / d")), ('ad', Unit("7.27 AU")), ('ad_sig', Unit("4e-08 AU"))])), ('equinox', 'J2000'), ('data_arc', '12717'), ('not_valid_after', None), ('n_del_obs_used', None), ('sb_used', 'SB431-N16'), ('n_obs_used', '869'), ('comment', None), ('pe_used', 'DE431'), ('last_obs', '2018-07-05'), ('moid', Unit("0.334 AU")), ('n_dop_obs_used', None)]))])

The default query returns a lot of information – even more is available upon request. The output is provided in the form of an OrderedDict in a tree-like structure.

In order to get a better overview of the data available, jplsbdb provides a function to print the query information as a schematic:

from astroquery.jplsbdb import SBDB
data = SBDB.query('3552')
print(SBDB.schematic(data))
+-+ object:
| +-- shortname: 3552 Don Quixote
| +-- neo: True
| +-+ orbit_class:
| | +-- name: Amor
| | +-- code: AMO
| +-- pha: False
| +-- spkid: 2003552
| +-- kind: an
| +-- orbit_id: 188
| +-- fullname: 3552 Don Quixote (1983 SA)
| +-- des: 3552
| +-- prefix: None
+-+ signature:
| +-- source: NASA/JPL Small-Body Database (SBDB) API
| +-- version: 1.0
...

For more information, have a look at the documentation, or contact the author of jplhorizons.

jplspec – JPL Molecular Spectroscopy Catalog Queries

The JPL Molecular Spectroscopy Catalog provides spectroscopic information on a wide range of molecules. For a query example and additional information, I refer to the jplspec documentation.

mpc – Minor Planet Center Ephemerides Queries

The Minor Planet Center is the official clearing house for all asteroid and comet observations. and provides a number of services and data sets for researchers. The astroquery submodule mpc includes functionality to query ephemerides, orbital elements, registered observatory locations, and observations reported to the MPC. For a full introduction on the provided services, I refer to the mpc documentation. Here, I simply present a short example for how to query the current ephemerides for asteroid Ceres as seen from the Discovery Channel Telescope (G37):

</pre>
from astroquery.mpc import MPC

eph = MPC.get_ephemeris('Ceres', location='G37', number=1)
print(eph['Date', 'RA', 'Dec'])
          Date                  RA                 Dec        
                               deg                 deg        
----------------------- ------------------ -------------------
2019-02-03 21:57:08.000 241.90666666666667 -14.536111111111111

skybot – Searching for Solar System bodies in a region of the sky

Coming soon…

pyoorb – A Python interface for orbit calculations with OpenOrb

OpenOrb is a software suite for performing all sorts of orbit calculations, including orbit integration, ephemerides calculations, and ranging to determine orbit solutions based on observations. Coded in Fortran, OpenOrb is usually compiled locally and run from the command line.

A few years ago, LSST software developers started working on a Python wrapper for OpenOrb that uses the Fortran code as a library. In the framework of sbpy, I picked up their excellent groundwork to extend the available functionality for a broader audience.

pyoorb currently provides functions for orbit integration, ephemerides calculations, as well as orbital element transformations. Orbit ranging will be implemented in the near future. Since pyoorb uses the existing OpenOrb Fortran code as a basis, function input has to follow the specifications meticulously, not offering the dynamic typing that you are used to as a Python user. In order to make pyoorb functions more user-friendly, sbpy provides convenience functions for the use of pyoorb.

Consider the following examples, which require sbpy and pyoorb to be installed:

  • Calculate the ephemerides of asteroid Ceres for some point in time; we obtain the orbital elements for the same epoch from JPL Horizons:
import pyoorb as oo
from astropy.time import Time
from sbpy.data import Orbit, Ephem

# initialize pyoorb
oo.pyoorb.oorb_init()

# obtain current osculating elements for Ceres
orb = Orbit.from_horizons('Ceres')

# define epoch for which ephemerides are to be calculated
epoch = Time('2019-02-01 12:00', format='iso')

# calculate ephemerides
eph = Ephem.from_oo(orb, epoch)
print(eph['epoch', 'RA', 'DEC'])
  epoch          RA               DEC        
    d           deg               deg        
--------- ---------------- ------------------
2458516.0 241.143973372825 -14.38567839898754
  • Transform Ceres’ Keplerian orbit to cartesian coordinates, comet orbit elements, and back to Keplerian coordinates; we compare the semi-major axis of the original Keplerian orbit with that of the transformed orbit:

import pyoorb as oo
from sbpy.data import Orbit

# initialize pyoorb
oo.pyoorb.oorb_init()

# obtain current osculating elements for Ceres
keporb = Orbit.from_horizons('Ceres')

# transform orbital elements to cartesian coordinates
cartorb = keporb.oo_transform('CART')

# transform cartesian coordinates to comet orbit
comorb = keporb.oo_transform('COM')

# transform comet orbit to Keplerian orbit
keporb2 = comorb.oo_transform('KEP')

print('discrepancy in semi-major axis:', keporb['a']-keporb2['a'])
discrepancy in semi-major axis: [0.] AU
  • Propagate Ceres’ orbit 100 years into the future and compare the result to the orbit derived by JPL Horizons:

import pyoorb as oo
from astropy.time import Time
from sbpy.data import Orbit

# initialize pyoorb
oo.pyoorb.oorb_init()

now = Time.now()

# get current osculating elements from JPL Horizons
orbnow_jpl = Orbit.from_horizons('Ceres')

# propagate orbit 100 years into the future
then_jd = now.jd + 10*365
orbthen = orbnow_jpl.oo_propagate(then_jd)

# compare to JPL Horizons orbit for the same date
orbthen_jpl = Orbit.from_horizons('Ceres', epochs=then_jd)

print('discrepancy in semi-major axis:',
      (orbthen['a']-orbthen_jpl['a']).to('m'))
discrepancy in semi-major axis: [51.80632224] m

For more information on OpenOrb and pyoorb, check out their github repo. Thanks to @bsipocz, @mjuric, and @rhiannonlynne, OpenOrb and pyoorb are now easily available through conda for Linux and MacOS. Give it a try!

If you are interested in using the sbpy convenience functions, have a look at the sbpy documentation.

 

Generating LaTeX Publication lists from NASA ADS

Compiling lists of publications for proposals, applications, or simply your cv can be a cumbersome process. If only there was some code that could produce a nicely formatted pdf from the information listed on the NASA ADS system…. Like this one. 

Et voilà:

from pylatexenc.latexencode import utf8tolatex
import ads

ads.config.token = ''  # your ADS token
author_name = ''  # your name
years = () # years to be queried: (start year, end year)
refereed = True # if True, only refereed publications will be queried;
                # if False, only non-refereed publications will be queried

latex_header = (
    '\\documentclass[11pt]{article}\n'
    '\\usepackage[inner=1in,outer=1in,top=1in,bottom=1in]{geometry}\n'
    '\\usepackage{hyperref}\n\n'
    '\\begin{document}\n\n'
    '\\section*{Publications}\n\n'
    '\\begin{itemize}\itemsep 0pt\n')

latex_footer = (
    '\\end{itemize}\n'
    '\\end{document}\n')

def query_papers(author, refereed=None, years=None, rows=1000):
    """query papers from NASA ADS

    :param author: str, author name
    :param refereed: boolean or `None`, if `True`, only extract refereed
                     publications; if `False`, only extract not refereed
                     publications; if `None`, extract all; default: `None`
    :param years: tuple, list, or `None`, range of years to query or `None`,
                  default: `None`
    :param rows: int, maximum number of publications to extract

    :return: list of ads publication objects
    """
    # set query payload
    if refereed is None:
        q = ''
    elif refereed:
        q = 'property:refereed'
    elif not refereed:
        q = 'property:notrefereed'
    fq = 'database:astronomy'
    if years is not None:
        fq += " year:{0}-{1}".format(years[0], years[1])

    # perform query
    papers = ads.SearchQuery(author=author,
                             fq=fq,
                             q=q,
                             sort='pubdate',
                             rows=rows,
                             fl=['title', 'author', 'year', 'volume',
                                 'page', 'pub', 'identifier', 'citation'])

    return list(papers)

def create_latex(paper, name=None):
    """turn ads publication objects into strings using latex encoding

    :param paper: ads publication object
    :param name: string or `None`, name that will be highlighted in latex,
                 default: `None`

    :return: str, latex encoded string for paper
    """
    out = ''

    # put paper title in italic font
    title = '{\\it ' + utf8tolatex(paper.title[0]) + '}'

    # build author list
    if name is None:
        # treat all author names equally and list all of them
        authors = [utf8tolatex(paper.author[i])
                   for i in range(len(paper.author))]
        etal = False
    else:
        # highlight `name` in output string, if provided
        authors = []
        name_found = False
        dotdotdot = False
        etal = False
        for i in range(len(paper.author)):
            # `name` is the i-th author on this paper
            author = utf8tolatex(paper.author[i])
            if name in author:
                name_found = True
                authors.append('{\\bf ' + author + '}')
            elif i = 3 and not name_found and not dotdotdot:
                # at least 3 authors and `name` is not among the first 3;
                # insert '...'
                authors.append('...\ ')
                dotdotdot = True
            elif i &gt;= 3 and not name_found and dotdotdot:
                # at least 3 authors, but this one is not `name`
                pass
            elif i &gt;= 3 and name_found:
                # at least 3 authors and this one is `name`
                etal = True
                break

    # join author list and add 'et al.' if required
    if etal:
        authors = ', '.join(authors) + ' et al.'
    else:
        authors = ', '.join(authors)

    year = paper.year

    # create string with journal volume and page number
    pub = str(paper.pub)
    if paper.volume is not None:
        pub += ', ' + str(paper.volume)
    if paper.page is not None:
        pub += ', ' + str(paper.page[0])

    arxiv_link = ''
    for ident in paper.identifier:
        if 'arXiv:' in ident:
            arxiv_id = ident[6:]
            arxiv_link = ('\href{https://arxiv.org/abs/' +
                          arxiv_id + '}{arxiv}')
        elif len(ident) == 10 and ident[4] == '.':
            arxiv_link = ('\href{https://arxiv.org/abs/' +
                          ident + '}{arxiv}')

    # assemble output string as latex bullet list item
    out = ('\\item ' + authors + ' (' + year + '), ' + title +
           ', ' + pub)
    if arxiv_link != '':
        out += ', ' + arxiv_link

    # add number of citations, if available
    if paper.citation is not None and len(paper.citation) &gt; 1:
        out += ', ' + str(len(paper.citation)) + ' citations'
    elif paper.citation is not None and len(paper.citation) == 1:
        out += ', ' + str(len(paper.citation)) + ' citation'

    return out

def fixme(out):
    """fix/reject citation substrings

    :param out: string containing publication information

    :return out: string
    """

    # words leading to a rejection
    reject = ['Erratum']
    for s in reject:
        if s in out:
            return ''

    # substrings to be replaced
    fix = {'<SUB>': '',
           '</SUB>': '',
           '': '',
           '': ''}
    for key, val in fix.items():
        if key in out:
            out = out.replace(key, val)

    out = out.replace('#', '\#')

    return out

# pull references from ads
papers = query_papers(author_name, refereed=refereed, years=years)

# write results to file
with open('publication_list.tex', 'w') as outf:
    outf.write(latex_header + '\n\n')
    for paper in list(papers):
        ref = fixme(create_latex(paper, author_name))
        if len(ref) &gt; 0:
            print(paper.author[0], paper.year)
            outf.write(ref + '\n\n')
    outf.write(latex_footer + '\n')

This script will produce the LaTeX source file for the pdf linked above. To run the code, you will need the following:

  • an account with the new ADS system and the private token you have been assigned with in
  • python 3 installed along with the modules ads and pylatexenc
  • pdflatex or latex installed to compile the resulting LaTeX file as a pdf or ps file, respectively

To create a publication list for yourself (or somebody else) follow these steps:

  1. download the code snippet and save it into a file, e.g., create_publist.py
  2. modify the highlighted lines in the code, i.e., the following items:
    • add your ADS token
    • add your name (this is the name that will be queried in ADS)
    • add the years to be queried, e.g., (2015, 2019) will query all publications published in the years 2015 till 2019
    • select whether you want a list of refereed publications or a list of non-refereed publications
  3. run the code in a terminal as: python create_publist.py
  4. compile the LaTeX source file: pdflatex publication_list.tex
  5. enjoy your publication list!

A few features of this script:

  • the number of authors listed is limited to 3; if there are more than 3 authors, the list if culled at 3 authors and “et al.” is added automatically
  • your name is automatically highlighted in bold and it will never be culled from the author list provided
  • article titles are provided, as well as journal names, volume and page numbers
  • a hyperlink to the article on arxiv is automatically added (this was chosen over the official journal website since arxiv is open access)
  • if an article has been cited by others, the number of citations is provided
  • special characters are converted to LaTeX using pylatexenc
  • the fixme function allows to reject publications based on buzzwords (e.g., ‘erratum‘, see line 156) and it will replace certain substrings if the formatting in ADS is messed up (in that case, look up the substring causing trouble in the LaTeX source file and add it to the dictionary ending in line 165)

Have fun with this!