Example Stock Price Charts

Following a convention that stores a short description of the data in the beginning lines of the CSV files can be very useful for annotating a graph or a report associated with the data in the file.

To follow along with the example, ensure your directory structure is similar to that presented in Chapter 2 in the section "Example: Directory Structure for the Book." Your base directory should be Ch4; within Ch4 there should be three subdirectories named src, data, and images. If you wish to use a different scheme, be sure to change the file path variable and the call to function savefig() in the script in Listing 4-3, which appears a little later in this section.

For this example you can download data from the NASDAQ stock exchange web site (http://www.nasdaq.com). Select a stock, for instance, the NASDAQ-100 (IXNDX) or your company's stock chart, you wish to display on the intranet web site. You will be presented with a chart of the stock. When you click the chart, the NASDAQ web site presents the actual values used to create the chart. You can choose to download the file in Excel format: do so, and save the file under directory Ch4/data/charts.xls.

If you open the file Ch4/data/charts.xls in a text editor, you'll notice that there's header information describing what each column means:











































In reality, the file format is a form of CSV, the separator being a tab instead of a comma. We can easily overcome this with Python's csv module by specifying the delimiter to be tab '\t'. Listing 4-3 shows our implementation, stock charts.py, which reads a stock chart file and presents a graph with the header information properly displayed. Be sure to save it in folder Ch4/src. The result will be a PNG image, stockprice.png, in directory Ch4/images.

Listing 4-3. stock_charts.py, Plotting NASDAQ charts.xls File from pylab import * import csv from time import gmtime, mktime

# modify the following to point to your data file filepath = '../data/charts.xls'

# read the entire CSV file and store it in an array of lists

for row in csv.reader(open(filepath), delimiter='\t'): data.append(row)

# split the data to header and values header = data[o]

# the first column is date information in a string format

# we transform it to a day of year format

# notice that this will not work over year boundary (need to add 365) yearday = zeros(len(values[:, 0]))

market_close_time = (int(day[6:]), int(day[:2]), int(day[3:5]), \ 16, 0, 0, 0, 0, o)

yearday[i] = gmtime(mktime(market_close_time)).tm_yday

plot(yearday, values[:, i], label=header[i], linewidth=3)

# annotate the start and end dates text(yearday[o], values[0, l], values[0, 0]) text(yearday[-l], values[-l, l], values[-l, 0])


ylabel('Stock price [USD]')

xlabel('Days from start of the year '+values[0, 0][6:])

title('NASDA0-100 (IXNDX) Stock price, period %s-%s' % (values[-l, 0], values[0,0])) savefig('../images/stock_price.png')

We start by reading the CSV data file and passing a tab as a delimiter. The first line in variable data is the header information, describing what each column means: Date, Open, High, Low, Close/Last, and Volume. The remaining lines are the values to plot. We therefore split the variable data into header and values, accordingly. We also convert the values to a NumPy array using the function call array(). Using a NumPy array, the data will be easier to process and plot; more about NumPy in Chapter 7.

The following is not so much an explanation of working with CSV files but is important to fully understand the script.

Next is the so-called linearization process. Much like in the GPS example of Chapter 1, data in charts.xls is not linear. The information is stock prices on a daily basis; however, stocks are not traded every day, weekends being the prime example but also holidays. If we plot the information as is, neglecting these "holes" in the data, the picture presented will be skewed. So instead, we need to choose a different time base, one that will take into consideration nontrade days. I chose to use the day-of-the-yearvalue: January 1 is 1, January2 is2,... December 31 is 365 or 366 (leap year dependent).

Since I don't want to get into the process of determining leap years or summing up the days in each month, I've decided to use the time module again. The idea here is to use the function gmtime() and as a side effect, retrieve the day-of-the-year value. Function gmtimeQ receives a value representing the number of seconds elapsed since the epoch, a fixed point in time (see more about the epoch in Chapter 5). While this sounds even more complicated than calculating the day of the year, in reality it's easier because of function mktime(). Function mktime() receives a tuple of nine values, detailed previously, and returns the number of seconds since the epoch. So we first construct a tuple of those nine values, the first three being year, month, and day, which are known to us, and arbitrarily assigning the hour to be 4 p.m. (which coincides with the end of trade). We leave the remaining fields zero. We then feed this number to gmtime() and receive a new tuple, now properly populated with the year of day, the eighth element of the tuple, accessible with tmyday, which we save in vector yearday.

■ Note The script does not take into account data over more than one year. To accommodate for this, you could take into consideration the number of days in a year (365 or 366, depending on a leap year) and use the lowest year as a baseline.

We then plot the data and annotate the graph. For the legend, we use the header values of the CSV file stored in variable header. We also use actual values from the variable values to annotate the start and end of period on the graph itself, the title, and the x-axis label (see Figure 4-1).

■ Note If you look closely at the data in charts.xls, you'll notice that it's reversed, that is, backward in time. One of the side effects of using the day-of-the-year value is that values are now plotted from lower to higher values, that is, older times are on the left, and newer events are on the right. If you'd like to reverse this behavior, issue the command gca().axes.invert_xaxis().

NASDAQ-lOO (IXNDX) Stock price, period 08/04/2008-09/02/2008

NASDAQ-lOO (IXNDX) Stock price, period 08/04/2008-09/02/2008

Python Stock Chart Lib

Days from start of the year 2008

Figure 4-1. Stock price chart output

Days from start of the year 2008

Figure 4-1. Stock price chart output

Was this article helpful?

0 0


  • carmen davidson
    How to read stocks with python?
    8 years ago

Post a comment