LEVEL : Beginner

Jan 10, 2021
Start Python for Finance

Essential things
to unlock financial data

Part 3/6

Once coding tools are in place, financial information should be found. Thankfully: countless financial data circulates freely on the web. Where to collect them and how?

For a first test: wait for the window to load and click on the Run button.

 The result
you should expect

 You should see the following result:

  • Estimated launch time: 3 min.
  • Goal: download the financial data of the stocks constituting the S&P 500 index.

The ultimate goal is to be able to gather any type of financial data from the web.
And there are several methods. Here is my feedback.

Making Finance less opaque

Where there is data smoke, there is business fire.
– Thomas Redman –

Be sure: Data is the new gold. And financial institutions have understood this well.

  Do you have any idea of the average bank’s market data expenditure?

Well, a mid-sized institution can easily exceed EUR 20 million per year.

Why so much will you tell me.

Data is information. The world of finance is so complex and opaque that information is key. Without relevant information, it is impossible to adjust, understand strengthen in this world and survive.

Access to data is expensive for a financial institution. So how – at the  beginner’s level – do I get into the game? Is making Finance less opaque a reasonable objective?

Yes, it is is paradoxical. That’s a fact. There is an incredible amount of financial data circulating on the web. And for free.

Finally, the question is not how to retrieve data.  But, which ones to gather?

3 data sources used by hedge funds

Imagine you plan to create your own financial analyst robot.

You want him to analyze the financial markets, and come back to you with the best opportunities.

To do this, the robot must meet your need and interest. And It must rely on 3 types of data.

The robot should use Market data

  If you look for an opportunity on trade-related data. Such as prices, volumes, trends or corporate events.

The robot should use Fundamental data

  If you look for an opportunity on business’s financial statements data. Such as asset, liability or earnings.

  Or if you consider the overall state of the economy and factors such as interest rates, production, employment or GDP.

The robot should use Alternative data.

  if you think possible opportunities through social networks, sentiment analysis, geolocation data or website scraping.


Theses 3 possibilities will be discussed on this website. In the example code above, whose objective is just to start, I use market data and website scrapping.

Let me explain why and how below. 

How to collect market data in a few lines of code?

The ease of collecting financial data with Python is amazing. Here are 3 ways to do it.

1. A professional solution: Financial data provider

Some data provider makes it easy to import financial data for Python.

(Quandl is one of those great tools. More, the Quandl API is free to use. We will talk about this later on this website.)

2. A response from the Python community: Libraries

One of the best known free financial data resources is Yahoo Finance.

Developers provide free libraries that they have created. These libraries give you a collection of functions and methods that allows you to get access to Yahoo Finance data.

  We implement it in the example code: I am using the library Yfinance.

3. A solution of complete freedom: Web scrapping

Sometimes, you are looking for data that is not available either by data providers or by available libraries.

However, you can find what you desire on another website.

Good news. With Python, it is possible to extract all the information available on any webpage. The method of extracting content from websites is commonly called: Web Scraping.

  We implement it in the example code:

In our case, I did not find, in the YFinance library, the list of stocks composing the SNP 500. This is why, I extract from the wikipedia page, the list of all stocks.

To do this, I use a function from the Panda library. I develop this solution below.

(To go further in web scrapping, we will discuss the Beautifulsoup library later as well. It is considered one of the best tools for scraping.)

Putting it all together. My code in 4 steps.

As a reminder: the objective of my code is to be able to download the financial data of the stocks constituting the S&P 500.

Here are the 4 steps:

Step 1 – Requirements to make the code work

Import the panda and Yfinance libraries.


# --- STEP 1. Import the libraries
import pandas as pd
import yfinance as yf

Step 2 [Optional] – Download the price data of a single stock

I add this portion of code for illustrative purposes only.

It aims to download the price of a single stock only. It concerns Google stock for the period starting from 2019-10-05 to 2020-10-05.

GOOG’ is the Ticker Symbol of Google. A Ticker Symbol is a unique identifier by which stock can be researched and traded. Generally, it is an abbreviation of the company.

The result is displayed by the print() function.


 # --- STEP 2.  Download the price data of a single stock: Google's stock
DataFrameYahoo = yf.download('GOOG', start='2019-10-05', end='2020-10-05')

Step 3 – A simple web scraping in action

Extract the table listing all the stocks of the S&P 500 from the wikipedia page: https://en.wikipedia.org/wiki/List_of_S%26P_500_companies

In the print() function, request the display of the first 5 stocks of the table.


# --- STEP 3.  Get the list of the stocks tickers composing the SnP500
# gather all tables from the url link
tables = pd.read_html('https://en.wikipedia.org/wiki/List_of_S%26P_500_companies')
# take columns Symbol and Security from the first table 
df_wiki = tables[0][['Symbol','Security']]
# create the list of the first 5 stocks composing the SnP500 in list_of_stocks
list_of_stocks = df_wiki['Symbol'].values.tolist()[:5]

Step 4 – Download the price data of multiple stock

From the stock list of the S&P 500 extracted in step 3, download the prices of the first 5 stocks of this list.
Finally, the financial data table is then displayed by the print() function.


# --- STEP 4.  Download Financial data for tickers of every stock composing the SnP500.
# Download result is saved in df_stocks_data
df_stocks_data = yf.download(list_of_stocks, 
                      period = "1y",
                      interval = "1d",
                      group_by = 'ticker',

What are the next steps?

Having data is good. Using them is better.


Icônes conçues par Freepik from www.flaticon.com