Free Trading/Stock Data Tested with Python in 2022
I finally did the job and tested multiple APIs to get started with our Reinforcement Learning Algorithmic Trading bot. Added also the code if you are willing to use these services.
Free Stock Market Datasets/API’s I’ve tested recently
To begin with, there are really a lot of data sources to find on the web and I was astonished how many companies are just making money by selling data. I am pretty sure that i have read that this market is worth around 3 billions and increasing every year.
Anyways, for research purposes, i have played around with free data sources such as Alpaca, AlphaVantage, YahooFinance, IEXFinance, Investing, Quandl(Nasdaq) and MarketStack.
Disclaimer: This post does not use any affiliate marketing.
1) Alpaca Markets
I first started with Alpaca as i had already experience with their API already.
For Paper Trading purposes, it really is a great service since their documentation is well written and is completely free.
TL/DR: Data can be used in minutes, hours and daily tickers BUT can only be used beginning from 2015 AND data cleaning must be made as stock splits are not taken into account :(
Start with the following Code after creating your free account.
def AlpacaData(Symbols=['AAPL']):
BASE_URL = "https://paper-api.alpaca.markets"
KEY_ID = "YOUR-ID"
# Instantiate REST API Connection
SECRET_KEY = "YOUR-KEY"
api = tradeapi.REST(key_id=KEY_ID,secret_key=SECRET_KEY,base_url="https://paper-api.alpaca.markets")
barTimeframe = "1D" # 1Min, 5Min, 15Min, 1H, 1D
# Fetch Account
account = api.get_account()
# Print Account Details
print(account.id, account.equity, account.status)
iteratorPos = 0 # Tracks position in list of symbols to download
for Symbol in Symbols:
# Fetch Apple data from last 100 days
Alpaca_DataFrame = api.get_bars(Symbol, barTimeframe, start="2010-01-01", adjustment='raw').df
Alpaca_DataFrame['Symbol'] = Symbol
# Preview Data
#Alpaca_DataFrame.to_csv('Alpaca_Data.csv', mode='a', index=False, header=False) #Appending mode
Alpaca_DataFrame.to_csv('01_Alpaca_Data.csv')
2) Yahoo! Finance (YFinance)
While i was using Google Spreadsheets, i often came across Google Finance, which interestingly gets data from Yahoo! Finance. YFinance is among the best data source you can find and is very popular in the community. Stock splits are taken into account but unfortunately has only end of day data of OHLCV (Open-High-Low-Close-Volume)
I can just tell you, if YFinance would have Minute or Hour Ticker data, it would become a standart in the whole market.
Check it out with the following code, you even don’t have to create any account:
def YfinanceData(Symbols):
# Interval required 5 minutes
start = datetime.datetime(2015, 12, 1)
for Symbol in Symbols:
Yfinance_DataFrame = yf.download(tickers=Symbol, interval='1d', start= start)
Yfinance_DataFrame['Symbol'] = Symbol
Yfinance_DataFrame.to_csv('02_Yfinance_Data.csv')
3) QUANDL/NASDAQ
Quandl was acquired by Nasdaq as it offers alternative data Nasdaq didn’t had before. It has quite old data able to begin from 2000’s but unfortunately stock splitting are not taken into account as well. Without preprocessing this datasets, our machine learning model would give us bad results not be able to rely on. Also, the symbol names are not the same as used in by Yahoo Finance or others but have some interesting names in it such as ‘WIKI/AAPL.4’…
def QuandlData(Symbols): # Or NasdaqData since it is acquired by nasdaq
#Unfortunately, it has it's own naming convention for symbols :(
Symbols = ['WIKI/AAPL.4']
for Symbol in Symbols:
Nasdaq_Dataframe = nasdaqdatalink.get(Symbol, start_date="2001-12-31", end_date="2021-12-31")
Nasdaq_Dataframe['Symbol'] = Symbol
Nasdaq_Dataframe.to_csv('DataSets/CSVs/03_Nasdaq_Data.csv')
# df.index = mdates.date2num(df.index)
# data = df.reset_index().values # Convert dataframe into 2-D list
4) IEXFinance
IEXCloud data is also one of the most popular datasets in the market. It has data of maximum 15 years back, takes stock splittings into account and has a well written API documentation. Drawbacks are, that you get credit points for each month (Freemium Account) and while i tested the API, i already run out of credit points… Nevertheless, it has the same OHLCV data AND additionally some other metrics such as percentage changes of Opens etc. Daily, Hourly and Minute Tickers are available.
Also, it also has alternative data you definitely need to have a look.
After creating your free account, feel free to test it with the following code:
def IEXCloudData(Symbols):
for Symbol in Symbols:
IEXCloud_Dataframe = get_historical_data(Symbol, output_format='pandas',token="YOUR-KEY",
start='01/01/2007',
end='23/06/2022'
)
IEXCloud_Dataframe.to_csv('DataSets/CSVs/05_IEXCloud_Data.csv')
5) MarketStack
This service offers the same data as Yahoo! Finance but really needs improvement in API documentation and usage. Check out the code and you will understand what i mean with that.
def MarketStackData(Symbols):
r = requests.get('http://api.marketstack.com/v1/tickers/AAPL/intraday?access_key=YOURKEY&date_from=2015-01-01&date_to=2022-06-23')
x = r.json()
data = r.json()['data']
MarketStack_Dataframe = pd.DataFrame.from_dict(data['intraday'])
MarketStack_Dataframe.to_csv('DataSets/CSVs/06_MarketStackData_Data.csv')
Conclusion:
I go for Yahoo! Finance as my ML/RL model will only be based on daily tickers. If you need Hourly or Minute based stock data Alpaca Market or IEXCloud as both have a good pricing policy.
Please keep me updated if you have any other free data sets used for stock market and algorithmic trading.
You can follow me on Twitter
Do not forget to check out: MLAlgotrading.com
Want to read more?
I found these interesting, check out: