Alpha因子分析工具Alphalens

出自Quantopian的开源Alpha因子分析工具Alphalens,能对Aplha因子做较全面的分析和图标展示,我们子需要简单地准备好因子数据,便可得到相应的图表和统计结果。

开源库在这里找到Alphalens,说明在这里AlphalensDoc

最新版的Alphalens V0.1.2, 由于Mindgo禁用了部分库,所以需要做简单修改才能使用,为了使用简便,我把它打包成一个文件Alphalens_v12.py

我们用流通市值因子CMC为例:

In [1]:
import pandas as pd
import numpy as np
import datetime

from alphalens_v12 import *

导入需要的库

In [3]:
start_date = '20130101'
end_date = '20171023'

trade_period = 'weekly'

market_start_date = datetime.datetime.strptime(start_date,'%Y%m%d') - datetime.timedelta(days=180) #Half-year early than factor start date.
stock_set_start = get_index_stocks('000300.SH',market_start_date.strftime("%Y%m%d"))
stock_set_end = get_index_stocks('000300.SH',end_date)
#    stock_set_start = get_index_stocks('000001.SH',market_start_date.strftime("%Y%m%d"))
#    stock_set_start += get_index_stocks('399106.SZ', start_date)
#    
#    stock_set_end = get_index_stocks('000001.SH',end_date)
#    stock_set_end += get_index_stocks('399106.SZ', end_date)     
stock_list1 = list(set(stock_set_start).intersection(set(stock_set_end)))


trade_days1 = get_trade_days(start_date, end_date, None)


if trade_period == 'weekly':
    trade_day_all = trade_days1[trade_days1.weekday==4]
    trade_data_freq = 'week'
else:
    trade_day_all = trade_days1
    trade_data_freq = '1d'        

print("Total assets: %d, time_period: %d" % (len(stock_list1), len(trade_day_all)))
Total assets: 166, time_period: 232

设定回测时间区间,股票池,在此只取沪深300为股票池,并采用周线数据回测

In [5]:
count = len(trade_day_all)
price = get_candle_stick(stock_list1, trade_day_all[-1].date().strftime("%Y%m%d"), fre_step = trade_data_freq, fields = ['close'],  skip_paused = False, bar_count = count)

price = pd.Panel.from_dict(price)
price = price.transpose(2,1,0)
price_close = price['close']
k_time_idx = price_close.index
k_assets = price_close.columns

取得价格数据

In [6]:
time_str = k_time_idx.strftime('%Y-%m-%d')
df_fac = pd.DataFrame(index = k_time_idx)
for stk in k_assets:
    q = query(
    factor.date,
    factor.current_market_cap
    ).filter(
    factor.symbol == stk,
    factor.date.in_(time_str)
    )
    df_tmp = get_factors(q)
    df_tmp.columns = ['factor_date', stk]
    df_tmp = df_tmp.set_index('factor_date')
    df_fac = df_fac.join(df_tmp, how='left')

取得流通市值数据

In [7]:
df_fac = df_fac.astype(np.float)  #注,修改类型为Float
ah_factor_data = get_clean_factor_and_forward_returns(df_fac.stack(dropna=False), price_close, quantiles=7, groupby = None, by_group=False, periods = [1,3,5])

提取并格式化因子数据,对因子分位,分组处理

接下来“一键”生成报表:

In [8]:
create_full_tear_sheet(ah_factor_data, by_group = False)
Quantiles Statistics
                          min           max          mean           std  \
factor_quantile                                                           
1                2.765230e+09  4.267630e+10  1.817348e+10  7.292697e+09   
2                1.030383e+10  5.444801e+10  2.531397e+10  8.022161e+09   
3                1.407545e+10  6.461347e+10  3.126956e+10  9.213377e+09   
4                2.006205e+10  8.551675e+10  3.951727e+10  1.154716e+10   
5                2.739112e+10  1.104275e+11  5.358679e+10  1.635579e+10   
6                3.951080e+10  2.233361e+11  9.174379e+10  3.358346e+10   
7                7.073752e+10  2.215094e+12  3.705376e+11  3.410461e+11   

                 count    count %  
factor_quantile                    
1                 5376  14.457831  
2                 5376  14.457831  
3                 5152  13.855422  
4                 5376  14.457831  
5                 5152  13.855422  
6                 5376  14.457831  
7                 5376  14.457831  
Returns Analysis
                                                    1       3       5
Ann. alpha                                     -0.133  -0.103  -0.108
beta                                           -0.137  -0.161  -0.154
Mean Period Wise Return Top Quantile (bps)     -7.413  -7.811  -7.196
Mean Period Wise Return Bottom Quantile (bps)  40.071  37.163  35.146
Mean Period Wise Spread (bps)                 -47.484 -45.915 -43.309
Information Analysis
                 1      3      5
IC Mean     -0.029 -0.038 -0.042
IC Std.      0.227  0.229  0.222
t-stat(IC)  -1.885 -2.466 -2.811
p-value(IC)  0.061  0.014  0.005
IC Skew      0.075  0.122  0.087
IC Kurtosis -0.376 -0.314  0.067
Ann. IR     -1.999 -2.615 -2.982
Turnover Analysis
                               1      3      5
Quantile 1 Mean Turnover   0.063  0.108  0.136
Quantile 2 Mean Turnover   0.136  0.225  0.285
Quantile 3 Mean Turnover   0.153  0.250  0.318
Quantile 4 Mean Turnover   0.129  0.219  0.277
Quantile 5 Mean Turnover   0.095  0.166  0.211
Quantile 6 Mean Turnover   0.051  0.088  0.109
Quantile 7 Mean Turnover   0.017  0.029  0.036
                                      1     3      5
Mean Factor Rank Autocorrelation  0.996  0.99  0.984
<matplotlib.figure.Figure at 0x7fbf0d155da0>

对于线性因子,因子取值和收益率线性相关,而非线性因子,需要靠合理的分组(寻找合理的解释因子),观察分组内的线性相关性,Alphalens也提供了相应的分组接口供使用。

除去财务类横截面因子,技术指标因子也可用Alphalens进行分析,但一般需要预先对技术指标做因子化处理。

In [ ]: