LightFMについて”
概要
- matrix factorization形に落とし込むレコメンドライブラリ
- adagratや他のオプティマイザが使える
- ランクペアワイズやlogisticやベイジアンなどが使える
具体例
映画のデータセットから学習
データの読み込み
from lightfm import LightFM
from lightfm.datasets import fetch_movielens
from lightfm.evaluation import precision_at_k
# Load the MovieLens 100k dataset. Only five
# star ratings are treated as positive.
data = fetch_movielens(min_rating=5.0)
display(data)
display(data["train"].shape)
"""
{'item_feature_labels': array(['Toy Story (1995)', 'GoldenEye (1995)', 'Four Rooms (1995)', ...,
'Sliding Doors (1998)', 'You So Crazy (1994)',
'Scream of Stone (Schrei aus Stein) (1991)'], dtype=object),
'item_features': <1682x1682 sparse matrix of type '<class 'numpy.float32'>'
with 1682 stored elements in Compressed Sparse Row format>,
'item_labels': array(['Toy Story (1995)', 'GoldenEye (1995)', 'Four Rooms (1995)', ...,
'Sliding Doors (1998)', 'You So Crazy (1994)',
'Scream of Stone (Schrei aus Stein) (1991)'], dtype=object),
'test': <943x1682 sparse matrix of type '<class 'numpy.int32'>'
with 2153 stored elements in COOrdinate format>,
'train': <943x1682 sparse matrix of type '<class 'numpy.int32'>'
with 19048 stored elements in COOrdinate format>}
(943, 1682)
"""
データの例
display(data["train"].todense())
"""
matrix([[5., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.],
...,
[5., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.],
[0., 5., 0., ..., 0., 0., 0.]], dtype=float32)
"""
学習と評価
# Instantiate and train the model
model = LightFM(loss='warp')
model.fit(data['train'], epochs=30, num_threads=2)
# Evaluate the trained model
test_precision = precision_at_k(model, data['test'], k=5).mean()
display(test_precision) # 0.05019815
推論
item_size = len(data["item_feature_labels"])
item_ids=np.array(range(item_size))
model.predict(user_ids=0, item_ids=item_ids)
google colab
cold start問題の対応方
if user_index is not None:
predictions = model.predict([user_index, ], np.array(target_item_indices))
else:
predictions = model.predict(0, np.array(target_item_indices), user_features=user_features)
user_features
に値を入れるuser_features
はmodel.get_user_representations(features)
にて生成できる- 参考