During the final stretch of the Zrive Applied Data Science program, I worked with two classmates to build a stock market asset ranking system for ETS Asset Management Factory.

The goal sounded simple on paper: produce a ranking of stocks by expected performance and demonstrate that the strategy can beat the S&P 500 in a realistic evaluation.

Problem framing

We framed the task as: given historical prices and financial variables, can we learn a signal that helps prioritize which assets are more likely to outperform the market?

In practice, the hard part is not training a model - it is setting up the experiment so you do not leak future information and the backtest resembles how the system would behave in production.

Approach

ETS provided a dataset covering historical prices and financial features. We built a pipeline with three focus areas:

1) Data and features

  • Clean missing values and standardize inputs.
  • Engineer technical indicators from raw price series.

2) Learning objective

We treated this as a classification problem: will this stock outperform the market over the next horizon? That objective is directly aligned with producing a ranking.

We tried multiple algorithms, and LightGBM ended up being the best fit for this type of structured tabular data.

3) Time-aware training and evaluation

We used a rolling (sliding window) training strategy to respect the time-series nature of the data. Each evaluation window only used information available up to that point, so the model never “peeks” into the future.

Results

On the held-out two-year test period, our strategy outperformed the benchmark.

MetricOur modelS&P 500Delta
CAGR (2-year test)15%10%+5pp
Risk-adjusted returnsHigherBaseline-

15% CAGR over the two-year testing period, vs. 10% for the S&P 500.

What I took away

  • The modeling choice mattered less than the evaluation design; time-aware validation was the difference between a demo and a trustworthy result.
  • For this kind of signal, LightGBM delivered a strong accuracy/compute tradeoff and made iteration fast.

Note: this is a student project and not investment advice.