Smart investment report: using neural networks to guide ETF investments

Vicky Costa
24 de fev.
9 min de leitura

Atualizado: 25 de fev.

I don't know about you, but in the financial world, I always come across countless doubts: "When and where to invest?", "How to invest?", "Is this investment good?". As a Data Analyst, I decided I didn't want to rely on guesswork since I aim to become a Data Scientist. Therefore, I developed a Machine Learning model to guide my monthly contributions into an ETF portfolio.

My process started long before the code. I studied ETF types and Hacienda fees to understand the impact on my income tax. I was looking for a practical answer: which asset should I choose so I don't have to worry about manual tax filings every year? That’s how I landed on accumulating ETFs, which automatically reinvest earnings and simplify tax life.

In this post, I will share the behind-the-scenes of this project, from the technical selection of assets to future scenario simulations.

💼 Asset selection: What are accumulating ETFs (Acc)?

For this test, I chose three ETFs (Exchange Traded Funds). If you are not from the financial world, imagine an ETF is like the Straw Hat Crew (One Piece): instead of fighting alone, you have an entire crew where each member has a different skill, but everyone is on the same boat.

In this world, there are two types of "rewards":

Distributing: It’s as if, at every island conquered, the crew divides the treasure and everyone spends their share. The money hits your account, but the Hacienda (the World Government) shows up immediately to collect taxes on that loot.
Accumulating: It’s like Luffy’s two-year training. The crew doesn’t spend the treasure now; they reinvest everything into training and ship upgrades to get stronger. You don’t see the "berries" in your hand today, but your power level (your net worth) increases exponentially.

I chose accumulating ETFs to simplify my tax life and focus on the long game. For our portfolio, the chosen ones are:

📊 Unlocking the charts: what do SMA and RSI tell us?

For those who look at charts and only see crossed lines, don't worry! I used two classic indicators, grounded in professional technical analysis, to understand the market sentiment. Here is the technical "ABC" of how they work:

SMA (Simple Moving Average)
- What it is: According to the Corporate Finance Institute, the SMA is the arithmetic average of an asset's closing prices over a specific period. It is called "moving" because as the price changes, the average updates accordingly.
  - SMA20: Average of the last 20 days (tracks the price closely).
  - SMA50: Average of the last 50 days (smoother, shows the medium-term trend).
  - The logic: It is a lagging indicator. It doesn't predict the future, but it confirms where the "herd" is moving.
  - Crossover strategy: When the short SMA (20) crosses below the long SMA (50), the market gives us a warning signal (bearish).
RSI (Relative Strength Index)
- What it is: Developed by J. Welles Wilder in 1978, the RSI is a momentum oscillator. It measures both the speed and the rate of change of price movements on a scale from 0 to 100.
  - Overbought (> 70): The market rose too fast. The asset is "expensive" and might face a correction.
  - Oversold (< 30): The market dropped too much. The asset is "cheap" and might attract buyers.
  - Divergence: As a future Data Scientist, this is the part that fascinates me the most. If the price hits a new low, but the RSI starts to rise, we have a Bullish Divergence, a powerful signal that the downtrend is losing strength even before the price turns around.

🔍 Analyzing our assets in practice

Based on these technical concepts, here is what the data showed me in February:

ICGA.DE (China): On 12/02, we saw the SMA20 cross below the SMA50. Technically, we entered a downtrend. Since the RSI is still at 41 (neutral zone), there is no "oversold" signal yet. In other words: it could drop further.
EXUS.DE (World ex-USA): The RSI hit 73.57 on 11/02. The market was too euphoric ("overbought"). My model suggested caution, and shortly after, we saw the price stabilize.

PPFB.DE (Gold): It started the month "stretched" (RSI > 80), but now the SMA20 is well above the SMA50, confirming a solid and healthy uptrend.

🧠 Why does this matter for my code?

For my Machine Learning model, these indicators are not just "tips"—they are features. I calculate the SMA and RSI programmatically and feed these values into my neural network. This way, it learns to identify patterns that would take the human eye hours to process.

🤖 Edge macro strategy: using GRU networks to filter market noise

To move beyond isolated technical indicators, I trained a Deep Learning model using GRU (Gated Recurrent Units) neural networks. The key differentiator here is the macro scenario analysis: the model processes variables such as the Dollar, VIX (fear index), and Interest Rates to understand the global context before suggesting a move.

By switching to a weekly analysis setup, I managed to filter out daily news "noise" and the notorious bull traps, focusing on the actual medium-term trend. This approach significantly improved precision, reaching accuracies of 67% for EXUS.DE and 61% for PPFB.DE.

I opted for GRU over LSTM because GRU is computationally more efficient and delivers similar results for smaller financial time series, making it the optimal choice for this specific dataset.

📊 Performance KPI: the reality test

How do I know if the model is truly performing or if it’s just luck? To answer that, I cross-referenced the Sharpe Ratio (risk-adjusted return) with the Correlation Matrix (how much the assets move together).

When analyzing the Sharpe Ratio, we follow these criteria:

Above 1.0: Excellent. Elite efficiency—the return significantly outweighs the risk.
Between 0.4 and 0.6: Market standard for moderate assets.
Below 0: Dangerous. The return does not justify the risk taken (better to stick with fixed income).

Asset	Annual Return	Sharpe Ratio	Max Drawdown
PPFB	23.59%	1.41	-13.01%
EXUS	15.30%	1.05	-16.21%
ICGA	-1.05%	-0.11	-52.08%

Currently, our correlation is extremely high (> 0.84). In practice, Luffy’s crew is all on the same ship facing the same storm; if one asset sinks, the others will hardly stay dry.

As a future Data Scientist, I understand that while the model looks at the past, the analyst looks at the landscape: with China’s RSI at 41.78, the asset is in consolidation. Future potential remains on the radar, but the correlation matrix warns me that my next technical step is to seek assets that "move against the tide" to protect my capital.

📖 Reading guide:

0.70 to 1.00: High Correlation. Caution! Assets move together (Our current scenario).
0.30 to 0.69: Moderate Correlation.
Below 0.30: Low Correlation. The "Holy Grail" of diversification.

🔮 Future projection and Monte Carlo simulation

The market is not a straight line. To move beyond mere assumptions, I simulated an initial investment of €1,000 with monthly contributions of €150. But instead of a basic linear calculation, I ran a Monte Carlo Simulation.

What exactly is this Monte Carlo simulation? Well, imagine you want to predict the outcome of a hiking trip. Instead of just calculating the "average time," you simulate the journey 500 times: in some versions, it rains; in others, the terrain is dry; in some, you twist your ankle; and in others, you find a shortcut.

The goal here is to simulate thousands of possible scenarios: crises, booms, and stabilit, to understand the real probability of success and how wealth behaves over the long term. In finance, Monte Carlo does exactly that:

It uses the historical volatility and past returns of the assets (Gold, China, World).
It generates thousands of random "paths" for the future.
What it’s for: It shows us that investing is not a fixed number, but a probability distribution. It helps us answer: "What is the real risk of ending up with less money than I invested?"

After 500 simulations, these are the results:

imeframe	Worst Case (10%)	Probable Outcome (Median)	Best Case (90%)	Total Invested
1 Year	€2,469.25	€2,886.47	€3,433.86	€2,800.00
5 Years	€8,568.95	€11,630.49	€16,631.76	€10,000.00
10 Years	€16,699.45	€26,520.99	€41,421.31	€19,000.00

What the data tells us:

Probability of Success: There is a 90% chance of ending with more than €16,699 in 10 years, even if the market faces poor conditions.
Realistic Expectation: The most likely outcome (median) is ~€26,520, provided the contribution discipline is maintained.
Risk Management: These numbers assume asset continuity. While ETF liquidation risk exists, choosing funds from major asset managers minimizes this possibility, ensuring the primary focus remains on market volatility rather than product failure.

💎 Advantage: What is "Alpha"?

My code generated a projected Alpha of +€132.44 over the next 10 years. But what does this mean in practice?

In the financial world, Alpha is excess return. It is the profit you earn above what the market would naturally deliver on its own. In my case, this value represents the "discipline premium."

Imagine the market is euphoric, and everyone is blindly buying China (ICGA.DE). Without the LLM, you would invest €150 without a second thought. However, the model detects an RSI of 80 (overbought), and the SMA signals a downturn. The code warns you: "Do not buy now."

Scenario A (without LLM): You buy for €150, and the asset drops 10% the next day. You immediately lose €15 in purchasing power.
Scenario B (with LLM Alpha): You save those €15 by waiting for the price to stabilize.

That "money that stayed in your pocket," or was gained by entering at the exact right moment, is your Alpha. Over the long run, this precision accumulated an extra €132.44. In practice, this bonus offsets all brokerage fees, effectively making your investment "zero cost."

🧠 The Model's "Brain": Why trust these signals?

Most forecasting models fail because they only look at an asset's own past. My differentiator here is the Macro Edge. I don’t just train the model on price; I feed it the variables that move the world:

DX-Y.NYB (US Dollar Index): The global thermometer. When the dollar rises, the world shakes.
^VIX (Fear Index): Measures Wall Street's anxiety.
^TNX (10-Year Treasury Yield): The cost of money. If yields rise, risk assets (like China) suffer.

The Weekly GRU Architecture

Instead of analyzing daily data, which is full of "noise" and irrelevant news, I configured a 12-week window (roughly a quarter). This allows the GRU neural network to understand the economic cycle rather than just a random Tuesday's market hiccup.

Because the model processed these variables, it generated the probabilities:

PPFB.DE (Gold): With a 62.88% probability of rising, the signal is STRONG BUY 🚀. The model detected that, given the current macro landscape, gold is the safe haven.
EXUS and ICGA: Both remained in the ~55% range, which for my model is a NEUTRAL ⚖️ signal. This means that while there is a slight upward bias, the macro risk (Dollar and Yields) is still too high for a heavy entry right now.

🎯 I am currently studying Data Science, and my goal with this project was to take knowledge out of textbooks and apply it to something real that impacts my financial future. The model is not a crystal ball, but I now have a clear compass.

Based on the data processed today, the system generated the following signals for early March:

🚦 Model Signals (March 2026 Forecast)

EXUS.DE (World ex-USA): 46.32% Upward Prob. | Status: WAIT. Downtrend detected. Best entry in 10 days (Target: 37.49 EUR).
PPFB.DE (Gold): 59.43% Upward Prob. | Status: BUY AUTHORIZED. Price in support zone.
ICGA.DE (China): 42.46% Upward Prob. | Status: WAIT. Downtrend detected. Best entry in 10 days (Target: 5.16 EUR).

Unless there are "bombshell" news events (the famous black swans), the model should perform perfectly—or not, we'll find out soon enough (lol). I will validate this information on March 6th, 2026, and adjust the sails if necessary. After all, as Luffy would say: "No matter how hard or impossible it is, never lose sight of your goal."

📌 Update (Feb 25, 2026)

Edit: After further code review and refining the model's architecture, I expanded the macro features from 9 to 18 variables. This upgrade significantly changed the output for our analysis:

Expanded Scope: By including metrics like Credit Stress (HYG), Relative Strength (XLF/SPY), and Emerging Markets flow (EEM), the model is now robust enough to analyze assets across different markets, including Brazil (B3), Europe, and the US.
The China Turnaround: The most notable change was in ICGA.DE (China). While the price-only model suggested a "Wait" signal, the new "Ultra-Macro" version identified a massive liquidity inflow, shifting the probability from 42% to a 72.76% Strong Buy 🚀.
Alpha Impact: This refinement caused our projected Alpha to jump from €132 to an impressive €667.24 over 10 years, representing a monthly "discipline premium" of €5.56.