An opinionated research sandbox for turning raw market data into actionable stock signals.
This project combines technical indicators, valuation screens, point-in-time-inspired fundamentals, sentiment scoring, and multi-factor ranking into a notebook-first workflow for stock research. It is built for fast iteration: inspect a single name, simulate decisions, then scale the same logic across entire indexes.
Build a practical market intelligence engine that helps surface conviction faster than manual chart-reading, headline-scanning, and spreadsheet triage ever could.
Evolve this codebase into a serious retail-to-pro research platform:
- A place where price action, business quality, valuation, and market narrative are scored in one system.
- A repeatable workflow for moving from idea generation to signal validation to portfolio action.
- A modular base for future automation, dashboards, backtests, portfolio overlays, and factor research.
Most stock research workflows are fragmented:
- Charts live in one app.
- Financial statements live in another.
- News and social chatter live somewhere else.
- Conviction ends up living in gut feel.
This codebase closes that gap. It pulls those inputs into one Python-driven pipeline and converts them into ranked outputs you can inspect, challenge, and improve.
For an individual ticker, the project can:
- Fetch historical price data from Yahoo Finance.
- Compute technical indicators such as RSI, SMA, EMA, MACD, Bollinger Bands, ATR, VWAP, and stochastic oscillators.
- Score bullish and bearish technical conditions.
- Pull key valuation statistics and convert them into a fundamental score.
- Derive expanded financial quality, growth, risk, and balance-sheet metrics from statements.
- Optionally layer in sentiment from Reddit and recent news using OpenAI scoring.
- Blend everything into a final numeric signal and a human-readable label:
STRONG SELL,WEAK SELL,HOLD,WEAK BUY,STRONG BUY.
The simulation workflow tests what would have happened if you had followed the generated signals over time.
It tracks:
- buy and sell decisions,
- cash balance,
- units held,
- average cost basis,
- portfolio value,
- total profit and loss.
The index search workflow expands the same logic across major benchmarks and ranks constituents by signal strength.
Currently supported index universes:
dow30nasdaq100sp500ftse100ftse250
The project also caches index constituents locally to reduce repeated scraping and speed up scans.
This is not just an indicator notebook. The interesting part is the signal fusion:
- Technical analysis captures market behavior.
- Fundamental analysis captures valuation.
- Derived financial metrics capture quality, growth, and strength.
- Sentiment analysis captures narrative pressure.
- A multi-factor model turns those dimensions into a unified ranking score.
That structure makes the project much more extensible than a one-off notebook script.
The codebase is built around a simple idea:
Strong stock decisions should not depend on one lens.
Instead of betting everything on chart setups or valuation alone, this engine asks:
- Is the price trend constructive?
- Is the company cheap, expensive, or fairly priced?
- Is the business improving in quality and growth?
- Is the balance sheet resilient?
- Is the market narrative supportive or deteriorating?
When those signals align, the result should be more meaningful than any single factor on its own.
stock analysis/
|-- 01_specific_stock_minima_maxima.ipynb
|-- 02_specific_stock_simulation.ipynb
|-- 03_index_search.ipynb
|-- analysis_functions/
| |-- technical_analysis.py
| |-- fundamental_analysis.py
| |-- sentiment_analysis.py
| |-- multifactor_analysis.py
|-- analysis_interfaces/
| |-- interface_specific_stock.py
| |-- interface_index_search.py
|-- analysis_types/
| |-- prediction.py
| |-- simulation.py
|-- dao/
| |-- dao.py
|-- config/
| |-- config.py
| |-- example.config.ini
| |-- logging_config.py
|-- cache/
|-- output/
|-- portfolio/
The data access layer. It is responsible for:
- Yahoo Finance price history
- Yahoo Finance key stats
- advanced statement-derived financial inputs
- Google News RSS ingestion
- Reddit post collection
- OpenAI sentiment scoring requests
- index constituent collection and caching
The core analytics engine.
technical_analysis.py: momentum, trend, volatility, and volume indicatorsfundamental_analysis.py: valuation, quality, growth, and financial-strength scoringsentiment_analysis.py: Reddit/news ingestion and sentiment aggregationmultifactor_analysis.py: composite factor model with weighted ranking outputs
The transformation layer between raw analytics and usable outputs.
prediction.py: builds the end-to-end prediction frame and final signalsimulation.py: converts signals into trading simulations and portfolio traces
The notebook-facing orchestration layer.
interface_specific_stock.py: single ticker analysis and simulation helpersinterface_index_search.py: index scan and ranking workflows
Focused single-name analysis. The notebook currently shows a quick research workflow around a chosen ticker, with a short-period setup suited for recent signal inspection.
Example parameters already present in the notebook:
ticker = "NVDA"initial_funds = 100include_sentiment = Falseperiod = "5d"
Single-stock strategy simulation over a longer window.
Example parameters already present in the notebook:
ticker = "TSLA"initial_funds = 100include_sentiment = Falseperiod = "1y"
Batch scans multiple indexes and ranks names by signal strength.
Example parameters already present in the notebook:
index_names = ["dow30", "nasdaq100", "sp500", "ftse100", "ftse250"]- per-index limits via
limits_by_index include_sentiment = Falseperiod = "1y"
The final signal is built from four major components:
- Technical score
- Sentiment score
- Fundamental valuation score
- Multi-factor composite score
Those components are blended using configurable weights from config/config.ini.
The shipped example config exposes:
- technical weight
- sentiment weight
- fundamental weight
- multifactor weight
- strong/weak buy and sell thresholds
- technical indicator parameters
- cache configuration
- logging configuration
The multi-factor layer is one of the strongest parts of the repository.
It currently incorporates:
- value
- quality
- momentum
- sentiment
- risk
- liquidity
Under the hood, that includes metrics such as:
- ROIC
- ROE
- gross and operating margins
- free cash flow margin
- debt ratios
- interest coverage
- EPS growth
- revenue growth
- EBITDA growth
- Piotroski F-Score
- Altman Z-Score
- 3/6/12 month returns
- historical volatility
- maximum drawdown
- Sharpe ratio
- Sortino ratio
- analyst recommendation score
- price target upside
This project goes beyond naive “latest fundamentals everywhere” logic.
The prediction pipeline includes a point-in-time-style financial snapshot process that:
- reads historical financial statement dates,
- applies availability lags to approximate when reports would have been known,
- maps market observations to the latest available fundamentals at that point in time.
That is a meaningful upgrade over simplistic backtests that leak future information into old decisions.
- Python
- Jupyter Notebook
- pandas
- numpy
- matplotlib
- yfinance
- requests
- BeautifulSoup (
bs4) - OpenAI Python SDK
- built-in logging with rotating file handlers
python -m venv .venv
.venv\Scripts\Activate.ps1There is no pinned requirements.txt yet, so install the libraries used by the codebase directly:
pip install pandas numpy matplotlib yfinance requests beautifulsoup4 openai lxml html5lib notebookUse the example config as your template:
Copy-Item config\example.config.ini config\config.iniThen update config/config.ini with your local settings, especially:
chatgpt_key- signal weights
- buy/sell thresholds
- cache and logging preferences
Launch Jupyter and open any of the three notebooks:
jupyter notebookSuggested progression:
- Start with
01_specific_stock_minima_maxima.ipynbto inspect a single ticker. - Move to
02_specific_stock_simulation.ipynbto evaluate signal-driven trading behavior. - Use
03_index_search.ipynbto scan broader universes and rank opportunities.
cache/: cached index constituent listslogs/: rotating application logsoutput/: exported workflow outputs such as signal tablesportfolio/: local portfolio state and portfolio-related artifacts
The configuration layer is intentionally broad, which makes the system easy to tune.
You can control:
- moving average windows
- RSI thresholds and period
- ATR thresholds and period
- Bollinger Band period
- stochastic oscillator parameters
- MACD fast/slow/signal periods
- signal component weights
- buy/sell cutoffs
- cache expiry
- log level and log location
This repository already has the bones of a much bigger system:
- a modular analytics core,
- reusable workflow interfaces,
- a scoring framework instead of hard-coded one-off decisions,
- simulation support,
- index-level batch processing,
- local caching and logging,
- a clean path toward APIs, dashboards, backtests, and scheduled jobs.
In other words: this is not just a collection of notebooks. It is the early-stage operating system for a serious stock research platform.
Natural next steps for the project:
- add a pinned
requirements.txtorpyproject.toml - add unit tests around signal math and factor scoring
- persist historical scan results for time-series comparison
- add benchmark-aware backtesting
- add portfolio allocation logic on top of single-name signals
- build a lightweight dashboard for ranked ideas and simulation summaries
- add scheduler support for recurring scans
- export richer CSV and JSON artifacts
- This project is best described as a research and experimentation engine, not a production trading system.
- Sentiment analysis requires an OpenAI API key and adds latency and cost.
- External data quality depends on Yahoo Finance, Google News RSS, Reddit, and page parsing stability.
- Financial signals are heuristic and should be validated before capital is allocated.
This repository is for research, education, and experimentation. It is not financial advice, not a guarantee of performance, and not a substitute for proper risk management.
If you want a codebase that can move from “interesting stock idea” to “ranked, explainable, testable signal,” this one is already pointed in the right direction.